Ok
Sign inSupport
Google Vertex AI
6 mins

Google Vertex AI

Neha Duggal

Dec 17, 2025

Content
Gain control of your cloud access.
Get a demo
Share article
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Generative AI is now a foundational capability inside modern enterprises, and Google Vertex AI has emerged as a core platform for building, tuning, and deploying Large Language Models and multimodal AI workloads at scale. With support for models like Gemini, PaLM, Claude, Llama, and a broad ecosystem of third-party and open-source offerings, Vertex gives organizations a unified environment to experiment, operationalize, and accelerate AI adoption.                              

Recently, I wrote about the governance challenges and risks associated with Amazon Bedrock and today I’ll explore how the same principles apply to Google Vertex.    

With teams expanding the contexts in which they use AI models, from prototype notebooks to production endpoints, Vertex also introduces a powerful new identity surface. Model inputs often carry sensitive business logic, customer information, or proprietary data. Access to fine-tuning pipelines or batch prediction jobs can expose internal data or inadvertently propagate unvetted training content. In multi-team, multi-project environments, even small misconfigurations can silently bypass enterprise identity and governance controls.    

As with AWS Bedrock, managing who can invoke, tune, deploy, and share models in Vertex AI is therefore fundamentally an identity governance rather than cloud operations challenge. This post explores where the hidden risks emerge, which controls matter most, and what “good” identity maturity looks like for organizations scaling AI on Vertex.    

Mapping Identity Risks in Google Vertex AI    

A quick review of the permission and operational workflow of Vertex reveals several risk areas that security and identity teams should be mindful of:    

Runtime invocation risk    

Vertex model execution relies on permissions such as aiplatform.endpoints.predict and aiplatform.endpoints.explain. Broadly assigned roles (especially Google’s default high-privilege roles like “Vertex AI Admin” or “Editor”) allow any user or workload to run inference on production endpoints.    

This can lead to:    

  • Sensitive data exposure through inference inputs
  • Unmonitored mass prediction jobs that drive cost spikes
  • Abuse of internal ML services from compromised credentials

As in other AI platforms, invocation rights become a form of privilege and require the same rigour as traditional privileged access.    

Model lifecycle and tuning risk    

Vertex allows users to fine-tune foundation models, create pipelines, and register custom models in the Model Registry. Permissions like aiplatform.trainingPipelines.create or aiplatform.models.upload grant the ability to:    

  • Inject unvalidated training data
  • Modify model behavior
  • Push new model versions into shared registries

If left unchecked, development teams can unintentionally introduce compliance issues, or malicious insiders could embed harmful instructions or data.    

Cross-project and Cross-region risk    

Vertex is tightly integrated with Google Cloud Resource Manager. While this enables cross-project model consumption and multi-region deployments, it also creates new propagation paths for access. Without strong governance:    

  • A model endpoint in a “sandbox” project may become callable from a production project
  • Data residency requirements may be bypassed by invoking models from regions with different regulatory constraints
  • Service accounts reused across projects may unintentionally inherit expansive privileges

Multi-project GCP estates must treat Vertex as a shared identity boundary, not an isolated ML service.    

Auditability and attribution risk    

Cloud Audit Logs capture Vertex activities—but correlating an inference request or training pipeline to a human or workload identity is non-trivial if:    

  • Service accounts are long-lived
  • API keys are shared
  • Workload Identity Federation mappings are inconsistent
  • IAM Conditions are not configured to restrict context

Without reliable mapping, audit logs become noisy and attribution becomes guesswork.    

Vertex AI Governance Guidance    

Organizations that want to safely scale AI on Vertex should treat identity governance as a fundamental operational requirement. At P0, we recommend that security teams prioritise the following:    

Replace standing access with just-in-time invocation    

Vertex permissions are often granted through broad IAM roles attached to human users or long-lived service accounts. Over time these accumulate dormant access that can be misused. Instead, implement controls that require users and workload to obtain short-lived, scoped credentials specific to endpoints, models or pipelines that are needed. This drastically limits the impact of compromised credentials or misconfigured roles.    

Enforce separation of duties across the model lifecycle    

Teams that manage, tune, and deploy models should not be the same identities that consume them in production. Security teams should split roles for model training, registration, and invocation while using appropriate roles to enforce least privilege. This reduces insider risk and improves traceability across the AI supply chain.    

Establish identity provenance and a clear trust chain    

Vertex activities frequently originate from federated identities or service workloads running in Cloud Run, GKE, or CI/CD pipelines. To ensure traceability, normalise workload identities via Workload Identity federation, map between enterprise IdP and GCP IAM and require each inference request or lifecycle action to be associated with a verified identity in your authoritative IdP. This ensures every action in Vertex has a clear, auditable origin.    

Govern cross-project, cross-region, and shared-service usage    

Vertex’s ability to share models across teams is powerful but dangerous without guardrails. Security teams should use appropriate GCP Organizational Policies to restrict model invocation to approved regions, apply project-level boundaries for any data residency rules and limit cross-project access to trusted projects using principled resources hierarchy design. While these represent technical controls, you should also work with your security and governance teams to ensure that AI adoption aligns with your corporate governance policies.    

In a nutshell

Platforms like Google Vertex AI empower organizations to rapidly scale generative AI experimentation and production workloads, but they also redefine the identity attack surface. The ability to invoke a model, launch a fine-tuning pipeline, or deploy a new model version becomes a high-impact privilege and one that demands least-privilege enforcement, separation of duties, and reliable identity attribution.    

Security maturity in the age of generative AI will increasingly be measured not only by whether teams can innovate quickly, but whether they can innovate securely, ensuring the right identities have the right access at the right time. With structured identity governance, Vertex can become an engine of safe, compliant AI acceleration rather than a hidden vector for risk.

Struggling to control production access in hybrid or multi-cloud environments?

Get a demo of P0 Security, the next-gen PAM platform built for every identity.