Back to all articles
8 MIN READ

Claude on Google Vertex AI: GCP Integration Guide

By Dorian Laurenceau

📅 Last reviewed: April 24, 2026. Updated with April 2026 findings and community feedback.

🔗 Pillar article: Claude API: Complete Guide


Why Claude on Vertex AI?

If your infrastructure runs on Google Cloud, Vertex AI is the most direct path to Claude.

AdvantageDescription
GCP billingClaude appears on your Google Cloud bill
Google Cloud IAMAccess control with service accounts and roles
VPC Service ControlsNetwork security perimeter
BigQueryDirect BigQuery ↔ Claude connection for analytics
Model GardenUnified model catalog (Claude + Gemini + open source)
ComplianceSOC 2, ISO 27001, HIPAA, FedRAMP
Managed quotasCapacity management via GCP quotas

The practical reason teams end up on Vertex AI instead of the direct Anthropic API, visible in the ongoing threads on r/googlecloud and r/devops: it's almost never a technical preference. It's procurement. If your org already has a negotiated GCP commit, a VPC-SC perimeter, and a centralized billing process, adding another SaaS vendor with its own contract, its own DPA, and its own invoicing is a 3-month fight — whereas enabling Claude in Model Garden is three clicks. The Vertex AI Model Garden page for Claude exists exactly because this friction is real and widespread.

Where the community correctly pushes back: "it's the same model" is almost but not quite true. Model availability lags the direct API (new Claude versions land on Anthropic first, then Bedrock and Vertex), some features like extended thinking or fine-grained safety controls can arrive later, and quota/region footprints are different. If you're building a product that must ship on the day a new Claude version releases, Vertex is the wrong plane of abstraction; if you're building an enterprise internal tool, it's exactly the right one.

Pragmatic heuristic: choose the direct API for greenfield experimentation and anything latency-sensitive, choose Vertex (or Bedrock) when IT, legal, or finance would otherwise block the rollout.

Setup: Step by Step

1. Enable Claude in the Model Garden

  1. Go to the GCP console > Vertex AI > Model Garden
  2. Search for "Claude" in the catalog
  3. Click Enable for the desired models
  4. Accept the Anthropic terms of use

2. Configure a Service Account

# Create a service account
gcloud iam service-accounts create claude-vertex \
    --display-name="Claude Vertex AI"

# Assign the Vertex AI User role
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
    --member="serviceAccount:claude-vertex@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
    --role="roles/aiplatform.user"

# Generate a key (for local development)
gcloud iam service-accounts keys create key.json \
    --iam-account=claude-vertex@YOUR_PROJECT_ID.iam.gserviceaccount.com

3. Install the SDKs

pip install anthropic[vertex] google-auth

Code Examples

Basic Call with the Anthropic SDK

import anthropic

client = anthropic.AnthropicVertex(
    project_id="your-gcp-project-id",
    region="us-east5"
)

message = client.messages.create(
    model="claude-sonnet-4@20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain the benefits of BigQuery for data analysis."}
    ]
)

print(message.content[0].text)

With System Prompt

message = client.messages.create(
    model="claude-sonnet-4@20250514",
    max_tokens=2048,
    system="You are an expert GCP cloud architect. Recommend Google Cloud solutions.",
    messages=[
        {"role": "user", "content": "I need an architecture for a real-time data pipeline."}
    ]
)

Streaming

with client.messages.stream(
    model="claude-sonnet-4@20250514",
    max_tokens=2048,
    messages=[{"role": "user", "content": "Write a GKE migration guide."}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Authentication on GCP (no JSON key)

# Automatic authentication via Application Default Credentials
# Works on GCE, Cloud Run, GKE, Cloud Functions
client = anthropic.AnthropicVertex(
    project_id="your-project-id",
    region="us-east5"
    # No explicit credentials needed!
)

Pricing: Vertex AI vs Direct API

ModelVertex InputVertex OutputDirect API InputDirect API Output
Claude Opus 4.6$15.00/M$75.00/M$15.00/M$75.00/M
Claude Sonnet 4$3.00/M$15.00/M$3.00/M$15.00/M
Claude Haiku 3.5$0.80/M$4.00/M$0.80/M$4.00/M

Vertex pricing advantages:

  • Committed Use Discounts (CUDs): Volume commitment discounts
  • Per-second billing: No minimum
  • GCP credits: Usable for Claude (including free startup credits)
  • Batch predictions: Reduction for batch processing

BigQuery Integration

A powerful use case: using Claude to analyze BigQuery data.

from google.cloud import bigquery
import anthropic

# Retrieve BigQuery data
bq_client = bigquery.Client()
query = """
    SELECT product_name, SUM(revenue) as total_revenue, COUNT(*) as orders
    FROM `project.dataset.sales`
    WHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
    GROUP BY product_name
    ORDER BY total_revenue DESC
    LIMIT 20
"""
results = bq_client.query(query).to_dataframe()

# Analyze with Claude
vertex_client = anthropic.AnthropicVertex(
    project_id="your-project-id",
    region="us-east5"
)

message = vertex_client.messages.create(
    model="claude-sonnet-4@20250514",
    max_tokens=2048,
    messages=[{
        "role": "user",
        "content": f"""Analyze this sales data from the last 30 days:

{results.to_markdown()}

Provide:
1. The top 3 performing products and why
2. Notable trends
3. Recommendations for next month"""
    }]
)

print(message.content[0].text)

Enterprise Architecture on GCP

Typical Architecture

Loading diagram…

Cloud Run + Claude

# app.py - Cloud Run service with Claude
from flask import Flask, request, jsonify
import anthropic

app = Flask(__name__)
client = anthropic.AnthropicVertex(
    project_id="your-project-id",
    region="us-east5"
)

@app.route("/analyze", methods=["POST"])
def analyze():
    data = request.json
    
    message = client.messages.create(
        model="claude-sonnet-4@20250514",
        max_tokens=2048,
        messages=[{"role": "user", "content": data["question"]}]
    )
    
    return jsonify({
        "answer": message.content[0].text,
        "model": message.model,
        "tokens": {
            "input": message.usage.input_tokens,
            "output": message.usage.output_tokens
        }
    })

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080)
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY app.py .
CMD ["python", "app.py"]
# Deploy to Cloud Run
gcloud run deploy claude-service \
    --source . \
    --region us-east5 \
    --service-account claude-vertex@YOUR_PROJECT.iam.gserviceaccount.com

Bedrock vs Vertex AI Comparison

AspectAmazon BedrockGoogle Vertex AI
Cloud providerAWSGCP
Available modelsClaude, Llama, Mistral, TitanClaude, Gemini, Llama, Mistral
Data analyticsAthena, RedshiftBigQuery (advantage)
Container orchestrationECS/EKSGKE, Cloud Run
GuardrailsBuilt-inVia Model Monitoring
Claude regionsus-east-1, us-west-2, eu-west-1us-east5, us-central1, eu-west1
Batch API

Available Regions

RegionCodeLatency (from London)
US East (Ohio)us-east5~100ms
US Central (Iowa)us-central1~120ms
Europe West (Belgium)europe-west1~15ms

Recommendation: Use europe-west1 for European applications (minimal latency + easier GDPR compliance).

Migrating from the Direct API

ChangeDirect APIVertex AI
Clientanthropic.Anthropic()anthropic.AnthropicVertex()
AuthAPI keyGCP Service Account
Model IDclaude-sonnet-4-20250514claude-sonnet-4@20250514
Additional param,project_id, region
Rest of codeIdenticalIdentical

GO DEEPER — FREE GUIDE

Module 0 — Prompting Fundamentals

Build your first effective prompts from scratch with hands-on exercises.

D

Dorian Laurenceau

Full-Stack Developer & Learning Designer

Full-stack web developer and learning designer. I spent 4 years as a freelance full-stack developer and 4 years teaching React, JavaScript, HTML/CSS and WordPress to adult learners. Today I design learning paths in web development and AI, grounded in learning science. I founded learn-prompting.fr to make AI practical and accessible, and built the Bluff app to gamify political transparency.

Prompt EngineeringLLMsFull-Stack DevelopmentLearning DesignReact
Published: March 10, 2026Updated: April 24, 2026
Newsletter

Weekly AI Insights

Tools, techniques & news — curated for AI practitioners. Free, no spam.

Free, no spam. Unsubscribe anytime.

FAQ

What is Google Vertex AI?+

Vertex AI is Google Cloud's MLOps platform that provides access to AI models (including Claude) via the Model Garden. It natively integrates with BigQuery, Cloud Storage, and the GCP ecosystem.

Why use Claude via Vertex AI instead of the direct API?+

Vertex AI offers unified GCP billing, integration with BigQuery for analytics, Google Cloud IAM quotas and permissions, VPC Service Controls, and GCP enterprise compliance.

How do I configure access to Claude on Vertex AI?+

Enable Claude in the Vertex AI Model Garden, create a service account with the Vertex AI User role, configure authentication, and use the Anthropic SDK with the vertex parameter.

Which GCP regions support Claude?+

Claude is available on Vertex AI in us-east5, us-central1, and europe-west1 regions. Available regions may vary depending on the model and demand.