Claude on Google Vertex AI: GCP Integration Guide
By Dorian Laurenceau
📅 Last reviewed: April 24, 2026. Updated with April 2026 findings and community feedback.
🔗 Pillar article: Claude API: Complete Guide
Why Claude on Vertex AI?
If your infrastructure runs on Google Cloud, Vertex AI is the most direct path to Claude.
| Advantage | Description |
|---|---|
| GCP billing | Claude appears on your Google Cloud bill |
| Google Cloud IAM | Access control with service accounts and roles |
| VPC Service Controls | Network security perimeter |
| BigQuery | Direct BigQuery ↔ Claude connection for analytics |
| Model Garden | Unified model catalog (Claude + Gemini + open source) |
| Compliance | SOC 2, ISO 27001, HIPAA, FedRAMP |
| Managed quotas | Capacity management via GCP quotas |
The practical reason teams end up on Vertex AI instead of the direct Anthropic API, visible in the ongoing threads on r/googlecloud and r/devops: it's almost never a technical preference. It's procurement. If your org already has a negotiated GCP commit, a VPC-SC perimeter, and a centralized billing process, adding another SaaS vendor with its own contract, its own DPA, and its own invoicing is a 3-month fight — whereas enabling Claude in Model Garden is three clicks. The Vertex AI Model Garden page for Claude exists exactly because this friction is real and widespread.
Where the community correctly pushes back: "it's the same model" is almost but not quite true. Model availability lags the direct API (new Claude versions land on Anthropic first, then Bedrock and Vertex), some features like extended thinking or fine-grained safety controls can arrive later, and quota/region footprints are different. If you're building a product that must ship on the day a new Claude version releases, Vertex is the wrong plane of abstraction; if you're building an enterprise internal tool, it's exactly the right one.
Pragmatic heuristic: choose the direct API for greenfield experimentation and anything latency-sensitive, choose Vertex (or Bedrock) when IT, legal, or finance would otherwise block the rollout.
Setup: Step by Step
1. Enable Claude in the Model Garden
- →Go to the GCP console > Vertex AI > Model Garden
- →Search for "Claude" in the catalog
- →Click Enable for the desired models
- →Accept the Anthropic terms of use
2. Configure a Service Account
# Create a service account
gcloud iam service-accounts create claude-vertex \
--display-name="Claude Vertex AI"
# Assign the Vertex AI User role
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
--member="serviceAccount:claude-vertex@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"
# Generate a key (for local development)
gcloud iam service-accounts keys create key.json \
--iam-account=claude-vertex@YOUR_PROJECT_ID.iam.gserviceaccount.com
3. Install the SDKs
pip install anthropic[vertex] google-auth
Code Examples
Basic Call with the Anthropic SDK
import anthropic
client = anthropic.AnthropicVertex(
project_id="your-gcp-project-id",
region="us-east5"
)
message = client.messages.create(
model="claude-sonnet-4@20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Explain the benefits of BigQuery for data analysis."}
]
)
print(message.content[0].text)
With System Prompt
message = client.messages.create(
model="claude-sonnet-4@20250514",
max_tokens=2048,
system="You are an expert GCP cloud architect. Recommend Google Cloud solutions.",
messages=[
{"role": "user", "content": "I need an architecture for a real-time data pipeline."}
]
)
Streaming
with client.messages.stream(
model="claude-sonnet-4@20250514",
max_tokens=2048,
messages=[{"role": "user", "content": "Write a GKE migration guide."}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
Authentication on GCP (no JSON key)
# Automatic authentication via Application Default Credentials
# Works on GCE, Cloud Run, GKE, Cloud Functions
client = anthropic.AnthropicVertex(
project_id="your-project-id",
region="us-east5"
# No explicit credentials needed!
)
Pricing: Vertex AI vs Direct API
| Model | Vertex Input | Vertex Output | Direct API Input | Direct API Output |
|---|---|---|---|---|
| Claude Opus 4.6 | $15.00/M | $75.00/M | $15.00/M | $75.00/M |
| Claude Sonnet 4 | $3.00/M | $15.00/M | $3.00/M | $15.00/M |
| Claude Haiku 3.5 | $0.80/M | $4.00/M | $0.80/M | $4.00/M |
Vertex pricing advantages:
- →Committed Use Discounts (CUDs): Volume commitment discounts
- →Per-second billing: No minimum
- →GCP credits: Usable for Claude (including free startup credits)
- →Batch predictions: Reduction for batch processing
BigQuery Integration
A powerful use case: using Claude to analyze BigQuery data.
from google.cloud import bigquery
import anthropic
# Retrieve BigQuery data
bq_client = bigquery.Client()
query = """
SELECT product_name, SUM(revenue) as total_revenue, COUNT(*) as orders
FROM `project.dataset.sales`
WHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
GROUP BY product_name
ORDER BY total_revenue DESC
LIMIT 20
"""
results = bq_client.query(query).to_dataframe()
# Analyze with Claude
vertex_client = anthropic.AnthropicVertex(
project_id="your-project-id",
region="us-east5"
)
message = vertex_client.messages.create(
model="claude-sonnet-4@20250514",
max_tokens=2048,
messages=[{
"role": "user",
"content": f"""Analyze this sales data from the last 30 days:
{results.to_markdown()}
Provide:
1. The top 3 performing products and why
2. Notable trends
3. Recommendations for next month"""
}]
)
print(message.content[0].text)
Enterprise Architecture on GCP
Typical Architecture
Cloud Run + Claude
# app.py - Cloud Run service with Claude
from flask import Flask, request, jsonify
import anthropic
app = Flask(__name__)
client = anthropic.AnthropicVertex(
project_id="your-project-id",
region="us-east5"
)
@app.route("/analyze", methods=["POST"])
def analyze():
data = request.json
message = client.messages.create(
model="claude-sonnet-4@20250514",
max_tokens=2048,
messages=[{"role": "user", "content": data["question"]}]
)
return jsonify({
"answer": message.content[0].text,
"model": message.model,
"tokens": {
"input": message.usage.input_tokens,
"output": message.usage.output_tokens
}
})
if __name__ == "__main__":
app.run(host="0.0.0.0", port=8080)
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY app.py .
CMD ["python", "app.py"]
# Deploy to Cloud Run
gcloud run deploy claude-service \
--source . \
--region us-east5 \
--service-account claude-vertex@YOUR_PROJECT.iam.gserviceaccount.com
Bedrock vs Vertex AI Comparison
| Aspect | Amazon Bedrock | Google Vertex AI |
|---|---|---|
| Cloud provider | AWS | GCP |
| Available models | Claude, Llama, Mistral, Titan | Claude, Gemini, Llama, Mistral |
| Data analytics | Athena, Redshift | BigQuery (advantage) |
| Container orchestration | ECS/EKS | GKE, Cloud Run |
| Guardrails | Built-in | Via Model Monitoring |
| Claude regions | us-east-1, us-west-2, eu-west-1 | us-east5, us-central1, eu-west1 |
| Batch API | ✅ | ✅ |
Available Regions
| Region | Code | Latency (from London) |
|---|---|---|
| US East (Ohio) | us-east5 | ~100ms |
| US Central (Iowa) | us-central1 | ~120ms |
| Europe West (Belgium) | europe-west1 | ~15ms |
Recommendation: Use europe-west1 for European applications (minimal latency + easier GDPR compliance).
Migrating from the Direct API
| Change | Direct API | Vertex AI |
|---|---|---|
| Client | anthropic.Anthropic() | anthropic.AnthropicVertex() |
| Auth | API key | GCP Service Account |
| Model ID | claude-sonnet-4-20250514 | claude-sonnet-4@20250514 |
| Additional param | , | project_id, region |
| Rest of code | Identical | Identical |
Module 0 — Prompting Fundamentals
Build your first effective prompts from scratch with hands-on exercises.
Dorian Laurenceau
Full-Stack Developer & Learning DesignerFull-stack web developer and learning designer. I spent 4 years as a freelance full-stack developer and 4 years teaching React, JavaScript, HTML/CSS and WordPress to adult learners. Today I design learning paths in web development and AI, grounded in learning science. I founded learn-prompting.fr to make AI practical and accessible, and built the Bluff app to gamify political transparency.
Weekly AI Insights
Tools, techniques & news — curated for AI practitioners. Free, no spam.
Free, no spam. Unsubscribe anytime.
→Related Articles
FAQ
What is Google Vertex AI?+
Vertex AI is Google Cloud's MLOps platform that provides access to AI models (including Claude) via the Model Garden. It natively integrates with BigQuery, Cloud Storage, and the GCP ecosystem.
Why use Claude via Vertex AI instead of the direct API?+
Vertex AI offers unified GCP billing, integration with BigQuery for analytics, Google Cloud IAM quotas and permissions, VPC Service Controls, and GCP enterprise compliance.
How do I configure access to Claude on Vertex AI?+
Enable Claude in the Vertex AI Model Garden, create a service account with the Vertex AI User role, configure authentication, and use the Anthropic SDK with the vertex parameter.
Which GCP regions support Claude?+
Claude is available on Vertex AI in us-east5, us-central1, and europe-west1 regions. Available regions may vary depending on the model and demand.