ai-api

Google Gemini API

Gemini 2.5 Pro, Flash, Flash-Lite — multimodal + 2M context

Gemini 2.5 Pro (1M-2M context), 2.5 Flash (fast/cheap), 2.5 Flash-Lite (cheapest). Multimodal (text, image, video, audio). Also available via Vertex AI (enterprise) + AI Studio (free dev).

Google AI Studio ↗Docs ↗

Pricing

Tier	Price	Notes
Free Tier (AI Studio)	Free	Generous free tier with rate limits. Good for dev + prototyping. Data may be used to improve Google products.
Paid API (Gemini API)	Free	Pay-as-you-go per-token. Data NOT used for training.
Vertex AI (GCP)	Free	Enterprise deployment via Google Cloud. Same pricing structure + GCP features (IAM, VPC-SC, CMEK).
Gemini Enterprise	Custom	Custom. Gemini 2.5 Deep Think model access + Google Workspace + Agentspace.

Limits

Tier	Metric	Value	Notes
—	batch api discount	50% off all models	Batch API
—	context caching	Up to 90% savings on cached prefixes	Context caching
—	context window pro	1048576 tokens	Gemini 2.5 Pro context (1M). 2M rolling out.
—	embedding-004	$0.025/M tokens	text-embedding-004
—	free tier data usage	Free tier prompts MAY be used to train/improve Google products	Critical privacy note
—	function calling	Full support incl. parallel function calls	Function calling
—	gemini-2.5-flash input	$0.30/M tokens	Gemini 2.5 Flash input
—	gemini-2.5-flash-lite input	$0.10/M tokens	Flash-Lite input
—	gemini-2.5-flash-lite output	$0.40/M tokens	Flash-Lite output
—	gemini-2.5-flash output	$2.50/M tokens	Gemini 2.5 Flash output
—	gemini-2.5-pro input	$1.25/M (≤200k tokens), $2.50/M (>200k)	Gemini 2.5 Pro input
—	gemini-2.5-pro output	$10/M (≤200k), $15/M (>200k)	Gemini 2.5 Pro output
—	grounding with search	Google Search grounding available for factual queries	Search grounding
—	imagen-3	$0.04/image (standard)	Imagen 3 generation
—	multimodal inputs	text, image, video, audio, PDF	Accepted input types
—	paid tier data usage	Paid tier prompts are NEVER used for training	Paid tier privacy

Features

Batch API — 50% discount for async processing.
Code Execution — Python code interpreter tool (sandboxed). · docs
Context Caching — Cache system instructions + tools for up to 90% savings. · docs
File API — Upload large files (up to 2 GB) for multimodal prompts.
Function Calling — JSON schema-based tool calling. Parallel supported. · docs
generateContent API — Core generation endpoint. · docs
Grounding with Search — Augment answers with Google Search results. Fact-checked citations returned. · docs
Model Tuning — Supervised fine-tuning via AI Studio.
Multimodal Live API — Bidirectional streaming voice + video (WebSocket). · docs
Safety Settings — Configurable thresholds for harm categories.
streamGenerateContent — Streaming variant with SSE.

Developer interfaces

Slug	Name	Kind	Version
mcp	Gemini MCP	mcp	—
rest-api	Gemini REST API	rest	v1beta
sdk-node	@google/genai	sdk	1.x
sdk-go	google-genai-go	sdk	1.x
sdk-python	google-genai (Python)	sdk	1.x
vertex-ai	Vertex AI Endpoint	rest	v1

Compare Google Gemini API with

ai-api

Google Gemini API vs Anthropic API

Side-by-side breakdown.

ai-api

Google Gemini API vs AssemblyAI

Side-by-side breakdown.

ai-api

Google Gemini API vs Deepgram

Side-by-side breakdown.

ai-api

Google Gemini API vs ElevenLabs

Side-by-side breakdown.

ai-api

Google Gemini API vs Groq

Side-by-side breakdown.

ai-api

Google Gemini API vs OpenAI API

Side-by-side breakdown.

ai-api

Google Gemini API vs Replicate

Side-by-side breakdown.

ai-api

Google Gemini API vs Together AI

Side-by-side breakdown.

Staxly is an independent catalog of developer platforms. Outbound links to Google Gemini API are plain references to their official pages. Pricing is verified at publication time — reconfirm on the vendor site before buying.