ai-api
Google Gemini API
Gemini 2.5 Pro, Flash, Flash-Lite — multimodal + 2M context
Gemini 2.5 Pro (1M-2M context), 2.5 Flash (fast/cheap), 2.5 Flash-Lite (cheapest). Multimodal (text, image, video, audio). Also available via Vertex AI (enterprise) + AI Studio (free dev).
Pricing
| Tier | Price | Notes |
|---|---|---|
| Free Tier (AI Studio) | Free | Generous free tier with rate limits. Good for dev + prototyping. Data may be used to improve Google products. |
| Paid API (Gemini API) | Free | Pay-as-you-go per-token. Data NOT used for training. |
| Vertex AI (GCP) | Free | Enterprise deployment via Google Cloud. Same pricing structure + GCP features (IAM, VPC-SC, CMEK). |
| Gemini Enterprise | Custom | Custom. Gemini 2.5 Deep Think model access + Google Workspace + Agentspace. |
Limits
| Tier | Metric | Value | Notes |
|---|---|---|---|
| — | batch api discount | 50% off all models | Batch API |
| — | context caching | Up to 90% savings on cached prefixes | Context caching |
| — | context window pro | 1048576 tokens | Gemini 2.5 Pro context (1M). 2M rolling out. |
| — | embedding-004 | $0.025/M tokens | text-embedding-004 |
| — | free tier data usage | Free tier prompts MAY be used to train/improve Google products | Critical privacy note |
| — | function calling | Full support incl. parallel function calls | Function calling |
| — | gemini-2.5-flash input | $0.30/M tokens | Gemini 2.5 Flash input |
| — | gemini-2.5-flash-lite input | $0.10/M tokens | Flash-Lite input |
| — | gemini-2.5-flash-lite output | $0.40/M tokens | Flash-Lite output |
| — | gemini-2.5-flash output | $2.50/M tokens | Gemini 2.5 Flash output |
| — | gemini-2.5-pro input | $1.25/M (≤200k tokens), $2.50/M (>200k) | Gemini 2.5 Pro input |
| — | gemini-2.5-pro output | $10/M (≤200k), $15/M (>200k) | Gemini 2.5 Pro output |
| — | grounding with search | Google Search grounding available for factual queries | Search grounding |
| — | imagen-3 | $0.04/image (standard) | Imagen 3 generation |
| — | multimodal inputs | text, image, video, audio, PDF | Accepted input types |
| — | paid tier data usage | Paid tier prompts are NEVER used for training | Paid tier privacy |
Features
- Batch API — 50% discount for async processing.
- Code Execution — Python code interpreter tool (sandboxed). · docs
- Context Caching — Cache system instructions + tools for up to 90% savings. · docs
- File API — Upload large files (up to 2 GB) for multimodal prompts.
- Function Calling — JSON schema-based tool calling. Parallel supported. · docs
- generateContent API — Core generation endpoint. · docs
- Grounding with Search — Augment answers with Google Search results. Fact-checked citations returned. · docs
- Model Tuning — Supervised fine-tuning via AI Studio.
- Multimodal Live API — Bidirectional streaming voice + video (WebSocket). · docs
- Safety Settings — Configurable thresholds for harm categories.
- streamGenerateContent — Streaming variant with SSE.
Developer interfaces
| Slug | Name | Kind | Version |
|---|---|---|---|
| mcp | Gemini MCP | mcp | — |
| rest-api | Gemini REST API | rest | v1beta |
| sdk-node | @google/genai | sdk | 1.x |
| sdk-go | google-genai-go | sdk | 1.x |
| sdk-python | google-genai (Python) | sdk | 1.x |
| vertex-ai | Vertex AI Endpoint | rest | v1 |
Compare Google Gemini API with
ai-api
Google Gemini API vs Anthropic API
Side-by-side breakdown.
ai-api
Google Gemini API vs AssemblyAI
Side-by-side breakdown.
ai-api
Google Gemini API vs Deepgram
Side-by-side breakdown.
ai-api
Google Gemini API vs ElevenLabs
Side-by-side breakdown.
ai-api
Google Gemini API vs Groq
Side-by-side breakdown.
ai-api
Google Gemini API vs OpenAI API
Side-by-side breakdown.
ai-api
Google Gemini API vs Replicate
Side-by-side breakdown.
ai-api
Google Gemini API vs Together AI
Side-by-side breakdown.
Staxly is an independent catalog of developer platforms. Outbound links to Google Gemini API are plain references to their official pages. Pricing is verified at publication time — reconfirm on the vendor site before buying.