google-gemini-api
Comprehensive integration guide and examples for Google Gemini using @google/genai, covering SDK/fetch usage, multimodal inputs, caching, code execution, grounding, and templates. It requires GEMINI_API_KEY, calls https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent, and includes shell commands (purpose-aligned; no secret exfiltration found).
google-gemini-api
Google Gemini API Skill for Claude Code CLI
Status: Phase 2 Complete ✅ Latest SDK: @google/genai@1.34.0 (⚠️ NOT @google/generative-ai which is DEPRECATED) API Coverage: Text Generation, Multimodal, Function Calling, Streaming, Thinking Mode, Context Caching, Code Execution, Grounding with Google Search
⚠️ CRITICAL: SDK Migration Warning
DEPRECATED: @google/generative-ai (sunset Nov 30, 2025)
CURRENT: @google/genai v1.34+ (use this!)
If you see code using @google/generative-ai, it's outdated! This skill uses the correct current SDK.
What This Skill Does
This skill provides comprehensive knowledge for building applications with Google Gemini API using the correct current SDK (@google/genai v1.34+) and accurate 2025-2026 model information.
Key Capabilities
Phase 1 - Core Features
✅ Text Generation with Gemini 2.5 Pro/Flash/Flash-Lite (GA models) ✅ Streaming with Server-Sent Events (SSE) and async iteration ✅ Multimodal inputs (text + images + video + audio + PDFs) ✅ Function calling (basic + parallel execution) ✅ Thinking mode (adaptive reasoning on 2.5 models) ✅ System instructions for behavior guidance ✅ Multi-turn chat (conversation history management) ✅ Both Node.js SDK (@google/genai) and fetch-based (Cloudflare Workers) approaches ✅ Accurate context windows: 1,048,576 input / 65,536 output tokens (NOT 2M for 2.5 models!)
Phase 2 - Advanced Features
✅ Context Caching (cost optimization with TTL-based caching - up to 90% savings) ✅ Code Execution (built-in Python sandbox for data analysis and computation) ✅ Grounding with Google Search (real-time web information + citations)
Auto-Trigger Keywords
Primary Keywords (Core API)
gemini apigoogle gemini@google/genaigemini-2.5-progemini-2.5-flashgemini-2.5-flash-litegenai sdkgoogle aigemini sdk
Model Names
gemini 2.5gemini 2.0gemini progemini flashgemini flash lite
Text Generation
gemini text generationgenerate content geminigemini chatgemini completiongemini response
Streaming
gemini streamingstream geminigemini sseserver-sent events geminiasync iteration geministreaming tokens gemini
Multimodal Keywords
gemini multimodalgemini visiongemini imagegemini videogemini audiogemini pdfimage understanding geminianalyze image geminivideo understanding geminiaudio transcription geminipdf parsing gemini
Function Calling
function calling geminigemini toolstool calling geminifunction declarations geminiparallel function calling geminicompositional function callinggemini tool use
Thinking Mode
thinking mode geminigemini thinkingadaptive reasoning geminithinking budget geminigemini reasoning
System Instructions & Chat
system instructions geminigemini system promptmulti-turn geminiconversation history geminichat history geminigemini chat sdk
Configuration
gemini temperaturegemini top-pgemini top-kstop sequences geminigeneration config geminiresponse mime type gemini
Context Caching (Phase 2)
context caching geminigemini cachinggemini cache ttlprompt caching geminicache tokens geminigemini cost optimizationreduce cost geminigemini cache videocache documents gemini90% savings gemini
Code Execution (Phase 2)
code execution geminigemini pythongemini code interpreterrun code geminiexecutable code geminigemini data analysisgemini calculationsgemini pandasgemini numpygemini matplotlibgenerate and run code
Grounding with Google Search (Phase 2)
grounding geminigoogle search geminigemini groundingreal-time information geminigemini search retrievalgemini citationsfact-checking geminigemini sourcesweb search geminicurrent events gemini
SDK Migration
@google/generative-ai deprecatedmigrate gemini sdkgemini sdk migrationgenerative-ai deprecatedupdate gemini sdk
Context Window
gemini context windowgemini token limitgemini 1m tokensgemini 2m tokens(⚠️ only Gemini 1.5 Pro has 2M, NOT 2.5 models!)context length gemini
Error Keywords
gemini api errorgemini 401gemini 429gemini rate limitinvalid api key geminimodel not found geminigemini context window exceededfunction calling error geminitool schema invalid geministreaming parse error geminimultimodal format error geminithinking mode not supporteddeprecated sdk error@google/generative-ai not found
Error Keywords - Phase 2
cache not found geminicache ttl expiredinvalid model version geminicode execution failed geminiexecution timeout geminipython package not available geminigrounding requires google cloudgrounding not working geminino grounding metadata
Integration Keywords
nextjs geminireact geminicloudflare workers geminivercel geminigemini backendgemini servergemini edge runtime
Comparison Keywords
gemini vs openaigemini vs claudegemini vs gptgoogle ai vs openai
When to Use This Skill
✅ Use google-gemini-api When:
- Building AI applications with Google's Gemini models
- Need multimodal AI (text + images + video + audio + PDFs)
- Implementing long-context applications (1M+ tokens)
- Using thinking mode for complex reasoning
- Need function calling with parallel execution
- Want streaming responses for better UX
- Deploying to Cloudflare Workers or other edge runtimes
- Building chat applications with conversation history
- Need to migrate from deprecated @google/generative-ai
❌ Don't Use google-gemini-api When:
- You specifically need embeddings (see separate
google-gemini-embeddingsskill for text-embedding-004) - You're using a different AI API provider (OpenAI, Anthropic Claude, etc.)
Quick Example
Text Generation (Node.js SDK)
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Explain quantum computing in simple terms'
});
console.log(response.text);
Text Generation (Fetch - Cloudflare Workers)
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: 'Explain quantum computing in simple terms' }] }]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
Streaming
const response = await ai.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: 'Write a 200-word story about AI'
});
for await (const chunk of response) {
process.stdout.write(chunk.text);
}
Known Issues Prevented
| Issue | Cause | Solution in Skill |
|---|---|---|
| Using deprecated SDK | Installing @google/generative-ai | Prominent warnings + migration guide |
| Wrong context window claims | Claiming 2M tokens for 2.5 models | Accurate: 1,048,576 input tokens |
| Model not found errors | Using old/wrong model names | Current model list (gemini-2.5-pro/flash/flash-lite) |
| Chat not working with fetch | Chat is SDK-only feature | Document SDK requirement for chat helpers |
| Function calling on Flash-Lite | Model doesn't support it | Model capabilities matrix |
| Invalid API key (401) | Missing GEMINI_API_KEY | Environment setup guide |
| Rate limit errors (429) | Too many requests | Exponential backoff pattern |
| Streaming parse errors | Incorrect SSE parsing | Complete SSE implementation |
| Multimodal format errors | Wrong image/video encoding | Base64/URL examples |
| Function schema errors | Invalid OpenAPI subset | Schema validation examples |
| Thinking mode on old models | Only 2.5 models support it | Model feature matrix |
| Parameter conflicts | Using unsupported params | Generation config reference |
| Token counting errors | Multimodal token estimation | Token counting guide |
| System instruction placement | Wrong position in request | Correct structure examples |
| Parallel function call errors | Dependencies not handled | Compositional vs parallel guide |
Current Models (2025-2026)
Gemini 3 Series (December 2025)
- gemini-3-flash: 🆕 Best speed+quality balance for production (1,048,576 input / 65,536 output tokens)
Gemini 2.5 Series (General Availability - Stable)
- gemini-2.5-pro: State-of-the-art thinking model (1,048,576 input / 65,536 output tokens)
- gemini-2.5-flash: Best price-performance (1,048,576 input / 65,536 output tokens)
- gemini-2.5-flash-lite: Cost-optimized, fastest (1,048,576 input / 65,536 output tokens)
Feature Matrix
| Feature | 3-Flash | 2.5-Pro | 2.5-Flash | 2.5-Flash-Lite |
|---|---|---|---|---|
| Thinking Mode | ✅ | ✅ | ✅ | ✅ |
| Function Calling | ✅ | ✅ | ✅ | ❌ |
| Multimodal | ✅ | ✅ | ✅ | ✅ |
| Streaming | ✅ | ✅ | ✅ | ✅ |
| System Instructions | ✅ | ✅ | ✅ | ✅ |
⚠️ Note: Gemini 2.5 Flash-Lite does NOT support function calling!
Token Efficiency
Without Skill
- Research APIs + SDK: ~22,000 tokens
- Fall into deprecated SDK trap: +5,000 tokens (debugging)
- Context window confusion: +3,000 tokens (debugging)
- Total: ~30,000 tokens
With Skill
- Skill discovery + implementation: ~10,500 tokens
- Zero debugging (all errors prevented)
- Total: ~10,500 tokens
Savings: ~65% (19,500 tokens)
What You Get
SKILL.md Content
- Complete API reference (1200+ lines)
- Gemini 2.5 specific guidance
- Streaming patterns (SDK + fetch)
- Function calling (basic + parallel)
- Multimodal examples (images, video, audio, PDFs)
- Thinking mode configuration
- System instructions & chat
- SDK migration guide
- Top 15 errors with solutions
- Production best practices
11 Templates
- package.json (dependencies)
- text-generation-basic.ts (SDK)
- text-generation-fetch.ts (Cloudflare Workers)
- streaming-chat.ts (SDK with async iteration)
- streaming-fetch.ts (SSE parsing)
- multimodal-image.ts (vision)
- multimodal-video-audio.ts (video/audio understanding)
- function-calling-basic.ts (tool use)
- function-calling-parallel.ts (parallel execution)
- thinking-mode.ts (configure thinking budget)
- cloudflare-worker.ts (complete Worker example)
8 Reference Docs
- models-guide.md (2.5 Pro/Flash/Flash-Lite comparison with ACCURATE context windows)
- sdk-migration-guide.md (complete migration from deprecated SDK)
- function-calling-patterns.md (tool use best practices)
- multimodal-guide.md (images, video, audio, PDFs)
- thinking-mode-guide.md (when to use, budget configuration)
- generation-config.md (all parameters explained)
- streaming-patterns.md (SSE implementation for SDK + fetch)
- top-errors.md (15+ documented errors with solutions)
1 Script
- check-versions.sh (verify @google/genai version, warn if using deprecated SDK)
Installation
# From claude-skills repo root
./scripts/install-skill.sh google-gemini-api
# Verify installation
ls -la ~/.claude/skills/google-gemini-api
Quick Reference
Package Version
npm install @google/genai@1.34.0
⚠️ NOT:
npm install @google/generative-ai # DEPRECATED!
Environment Variables
export GEMINI_API_KEY="..."
Models Overview (2025-2026)
- Gemini 3 Flash:
gemini-3-flash(🆕 best speed+quality for production) - Gemini 2.5 Pro:
gemini-2.5-pro(thinking, function calling, multimodal) - Gemini 2.5 Flash:
gemini-2.5-flash(proven price-performance) - Gemini 2.5 Flash-Lite:
gemini-2.5-flash-lite(fastest, no function calling)
Context Windows (ACCURATE)
- Gemini 2.5 models: 1,048,576 input / 65,536 output tokens
- NOT 2M tokens (only Gemini 1.5 Pro has 2M, which is an older model)
Official Documentation
- Gemini API Overview: https://ai.google.dev/gemini-api/docs
- @google/genai SDK: https://github.com/googleapis/js-genai
- Models Guide: https://ai.google.dev/gemini-api/docs/models
- Text Generation: https://ai.google.dev/gemini-api/docs/text-generation
- Function Calling: https://ai.google.dev/gemini-api/docs/function-calling
- Multimodal: https://ai.google.dev/gemini-api/docs/vision
- Streaming: https://ai.google.dev/gemini-api/docs/streaming
- Migration Guide: https://ai.google.dev/gemini-api/docs/migrate-to-genai
Production Validated: Templates tested with @google/genai@1.34.0 Last Updated: 2026-01-03 Maintainer: Jeremy Dawes | Jezweb