Install the SDK and Get an API Key
| |
Get a free API key from Google AI Studio. Set it as an environment variable:
| |
The SDK reads GEMINI_API_KEY automatically. You can also pass it explicitly when creating the client. Python 3.9+ is required.
The package name is google-genai, not google-generativeai. The old google-generativeai package hit end-of-life in November 2025. If you’re following an older tutorial that imports import google.generativeai as genai, stop and switch to the new SDK.
Your First API Call
| |
That’s the entire pattern. The client.models.generate_content() method is the workhorse for everything – text, images, PDFs, function calling. The model parameter takes a model string. As of early 2026, your best options are:
gemini-2.5-flash– Fast, cheap, great for most tasks. This is what you should default to.gemini-2.5-pro– More capable for complex reasoning, code analysis, and multi-document tasks. Costs more.gemini-2.5-flash-lite– Cheapest option, best for high-volume classification or summarization.
Google also has gemini-3-flash-preview and gemini-3-pro-preview in preview. They’re more powerful but APIs may change before GA.
Configuring Generation Parameters
| |
Set temperature=0 for deterministic output (useful for extraction tasks). The types module holds all the configuration classes you’ll need throughout the SDK.
Streaming Responses
For chatbots and real-time UIs, stream tokens as they arrive instead of waiting for the full response:
| |
The generate_content_stream() method returns an iterator of chunks. Each chunk has a .text attribute with the latest tokens. Streaming doesn’t change the total generation time, but the user sees output immediately.
Multi-Turn Conversations
The SDK has a built-in chat session that tracks conversation history for you:
| |
The chat object accumulates messages automatically. You don’t need to manually track and pass the conversation history on each call. This also works with streaming via chat.send_message_stream().
You can seed a chat with history if you’re restoring a previous session:
| |
Multimodal Input: Images and PDFs
Gemini is natively multimodal. You can send images, PDFs, audio, and video alongside text in a single request.
Analyzing a Local Image
| |
Uploading and Analyzing a PDF
For larger files, upload them first with the Files API:
| |
The Files API handles files up to 2GB. Uploaded files are available for 48 hours. This is the easiest way to process PDFs, long videos, or audio files that would be too large to send inline.
Function Calling
Function calling lets Gemini invoke your Python functions to fetch real-time data or perform actions. The SDK can handle the full loop automatically.
Automatic Function Calling
Pass Python functions directly as tools. The SDK extracts the schema from type hints and docstrings, calls your function when the model requests it, and sends the result back:
| |
The SDK automatically calls get_stock_price when Gemini requests it, then sends the result back to the model to produce a natural language answer. No manual loop needed.
Manual Function Calling
If you need control over function execution (for validation, logging, or async calls), disable automatic calling:
| |
This gives you a chance to validate arguments, add logging, or route the call to an async worker before sending the result back.
Structured Output with Pydantic
When you need JSON that matches a specific schema – for data extraction, API responses, or pipeline steps – use structured output with Pydantic models:
| |
The model is constrained to output valid JSON matching your schema. No more parsing broken JSON or hoping the model follows your prompt instructions. This works with enums, nested models, lists, and optional fields.
For simpler cases, you can skip Pydantic and pass a dictionary schema directly. But Pydantic gives you automatic validation and type safety, so use it.
Safety Settings
Gemini has built-in content safety filters that can block requests or responses. The defaults are reasonable, but you may need to adjust them for applications that handle sensitive content legitimately:
| |
The available categories are HARM_CATEGORY_HARASSMENT, HARM_CATEGORY_HATE_SPEECH, HARM_CATEGORY_SEXUALLY_EXPLICIT, HARM_CATEGORY_DANGEROUS_CONTENT, and HARM_CATEGORY_CIVIC_INTEGRITY. Thresholds range from BLOCK_NONE to BLOCK_LOW_AND_ABOVE. Even with BLOCK_NONE, Google’s internal filters may still block certain content.
If your request gets blocked, the response will have an empty text attribute and a populated candidates[0].finish_reason of SAFETY. Check response.candidates[0].safety_ratings to see which category triggered the block.
Common Errors and Fixes
ModuleNotFoundError: No module named 'google.genai'
| |
You installed the wrong package. The correct command is pip install google-genai. If you previously had google-generativeai installed, uninstall it first to avoid conflicts:
| |
400 INVALID_ARGUMENT: API key not valid
| |
Your API key is missing, expired, or malformed. Verify it’s set correctly:
| |
If you’re passing it explicitly, make sure there are no trailing spaces or newline characters. Generate a new key at Google AI Studio if needed.
429 RESOURCE_EXHAUSTED
| |
You’ve hit a rate limit. The free tier has generous but real limits – 15 requests per minute for gemini-2.5-flash and 2 RPM for gemini-2.5-pro. Solutions:
- Add exponential backoff with
time.sleep()between requests. - Switch to a smaller model (
gemini-2.5-flash-lite) for high-volume workloads. - Upgrade to a paid plan for higher quotas.
Empty response with finish_reason: SAFETY
Your prompt or the generated response triggered a safety filter. Check which category caused it:
| |
Adjust safety settings to be more permissive for that category, or rephrase your prompt to avoid triggering the filter.
TypeError: generate_content() got an unexpected keyword argument
You’re mixing up the old google-generativeai SDK patterns with the new google-genai SDK. The old SDK used model.generate_content() on a model object. The new SDK uses client.models.generate_content() on the client. Check your imports – you should have from google import genai, not import google.generativeai as genai.
Gemini vs. Claude vs. GPT
Pick the right model for the job:
- Gemini 2.5 Flash wins on price and speed. Google’s free tier is the most generous of the three providers. For prototyping, high-volume extraction, or budget-constrained projects, it’s the obvious choice. Native multimodal support (images, video, audio, PDFs in a single call) is smoother than competitors.
- Claude (via the Anthropic SDK) is strongest at long-context analysis, nuanced writing, and instruction following. If your app needs to process 100k+ token documents or produce carefully structured output, Claude is worth the premium.
- GPT-4o (via the OpenAI SDK) has the broadest ecosystem. More third-party tools, more fine-tuning options, more community resources. If you need function calling with complex tool chains or are already deep in the OpenAI ecosystem, stay there.
All three are excellent. For most new projects in early 2026, start with Gemini Flash for cost, evaluate Claude for quality-sensitive tasks, and use GPT-4o when ecosystem integration matters most.
Related Guides
- How to Use the Google Vertex AI Gemini API for Multimodal Tasks
- How to Use the Anthropic Python SDK for Claude
- How to Use the Anthropic Multi-Turn Conversation API with Tool Use
- How to Use the Mistral API for Code Generation and Chat
- How to Use the OpenAI Realtime API for Voice Applications
- How to Use the Anthropic Token Counting API for Cost Estimation
- How to Use the xAI Grok API for Chat and Function Calling
- How to Use the Anthropic Claude Files API for Large Document Processing
- How to Run Models with the Hugging Face Inference API
- How to Use the Anthropic PDF Processing API for Document Analysis