gemini-3-pro-api
π―Skillfrom adaptationio/skrillz
Enables seamless integration with Google's Gemini 3 Pro API for advanced text generation, reasoning, and chat applications using Python or Node.js SDKs.
Part of
adaptationio/skrillz(191 items)
Installation
pip install google-genaiSkill Details
Gemini 3 Pro API/SDK integration for text generation, reasoning, and chat. Covers setup, authentication, thinking levels, streaming, and production deployment. Use when working with Gemini 3 Pro API, Python SDK, Node.js SDK, text generation, chat applications, or advanced reasoning tasks.
Overview
# Gemini 3 Pro API Integration
Comprehensive guide for integrating Google's Gemini 3 Pro API/SDK into your applications. Covers setup, authentication, text generation, advanced reasoning with dynamic thinking, chat applications, streaming responses, and production deployment patterns.
Overview
Gemini 3 Pro (gemini-3-pro-preview) is Google's most intelligent model designed for complex tasks requiring advanced reasoning and broad world knowledge. This skill provides complete workflows for API integration using Python or Node.js SDKs.
Key Capabilities
- Massive Context: 1M token input, 64k token output
- Dynamic Thinking: Adaptive reasoning with high/low modes
- Streaming: Real-time token delivery
- Chat: Multi-turn conversations with history
- Production-Ready: Error handling, retry logic, cost optimization
When to Use This Skill
- Setting up Gemini 3 Pro API access
- Building text generation applications
- Implementing chat applications with reasoning
- Configuring advanced thinking modes
- Deploying production Gemini applications
- Optimizing API usage and costs
---
Quick Start
Prerequisites
- API Key: Get from [Google AI Studio](https://aistudio.google.com/)
- Python 3.9+ or Node.js 18+
Python Quick Start
```python
# Install SDK
pip install google-genai
# Basic usage
import google.generativeai as genai
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro-preview")
response = model.generate_content("Explain quantum computing")
print(response.text)
```
Node.js Quick Start
```typescript
// Install SDK
npm install @google/generative-ai
// Basic usage
import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI("YOUR_API_KEY");
const model = genAI.getGenerativeModel({ model: "gemini-3-pro-preview" });
const result = await model.generateContent("Explain quantum computing");
console.log(result.response.text());
```
---
Core Workflows
Workflow 1: Quick Start Setup
Goal: Get from zero to first successful API call in < 5 minutes.
Steps:
- Get API Key
- Visit [Google AI Studio](https://aistudio.google.com/)
- Create or select project
- Generate API key
- Copy key securely
- Install SDK
```bash
# Python
pip install google-genai
# Node.js
npm install @google/generative-ai
```
- Configure Authentication
```python
# Python - using environment variable (recommended)
import os
import google.generativeai as genai
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
```
```typescript
// Node.js - using environment variable (recommended)
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
```
- Make First API Call
```python
# Python
model = genai.GenerativeModel("gemini-3-pro-preview")
response = model.generate_content("Write a haiku about coding")
print(response.text)
```
- Verify Success
- Check response received
- Verify text output
- Note token usage
- Confirm API key working
Expected Outcome: Working API integration in under 5 minutes.
---
Workflow 2: Chat Application Development
Goal: Build a production-ready chat application with conversation history and streaming.
Steps:
- Initialize Chat Model
```python
# Python
model = genai.GenerativeModel(
"gemini-3-pro-preview",
generation_config={
"thinking_level": "high", # Dynamic reasoning
"temperature": 1.0, # Keep at 1.0 for best results
"max_output_tokens": 8192
}
)
```
- Start Chat Session
```python
chat = model.start_chat(history=[])
```
- Send Message with Streaming
```python
response = chat.send_message(
"Explain how neural networks learn",
stream=True
)
# Stream tokens in real-time
for chunk in response:
print(chunk.text, end="", flush=True)
```
- Manage Conversation History
```python
# History is automatically maintained
# Access it anytime
print(f"Conversation turns: {len(chat.history)}")
# Continue conversation
response = chat.send_message("Can you give an example?")
```
- Handle Thought Signatures
- SDKs handle automatically in standard chat flows
- No manual intervention needed for basic use
- See references/thought-signatures.md for advanced cases
- Implement Error Handling
```python
import time
from google.api_core import retry, exceptions
@retry.Retry(predicate=retry.if_exception_type(
exceptions.ResourceExhausted,
exceptions.ServiceUnavailable
))
def send_with_retry(chat, message):
return chat.send_message(message)
try:
response = send_with_retry(chat, user_input)
except exceptions.GoogleAPIError as e:
print(f"API error: {e}")
```
Expected Outcome: Production-ready chat application with streaming, history, and error handling.
---
Workflow 3: Production Deployment
Goal: Deploy Gemini 3 Pro integration with monitoring, cost control, and reliability.
Steps:
- Setup Authentication (Production)
```python
# Use environment variables (never hardcode keys)
import os
from pathlib import Path
# Option 1: Environment variable
api_key = os.getenv("GEMINI_API_KEY")
# Option 2: Secrets manager (recommended for production)
# Use Google Secret Manager, AWS Secrets Manager, etc.
```
- Configure Production Settings
```python
model = genai.GenerativeModel(
"gemini-3-pro-preview",
generation_config={
"thinking_level": "high", # or "low" for simple tasks
"temperature": 1.0, # CRITICAL: Keep at 1.0
"max_output_tokens": 4096,
"top_p": 0.95,
"top_k": 40
},
safety_settings={
# Configure content filtering as needed
}
)
```
- Implement Comprehensive Error Handling
```python
from google.api_core import exceptions, retry
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def generate_with_fallback(prompt, max_retries=3):
@retry.Retry(
predicate=retry.if_exception_type(
exceptions.ResourceExhausted,
exceptions.ServiceUnavailable,
exceptions.DeadlineExceeded
),
initial=1.0,
maximum=10.0,
multiplier=2.0,
deadline=60.0
)
def _generate():
return model.generate_content(prompt)
try:
return _generate()
except exceptions.InvalidArgument as e:
logger.error(f"Invalid argument: {e}")
raise
except exceptions.PermissionDenied as e:
logger.error(f"Permission denied: {e}")
raise
except Exception as e:
logger.error(f"Unexpected error: {e}")
# Fallback to simpler model or cached response
return None
```
- Monitor Usage and Costs
```python
def log_usage(response):
usage = response.usage_metadata
logger.info(f"Tokens - Input: {usage.prompt_token_count}, "
f"Output: {usage.candidates_token_count}, "
f"Total: {usage.total_token_count}")
# Estimate cost (for prompts β€200k tokens)
input_cost = (usage.prompt_token_count / 1_000_000) * 2.00
output_cost = (usage.candidates_token_count / 1_000_000) * 12.00
total_cost = input_cost + output_cost
logger.info(f"Estimated cost: ${total_cost:.6f}")
response = model.generate_content(prompt)
log_usage(response)
```
- Implement Rate Limiting
```python
import time
from collections import deque
class RateLimiter:
def __init__(self, max_requests_per_minute=60):
self.max_rpm = max_requests_per_minute
self.requests = deque()
def wait_if_needed(self):
now = time.time()
# Remove requests older than 1 minute
while self.requests and self.requests[0] < now - 60:
self.requests.popleft()
# Check if at limit
if len(self.requests) >= self.max_rpm:
sleep_time = 60 - (now - self.requests[0])
if sleep_time > 0:
time.sleep(sleep_time)
self.requests.append(now)
limiter = RateLimiter(max_requests_per_minute=60)
def generate_with_rate_limit(prompt):
limiter.wait_if_needed()
return model.generate_content(prompt)
```
- Setup Logging and Monitoring
```python
import logging
from datetime import datetime
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler('gemini_api.log'),
logging.StreamHandler()
]
)
logger = logging.getLogger(__name__)
def monitored_generate(prompt):
start_time = datetime.now()
try:
response = model.generate_content(prompt)
duration = (datetime.now() - start_time).total_seconds()
logger.info(f"Success - Duration: {duration}s, "
f"Tokens: {response.usage_metadata.total_token_count}")
return response
except Exception as e:
duration = (datetime.now() - start_time).total_seconds()
logger.error(f"Failed - Duration: {duration}s, Error: {e}")
raise
```
Expected Outcome: Production-ready deployment with monitoring, cost control, error handling, and rate limiting.
---
Thinking Levels
Dynamic Thinking System
Gemini 3 Pro introduces thinking_level to control reasoning depth:
thinking_level: "high" (default)
- Maximum reasoning depth
- Best quality for complex tasks
- Slower first-token response
- Higher cost
- Use for: Complex reasoning, coding, analysis, research
thinking_level: "low"
- Minimal reasoning overhead
- Faster response
- Lower cost
- Simpler output
- Use for: Simple questions, factual answers, quick queries
Configuration
```python
# Python
model = genai.GenerativeModel(
"gemini-3-pro-preview",
generation_config={
"thinking_level": "high" # or "low"
}
)
```
```typescript
// Node.js
const model = genAI.getGenerativeModel({
model: "gemini-3-pro-preview",
generationConfig: {
thinking_level: "high" // or "low"
}
});
```
Critical Notes
β οΈ Temperature MUST stay at 1.0 - Changing temperature can cause looping or degraded performance on complex reasoning tasks.
β οΈ Cannot combine thinking_level with legacy thinking_budget parameter.
See references/thinking-levels.md for detailed guide.
---
Streaming Responses
Python Streaming
```python
response = model.generate_content(
"Write a long article about AI",
stream=True
)
for chunk in response:
print(chunk.text, end="", flush=True)
```
Node.js Streaming
```typescript
const result = await model.generateContentStream("Write a long article about AI");
for await (const chunk of result.stream) {
process.stdout.write(chunk.text());
}
```
Benefits
- Lower perceived latency
- Real-time user feedback
- Better UX for long responses
- Can process tokens as they arrive
See references/streaming.md for advanced patterns.
---
Cost Optimization
Pricing (Gemini 3 Pro)
| Context Size | Input | Output |
|-------------|-------|--------|
| β€ 200k tokens | $2/1M | $12/1M |
| > 200k tokens | $4/1M | $18/1M |
Optimization Strategies
- Keep prompts under 200k tokens (50% cheaper)
- Use
thinking_level: "low"for simple tasks (faster, lower cost) - Implement context caching for reusable contexts (see
gemini-3-advancedskill) - Monitor token usage and set budgets
- Use Gemini 1.5 Flash for simple tasks (20x cheaper)
See references/best-practices.md for comprehensive cost optimization.
---
Model Selection
Gemini 3 Pro vs Other Models
| Model | Context | Output | Input Price | Best For |
|-------|---------|--------|-------------|----------|
| gemini-3-pro-preview | 1M | 64k | $2-4/1M | Complex reasoning, coding |
| gemini-1.5-pro | 1M | 8k | $7-14/1M | General use, multimodal |
| gemini-1.5-flash | 1M | 8k | $0.35-0.70/1M | Simple tasks, cost-sensitive |
When to Use Gemini 3 Pro
β Complex reasoning tasks
β Advanced coding problems
β Long-context analysis (up to 1M tokens)
β Large output requirements (up to 64k tokens)
β Tasks requiring dynamic thinking
When to Use Alternatives
- Gemini 1.5 Flash: Simple tasks, cost-sensitive applications
- Gemini 1.5 Pro: Multimodal tasks, general use
- Gemini 2.5 models: Experimental features, specific capabilities
---
Error Handling
Common Errors
| Error | Cause | Solution |
|-------|-------|----------|
| ResourceExhausted | Rate limit exceeded | Implement retry with backoff |
| InvalidArgument | Invalid parameters | Validate input, check docs |
| PermissionDenied | Invalid API key | Check authentication |
| DeadlineExceeded | Request timeout | Reduce context, retry |
Production Error Handling
```python
from google.api_core import exceptions, retry
@retry.Retry(
predicate=retry.if_exception_type(
exceptions.ResourceExhausted,
exceptions.ServiceUnavailable
),
initial=1.0,
maximum=60.0,
multiplier=2.0
)
def safe_generate(prompt):
try:
return model.generate_content(prompt)
except exceptions.InvalidArgument as e:
logger.error(f"Invalid argument: {e}")
raise
except exceptions.PermissionDenied as e:
logger.error(f"Permission denied - check API key: {e}")
raise
except Exception as e:
logger.error(f"Unexpected error: {e}")
raise
```
See references/error-handling.md for comprehensive patterns.
---
References
Setup & Configuration
- [Setup Guide](references/setup-guide.md) - Installation, authentication, configuration
- [Best Practices](references/best-practices.md) - Optimization, cost control, tips
Features
- [Text Generation](references/text-generation.md) - Detailed text generation patterns
- [Chat Patterns](references/chat-patterns.md) - Chat conversation management
- [Thinking Levels](references/thinking-levels.md) - Dynamic thinking system guide
- [Streaming](references/streaming.md) - Streaming response patterns
Production
- [Error Handling](references/error-handling.md) - Error handling and retry strategies
Official Resources
- [Gemini 3 Documentation](https://ai.google.dev/gemini-api/docs/gemini-3)
- [Python SDK Docs](https://googleapis.github.io/python-genai/)
- [Node.js SDK Docs](https://github.com/google/generative-ai-js)
- [API Reference](https://ai.google.dev/gemini-api/docs)
- [Pricing](https://ai.google.dev/pricing)
---
Next Steps
After Basic Setup
- Explore chat applications - Build conversational interfaces
- Add multimodal capabilities - Use
gemini-3-multimodalskill - Add image generation - Use
gemini-3-image-generationskill - Add advanced features - Use
gemini-3-advancedskill (caching, tools, batch)
Common Integration Patterns
- Simple Chatbot: This skill only
- Multimodal Assistant: This skill +
gemini-3-multimodal - Creative Bot: This skill +
gemini-3-image-generation - Production App: All 4 Gemini 3 skills
---
Troubleshooting
Issue: API key not working
Solution: Verify API key in Google AI Studio, check environment variable
Issue: Rate limit errors
Solution: Implement rate limiting, upgrade to paid tier, reduce request frequency
Issue: Slow responses
Solution: Use thinking_level: "low" for simple tasks, enable streaming, reduce context size
Issue: High costs
Solution: Keep prompts under 200k tokens, use appropriate thinking level, consider Gemini 1.5 Flash for simple tasks
Issue: Temperature warnings
Solution: Keep temperature at 1.0 (default) - do not modify for complex reasoning tasks
---
Summary
This skill provides everything needed to integrate Gemini 3 Pro API into your applications:
β Quick setup (< 5 minutes)
β Production-ready chat applications
β Dynamic thinking configuration
β Streaming responses
β Error handling and retry logic
β Cost optimization strategies
β Monitoring and logging patterns
For multimodal, image generation, and advanced features, see the companion skills.
Ready to build? Start with Workflow 1: Quick Start Setup above!
More from this repository10
xai-stock-sentiment skill from adaptationio/skrillz
Generates Ralph-compatible prompts for single implementation tasks with clear completion criteria and automatic verification.
Orchestrates continuous autonomous coding sessions, managing feature implementation, testing, and progress tracking with intelligent checkpointing and recovery mechanisms.
auto-claude-troubleshooting skill from adaptationio/skrillz
auto-claude-setup skill from adaptationio/skrillz
Analyzes Claude Code observability data to generate insights on performance, costs, errors, tool usage, sessions, conversations, and subagents through advanced metrics and log querying.
xai-financial-integration skill from adaptationio/skrillz
xai-agent-tools skill from adaptationio/skrillz
xai-crypto-sentiment skill from adaptationio/skrillz
auto-claude-memory skill from adaptationio/skrillz