🎯

gemini-3-pro-api

🎯Skill

from adaptationio/skrillz

What it does

Enables seamless integration with Google's Gemini 3 Pro API for advanced text generation, reasoning, and chat applications using Python or Node.js SDKs.

📦

Part of

adaptationio/skrillz(191 items)

gemini-3-pro-api

Installation

pip installInstall Python package

pip install google-genai

📖 Extracted from docs: adaptationio/skrillz

Need more details? View full documentation on GitHub →

3Installs

Last UpdatedJan 16, 2026

View on GitHub Back to Skills

Skill Details

SKILL.md

Gemini 3 Pro API/SDK integration for text generation, reasoning, and chat. Covers setup, authentication, thinking levels, streaming, and production deployment. Use when working with Gemini 3 Pro API, Python SDK, Node.js SDK, text generation, chat applications, or advanced reasoning tasks.

Overview

# Gemini 3 Pro API Integration

Comprehensive guide for integrating Google's Gemini 3 Pro API/SDK into your applications. Covers setup, authentication, text generation, advanced reasoning with dynamic thinking, chat applications, streaming responses, and production deployment patterns.

Overview

Gemini 3 Pro (gemini-3-pro-preview) is Google's most intelligent model designed for complex tasks requiring advanced reasoning and broad world knowledge. This skill provides complete workflows for API integration using Python or Node.js SDKs.

Key Capabilities

Massive Context: 1M token input, 64k token output
Dynamic Thinking: Adaptive reasoning with high/low modes
Streaming: Real-time token delivery
Chat: Multi-turn conversations with history
Production-Ready: Error handling, retry logic, cost optimization

When to Use This Skill

Setting up Gemini 3 Pro API access
Building text generation applications
Implementing chat applications with reasoning
Configuring advanced thinking modes
Deploying production Gemini applications
Optimizing API usage and costs

---

Quick Start

Prerequisites

API Key: Get from [Google AI Studio](https://aistudio.google.com/)
Python 3.9+ or Node.js 18+

Python Quick Start

```python

# Install SDK

pip install google-genai

# Basic usage

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")

model = genai.GenerativeModel("gemini-3-pro-preview")

response = model.generate_content("Explain quantum computing")

print(response.text)

```

Node.js Quick Start

```typescript

// Install SDK

npm install @google/generative-ai

// Basic usage

import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI("YOUR_API_KEY");

const model = genAI.getGenerativeModel({ model: "gemini-3-pro-preview" });

const result = await model.generateContent("Explain quantum computing");

console.log(result.response.text());

```

---

Core Workflows

Workflow 1: Quick Start Setup

Goal: Get from zero to first successful API call in < 5 minutes.

Steps:

Get API Key

- Visit [Google AI Studio](https://aistudio.google.com/)

- Create or select project

- Generate API key

- Copy key securely

Install SDK

```bash

# Python

pip install google-genai

# Node.js

npm install @google/generative-ai

```

Configure Authentication

```python

# Python - using environment variable (recommended)

import os

import google.generativeai as genai

genai.configure(api_key=os.getenv("GEMINI_API_KEY"))

```

```typescript

// Node.js - using environment variable (recommended)

const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);

```

Make First API Call

```python

# Python

model = genai.GenerativeModel("gemini-3-pro-preview")

response = model.generate_content("Write a haiku about coding")

print(response.text)

```

Verify Success

- Check response received

- Verify text output

- Note token usage

- Confirm API key working

Expected Outcome: Working API integration in under 5 minutes.

---

Workflow 2: Chat Application Development

Goal: Build a production-ready chat application with conversation history and streaming.

Steps:

Initialize Chat Model

```python

# Python

model = genai.GenerativeModel(

"gemini-3-pro-preview",

generation_config={

"thinking_level": "high", # Dynamic reasoning

"temperature": 1.0, # Keep at 1.0 for best results

"max_output_tokens": 8192

}

)

```

Start Chat Session

```python

chat = model.start_chat(history=[])

```

Send Message with Streaming

```python

response = chat.send_message(

"Explain how neural networks learn",

stream=True

)

# Stream tokens in real-time

for chunk in response:

print(chunk.text, end="", flush=True)

```

Manage Conversation History

```python

# History is automatically maintained

# Access it anytime

print(f"Conversation turns: {len(chat.history)}")

# Continue conversation

response = chat.send_message("Can you give an example?")

```

Handle Thought Signatures

- SDKs handle automatically in standard chat flows

- No manual intervention needed for basic use

- See references/thought-signatures.md for advanced cases

Implement Error Handling

```python

import time

from google.api_core import retry, exceptions

@retry.Retry(predicate=retry.if_exception_type(

exceptions.ResourceExhausted,

exceptions.ServiceUnavailable

))

def send_with_retry(chat, message):

return chat.send_message(message)

try:

response = send_with_retry(chat, user_input)

except exceptions.GoogleAPIError as e:

print(f"API error: {e}")

```

Expected Outcome: Production-ready chat application with streaming, history, and error handling.

---

Workflow 3: Production Deployment

Goal: Deploy Gemini 3 Pro integration with monitoring, cost control, and reliability.

Steps:

Setup Authentication (Production)

```python

# Use environment variables (never hardcode keys)

import os

from pathlib import Path

# Option 1: Environment variable

api_key = os.getenv("GEMINI_API_KEY")

# Option 2: Secrets manager (recommended for production)

# Use Google Secret Manager, AWS Secrets Manager, etc.

```

Configure Production Settings

```python

model = genai.GenerativeModel(

"gemini-3-pro-preview",

generation_config={

"thinking_level": "high", # or "low" for simple tasks

"temperature": 1.0, # CRITICAL: Keep at 1.0

"max_output_tokens": 4096,

"top_p": 0.95,

"top_k": 40

safety_settings={

# Configure content filtering as needed

}

)

```

Implement Comprehensive Error Handling

```python

from google.api_core import exceptions, retry

import logging

logging.basicConfig(level=logging.INFO)

logger = logging.getLogger(__name__)

def generate_with_fallback(prompt, max_retries=3):

@retry.Retry(

predicate=retry.if_exception_type(

exceptions.ResourceExhausted,

exceptions.ServiceUnavailable,

exceptions.DeadlineExceeded

initial=1.0,

maximum=10.0,

multiplier=2.0,

deadline=60.0

)

def _generate():

return model.generate_content(prompt)

try:

return _generate()

except exceptions.InvalidArgument as e:

logger.error(f"Invalid argument: {e}")

raise

except exceptions.PermissionDenied as e:

logger.error(f"Permission denied: {e}")

raise

except Exception as e:

logger.error(f"Unexpected error: {e}")

# Fallback to simpler model or cached response

return None

```

Monitor Usage and Costs

```python

def log_usage(response):

usage = response.usage_metadata

logger.info(f"Tokens - Input: {usage.prompt_token_count}, "

f"Output: {usage.candidates_token_count}, "

f"Total: {usage.total_token_count}")

# Estimate cost (for prompts ≤200k tokens)

input_cost = (usage.prompt_token_count / 1_000_000) * 2.00

output_cost = (usage.candidates_token_count / 1_000_000) * 12.00

total_cost = input_cost + output_cost

logger.info(f"Estimated cost: ${total_cost:.6f}")

response = model.generate_content(prompt)

log_usage(response)

```

Implement Rate Limiting

```python

import time

from collections import deque

class RateLimiter:

def __init__(self, max_requests_per_minute=60):

self.max_rpm = max_requests_per_minute

self.requests = deque()

def wait_if_needed(self):

now = time.time()

# Remove requests older than 1 minute

while self.requests and self.requests[0] < now - 60:

self.requests.popleft()

# Check if at limit

if len(self.requests) >= self.max_rpm:

sleep_time = 60 - (now - self.requests[0])

if sleep_time > 0:

time.sleep(sleep_time)

self.requests.append(now)

limiter = RateLimiter(max_requests_per_minute=60)

def generate_with_rate_limit(prompt):

limiter.wait_if_needed()

return model.generate_content(prompt)

```

Setup Logging and Monitoring

```python

import logging

from datetime import datetime

# Configure logging

logging.basicConfig(

level=logging.INFO,

format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',

handlers=[

logging.FileHandler('gemini_api.log'),

logging.StreamHandler()

]

)

logger = logging.getLogger(__name__)

def monitored_generate(prompt):

start_time = datetime.now()

try:

response = model.generate_content(prompt)

duration = (datetime.now() - start_time).total_seconds()

logger.info(f"Success - Duration: {duration}s, "

f"Tokens: {response.usage_metadata.total_token_count}")

return response

except Exception as e:

duration = (datetime.now() - start_time).total_seconds()

logger.error(f"Failed - Duration: {duration}s, Error: {e}")

raise

```

Expected Outcome: Production-ready deployment with monitoring, cost control, error handling, and rate limiting.

---

Thinking Levels

Dynamic Thinking System

Gemini 3 Pro introduces thinking_level to control reasoning depth:

thinking_level: "high" (default)

Maximum reasoning depth
Best quality for complex tasks
Slower first-token response
Higher cost
Use for: Complex reasoning, coding, analysis, research

thinking_level: "low"

Minimal reasoning overhead
Faster response
Lower cost
Simpler output
Use for: Simple questions, factual answers, quick queries

Configuration

```python

# Python

model = genai.GenerativeModel(

"gemini-3-pro-preview",

generation_config={

"thinking_level": "high" # or "low"

}

)

```

```typescript

// Node.js

const model = genAI.getGenerativeModel({

model: "gemini-3-pro-preview",

generationConfig: {

thinking_level: "high" // or "low"

}

});

```

Critical Notes

⚠️ Temperature MUST stay at 1.0 - Changing temperature can cause looping or degraded performance on complex reasoning tasks.

⚠️ Cannot combine thinking_level with legacy thinking_budget parameter.

See references/thinking-levels.md for detailed guide.

---

Streaming Responses

Python Streaming

```python

response = model.generate_content(

"Write a long article about AI",

stream=True

)

for chunk in response:

print(chunk.text, end="", flush=True)

```

Node.js Streaming

```typescript

const result = await model.generateContentStream("Write a long article about AI");

for await (const chunk of result.stream) {

process.stdout.write(chunk.text());

}

```

Benefits

Lower perceived latency
Real-time user feedback
Better UX for long responses
Can process tokens as they arrive

See references/streaming.md for advanced patterns.

---

Cost Optimization

Pricing (Gemini 3 Pro)

| Context Size | Input | Output |

|-------------|-------|--------|

| ≤ 200k tokens | $2/1M | $12/1M |

| > 200k tokens | $4/1M | $18/1M |

Optimization Strategies

Keep prompts under 200k tokens (50% cheaper)
Use thinking_level: "low" for simple tasks (faster, lower cost)
Implement context caching for reusable contexts (see gemini-3-advanced skill)
Monitor token usage and set budgets
Use Gemini 1.5 Flash for simple tasks (20x cheaper)

See references/best-practices.md for comprehensive cost optimization.

---

Model Selection

Gemini 3 Pro vs Other Models

|-------|---------|--------|-------------|----------|

| gemini-3-pro-preview | 1M | 64k | $2-4/1M | Complex reasoning, coding |

| gemini-1.5-pro | 1M | 8k | $7-14/1M | General use, multimodal |

| gemini-1.5-flash | 1M | 8k | $0.35-0.70/1M | Simple tasks, cost-sensitive |

When to Use Gemini 3 Pro

✅ Complex reasoning tasks

✅ Advanced coding problems

✅ Long-context analysis (up to 1M tokens)

✅ Large output requirements (up to 64k tokens)

✅ Tasks requiring dynamic thinking

When to Use Alternatives

Gemini 1.5 Flash: Simple tasks, cost-sensitive applications
Gemini 1.5 Pro: Multimodal tasks, general use
Gemini 2.5 models: Experimental features, specific capabilities

---

Error Handling

Common Errors

| Error | Cause | Solution |

|-------|-------|----------|

| ResourceExhausted | Rate limit exceeded | Implement retry with backoff |

| InvalidArgument | Invalid parameters | Validate input, check docs |

| PermissionDenied | Invalid API key | Check authentication |

| DeadlineExceeded | Request timeout | Reduce context, retry |

Production Error Handling

```python

from google.api_core import exceptions, retry

@retry.Retry(

predicate=retry.if_exception_type(

exceptions.ResourceExhausted,

exceptions.ServiceUnavailable

initial=1.0,

maximum=60.0,

multiplier=2.0

)

def safe_generate(prompt):

try:

return model.generate_content(prompt)

except exceptions.InvalidArgument as e:

logger.error(f"Invalid argument: {e}")

raise

except exceptions.PermissionDenied as e:

logger.error(f"Permission denied - check API key: {e}")

raise

except Exception as e:

logger.error(f"Unexpected error: {e}")

raise

```

See references/error-handling.md for comprehensive patterns.

---

References

Setup & Configuration

[Setup Guide](references/setup-guide.md) - Installation, authentication, configuration
[Best Practices](references/best-practices.md) - Optimization, cost control, tips

Features

[Text Generation](references/text-generation.md) - Detailed text generation patterns
[Chat Patterns](references/chat-patterns.md) - Chat conversation management
[Thinking Levels](references/thinking-levels.md) - Dynamic thinking system guide
[Streaming](references/streaming.md) - Streaming response patterns

Production

[Error Handling](references/error-handling.md) - Error handling and retry strategies

Official Resources

[Gemini 3 Documentation](https://ai.google.dev/gemini-api/docs/gemini-3)
[Python SDK Docs](https://googleapis.github.io/python-genai/)
[Node.js SDK Docs](https://github.com/google/generative-ai-js)
[API Reference](https://ai.google.dev/gemini-api/docs)
[Pricing](https://ai.google.dev/pricing)

---

Next Steps

After Basic Setup

Explore chat applications - Build conversational interfaces
Add multimodal capabilities - Use gemini-3-multimodal skill
Add image generation - Use gemini-3-image-generation skill
Add advanced features - Use gemini-3-advanced skill (caching, tools, batch)

Common Integration Patterns

Simple Chatbot: This skill only
Multimodal Assistant: This skill + gemini-3-multimodal
Creative Bot: This skill + gemini-3-image-generation
Production App: All 4 Gemini 3 skills

---

Troubleshooting

Issue: API key not working

Solution: Verify API key in Google AI Studio, check environment variable

Issue: Rate limit errors

Solution: Implement rate limiting, upgrade to paid tier, reduce request frequency

Issue: Slow responses

Solution: Use thinking_level: "low" for simple tasks, enable streaming, reduce context size

Issue: High costs

Solution: Keep prompts under 200k tokens, use appropriate thinking level, consider Gemini 1.5 Flash for simple tasks

Issue: Temperature warnings

Solution: Keep temperature at 1.0 (default) - do not modify for complex reasoning tasks

---

Summary

This skill provides everything needed to integrate Gemini 3 Pro API into your applications:

✅ Quick setup (< 5 minutes)

✅ Production-ready chat applications

✅ Dynamic thinking configuration

✅ Streaming responses

✅ Error handling and retry logic

✅ Cost optimization strategies

✅ Monitoring and logging patterns

For multimodal, image generation, and advanced features, see the companion skills.

Ready to build? Start with Workflow 1: Quick Start Setup above!

More from this repository10

🎯

xai-stock-sentiment🎯Skill

xai-stock-sentiment skill from adaptationio/skrillz

🎯

ralph-prompt-single-task🎯Skill

Generates Ralph-compatible prompts for single implementation tasks with clear completion criteria and automatic verification.

🎯

autonomous-loop🎯Skill

Orchestrates continuous autonomous coding sessions, managing feature implementation, testing, and progress tracking with intelligent checkpointing and recovery mechanisms.

🎯

auto-claude-troubleshooting🎯Skill

auto-claude-troubleshooting skill from adaptationio/skrillz

🎯

auto-claude-setup🎯Skill

auto-claude-setup skill from adaptationio/skrillz

🎯

observability-analyzer🎯Skill

Analyzes Claude Code observability data to generate insights on performance, costs, errors, tool usage, sessions, conversations, and subagents through advanced metrics and log querying.

🎯

xai-financial-integration🎯Skill

xai-financial-integration skill from adaptationio/skrillz

🎯

xai-agent-tools🎯Skill

xai-agent-tools skill from adaptationio/skrillz

🎯

xai-crypto-sentiment🎯Skill

xai-crypto-sentiment skill from adaptationio/skrillz

🎯

auto-claude-memory🎯Skill

auto-claude-memory skill from adaptationio/skrillz