gemini-tts
π―Skillfrom akrindev/google-studio-skills
Generates natural-sounding speech from text using Google Gemini TTS models, supporting multiple voices, streaming, and multi-speaker conversations.
Part of
akrindev/google-studio-skills(5 items)
Installation
python scripts/tts.py "Hello, world! Have a wonderful day."python scripts/tts.py "Welcome to our podcast about technology trends" --voice Puck --output welcomepython scripts/tts.py "TTS the following conversation:python scripts/tts.py "This is a very long text that would benefit from streaming..." --stream --output long-formpython scripts/tts.py "Welcome to our quarterly earnings presentation. Today we'll discuss our growth metrics and future plans." --voice Charon --output voiceover+ 12 more commands
Skill Details
Generate speech from text using Google Gemini TTS models via scripts/. Use for text-to-speech, audio generation, voice synthesis, multi-speaker conversations, and creating audio content. Supports multiple voices and streaming. Triggers on "text to speech", "TTS", "generate audio", "voice synthesis", "speak this text".
More from this repository4
Efficiently process large volumes of AI requests using Gemini Batch API, enabling cost-effective bulk text generation and async job execution via scripts.
Generates high-quality text embeddings using Gemini API for semantic search, similarity analysis, clustering, and RAG applications.
Generates high-quality AI images from text prompts using Google's Gemini and Imagen models, supporting multiple resolutions, aspect ratios, and creative styles.
Generates text content using Google Gemini models with advanced capabilities like multimodal prompts, thinking mode, JSON output, and search grounding.