speech-use
π―Skillfrom cnemri/google-genai-skills
Generates, transcribes, and clones voices using Google's GenAI and Cloud Speech SDKs with support for Gemini-TTS, Chirp 3, and custom voice models.
Part of
cnemri/google-genai-skills(9 items)
Installation
uv run skills/speech-use/scripts/generate_speech.py "Hello world, this is a test." --voice Puck --output hello.wavuv run skills/speech-use/scripts/generate_speech.py "This is my custom voice speaking." --voice-cloning-key "YOUR_KEY_HERE" --output custom.wavuv run skills/speech-use/scripts/create_custom_voice.py --reference-audio reference.wav --consent-audio consent.wavuv run skills/speech-use/scripts/transcribe_audio.py audio.wav --language en-US --output transcript.txtSkill Details
"Generate (TTS), Transcribe (STT), and Clone voices using Google's GenAI and Cloud Speech SDKs. Supports Gemini-TTS, Chirp 3, and Instant Custom Voice."
More from this repository8
Provides expert guidance and Python code examples for building, configuring, and deploying intelligent agents using the Google Agent Development Kit (ADK).
Provides expert Python code guidance for leveraging Google's Gemini API with the official GenAI SDK, covering text, chat, multimodal, and generative AI tasks.
Autonomously conducts multi-step research by searching web, analyzing files, and generating comprehensive, cited reports using Gemini.
Generates and edits videos using Google's Veo AI models with text, image, and reference-based inputs across multiple creative modes.
Generates compact, efficient Python code snippets for processing and analyzing small banana-related datasets with minimal computational overhead.
Generates and edits videos using Google's Veo AI models, supporting text-to-video, image-to-video, and advanced video manipulation techniques.
Generates and edits high-quality images using Gemini's Nano Banana models, supporting text-to-image, style transfer, and character consistency.
Generates speech audio from text using Google's text-to-speech technology, enabling easy audio conversion for various applications.