veo-use
π―Skillfrom cnemri/google-genai-skills
Generates and edits videos using Google's Veo AI models with text, image, and reference-based inputs across multiple creative modes.
Part of
cnemri/google-genai-skills(9 items)
Installation
uv run skills/veo-use/scripts/text_to_video.py "A cinematic drone shot of a futuristic city" --output city.mp4uv run skills/veo-use/scripts/image_to_video.py "Zoom out from the flower" --image start.png --output flower.mp4uv run skills/veo-use/scripts/reference_to_video.py "A man walking on the moon" --reference-image man.png --output moon_walk.mp4uv run skills/veo-use/scripts/edit_video.py --video input.mp4 --mask mask.png --mode INSERT --prompt "A flying car" --output edited.mp4uv run skills/veo-use/scripts/extend_video.py --video clip.mp4 --prompt "The car flies away into the sunset" --duration 6 --output extended.mp4Skill Details
"Create and edit videos using Google's Veo 2 and Veo 3 models. Supports Text-to-Video, Image-to-Video, Reference-to-Video, Inpainting, and Video Extension. Available parameters: prompt, image, mask, mode, duration, aspect-ratio. Always confirm parameters with the user or explicitly state defaults before running."
More from this repository8
Provides expert guidance and Python code examples for building, configuring, and deploying intelligent agents using the Google Agent Development Kit (ADK).
Provides expert Python code guidance for leveraging Google's Gemini API with the official GenAI SDK, covering text, chat, multimodal, and generative AI tasks.
Autonomously conducts multi-step research by searching web, analyzing files, and generating comprehensive, cited reports using Gemini.
Generates compact, efficient Python code snippets for processing and analyzing small banana-related datasets with minimal computational overhead.
Generates, transcribes, and clones voices using Google's GenAI and Cloud Speech SDKs with support for Gemini-TTS, Chirp 3, and custom voice models.
Generates and edits videos using Google's Veo AI models, supporting text-to-video, image-to-video, and advanced video manipulation techniques.
Generates speech audio from text using Google's text-to-speech technology, enabling easy audio conversion for various applications.
Generates and edits high-quality images using Gemini's Nano Banana models, supporting text-to-image, style transfer, and character consistency.