fal-ai
🎯Skillfrom vorbei/fal-skill
Generates diverse AI media including images, videos, audio, music, and more with intelligent intent classification and multi-language support.
Installation
npx skills add vorbei/fal-skill --skill fal-aiSkill Details
Overview
# /fal-generate
Universal AI media generation - generates images, videos, audio, music, and more.
Instructions
Analyze the user's request and determine what they want to generate. Claude Code should understand the intent directly and call the appropriate fal_api.py command.
Prerequisites
Before executing any generation, verify the API key is configured:
```bash
if [ ! -f ~/.config/fal-skill/.env ]; then
echo "❌ API key not configured. Please run /fal-setup first."
exit 1
fi
```
Intent Classification
| Intent | Triggers (examples) | Command |
|--------|---------------------|---------|
| Image Generation | Default; "画", "生成图片", "create image", "draw" | generate |
| Video Generation | "视频", "video", "animate", "动画", "clip" | video |
| Image-to-Video | Has image + "变成视频", "animate this", "make it move" | video --image-url |
| Background Removal | "去背景", "抠图", "remove bg", "transparent", "cutout" | Use /fal-remove-bg |
| TTS | "朗读", "说", "speak", "voice", "tts", "read aloud" | tts |
| Music | "音乐", "music", "歌曲", "song", "bgm", "soundtrack" | music |
| Sound Effects | "音效", "sound effect", "sfx" | music (cassetteai model) |
| Avatar/Lipsync | "口型", "avatar", "lipsync", "talking head" | avatar |
| Transcribe | "转文字", "transcribe", "听写", "speech to text" | transcribe |
| Upscale | "放大", "upscale", "enhance", "超分", "2x/4x/8x" | upscale |
| Photo Edit | "调色", "打光", "colorize", "relight", "reseason" | edit |
Intent Priority (Conflict Resolution)
When a request matches multiple intents, use this priority order:
- Upscale - If explicit scale keywords (2x, 4x, 8x, upscale, 放大, 超分)
- Background Removal - If "background", "去背景", "抠图", "transparent"
- Transcribe - If "transcribe", "转文字", "听写" with audio file
- Avatar - If "avatar", "lipsync", "口型" with image+audio
- TTS - If "speak", "朗读", "tts", "voice" (speech output)
- Music/SFX - If "music", "音乐", "sfx", "音效" (no video context)
- Video - If "video", "视频", "animate", "动画"
- Photo Edit - If editing keywords with existing image (colorize, relight, reseason)
- Image Generation - Default fallback for creative prompts
Ambiguous Cases
| Request | Correct Intent | Reasoning |
|---------|---------------|-----------|
| "remove the person from photo.jpg" | Photo Edit (remove-object) | "remove" + object subject = object removal |
| "remove background from photo.jpg" | Background Removal | "background" keyword = bg removal |
| "make the image larger" | Upscale | "larger" without creative context = upscale |
| "create a larger castle image" | Image Generation | "larger" in creative context = image gen |
| "music video of a band" | Video Generation | "video" takes precedence over "music" |
| "generate music for my video" | Music Generation | "music" is the output, video is context |
| "enhance photo.jpg" | Upscale | "enhance" alone = quality improvement |
| "enhance the colors in photo.jpg" | Photo Edit (colorize) | "colors" specifies editing operation |
Parameter Extraction
Extract these parameters from the user's request:
Video parameters:
- Duration: "短视频" → 5, "长一点" → 10, "10秒/10 seconds" → 10
- Aspect ratio: "手机看/vertical/portrait/tiktok" → 9:16, "横屏/widescreen" → 16:9, "方形/square" → 1:1
Image parameters:
- Size: "竖版/portrait" → portrait_16_9, "横版/landscape" → landscape_16_9, default → square_hd
- Quality: "快速/fast" → fewer steps, "精细/detailed" → more steps
Upscale parameters:
- Scale: "2倍/2x" → 2, "4倍/4x" → 4, "8倍/8x" → 8 (default: 2)
Default Models
| Task | Model |
|------|-------|
| Image Generation | fal-ai/flux.2/dev |
| Text-to-Video | fal-ai/kling-video/v2/standard/text-to-video |
| Image-to-Video | fal-ai/kling-video/v2/standard/image-to-video |
| TTS | fal-ai/kokoro/american-english |
| Music | fal-ai/minimax-music/v2 |
| Sound Effects | cassetteai/sound-effects-generator |
| Avatar | fal-ai/kling-video/ai-avatar/v2/standard |
| Transcribe | fal-ai/elevenlabs/speech-to-text/scribe-v2 |
| Upscale (Image) | fal-ai/crystal/upscale |
| Photo Colorize | fal-ai/fibo-edit/colorize |
| Photo Relight | fal-ai/fibo-edit/relight |
| Photo Reseason | fal-ai/fibo-edit/reseason |
| Photo Restyle | fal-ai/fibo-edit/restyle |
Execution
The skill directory is at: ~/.claude/skills/fal-ai/fal-ai
Run commands using uv:
```bash
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py
```
File/URL Handling
- If user provides a local file path, first upload it:
uv run python scripts/upload_image.py - Use the returned URL as
--image-url,--video-url, or--audio-url - If user says "这张图", "this image", etc., use the most recently generated/mentioned file
Examples
Image Generation
User: "帮我画一只猫"
```bash
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py generate \
--model fal-ai/flux.2/dev \
--prompt "a cat"
```
User: "a portrait of a warrior in high quality"
```bash
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py generate \
--model fal-ai/flux.2/dev \
--prompt "a portrait of a warrior" \
--size portrait_16_9 \
--steps 50
```
Video Generation
User: "生成一个海浪的视频"
```bash
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py video \
--model fal-ai/kling-video/v2/standard/text-to-video \
--prompt "ocean waves crashing on beach" \
--duration 5 \
--aspect-ratio 16:9
```
User: "适合手机看的短视频,一只猫在跳舞"
```bash
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py video \
--model fal-ai/kling-video/v2/standard/text-to-video \
--prompt "a cat dancing" \
--duration 5 \
--aspect-ratio 9:16
```
Image-to-Video
User: "把这张图变成视频 cat.jpg"
```bash
# First upload the image
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/upload_image.py cat.jpg
# Then generate video (use returned URL)
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py video \
--model fal-ai/kling-video/v2/standard/image-to-video \
--image-url
--prompt "the cat starts walking" \
--duration 5
```
Text-to-Speech
User: "朗读这段话:今天天气真好"
```bash
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py tts \
--model fal-ai/kokoro/american-english \
--text "今天天气真好"
```
User: "speak this in a warm voice: Hello world"
```bash
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py tts \
--model fal-ai/kokoro/american-english \
--text "Hello world" \
--voice af_heart
```
Music Generation
User: "生成一段轻松的背景音乐"
```bash
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py music \
--model fal-ai/minimax-music/v2 \
--prompt "relaxing background music, ambient, calm" \
--duration 30
```
User: "explosion sound effect"
```bash
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py music \
--model cassetteai/sound-effects-generator \
--prompt "explosion, cinematic"
```
Avatar / Lipsync
User: "用这张照片和音频做口型同步 portrait.jpg audio.mp3"
```bash
# Upload both files first
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/upload_image.py portrait.jpg
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/upload_image.py audio.mp3
# Then create avatar
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py avatar \
--model fal-ai/kling-video/ai-avatar/v2/standard \
--image-url
--audio-url
```
Transcription
User: "把这段音频转成文字 meeting.mp3"
```bash
# Upload audio first
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/upload_image.py meeting.mp3
# Then transcribe
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py transcribe \
--model fal-ai/elevenlabs/speech-to-text/scribe-v2 \
--audio-url
```
Upscale
User: "放大这张图4倍 img.png"
```bash
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/upload_image.py img.png
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py upscale \
--model fal-ai/crystal/upscale \
--image-url
--scale 4
```
Photo Editing
User: "让照片更温暖 photo.jpg"
```bash
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/upload_image.py photo.jpg
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py edit \
--model fal-ai/fibo-edit/relight \
--image-url
--light-type "sunrise light"
```
User: "把这张照片变成冬天的场景 landscape.jpg"
```bash
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/upload_image.py landscape.jpg
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py edit \
--model fal-ai/fibo-edit/reseason \
--image-url
--season winter
```
User: "colorize this old photo vintage.jpg"
```bash
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/upload_image.py vintage.jpg
cd ~/.claude/skills/fal-ai/fal-ai && uv run python scripts/fal_api.py edit \
--model fal-ai/fibo-edit/colorize \
--image-url
--color "contemporary color"
```
Error Handling
If a command fails:
- Check if the API key is configured:
~/.config/fal-skill/.env - Check if the file exists and is accessible
- Report the error message to the user
- Suggest alternatives or corrections
Related Skills
/fal-remove-bg- Background removal (specialized)/fal-generate-video- Video generation (specialized)/fal-setup- Configure API key