🎯

lipsync

🎯Skill

from agentspace-so/runcomfy-agent-skills

What it does

A Claude Code skill for lip-syncing faces to audio tracks using RunComfy CLI, routing across OmniHuman, Sync Labs, Kling lipsync, and Creatify endpoints for portrait-to-avatar or video mouth-sync workflows.

Overview

Lipsync is a Claude Code skill for synchronizing a face's mouth movements to an audio track via the RunComfy CLI. It routes across ByteDance OmniHuman for full-body avatar generation from portrait + audio, Sync Labs sync v2 and Pro for premium mouth-sync on existing video, Kling lipsync for audio-to-video and text-to-video with synced speech, and Creatify lipsync. The skill classifies user intent based on input shape (portrait still + audio vs source video + audio vs script-only) and selects the best endpoint for quality tier and budget.

Key Features

Source video + audio lip sync - Sync Labs sync v2 Pro (premium) and standard tiers apply state-of-the-art mouth motion onto existing video footage, preserving the rest of the frame untouched for hero-quality dubs and foreign-language dubbing.
Portrait-to-talking-video - OmniHuman generates a full speaking/singing video from a single portrait image plus an audio file, producing natural head, mouth, and body movement.
Multiple quality and budget tiers - From premium Sync Labs Pro for final delivery to standard tiers for iteration, with Kling lipsync offering additional audio-to-video and text-to-video paths with synced speech.
Built-in consent guidance - Includes responsible-use reminders about the dual-use nature of lip-sync technology, recommending consent verification before processing real faces.

Who is this for?

Video producers and localization teams dubbing content into different languages who need precise mouth-to-audio synchronization on existing footage
Content creators building talking-head videos from portrait photos and voiceover audio without filming
Developers building lip-sync pipelines who need automatic model selection based on input type (still portrait vs existing video) and quality requirements

📦

Same repository

agentspace-so/runcomfy-agent-skills(30 items)

lipsync

Installation

Vibe Index InstallInstalls to .claude/skills/

npx vibeindex add agentspace-so/runcomfy-agent-skills --skill lipsync

skills.sh Install⚠ Installs to .agents/skills/

npx skills add agentspace-so/runcomfy-agent-skills --skill lipsync

Manual InstallCopy SKILL.md content and save to the path below

~/.claude/skills/lipsync/SKILL.md

SKILL.md

293,703Installs

AddedMay 13, 2026

View on GitHub Back to Skills

More from this repository10

🎯

nano-banana-2🎯Skill

A RunComfy skill that generates images using Google Nano Banana 2, the flash-tier text-to-image model in the Gemini family. Optimized for rapid iteration, social thumbnails, and in-image typography with configurable resolution tiers and safety tolerance.

🎯

image-edit🎯Skill

A smart intent-routing skill for image editing on RunComfy that selects the best model based on the editing task. Routes to Nano Banana Edit for batch edits up to 20 images, GPT Image 2 for multilingual text rewrite, Flux Kontext Pro for single-shot precise edits, or Z-Image Turbo for mask-driven inpainting.

🎯

kling-3-0🎯Skill

Provides Kling 3.0 video generation on RunComfy, covering all six endpoints across three quality tiers (Standard, Pro, 4K) and two modes (text-to-video, image-to-video) for Kuaishou's third-generation cinematic video model with native synchronized audio.

🎯

nano-banana-edit🎯Skill

Edit images with Google Nano Banana 2 on RunComfy, supporting batch edits of up to 20 images per call with strong identity preservation. Features localized edits using spatial language, background swaps, and configurable resolution up to 4K.

🎯

wan-2-7🎯Skill

Generate text-to-video with Wan-AI's Wan 2.7 on RunComfy, featuring multi-reference conditioning and audio-driven lip-sync via custom audio tracks. Supports prompt expansion, negative prompts, and up to 1080p resolution through the RunComfy CLI.

🎯

gpt-image-edit🎯Skill

Edit images with OpenAI GPT Image 2 on RunComfy, excelling at multilingual in-image text editing across any script (Latin, kana, CJK, Cyrillic, Arabic) and multi-reference composition with up to 10 input images. Ideal for identity-preserving edits and layout-precise repositioning.

🎯

happyhorse-1-0🎯Skill

Generate text-to-video with HappyHorse 1.0 on RunComfy, currently ranked #1 on Artificial Analysis Video Arena. Supports native 1080p with in-pass synchronized audio, multi-shot character consistency, and 6-language prompt support via the RunComfy CLI.

🎯

seedance-v2🎯Skill

Generate cinematic short-form video with ByteDance Seedance 2.0 Pro on RunComfy, supporting multi-modal references including up to 9 images, 3 videos, and 3 audio tracks. Features native lip-synced audio generation and is ideal for brand-consistent multi-language narratives.

🎯

flux-2-klein🎯Skill

A RunComfy skill for generating images with Black Forest Labs' Flux 2 Klein, the distilled low-latency variant of Flux 2. Supports 9B and 4B model variants with sub-second inference for real-time art direction, rapid concepting, and multi-reference brand styling.

🎯

runcomfy-cli🎯Skill

The foundation skill for the RunComfy platform, providing a single CLI to install, authenticate, and invoke hundreds of model endpoints including image generation, video, face-swap, lip-sync, and LoRA training.