🎯

image-gen

🎯Skill

from marswaveai/skills

What it does

Generates AI images via the `listenhub image create` CLI with `gemini-3-pro-image-preview` (pro) or `gemini-3.1-flash-image-preview` (flash) models, supporting 1K/2K/4K resolutions, multiple aspect ratios (16:9, 1:1, 9:16, plus 1:4/4:1/1:8/8:1 on flash), and up to 5 reference images via `--reference`. Saves outputs to `.listenhub/image-gen/YYYY-MM-DD-{jobId}/` with inline/download/both display modes.

📦

Same repository

marswaveai/skills(12 items)

image-gen

Installation

Vibe Index InstallInstalls to .claude/skills/ - auto-recognized by Claude Code

npx vibeindex add marswaveai/skills --skill image-gen

skills.sh Install⚠ Installs to .agents/skills/ - may not be auto-recognized by Claude Code

npx skills add marswaveai/skills --skill image-gen

Manual InstallCopy SKILL.md content and save to the path below

~/.claude/skills/image-gen/SKILL.md

SKILL.md

670Installs

AddedApr 13, 2026

View on GitHub Back to Skills

More from this repository10

🎯

listenhub🎯Skill

Provides AI skills for ListenHub to explain content from videos, podcasts, and other media formats.

🎯

tts🎯Skill

Text-to-speech via the `listenhub` CLI with two modes: Quick (single voice, low-latency, synchronous) for snippets and casual reading, and Script (multi-speaker, per-segment voice assignment) for dialogue and audiobooks. Enforces AskUserQuestion-based parameter collection and follows shared CLI auth, config, and speaker-selection patterns.

🎯

asr🎯Skill

Transcribe audio files to text fully offline via the `coli asr` CLI, using local speech-recognition models — `sensevoice` for Chinese/English/Japanese/Korean/Cantonese (with language, emotion, and audio-event detection) or `whisper-tiny.en` for English only. Optionally polishes transcripts (punctuation cleanup, filler removal) and can export to Markdown with front-matter metadata.

🎯

podcast🎯Skill

Generate podcast episodes (1–2 AI speakers) from a topic, URL, or text via the `listenhub` CLI in three modes — Quick (brief overview), Deep (in-depth analysis), and Debate (two-speaker argument). Triggers on Chinese/English keywords (`podcast`, `播客`, `录一期节目`, `debate`, etc.), uses AskUserQuestion for every choice, defaults to 2 speakers, auto-detects language, and only calls generation APIs after the user confirms the summary.

Skill

Skill

listenhub-driven creator workflow that turns a topic/URL/text/audio into a platform-ready content package (WeChat article, Xiaohongshu post, narration script) with article + images + metadata — enforces one-question-at-a-time AskUserQuestion, explicit confirmation gates, UI-language mirroring, and `listenhub auth`/`LISTENHUB_API_KEY` checks before remote calls.

🎯

music🎯Skill

Generates original AI music from text prompts or creates cover versions from reference audio, supporting custom styles, titles, and instrumental-only options through the ListenHub CLI.

🎯

listenhub-cli🎯Skill

A router skill that identifies user intent and delegates to specialized ListenHub skills for podcasts, explainer videos, slides, text-to-speech, image generation, music creation, content extraction, and audio transcription.

🎯

slides🎯Skill

Generates slide decks with AI-created visuals from topics, URLs, or text input, with optional audio narration support — ideal for presentations, summaries, and visual storytelling.