๐ŸŽฏ

image-gen

๐ŸŽฏSkill

from marswaveai/skills

VibeIndex|
What it does
|

Generates AI images via the `listenhub image create` CLI with `gemini-3-pro-image-preview` (pro) or `gemini-3.1-flash-image-preview` (flash) models, supporting 1K/2K/4K resolutions, multiple aspect ratios (16:9, 1:1, 9:16, plus 1:4/4:1/1:8/8:1 on flash), and up to 5 reference images via `--reference`. Saves outputs to `.listenhub/image-gen/YYYY-MM-DD-{jobId}/` with inline/download/both display modes.

๐Ÿ“ฆ

Same repository

marswaveai/skills(12 items)

image-gen

Installation

Vibe Index InstallInstalls to .claude/skills/ - auto-recognized by Claude Code
npx vibeindex add marswaveai/skills --skill image-gen
skills.sh Installโš  Installs to .agents/skills/ - may not be auto-recognized by Claude Code
npx skills add marswaveai/skills --skill image-gen
Manual InstallCopy SKILL.md content and save to the path below
~/.claude/skills/image-gen/SKILL.md

SKILL.md

670Installs
-
AddedApr 13, 2026

More from this repository10

๐ŸŽฏ
listenhub๐ŸŽฏSkill

Provides AI skills for ListenHub to explain content from videos, podcasts, and other media formats.

๐ŸŽฏ
tts๐ŸŽฏSkill

Text-to-speech via the `listenhub` CLI with two modes: Quick (single voice, low-latency, synchronous) for snippets and casual reading, and Script (multi-speaker, per-segment voice assignment) for dialogue and audiobooks. Enforces AskUserQuestion-based parameter collection and follows shared CLI auth, config, and speaker-selection patterns.

๐ŸŽฏ
asr๐ŸŽฏSkill

Transcribe audio files to text fully offline via the `coli asr` CLI, using local speech-recognition models โ€” `sensevoice` for Chinese/English/Japanese/Korean/Cantonese (with language, emotion, and audio-event detection) or `whisper-tiny.en` for English only. Optionally polishes transcripts (punctuation cleanup, filler removal) and can export to Markdown with front-matter metadata.

๐ŸŽฏ
podcast๐ŸŽฏSkill

Generate podcast episodes (1โ€“2 AI speakers) from a topic, URL, or text via the `listenhub` CLI in three modes โ€” Quick (brief overview), Deep (in-depth analysis), and Debate (two-speaker argument). Triggers on Chinese/English keywords (`podcast`, `ๆ’ญๅฎข`, `ๅฝ•ไธ€ๆœŸ่Š‚็›ฎ`, `debate`, etc.), uses AskUserQuestion for every choice, defaults to 2 speakers, auto-detects language, and only calls generation APIs after the user confirms the summary.

๐ŸŽฏ
content-parser๐ŸŽฏSkill

Skill

๐ŸŽฏ
explainer๐ŸŽฏSkill

Skill

๐ŸŽฏ
creator๐ŸŽฏSkill

listenhub-driven creator workflow that turns a topic/URL/text/audio into a platform-ready content package (WeChat article, Xiaohongshu post, narration script) with article + images + metadata โ€” enforces one-question-at-a-time AskUserQuestion, explicit confirmation gates, UI-language mirroring, and `listenhub auth`/`LISTENHUB_API_KEY` checks before remote calls.

๐ŸŽฏ
music๐ŸŽฏSkill

Generates original AI music from text prompts or creates cover versions from reference audio, supporting custom styles, titles, and instrumental-only options through the ListenHub CLI.

๐ŸŽฏ
listenhub-cli๐ŸŽฏSkill

A router skill that identifies user intent and delegates to specialized ListenHub skills for podcasts, explainer videos, slides, text-to-speech, image generation, music creation, content extraction, and audio transcription.

๐ŸŽฏ
slides๐ŸŽฏSkill

Generates slide decks with AI-created visuals from topics, URLs, or text input, with optional audio narration support โ€” ideal for presentations, summaries, and visual storytelling.