marswaveai

marswaveai/skills

12 resources in this repository

GitHub
๐ŸŽฏ12
2

๐ŸŽฏSkills12

๐ŸŽฏlistenhub๐ŸŽฏSkill

Provides AI skills for ListenHub to explain content from videos, podcasts, and other media formats.

listenhub
๐ŸŽฏtts๐ŸŽฏSkill

Text-to-speech via the `listenhub` CLI with two modes: Quick (single voice, low-latency, synchronous) for snippets and casual reading, and Script (multi-speaker, per-segment voice assignment) for dialogue and audiobooks. Enforces AskUserQuestion-based parameter collection and follows shared CLI auth, config, and speaker-selection patterns.

tts
๐ŸŽฏasr๐ŸŽฏSkill

Transcribe audio files to text fully offline via the `coli asr` CLI, using local speech-recognition models โ€” `sensevoice` for Chinese/English/Japanese/Korean/Cantonese (with language, emotion, and audio-event detection) or `whisper-tiny.en` for English only. Optionally polishes transcripts (punctuation cleanup, filler removal) and can export to Markdown with front-matter metadata.

asr
๐ŸŽฏpodcast๐ŸŽฏSkill

Generate podcast episodes (1โ€“2 AI speakers) from a topic, URL, or text via the `listenhub` CLI in three modes โ€” Quick (brief overview), Deep (in-depth analysis), and Debate (two-speaker argument). Triggers on Chinese/English keywords (`podcast`, `ๆ’ญๅฎข`, `ๅฝ•ไธ€ๆœŸ่Š‚็›ฎ`, `debate`, etc.), uses AskUserQuestion for every choice, defaults to 2 speakers, auto-detects language, and only calls generation APIs after the user confirms the summary.

podcast
๐ŸŽฏcontent-parser๐ŸŽฏSkill

Skill

content-parser
๐ŸŽฏimage-gen๐ŸŽฏSkill

Generates AI images via the `listenhub image create` CLI with `gemini-3-pro-image-preview` (pro) or `gemini-3.1-flash-image-preview` (flash) models, supporting 1K/2K/4K resolutions, multiple aspect ratios (16:9, 1:1, 9:16, plus 1:4/4:1/1:8/8:1 on flash), and up to 5 reference images via `--reference`. Saves outputs to `.listenhub/image-gen/YYYY-MM-DD-{jobId}/` with inline/download/both display modes.

image-gen
๐ŸŽฏexplainer๐ŸŽฏSkill

Skill

explainer
๐ŸŽฏcreator๐ŸŽฏSkill

listenhub-driven creator workflow that turns a topic/URL/text/audio into a platform-ready content package (WeChat article, Xiaohongshu post, narration script) with article + images + metadata โ€” enforces one-question-at-a-time AskUserQuestion, explicit confirmation gates, UI-language mirroring, and `listenhub auth`/`LISTENHUB_API_KEY` checks before remote calls.

creator
๐ŸŽฏmusic๐ŸŽฏSkill

Generates original AI music from text prompts or creates cover versions from reference audio, supporting custom styles, titles, and instrumental-only options through the ListenHub CLI.

music
๐ŸŽฏlistenhub-cli๐ŸŽฏSkill

A router skill that identifies user intent and delegates to specialized ListenHub skills for podcasts, explainer videos, slides, text-to-speech, image generation, music creation, content extraction, and audio transcription.

listenhub-cli
๐ŸŽฏslides๐ŸŽฏSkill

Generates slide decks with AI-created visuals from topics, URLs, or text input, with optional audio narration support โ€” ideal for presentations, summaries, and visual storytelling.

slides
๐ŸŽฏcola-avatar-pack๐ŸŽฏSkill

Generates pixel-art self-portraits, profile cards, emoji GIFs, and meme stickers for the Cola AI companion, with automatic mood-based expression selection and language-adaptive interactions in Chinese and English.

cola-avatar-pack