🎯

image-generate

🎯Skill

from agntswrm/agent-media

VibeIndex|
What it does

Generates images from text prompts using AI image generation services like Fal.ai, Replicate, or Runpod.

image-generate

Installation

Install skill:
npx skills add https://github.com/agntswrm/agent-media --skill image-generate
1
AddedJan 25, 2026

Skill Details

SKILL.md

Overview

# agent-media

Media processing CLI for AI agents.

  • Image: generate, edit, remove-background, resize, convert, extend, crop
  • Video: generate (text-to-video and image-to-video)
  • Audio: extract from video, transcribe (with speaker identification)

Installation

Global

```bash

npm install -g agent-media@latest

```

From Source

```bash

git clone https://github.com/agntswrm/agent-media

cd agent-media

pnpm install && pnpm build && pnpm link --global

```

Via bunx / npx

Run directly without installing:

```bash

bunx agent-media@latest --help

npx agent-media@latest --help

```

Skills for AI Agents

Install agent-media skills to your coding agent (Claude Code, Cursor, Codex, etc.):

```bash

npx skills add agntswrm/agent-media

```

This adds media processing skills that your AI agent can use automatically. Available skills:

  • agent-media - Overview of all capabilities
  • image-generate - Generate images from text
  • image-edit - Edit images with text prompts
  • image-resize - Resize images
  • image-convert - Convert image formats
  • image-extend - Extend image canvas with padding
  • image-remove-background - Remove backgrounds
  • image-crop - Crop images to specified dimensions
  • audio-extract - Extract audio from video
  • audio-transcribe - Transcribe audio to text
  • video-generate - Generate videos from text or images

Quick Start

```bash

# generate an image

agent-media image generate --prompt "a robot" --out rob.png

# remove background

agent-media image remove-background --in rob.png --out rob_nobg.png

# edit the image

agent-media image edit --in rob_nobg.png --prompt "the robot is sitting on a bench next to a cat, in the background you can see the Eiffel Tower in Paris" --out rob_cat_paris.png

# generate a video with audio (cat meows, robot speaks!)

agent-media video generate --in rob_cat_paris.png --prompt "the cat meows and the robot says: \"Yes, me too.\"" --audio --out rob_cat_video.mp4

# extract audio from video

agent-media audio extract --in rob_cat_video.mp4 --out rob_cat_audio.mp3

# transcribe the audio

agent-media audio transcribe --in rob_cat_audio.mp3

```

Requirements

  • Node.js >= 18.0.0
  • API key from [fal.ai](https://fal.ai/dashboard/keys), [Replicate](https://replicate.com/account/api-tokens), [Runpod](https://www.runpod.io/console/user/settings), or [AI Gateway](https://vercel.com/ai-gateway) for AI features

Local processing (no API key): resize, convert, extend, crop, audio extract, remove-background, transcribe

Cloud processing (API key required): image generate, image edit, video generate, remove-background, transcribe

> Note: You may see a mutex lock failed error when using local remove-background or transcribe β€” ignore it, the output is correct if JSON shows "ok": true.

---

image

```bash

agent-media image resize --in [options]

agent-media image convert --in --format

agent-media image extend --in --padding --color

agent-media image crop --in --width --height

agent-media image generate --prompt

agent-media image edit --in --prompt

agent-media image remove-background --in

```

resize

local

```bash

agent-media image resize --in sunset-mountains.jpg --width 800

agent-media image resize --in sunset-mountains.jpg --height 600

agent-media image resize --in https://ytrzap04kkm0giml.public.blob.vercel-storage.com/sunset-mountains.jpg --width 800

```

| Option | Description |

|--------|-------------|

| --in | Input file path or URL (required) |

| --width | Target width in pixels |

| --height | Target height in pixels |

| --out | Output path, filename or directory (default: ./) |

convert

local

```bash

agent-media image convert --in sunset-mountains.png --format webp

agent-media image convert --in sunset-mountains.jpg --format png

agent-media image convert --in https://ytrzap04kkm0giml.public.blob.vercel-stor