🎯

baoyu-image-gen

🎯Skill

from siatwangmin/coco-skills

What it does

Generates AI images using OpenAI and Google APIs with flexible text-to-image generation options.

📦

Part of

siatwangmin/coco-skills(11 items)

baoyu-image-gen

Installation

npxRun with npx

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png

npxRun with npx

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9

npxRun with npx

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k

npxRun with npx

npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png

npxRun with npx

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png

+ 1 more commands

📖 Extracted from docs: siatwangmin/coco-skills

Need more details? View full documentation on GitHub →

5Installs

AddedFeb 4, 2026

View on GitHub Back to Skills

Skill Details

SKILL.md

AI image generation with OpenAI and Google APIs. Supports text-to-image, reference images, aspect ratios, and parallel generation (recommended 4 concurrent subagents). Use when user asks to generate, create, or draw images.

Overview

# Image Generation (AI SDK)

Official API-based image generation. Supports OpenAI and Google providers.

Script Directory

Agent Execution:

SKILL_DIR = this SKILL.md file's directory
Script path = ${SKILL_DIR}/scripts/main.ts

Preferences (EXTEND.md)

Use Bash to check EXTEND.md existence (priority order):

```bash

# Check project-level first

test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"

# Then user-level (cross-platform: $HOME works on macOS/Linux/WSL)

test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"

```

┌──────────────────────────────────────────────────┬───────────────────┐

│ Path │ Location │

├──────────────────────────────────────────────────┼───────────────────┤

│ .baoyu-skills/baoyu-image-gen/EXTEND.md │ Project directory │

│ $HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md │ User home │

└──────────────────────────────────────────────────┴───────────────────┘

┌───────────┬───────────────────────────────────────────────────────────────────────────┐

│ Result │ Action │

├───────────┼───────────────────────────────────────────────────────────────────────────┤

│ Found │ Read, parse, apply settings │

│ Not found │ Use defaults │

└───────────┴───────────────────────────────────────────────────────────────────────────┘

EXTEND.md Supports: Default provider | Default quality | Default aspect ratio

Usage

```bash

# Basic

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png

# With aspect ratio

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9

# High quality

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k

# From prompt files

npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png

# With reference images (Google multimodal only)

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png

# Specific provider

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai

```

Options

| Option | Description |

|--------|-------------|

| --prompt , -p | Prompt text |

| --promptfiles | Read prompt from files (concatenated) |

| --image | Output image path (required) |

| --provider google\|openai | Force provider (default: google) |

| --model , -m | Model ID |

| --ar | Aspect ratio (e.g., 16:9, 1:1, 4:3) |

| --size | Size (e.g., 1024x1024) |

| --quality normal\|2k | Quality preset (default: 2k) |

| --imageSize 1K\|2K\|4K | Image size for Google (default: from quality) |

| --ref | Reference images (Google multimodal only) |

| --n | Number of images |

| --json | JSON output |

Environment Variables

| Variable | Description |

|----------|-------------|

| OPENAI_API_KEY | OpenAI API key |

| GOOGLE_API_KEY | Google API key |

| OPENAI_IMAGE_MODEL | OpenAI model override |

| GOOGLE_IMAGE_MODEL | Google model override |

| OPENAI_BASE_URL | Custom OpenAI endpoint |

| GOOGLE_BASE_URL | Custom Google endpoint |

Load Priority: CLI args > env vars > /.baoyu-skills/.env > ~/.baoyu-skills/.env

Provider Selection

--provider specified → use it
Only one API key available → use that provider
Both available → default to Google

Quality Presets

|--------|------------------|-------------|----------|

| normal | 1K | 1024px | Quick previews |

| 2k (default) | 2K | 2048px | Covers, illustrations, infographics |

Google imageSize: Can be overridden with --imageSize 1K|2K|4K

Aspect Ratios

Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1

Google multimodal: uses imageConfig.aspectRatio
Google Imagen: uses aspectRatio parameter
OpenAI: maps to closest supported size

Parallel Generation

Supports concurrent image generation via background subagents for batch operations.

| Setting | Value |

|---------|-------|

| Recommended concurrency | 4 subagents |

| Max concurrency | 8 subagents |

| Use case | Batch generation (slides, comics, infographics) |

Agent Implementation:

```

# Launch multiple generations in parallel using Task tool

# Each Task runs as background subagent with run_in_background=true

# Collect results via TaskOutput when all complete

```

Best Practice: When generating 4+ images, spawn background subagents (recommended 4 concurrent) instead of sequential execution.

Error Handling

Missing API key → error with setup instructions
Generation failure → auto-retry once
Invalid aspect ratio → warning, proceed with default
Reference images with non-multimodal model → warning, ignore refs

Extension Support

Custom configurations via EXTEND.md. See Preferences section for paths and supported options.

More from this repository10

🎯

baoyu-xhs-images🎯Skill

Skill

🎯

baoyu-infographic🎯Skill

Generates professional infographics by analyzing content and creating publication-ready visuals with 20 layout types and 17 visual styles.

🎯

baoyu-article-illustrator🎯Skill

Generates contextually-appropriate illustrations for articles by analyzing content structure and automatically selecting optimal image types and visual styles.

🎯

release-skills🎯Skill

Automatically detects and updates version files, changelogs, and git tags across multiple project types with multi-language support and intelligent version bumping.

🎯

baoyu-danger-x-to-markdown🎯Skill

Skill

🎯

baoyu-cover-image🎯Skill

Generates customizable article cover images with 5-dimensional design options, supporting multiple styles, palettes, and aspect ratios.

🎯

baoyu-danger-gemini-web🎯Skill

Generates images and text via reverse-engineered Gemini Web API, supporting multi-turn conversations and vision-capable AI generation.

🎯

baoyu-slide-deck🎯Skill

Generates professional slide deck images from content, creating stylized presentations with customizable options for audience, style, and language.

🎯

baoyu-compress-image🎯Skill

Automatically compresses and optimizes images to WebP or other formats using the best available tool.

🎯

baoyu-post-to-wechat🎯Skill

Automates posting markdown articles and image-text content to WeChat Official Account using Chrome browser automation.