🎯

image-prompt-generator

🎯Skill

from cdeistopened/skill-stack

What it does

Generates professional AI images using Google's Gemini API, creating custom visuals for thumbnails, social posts, and creative content with iterative styling.

📦

Part of

cdeistopened/skill-stack(25 items)

image-prompt-generator

Installation

pip installInstall Python package

pip install google-generativeai python-dotenv pillow

PythonRun Python server

python scripts/images/generate_image.py "YOUR PROMPT" \

PythonRun Python server

python scripts/images/generate_image.py "prompt here" --model pro --aspect 16:9 --size 2K

PythonRun Python server

python scripts/images/generate_image.py "prompt here" --model pro --size 4K

PythonRun Python server

python scripts/images/generate_image.py "prompt" --output "./images" --name "my_image" --size 2K

+ 5 more commands

📖 Extracted from docs: cdeistopened/skill-stack

Need more details? View full documentation on GitHub →

2Installs

AddedFeb 4, 2026

View on GitHub Back to Skills

Skill Details

SKILL.md

Generate AI images using Gemini image generation API. Use this skill when content needs images - thumbnails, social posts, blog headers, or creative visuals. Follows an iterative workflow - brainstorm concepts, select direction, generate in multiple styles, then produce via API.

Overview

# Image Prompt Generator

Generate professional, non-generic images using Google's Gemini API for image generation.

Prerequisites & Setup

Getting Your Gemini API Key

Go to [Google AI Studio](https://aistudio.google.com/app/apikey)
Sign in with your Google account
Click "Create API Key"
Copy the generated key

Configuring the API Key

Option 1: Environment file (recommended)

Create a .env file in your project root:

```bash

GEMINI_API_KEY=your_api_key_here

```

Option 2: Direct environment variable

```bash

export GEMINI_API_KEY=your_api_key_here

```

Install Dependencies

```bash

pip install google-generativeai python-dotenv pillow

```

Available Models

| Model | API Name | Best For |

|-------|----------|----------|

| Flash | gemini-2.5-flash-image | Speed, drafts, iteration |

| Pro | gemini-3-pro-image-preview | Final assets, 16:9 aspect ratio, quality |

Image Size (CRITICAL for quality)

| Size | Output | Use For |

|------|--------|---------|

| 1K | ~400KB | Drafts, iteration |

| 2K | ~2MB | Thumbnails, web assets (default) |

| 4K | ~7MB | Print, high-resolution needs |

CRITICAL: Always specify --size 2K or --size 4K for final assets. Without this, output defaults to 1K which looks low-quality.

Skill Stack Thumbnails - MANDATORY SETTINGS

For ANY Skill Stack thumbnail, you MUST use:

```bash

python scripts/images/generate_image.py "YOUR PROMPT" \

--model pro \

--size 2K \

--aspect 16:9 \

--output public/images/thumbnails \

--name your-slug

```

And your prompt MUST include risograph style instructions:

```

STYLE: Risograph / screen print aesthetic. Visible halftone dots throughout.

Slight misregistration between color layers. Indie printmaking quality - warm, tactile, handmade.

COLOR: Warm cream base. Charcoal and gray tones. Terracotta (#D4654A) accent on ONE key element.

TEXTURE: Visible halftone pattern. Paper grain. NOT smooth digital gradients.

AVOID: Empty white borders, photorealism, glossy renders, smooth gradients, clip art aesthetic.

```

WHY: Without --size 2K, output is ~400KB garbage that looks like it was made with Flash. Without risograph style, output looks generic and AI-generated.

---

Workflow Overview

Brainstorm Concepts - Generate 4-6 high-level visual ideas
Select Direction - User picks the concept they like
Optimize Prompt - Refine into a strong, detailed prompt
Style Variations - Adapt to 2-3 different visual styles
Generate Images - Run via Gemini API

Step 1: Brainstorm Concepts

When the user provides a topic or use case, generate 3 high-level visual concepts and indicate your pick. Each concept should be:

One sentence describing the visual idea
Concrete and immediate - you can picture it instantly
Conceptual but not abstract - a clear object/scene with meaning
Non-generic - avoid cliches (no lightbulbs for ideas, no handshakes for partnership)

Format:

```

Concept A: One sentence description of the visual concept and why it works.

Concept B: One sentence description...

Concept C: One sentence description...

My pick: Concept X — Brief reasoning why this one is strongest.

```

Example for "newsletter about personal productivity":

```

Concept A: A vintage compass where the needle points toward a coffee ring stain on a map—direction emerges from daily rituals.

Concept B: A clock where the 12 hours show seasonal changes—time management over long arcs, not just hours.

Concept C: One small key attached to dozens of decorative keychains—we overcomplicate simple solutions.

My pick: Concept A — The coffee stain is unexpected and personal, the compass is concrete without being cliché.

```

Wait for user to confirm or redirect before proceeding.

Step 2: Optimize the Prompt

Once the user selects a concept, develop it into a full prompt. Structure:

```

Create a [style type] illustration of [subject].

CONCEPT: [Expand the one-sentence idea into a clear visual description]

STYLE: [Artistic approach - load from references/styles/ if brand-specific]

COMPOSITION: [Framing, focal point, negative space, balance]

COLORS: [Palette - describe by name, not hex codes which may render as text]

TEXTURE: [Surface qualities, analog/digital feel]

AVOID: [What should NOT appear - be specific]

FORMAT: [Aspect ratio]

```

Key principles:

Natural language, full sentences - no tag soup
Describe colors by name (burnt orange, sky blue, near-black) not hex codes
Maximum 2-3 elements - if it feels busy, remove something
Favor metaphor over literal depiction

Step 3: Style Variations

MANDATORY for Skill Stack: Risograph style. Every thumbnail must include risograph style instructions in the prompt. See references/styles/risograph.md.

Key risograph elements to ALWAYS include in prompt:

"Visible halftone dots throughout"
"Slight misregistration between color layers"
"Indie printmaking quality - warm, tactile, handmade"
"NOT smooth digital gradients"

Available styles in references/styles/ (for non-Skill-Stack projects):

risograph.md - DEFAULT. Halftone dots, misregistration, indie printmaking aesthetic.
minimalist-ink.md - High-contrast black and white, crosshatching.
watercolor-line.md - Ink linework with watercolor washes.
editorial-conceptual.md - Conceptual, sophisticated, editorial wit.

Step 4: Generate via API

Running the Script

```bash

# Generate 2K thumbnail (recommended for web)

python scripts/images/generate_image.py "prompt here" --model pro --aspect 16:9 --size 2K

# Generate 4K for print/high-res

python scripts/images/generate_image.py "prompt here" --model pro --size 4K

# Save to specific folder

python scripts/images/generate_image.py "prompt" --output "./images" --name "my_image" --size 2K

```

Options:

--model pro (default, higher quality) or --model flash (faster)
--size 2K (default), 1K, or 4K - always use 2K or 4K for final assets
--aspect 16:9 (default), 1:1, 9:16, 3:4, 4:3
--variations N - generate N versions
--output ./path - save location
--name prefix - filename prefix

Output location: Save images alongside the content they belong to - not a generic images dump.

Step 5: Iterate

After user reviews generated images:

80% good? Request specific edits conversationally rather than regenerating
Composition off? Adjust framing or element placement in prompt
Wrong style? Try a different style reference
Too busy? Simplify to fewer elements
Colors wrong? Be more explicit about palette

Prompting Principles

Write Like a Creative Director

Brief the model like a human artist. Use proper grammar, full sentences, and descriptive adjectives.

| Don't | Do |

|-------|-----|

| "Cool car, neon, city, night, 8k" | "A cinematic wide shot of a futuristic sports car speeding through a rainy Tokyo street at night. The neon signs reflect off the wet pavement and the car's metallic chassis." |

Be specific about:

Subject: Instead of "a woman," say "a sophisticated elderly woman wearing a vintage chanel-style suit"
Materiality: Describe textures - "matte finish," "brushed steel," "soft velvet," "crumpled paper"
Setting: Define location, time of day, weather
Lighting: Specify mood and light source
Mood: Emotional tone of the image

Provide Context

Context helps the model make logical artistic decisions. Include the "why" or "for whom."

Example: "Create an image of a sandwich for a Brazilian high-end gourmet cookbook."

(Model infers: professional plating, shallow depth of field, perfect lighting)

Keep It Simple

One clear focal point
Maximum 2-3 elements total
Generous negative space
If it feels busy, remove something

Avoid the Generic

Hard rules:

NEVER include text in images - Text renders poorly and looks amateur
No lightbulbs for "ideas"
No handshakes for "partnership"
No audio waveforms for "podcasts" or "audio"
No gears/cogs for "systems" or "process"
No happy stock photo poses
No glossy AI aesthetic
No puzzle pieces for "connection" or "integration"

Metaphors That Work

Strong concepts from past generations:

Orange cut open revealing neural network - organic exterior, systematic interior (knowledge systems, RAG)
Mail slot with origami bird emerging - delivery transforms into something alive (newsletters)
Exercise club crossed with quill pen - physical culture meets intellectual work (body + mind)
Typewriter key extreme close-up with radiating impact - decisive action, command (books, authority)
Medicine dropper with mandala bloom - clinical precision meets transcendence (therapy, transformation)
Straight razor on leather strop - precision tool, polishing, refinement (editing, production)
Prism dispersing light into distinct beams - one input, multiple outputs (frameworks)
Compass with coffee stain on map - direction from daily rituals (productivity)

Resources

references/styles/

Aesthetic style definitions:

risograph.md - DEFAULT - Halftone, misregistration, indie printmaking
minimalist-ink.md - Black and white ink illustration
watercolor-line.md - Ink with watercolor washes
editorial-conceptual.md - Conceptual editorial style

scripts/

generate_image.py - Gemini API image generation

Prompt Modifiers Reference

| Category | Examples |

|----------|----------|

| Lighting | golden hour, dramatic shadows, soft diffused light, neon glow, overcast |

| Style | cinematic, editorial, technical diagram, hand-drawn, photorealistic |

| Texture | matte finish, brushed steel, soft velvet, crumpled paper, weathered wood |

| Composition | wide shot, close-up, bird's eye view, dutch angle, symmetrical |

| Mood | energetic, serene, dramatic, playful, sophisticated |

| Quality | 4K, high-fidelity, pixel-perfect, professional grade |

Advanced Capabilities

Text Rendering & Infographics

Put exact text in quotes. Specify style: "polished editorial," "technical diagram," or "hand-drawn whiteboard."

Example prompts:

```

Earnings Report Infographic:

"Generate a clean, modern infographic summarizing the key financial highlights from this earnings report. Include charts for 'Revenue Growth' and 'Net Income', and highlight the CEO's key quote in a stylized pull-quote box."

```

Whiteboard Summary:

"Summarize the concept of 'Transformer Neural Network Architecture' as a hand-drawn whiteboard diagram suitable for a university lecture. Use different colored markers for the Encoder and Decoder blocks, and include legible labels for 'Self-Attention' and 'Feed Forward'."

```

Character Consistency & Thumbnails

Use reference images and state "Keep the person's facial features exactly the same as Image 1." Describe expression/action changes while maintaining identity.

Example prompt:

```

Viral Thumbnail:

"Design a viral video thumbnail using the person from Image 1.

Face Consistency: Keep the person's facial features exactly the same as Image 1, but change their expression to look excited and surprised.

Action: Pose the person on the left side, pointing their finger towards the right side of the frame.

Subject: On the right side, place a high-quality image of a delicious avocado toast.

Graphics: Add a bold yellow arrow connecting the person's finger to the toast.

Text: Overlay massive, pop-style text in the middle: 'Done in 3 mins!'. Use a thick white outline and drop shadow.

Background: A blurred, bright kitchen background. High saturation and contrast."

```

Image Reworking (Edit Existing Images)

The --input flag enables "rework mode" - pass an existing image to Gemini and describe the changes you want.

Key use cases:

Small tweaks - Adjust colors, add/remove elements, change lighting
Style transfer - Keep composition but change artistic style
Object manipulation - Remove, add, or modify specific objects
Seasonal/temporal changes - Same scene, different time/season

Running in rework mode:

```bash

# Basic edit - add something

python scripts/generate_image.py "Add snow to the roof and yard" \

--input ./house.png \

--model pro

# Color adjustment

python scripts/generate_image.py "Change the accent color from red to teal, keep everything else identical" \

--input ./thumbnail.png \

--model pro

# Style transfer - keep composition, change aesthetic

python scripts/generate_image.py "Convert this to risograph style with halftone dots and slight color misregistration" \

--input ./photo.png \

--model pro

# Generate variations of an edit

python scripts/generate_image.py "Make the lighting warmer, like golden hour" \

--input ./portrait.png \

--variations 3 \

--model pro

```

Prompting tips for rework mode:

Be specific about what to preserve:

- "Keep the person's facial features exactly the same"

- "Maintain the composition and framing"

- "Don't change the background"

Be explicit about what to change:

- "Change ONLY the color of the shirt from blue to red"

- "Add snow to the roof and nothing else"

- "Remove the text overlay"

Use comparative language:

- "Make the colors more vibrant"

- "Increase the contrast slightly"

- "Make the lighting softer and more diffused"

Output naming: Files from rework mode are named {prefix}_{timestamp}_edit_{model}.png to distinguish from generated images (_gen_).

Advanced Editing Examples

Object Removal:

```bash

python scripts/generate_image.py \

"Remove the tourists from the background and fill with matching cobblestones and storefronts" \

--input ./street-photo.png \

--model pro

```

Seasonal Control:

```bash

python scripts/generate_image.py \

"Turn this into winter. Add snow to the roof and yard. Change lighting to cold, overcast afternoon. Keep architecture identical." \

--input ./house-summer.png \

--model pro

```

Character Consistency (thumbnail series):

```bash

python scripts/generate_image.py \

"Keep the person's face exactly the same. Change expression to surprised. Add a pointing gesture toward the right side of the frame." \

--input ./person-reference.png \

--model pro

```

---

Related Skills

youtube-title-creator - Pair generated images with optimized titles
social-content-creation - Use images in platform-optimized posts

---

For custom brand styles, create new style files in references/styles/ following the existing format

More from this repository10

🎯

video-caption-creation🎯Skill

Generates optimized captions and hooks for short-form videos, maximizing algorithm reach and audience engagement using the Triple Word Score system.

🎯

anti-ai-writing🎯Skill

Transforms AI-generated drafts into authentic, human-sounding content by eliminating robotic patterns and applying natural writing techniques.

🎯

ghostwriter🎯Skill

I apologize, but I cannot generate a description without seeing the actual details of the "ghostwriter" skill. Could you provide more context about what the skill does, its purpose, or share some i...

🎯

amazon-category-research🎯Skill

Helps authors research and select low-competition, high-visibility Amazon book categories to maximize publishing success.

🎯

podcast-blog-post-creator🎯Skill

Transforms podcast episodes into narrative-driven, SEO-optimized blog posts that capture expert insights and conversational energy.

🎯

hook-and-headline-writing🎯Skill

Generates high-converting hooks and headlines using 15 proven formulas, systematic frameworks, and the 4 U's test to create compelling content across platforms.

🎯

cold-open-creator🎯Skill

Generates compelling 25-35 second podcast cold opens by extracting story arcs from transcripts, strategically cutting at peak narrative tension to hook listeners.

🎯

youtube-title-creator🎯Skill

Generates creative and engaging YouTube video titles optimized for attracting viewer attention and improving search rankings.

🎯

podcast-production🎯Skill

Transforms raw podcast transcripts into multi-platform content assets through strategic storytelling and targeted marketing checkpoints.

🎯

voice-analyzer🎯Skill

Extracts and codifies distinctive writing voice patterns from 3-5 writing samples to create a reusable, authentic style guide.