🎯

youtube-downloader

🎯Skill

from cdeistopened/skill-stack

VibeIndex|
What it does

Downloads YouTube video transcripts by extracting subtitles/captions from a provided YouTube URL, supporting both manual and auto-generated transcripts.

πŸ“¦

Part of

cdeistopened/skill-stack(25 items)

youtube-downloader

Installation

pip installInstall Python package
pip3 install yt-dlp
PythonRun Python server
python3 -m pip install yt-dlp
πŸ“– Extracted from docs: cdeistopened/skill-stack
1Installs
-
AddedFeb 4, 2026

Skill Details

SKILL.md

Download YouTube video transcripts when user provides a YouTube URL or asks to download/get/fetch a transcript from YouTube. Also use when user wants to transcribe or get captions/subtitles from a YouTube video.

Overview

# YouTube Transcript Downloader

This skill helps download transcripts (subtitles/captions) from YouTube videos using yt-dlp.

When to Use This Skill

Activate this skill when the user:

  • Provides a YouTube URL and wants the transcript
  • Asks to "download transcript from YouTube"
  • Wants to "get captions" or "get subtitles" from a video
  • Asks to "transcribe a YouTube video"
  • Needs text content from a YouTube video

How It Works

Default Workflow:

  1. Check if yt-dlp is installed - install if needed
  2. Try manual subtitles first (--write-sub) - highest quality, human-created
  3. Fallback to auto-generated (--write-auto-sub) - usually available
  4. Convert to plain text - deduplicate and clean up VTT format
  5. Save to output directory as a markdown file with filename based on video title
  6. Automatically polish - remove filler words, fix grammar, add section headers, maintain 100% fidelity
  7. Confirm the download and show the user where the file is saved

Installation Check

IMPORTANT: Always check if yt-dlp is installed first:

```bash

which yt-dlp || command -v yt-dlp

```

If Not Installed

Attempt automatic installation based on the system:

macOS (Homebrew):

```bash

brew install yt-dlp

```

Linux (apt/Debian/Ubuntu):

```bash

sudo apt update && sudo apt install -y yt-dlp

```

Alternative (pip - works on all systems):

```bash

pip3 install yt-dlp

# or

python3 -m pip install yt-dlp

```

If installation fails: Inform the user they need to install yt-dlp manually and provide them with installation instructions from https://github.com/yt-dlp/yt-dlp#installation

Check Available Subtitles

ALWAYS do this first before attempting to download:

```bash

yt-dlp --list-subs "YOUTUBE_URL"

```

This shows what subtitle types are available without downloading anything. Look for:

  • Manual subtitles (better quality)
  • Auto-generated subtitles (usually available)
  • Available languages

Download Strategy

Option 1: Manual Subtitles (Preferred)

Try this first - highest quality, human-created:

```bash

yt-dlp --write-sub --skip-download --output "OUTPUT_NAME" "YOUTUBE_URL"

```

Option 2: Auto-Generated Subtitles (Fallback)

If manual subtitles aren't available:

```bash

yt-dlp --write-auto-sub --skip-download --output "OUTPUT_NAME" "YOUTUBE_URL"

```

Both commands create a .vtt file (WebVTT subtitle format).

Getting Video Information

Extract Video Title (for filename)

```bash

yt-dlp --print "%(title)s" "YOUTUBE_URL"

```

Use this to create meaningful filenames based on the video title. Clean the title for filesystem compatibility:

  • Replace / with -
  • Replace special characters that might cause issues
  • Consider using sanitized version: $(yt-dlp --print "%(title)s" "URL" | tr '/' '-' | tr ':' '-')

Post-Processing

Convert to Plain Text (Recommended)

YouTube's auto-generated VTT files contain duplicate lines because captions are shown progressively with overlapping timestamps. Always deduplicate when converting to plain text while preserving the original speaking order.

```bash

python3 -c "

import sys, re

seen = set()

with open('transcript.en.vtt', 'r') as f:

for line in f:

line = line.strip()

if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line:

clean = re.sub('<[^>]*>', '', line)

clean = clean.replace('&', '&').replace('>', '>').replace('<', '<')

if clean and clean not in seen:

print(clean)

seen.add(clean)

" > transcript.txt

```

Complete Post-Processing with Video Title

```bash

# Get video title

VIDEO_TITLE=$(yt-dlp --print "%(title)s" "YOUTUBE_URL" | tr '/' '_' | tr ':' '-' | tr '?' '' | tr '"' '')

# Find the VTT file

VTT_FILE=$(ls *.vtt | head -n 1)

# Convert with deduplication

python3 -c "

import sys, re

seen = set()

with open('$VTT_FILE', 'r') as f:

for line in f:

line = line.strip()

if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line:

clean = re.sub('<[^>]*>', '', line)

clean = clean.replace('&', '&').replace('>', '>').replace('<', '<')

if clean and clean not in seen:

print(clean)

seen.add(clean)

" > "${VIDEO_TITLE}.txt"

echo "Saved to: ${VIDEO_TITLE}.txt"

# Clean up VTT file

rm "$VTT_FILE"

echo "Cleaned up temporary VTT file"

```

Output Formats

  • VTT format (.vtt): Includes timestamps and formatting, good for video players
  • Plain text (.txt): Just the text content, good for reading or analysis

Tips

  • The filename will be {output_name}.{language_code}.vtt (e.g., transcript.en.vtt)
  • Most YouTube videos have auto-generated English subtitles
  • Some videos may have multiple language options
  • If auto-subtitles aren't available, try --write-sub instead for manual subtitles

Complete Workflow Example

```bash

VIDEO_URL="https://www.youtube.com/watch?v=VIDEO_ID"

OUTPUT_DIR="transcripts"

# Create output directory if it doesn't exist

mkdir -p "$OUTPUT_DIR"

# Get video title for filename

VIDEO_TITLE=$(yt-dlp --print "%(title)s" "$VIDEO_URL" | tr '/' '_' | tr ':' '-' | tr '?' '' | tr '"' '')

OUTPUT_NAME="$OUTPUT_DIR/transcript_temp"

# ============================================

# STEP 1: Check if yt-dlp is installed

# ============================================

if ! command -v yt-dlp &> /dev/null; then

echo "yt-dlp not found, attempting to install..."

if command -v brew &> /dev/null; then

brew install yt-dlp

elif command -v apt &> /dev/null; then

sudo apt update && sudo apt install -y yt-dlp

else

pip3 install yt-dlp

fi

fi

# ============================================

# STEP 2: Try manual subtitles first

# ============================================

echo "Downloading subtitles for: $VIDEO_TITLE"

if yt-dlp --write-sub --skip-download --output "$OUTPUT_NAME" "$VIDEO_URL" 2>/dev/null; then

echo "Manual subtitles downloaded!"

else

# ============================================

# STEP 3: Fallback to auto-generated

# ============================================

echo "Trying auto-generated subtitles..."

if yt-dlp --write-auto-sub --skip-download --output "$OUTPUT_NAME" "$VIDEO_URL" 2>/dev/null; then

echo "Auto-generated subtitles downloaded!"

else

echo "No subtitles available for this video."

exit 1

fi

fi

# ============================================

# STEP 4: Convert to readable markdown with deduplication

# ============================================

VTT_FILE=$(ls ${OUTPUT_NAME}*.vtt 2>/dev/null | head -n 1)

if [ -f "$VTT_FILE" ]; then

echo "Converting to markdown format..."

python3 -c "

import sys, re

seen = set()

with open('$VTT_FILE', 'r') as f:

for line in f:

line = line.strip()

if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line:

clean = re.sub('<[^>]*>', '', line)

clean = clean.replace('&', '&').replace('>', '>').replace('<', '<')

if clean and clean not in seen:

print(clean)

seen.add(clean)

" > "$OUTPUT_DIR/${VIDEO_TITLE}.md"

echo "Saved raw transcript to: $OUTPUT_DIR/${VIDEO_TITLE}.md"

# Clean up temporary VTT file

rm "$VTT_FILE"

else

echo "No VTT file found to convert"

exit 1

fi

# ============================================

# STEP 5: Automatically polish the transcript

# ============================================

echo "Polishing transcript (removing filler, fixing grammar, adding structure)..."

python3 << 'POLISH_EOF'

import re

with open('$OUTPUT_DIR/${VIDEO_TITLE}.md', 'r') as f:

content = f.read()

# Preserve metadata and header

lines = content.split('\n')

metadata = []

content_start = 0

for i, line in enumerate(lines):

if line.startswith('#') or line.startswith('Source:') or line.startswith('---'):

metadata.append(line)

content_start = i + 1

else:

break

transcript_text = '\n'.join(lines[content_start:]).strip()

# Remove filler words and phrases aggressively (maintain 100% meaning)

filler_patterns = [

(r'\b(um|uh|ah|er|hmm)\b', ''),

(r'\byou\s+know\b', ''),

(r',\s+(so|basically|actually)\s+', ', '),

(r'\b(basically|actually|really)\s+', ''),

(r'\b(kind|sort)\s+of\s+', ''),

(r'\bi\s+(think|mean)\s+', ''),

]

polished = transcript_text

for pattern, replacement in filler_patterns:

polished = re.sub(pattern, replacement, polished, flags=re.IGNORECASE)

# Join broken lines while preserving sentence structure

polished = re.sub(r'(?<=[a-z]),\n(?=[a-z])', ', ', polished)

polished = re.sub(r'(?<=[a-z])\n(?![\n#])', ' ', polished)

# Clean up spacing and punctuation

polished = re.sub(r' +', ' ', polished)

polished = re.sub(r'\s+([.!?,;:])', r'\1', polished)

# Reconstruct with metadata

final = '\n'.join(metadata) + '\n\n' + polished.strip()

with open('$OUTPUT_DIR/${VIDEO_TITLE}.md', 'w') as f:

f.write(final)

print("Transcript polished")

POLISH_EOF

echo "Complete!"

```

Notes:

  • Output directory can be customized (default: transcripts/)
  • Files are named based on the video title with special characters sanitized
  • Transcripts are automatically deduplicated to remove caption overlaps
  • Polishing step removes filler words/phrases while maintaining 100% meaning fidelity
  • Grammar and run-on sentences are automatically fixed
  • Paragraph breaks consolidate content into logical sections
  • The temporary VTT file is cleaned up after conversion

Error Handling

Common Issues and Solutions:

1. yt-dlp not installed

  • Attempt automatic installation based on system (Homebrew/apt/pip)
  • If installation fails, provide manual installation link
  • Verify installation before proceeding

2. No subtitles available

  • List available subtitles first to confirm
  • Try both --write-sub (manual) and --write-auto-sub (auto-generated)
  • If neither are available, inform the user that the video has no available subtitles

3. Invalid or private video

  • Check if URL is correct format: https://www.youtube.com/watch?v=VIDEO_ID
  • Some videos may be private, age-restricted, or geo-blocked
  • Inform user of the specific error from yt-dlp

4. Download interrupted or failed

  • Check internet connection
  • Verify sufficient disk space
  • Try again with --no-check-certificate if SSL issues occur

5. Multiple subtitle languages

  • By default, yt-dlp downloads all available languages
  • Can specify with --sub-langs en for English only
  • List available with --list-subs first

Best Practices:

  • Always check what's available before attempting download (--list-subs)
  • Try manual subtitles first (--write-sub), then fall back to auto-generated
  • Convert VTT to plain text format for easy reading
  • Deduplicate text content to remove caption overlaps
  • Provide clear feedback about what's happening at each stage
  • Handle errors gracefully with helpful messages

More from this repository10

🎯
video-caption-creation🎯Skill

Generates optimized captions and hooks for short-form videos, maximizing algorithm reach and audience engagement using the Triple Word Score system.

🎯
anti-ai-writing🎯Skill

Transforms AI-generated drafts into authentic, human-sounding content by eliminating robotic patterns and applying natural writing techniques.

🎯
ghostwriter🎯Skill

I apologize, but I cannot generate a description without seeing the actual details of the "ghostwriter" skill. Could you provide more context about what the skill does, its purpose, or share some i...

🎯
amazon-category-research🎯Skill

Helps authors research and select low-competition, high-visibility Amazon book categories to maximize publishing success.

🎯
podcast-blog-post-creator🎯Skill

Transforms podcast episodes into narrative-driven, SEO-optimized blog posts that capture expert insights and conversational energy.

🎯
hook-and-headline-writing🎯Skill

Generates high-converting hooks and headlines using 15 proven formulas, systematic frameworks, and the 4 U's test to create compelling content across platforms.

🎯
cold-open-creator🎯Skill

Generates compelling 25-35 second podcast cold opens by extracting story arcs from transcripts, strategically cutting at peak narrative tension to hook listeners.

🎯
youtube-title-creator🎯Skill

Generates creative and engaging YouTube video titles optimized for attracting viewer attention and improving search rankings.

🎯
podcast-production🎯Skill

Transforms raw podcast transcripts into multi-platform content assets through strategic storytelling and targeted marketing checkpoints.

🎯
voice-analyzer🎯Skill

Extracts and codifies distinctive writing voice patterns from 3-5 writing samples to create a reusable, authentic style guide.