🎯

pdf

🎯Skill

from autumnsgrove/claudeskills

What it does

Extracts, manipulates, and generates PDFs with advanced capabilities like text extraction, form filling, merging, and annotations.

📦

Part of

autumnsgrove/claudeskills(16 items)

pdf

Installation

pip installInstall Python package

pip install pypdf pdfplumber reportlab PyMuPDF pdf2image pytesseract pillow

PythonRun Python server

python scripts/pdf_helper.py --help

📖 Extracted from docs: autumnsgrove/claudeskills

Need more details? View full documentation on GitHub →

2Installs

AddedFeb 4, 2026

View on GitHub Back to Skills

Skill Details

SKILL.md

"Comprehensive PDF manipulation, extraction, and generation with support for text extraction, form filling, merging, splitting, annotations, and creation. Use when working with .pdf files for: (1) Extracting text and tables, (2) Filling PDF forms, (3) Merging/splitting PDFs, (4) Creating PDFs programmatically, (5) Adding watermarks/annotations, (6) PDF metadata management"

Overview

# PDF Manipulation Skill

Comprehensive guide for working with PDF files in Python, covering extraction, manipulation, creation, and advanced operations using progressive disclosure for efficiency.

Core Capabilities

Extract and manipulate PDF content:

Extract text with layout preservation
Extract tables and parse structured data
Fill PDF forms programmatically
Merge multiple PDFs into a single document
Split PDFs by pages or ranges
Create PDFs from scratch with text, images, and graphics
Add watermarks and annotations
Extract and modify metadata (author, title, keywords)
Add password protection and encryption
Perform OCR on scanned documents
Convert images to PDF
Compress and optimize PDF files
Extract images from PDFs
Rotate and reorder pages

Quick Start

Install required libraries:

```bash

pip install pypdf pdfplumber reportlab PyMuPDF pdf2image pytesseract pillow

```

For detailed installation instructions including system dependencies, see:

[Library Installation Guide](./references/library-installation.md)

Python Libraries Overview

pypdf: Basic operations (merge, split, rotate, metadata)

pdfplumber: Advanced text/table extraction with layout awareness

reportlab: Create PDFs from scratch (reports, invoices, documents)

PyMuPDF (fitz): Advanced manipulation, annotations, compression

pdf2image: Convert PDF pages to images (requires poppler)

pytesseract: OCR for scanned documents (requires tesseract)

Text Extraction Workflow

Basic Extraction

```python

from pypdf import PdfReader

reader = PdfReader("document.pdf")

for page in reader.pages:

text = page.extract_text()

print(text)

```

Layout-Aware Extraction

```python

import pdfplumber

with pdfplumber.open("document.pdf") as pdf:

for page in pdf.pages:

text = page.extract_text()

words = page.extract_words() # With positioning

print(text)

```

Extract from Specific Region

```python

with pdfplumber.open("document.pdf") as pdf:

page = pdf.pages[0]

bbox = (0, 0, 612, 100) # x0, y0, x1, y1

header = page.crop(bbox).extract_text()

```

For detailed text extraction methods including OCR fallback and encoding handling, see:

[Text Extraction Reference](./references/text-extraction.md)

Table Extraction Workflow

Extract All Tables

```python

import pdfplumber

with pdfplumber.open("report.pdf") as pdf:

for page in pdf.pages:

tables = page.extract_tables()

for table in tables:

print(table)

```

Advanced Table Detection

```python

table_settings = {

"vertical_strategy": "lines",

"horizontal_strategy": "lines",

"snap_tolerance": 3

}

tables = page.extract_tables(table_settings=table_settings)

```

For detailed table extraction strategies and data cleaning, see:

[Table Extraction Reference](./references/table-extraction.md)

PDF Form Operations

Fill Form Fields

```python

import fitz

doc = fitz.open("form.pdf")

for page in doc:

for widget in page.widgets():

if widget.field_name == "name":

widget.field_value = "John Doe"

widget.update()

doc.save("filled.pdf")

doc.close()

```

Extract Form Field Names

```python

doc = fitz.open("form.pdf")

for page in doc:

for widget in page.widgets():

print(f"{widget.field_name}: {widget.field_type_string}")

doc.close()

```

For form filling, flattening, and debugging, see:

[PDF Operations Reference](./references/pdf-operations.md)

Merging and Splitting

Merge PDFs

```python

from pypdf import PdfMerger

merger = PdfMerger()

for pdf in ["file1.pdf", "file2.pdf", "file3.pdf"]:

merger.append(pdf)

merger.write("merged.pdf")

merger.close()

```

Merge with Page Ranges

```python

merger = PdfMerger()

merger.append("doc1.pdf", pages=(0, 3)) # First 3 pages

merger.append("doc2.pdf") # All pages

merger.write("compiled.pdf")

merger.close()

```

Split into Individual Pages

```python

from pypdf import PdfReader, PdfWriter

reader = PdfReader("document.pdf")

for i, page in enumerate(reader.pages):

writer = PdfWriter()

writer.add_page(page)

with open(f"page_{i+1}.pdf", 'wb') as f:

writer.write(f)

```

For merging with bookmarks and splitting by size, see:

[PDF Operations Reference](./references/pdf-operations.md)

Creating PDFs

Simple Text PDF

```python

from reportlab.pdfgen import canvas

from reportlab.lib.pagesizes import letter

c = canvas.Canvas("output.pdf", pagesize=letter)

c.setFont("Helvetica", 12)

c.drawString(50, 750, "Hello, World!")

c.save()

```

Styled Report

```python

from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer

from reportlab.lib.styles import getSampleStyleSheet

doc = SimpleDocTemplate("report.pdf")

styles = getSampleStyleSheet()

story = []

story.append(Paragraph("Report Title", styles['Title']))

story.append(Spacer(1, 12))

story.append(Paragraph("Content here", styles['BodyText']))

doc.build(story)

```

PDF with Table

```python

from reportlab.platypus import Table, TableStyle

from reportlab.lib import colors

data = [

['Product', 'Quantity', 'Price'],

['Widget A', '10', '$50'],

['Widget B', '5', '$75']

]

table = Table(data)

table.setStyle(TableStyle([

('BACKGROUND', (0, 0), (-1, 0), colors.grey),

('GRID', (0, 0), (-1, -1), 1, colors.black)

]))

```

For complete PDF creation workflows including images, multi-column layouts, and custom fonts, see:

[PDF Creation Reference](./references/pdf-creation.md)

For practical examples:

[Invoice Generator](./examples/invoice-generator.md)
[Report Automation](./examples/report-automation.md)

Metadata and Security

Extract Metadata

```python

from pypdf import PdfReader

reader = PdfReader("document.pdf")

metadata = reader.metadata

print(f"Title: {metadata.get('/Title')}")

print(f"Author: {metadata.get('/Author')}")

```

Modify Metadata

```python

from pypdf import PdfWriter

writer = PdfWriter()

for page in reader.pages:

writer.add_page(page)

writer.add_metadata({

'/Title': 'New Title',

'/Author': 'John Doe'

})

with open("updated.pdf", 'wb') as f:

writer.write(f)

```

Add Password Protection

```python

writer.encrypt(

user_password="user123",

owner_password="owner456",

algorithm="AES-256"

)

```

For detailed security operations and comprehensive metadata management, see:

[Metadata, Security, and OCR Reference](./references/metadata-security-ocr.md)

OCR for Scanned Documents

Basic OCR

```python

from pdf2image import convert_from_path

import pytesseract

images = convert_from_path("scanned.pdf")

for i, image in enumerate(images):

text = pytesseract.image_to_string(image)

print(f"Page {i+1}:\n{text}")

```

Multi-Language OCR

```python

text = pytesseract.image_to_string(image, lang='eng+fra+deu')

```

For searchable PDF creation and OCR preprocessing, see:

[Metadata, Security, and OCR Reference](./references/metadata-security-ocr.md)

Watermarks and Annotations

Add Text Watermark

```python

import fitz

doc = fitz.open("document.pdf")

for page in doc:

page.insert_textbox(

page.rect,

"CONFIDENTIAL",

fontsize=50,

rotate=45,

opacity=0.3,

color=(0.7, 0.7, 0.7)

)

doc.save("watermarked.pdf")

doc.close()

```

Add Annotations

```python

page.add_highlight_annot(rect) # Highlight

page.add_text_annot(point, "Note") # Text note

page.add_underline_annot(rect) # Underline

```

For stamps and image watermarks, see:

[Metadata, Security, and OCR Reference](./references/metadata-security-ocr.md)

Page Operations

Rotate Pages

```python

from pypdf import PdfReader, PdfWriter

reader = PdfReader("document.pdf")

writer = PdfWriter()

for page in reader.pages:

page.rotate(90)

writer.add_page(page)

with open("rotated.pdf", 'wb') as f:

writer.write(f)

```

Extract Images

```python

import fitz

doc = fitz.open("document.pdf")

for page_num in range(len(doc)):

page = doc[page_num]

for img_index, img in enumerate(page.get_images()):

xref = img[0]

base_image = doc.extract_image(xref)

with open(f"image_{page_num}_{img_index}.png", "wb") as f:

f.write(base_image["image"])

doc.close()

```

Convert Images to PDF

```python

from PIL import Image

from reportlab.pdfgen import canvas

c = canvas.Canvas("output.pdf")

for img_path in ["img1.jpg", "img2.jpg"]:

img = Image.open(img_path)

c.setPageSize(img.size)

c.drawImage(img_path, 0, 0, width=img.width, height=img.height)

c.showPage()

c.save()

```

For detailed page operations, see:

[PDF Operations Reference](./references/pdf-operations.md)

Optimization

Compress PDF

```python

import fitz

doc = fitz.open("large.pdf")

doc.save(

"optimized.pdf",

garbage=4,

deflate=True,

clean=True

)

doc.close()

```

Best Practices

Memory Management

Process large PDFs in chunks:

```python

from pypdf import PdfReader

import gc

reader = PdfReader("large.pdf")

for i, page in enumerate(reader.pages):

text = page.extract_text()

# Process text

if i % 10 == 0:

gc.collect()

```

Error Handling

Always handle encryption and errors:

```python

from pypdf import PdfReader

try:

reader = PdfReader("document.pdf")

if reader.is_encrypted:

reader.decrypt(password)

for page in reader.pages:

text = page.extract_text()

except Exception as e:

print(f"Error: {e}")

```

OCR Fallback

Detect and handle scanned documents:

```python

import fitz

doc = fitz.open("document.pdf")

text = doc[0].get_text()

if not text.strip():

# Use OCR for scanned document

from pdf2image import convert_from_path

import pytesseract

images = convert_from_path("document.pdf")

text = pytesseract.image_to_string(images[0])

```

For comprehensive best practices, common pitfalls, and troubleshooting, see:

[Best Practices and Common Pitfalls](./references/best-practices.md)

Common Pitfalls

Scanned Documents: Text extraction returns empty for scanned PDFs. Use OCR (pytesseract).

Table Detection: Tables not detected correctly. Adjust table_settings strategies.

Encrypted PDFs: Operations fail. Check and decrypt with password first.

Form Fields: Can't find field names. Use debug helper to list all fields.

Memory Issues: Large PDFs cause crashes. Process in chunks with garbage collection.

Encoding Issues: Special characters corrupted. Handle with UTF-8 encoding explicitly.

For detailed solutions and debugging strategies, see:

[Best Practices and Common Pitfalls](./references/best-practices.md)

Quick Reference

Text Extraction:

Simple: pypdf - page.extract_text()
Advanced: pdfplumber - page.extract_text() + page.extract_words()

Table Extraction:

Always use: pdfplumber - page.extract_tables()

PDF Creation:

Use: reportlab - canvas.Canvas() or SimpleDocTemplate()

Advanced Operations:

Use: PyMuPDF (fitz) - forms, annotations, compression

OCR:

Use: pytesseract + pdf2image

Merging/Splitting:

Use: pypdf - PdfMerger() and PdfWriter()

Helper Scripts

The skill includes helper scripts for common operations:

```bash

# See scripts directory for utilities

python scripts/pdf_helper.py --help

```

Additional Resources

Comprehensive References:

[Library Installation](./references/library-installation.md) - Setup and dependencies
[Text Extraction](./references/text-extraction.md) - All extraction methods
[Table Extraction](./references/table-extraction.md) - Table detection strategies
[PDF Operations](./references/pdf-operations.md) - Forms, merge, split, pages
[PDF Creation](./references/pdf-creation.md) - Creating PDFs from scratch
[Metadata, Security, OCR](./references/metadata-security-ocr.md) - Advanced operations
[Best Practices](./references/best-practices.md) - Pitfalls and solutions

Practical Examples:

[Invoice Generator](./examples/invoice-generator.md) - Professional invoice templates
[Report Automation](./examples/report-automation.md) - Automated report generation

Implementation Guidelines

When working with PDFs:

Choose the right library for your task (see Quick Reference)
Handle errors with try-except blocks
Check for encryption before processing
Use OCR fallback for scanned documents
Process large files in chunks to manage memory
Validate input files before operations
Close documents to free resources: doc.close()

For production use, always implement proper error handling, validate inputs, and test with various PDF types and versions.

More from this repository10

🎯

d3js-visualization🎯Skill

Generates interactive and custom data visualizations using D3.js, enabling complex chart creation, dashboard design, and dynamic data representation across multiple domains.

🎯

error-detective🎯Skill

Systematically investigates and resolves software errors using a structured TRACE framework for comprehensive debugging and root cause analysis.

🎯

markdown-pro🎯Skill

Generates professional Markdown documentation with automated README creation, changelog generation, and technical documentation formatting.

🎯

xlsx🎯Skill

Generates, reads, analyzes, and manipulates Excel spreadsheets with advanced formula, formatting, charting, and data transformation capabilities.

🎯

api-designer🎯Skill

Designs comprehensive RESTful and GraphQL API specifications with best practices for authentication, versioning, and documentation using OpenAPI/Swagger standards.

🎯

brand-guidelines🎯Skill

Generates and validates brand-specific design guidelines and visual identity standards for consistent marketing materials.

🎯

docker-workflow🎯Skill

Optimizes Docker containerization workflows, covering multi-stage builds, image optimization, and production deployment strategies.

🎯

pptx🎯Skill

Generates professional PowerPoint presentations programmatically, enabling creation, editing, and automation of .pptx files with comprehensive design and content control.

🎯

mcp-builder🎯Skill

Builds standardized, secure Model Context Protocol (MCP) servers to enable Claude's seamless integration with external tools, resources, and APIs.

🎯

git-advanced🎯Skill

Performs advanced Git operations like interactive rebasing, conflict resolution, history manipulation, and strategic branch management for complex version control workflows.