🎯

pdf-to-markdown

🎯Skill

from duc01226/easyplatform

VibeIndex|
What it does

pdf-to-markdown skill from duc01226/easyplatform

pdf-to-markdown

Installation

Install skill:
npx skills add https://github.com/duc01226/easyplatform --skill pdf-to-markdown
9
Last UpdatedJan 23, 2026

Skill Details

SKILL.md

Convert PDF files to Markdown. Use when extracting text from PDFs, creating editable documentation from PDF reports, or converting PDF content to version-controlled markdown files.

Overview

# pdf-to-markdown

Convert PDF files to Markdown format.

Installation Required

```bash

cd .claude/skills/pdf-to-markdown

npm install

```

Dependencies: pdf-parse

Quick Start

```bash

# Basic conversion

node .claude/skills/pdf-to-markdown/scripts/convert.cjs \

--file ./document.pdf

# Custom output path

node .claude/skills/pdf-to-markdown/scripts/convert.cjs \

--file ./doc.pdf \

--output ./output/doc.md

```

CLI Options

| Option | Required | Description |

| ----------------- | -------- | ------------------------------------------------ |

| --file | Yes | Input PDF file |

| --output | No | Output Markdown path (default: input name + .md) |

Output Format (JSON)

```json

{

"success": true,

"input": "/path/to/input.pdf",

"output": "/path/to/output.md",

"wordCount": 1523,

"warnings": ["Tables may not be accurately converted"]

}

```

Supported Elements

  • Text extraction from digital PDFs
  • Headings (detected by font size heuristics)
  • Paragraphs
  • Basic lists
  • Links (when embedded in PDF)

Known Limitations

  • Tables: Very limited support; may not render correctly
  • Multi-column layouts: Text may interleave between columns
  • Scanned PDFs: NOT supported (requires OCR - see alternatives below)
  • Images: NOT extracted (PDF images are not included in output)
  • Complex formatting: May be simplified or lost
  • Password-protected PDFs: NOT supported

Alternatives for Unsupported Cases

For scanned PDFs (OCR needed):

  • Use scribe.js-ocr library (AGPL license)
  • Commercial OCR services (Google Cloud Vision, AWS Textract)

For complex tables:

  • Consider AI-based extraction (LLM post-processing)
  • Manual review and correction

For image extraction:

  • Use unpdf library with sharp for image extraction
  • Process images separately and reference in markdown

Troubleshooting

Dependencies not found: Run npm install in skill directory

Empty output: PDF may be scanned/image-based (requires OCR)

Garbled text: PDF may use embedded fonts not supported by parser

Memory issues: Large PDFs may require --max-old-space-size=4096 flag

IMPORTANT Task Planning Notes

  • Always plan and break many small todo tasks
  • Always add a final review todo task to review the works done at the end to find any fix or enhancement needed

More from this repository10

🎯
qa-engineer🎯Skill

Generates comprehensive test plans, test cases, and coverage analysis to support QA engineers in systematic software testing and quality assurance.

🎯
learn🎯Skill

Teaches Claude new patterns, preferences, and conventions to remember across coding sessions using explicit learning commands.

🎯
product-owner🎯Skill

Helps Product Owners prioritize ideas, manage backlogs, and communicate product vision through structured decision-making frameworks.

🎯
business-feature-docs🎯Skill

Generates comprehensive enterprise module documentation with a 26-section structure, creating detailed specs and folder hierarchy for business features.

🎯
frontend-angular-form🎯Skill

Generates Angular reactive forms with advanced validation, async validators, dependent validation, and FormArrays using platform-specific design patterns.

🎯
learned-patterns🎯Skill

Manages Claude's learned patterns by listing, viewing, archiving, and dynamically adjusting pattern confidence levels.

🎯
arch-performance-optimization🎯Skill

Optimizes system performance by triaging and routing to specific strategies for database, frontend, API, jobs, and cross-service bottlenecks.

🎯
shadcn-tailwind🎯Skill

Rapidly build accessible React UI with shadcn/ui components, Radix primitives, and Tailwind CSS utility styling for modern web applications.

🎯
plan-analysis🎯Skill

Analyzes implementation plans by extracting features, assessing change impacts, mapping specifications, and preparing comprehensive technical and business impact reports.

🎯
easyplatform-backend🎯Skill

Develops comprehensive backend components for .NET microservices using EasyPlatform's CQRS, domain-driven design, and modular architecture patterns.