🎯

videocut-clip-oral

🎯Skill

from zrt-ai-lab/opencode-skills

VibeIndex|
What it does

Converts oral videos to transcripts, identifies speech errors, and generates review drafts with precise deletion tasks.

📦

Part of

zrt-ai-lab/opencode-skills(16 items)

videocut-clip-oral

Installation

📋 No install commands found in docs. Showing default command. Check GitHub for actual instructions.
Quick InstallInstall with npx
npx skills add zrt-ai-lab/opencode-skills --skill videocut-clip-oral
2Installs
-
AddedFeb 4, 2026

Skill Details

SKILL.md

口播视频转录和口误识别。生成审查稿和删除任务清单。触发词:剪口播、处理视频、识别口误

Overview

# 剪口播

> 转录 + 口误/静音识别 → 生成审查稿

快速使用

```

用户: 帮我剪这个口播视频

用户: 处理一下这个视频

```

流程

```

  1. FunASR 30s 分段转录(字符级时间戳)

  1. 识别口误(逐句检查)

  1. 识别微口误(VAD 检测短片段)

  1. 识别语气词(嗯/哎/诶 等)

  1. 识别静音(≥1s)

  1. 生成审查稿(时间戳驱动)

  1. 输出删除任务 TodoList

【等待用户确认】→ 用户确认后,执行 /videocut:剪辑

```

⚠️ 为什么用 30s 分段

FunASR 长视频有时间戳漂移,30s 分段可避免。

进度 TodoList

启动时创建:

```

  • [ ] 读取「转录最佳实践」→ 转录视频
  • [ ] 读取「口误识别方法论」→ 识别口误
  • [ ] VAD 检测微口误(短片段 < 0.5s)
  • [ ] 扫描语气词(嗯/哎/诶 等)
  • [ ] 识别静音(≥1s)
  • [ ] 生成审查稿
  • [ ] 输出删除任务清单

```

⚠️ 必须先读方法论再执行

| 阶段 | 先读 | 再执行 |

|------|------|--------|

| 转录 | tips/转录最佳实践.md | 调用ASR |

| 识别口误 | tips/口误识别方法论.md | 逐句分析 |

---

核心:时间戳驱动

删除任务格式

每项必须标注精确时间戳 (start-end)

```

口误(N处):

  • [ ] 1. (start-end) 删"错误文本" → 保留"正确文本"

语气词(N处):

  • [ ] 1. (前字end-后字start) 删"嗯" 上下文: XX【嗯】YY

静音(N处):

  • [ ] 1. (start-end) 静音Xs

```

口误类型

| 类型 | 示例 | 删除策略 |

|------|------|----------|

| 重复型 | 拉满新拉满 | 只删差异("新") |

| 替换型 | AI就是AI就会 | 删第一个完整版本("AI就是") |

| 卡顿型 | 听会会 | 删第一个重复字 |

⚠️ 关键规则

  1. 时间戳驱动:审查稿直接标注时间戳,剪辑不再搜索文本
  2. 逐token分析:对于"删前面保后面"的口误,必须逐token查时间戳
  3. 检查时间跨度:如果口误时间跨度 > 2秒,必有静音,需拆分

---

输出文件

```

01-xxx-v1_transcript.json # 转录结果(含时间戳)

01-xxx-v1_审查稿.md # 口误审查稿

```

展示要求

生成审查稿后,必须展示给用户

  1. 写入文件 01-xxx-v1_审查稿.md
  2. 读取并展示内容
  3. 等待用户确认要删除哪些项目

---

方法论

详见 tips/口误识别方法论.md

  • 口误识别方法(逐句检查)
  • "删前面保后面"的精确处理
  • FunASR 时间戳对齐规则

More from this repository10

🎯
videocut-subtitle🎯Skill

Generates and burns subtitles for videos by transcribing speech, correcting text, enabling user review, and embedding subtitles using FFmpeg.

🎯
image-service🎯Skill

生成多模态图像处理服务,支持文生图、图生图、图生文、长图拼接等图像处理能力。

🎯
searchnews🎯Skill

I apologize, but I cannot generate a description without seeing the actual code or details of the "searchnews" skill from the repository. Could you provide more context about what the skill does, s...

🎯
mcp-builder🎯Skill

Builds and manages Minecraft Proxy (MCP) configurations with automated setup and deployment capabilities for network infrastructure.

🎯
deep-research🎯Skill

Performs comprehensive technical research by extracting information, conducting web searches, generating professional reports in Markdown and Word formats, and creating visual infographics.

🎯
log-analyzer🎯Skill

Analyzes log files, extracting key metrics, identifying patterns, and generating insights to help troubleshoot system performance and detect potential issues.

🎯
uni-agent🎯Skill

Enables unified cross-protocol agent communication by providing a single API to call and interact with agents across different protocols like ANP, MCP, A2A, and AITP.

🎯
smart-query🎯Skill

Enables secure database querying via SSH tunnel, translating natural language to SQL and exploring table structures with ease.

🎯
csv-data-summarizer🎯Skill

Automatically analyzes CSV files, generating comprehensive statistical summaries and intelligent visualizations tailored to the specific data type and content.

🎯
skill-creator🎯Skill

Guides users through creating specialized skills that extend Claude's capabilities with domain expertise, workflows, and tool integrations.