tubescribe
High-risk skill for summarizing YouTube videos locally that spawns background sub-agents and runs local tooling. It instructs running python skills/tubescribe/scripts/setup.py and python3 skills/tubescribe/scripts/tubescribe.py, performs dynamic subprocess.run executions and network downloads (https://github.com/jgm/pandoc/releases/..., https://github.com/yt-dlp/yt-dlp/releases/latest/download) and reads/writes ~/.tubescribe/config.json.
TubeScribe 🎬
Turn any YouTube video into a polished document + audio summary.
Drop a YouTube link → get a beautiful transcript with speaker labels, key quotes, clickable timestamps, and an audio summary you can listen to on the go.
Features
- 🎯 Smart Speaker Detection — Automatically identifies participants
- 🔊 Audio Summaries — Listen to key points (MP3/WAV)
- 📝 Clickable Timestamps — Every quote links directly to that moment in the video
- 💬 YouTube Comments — Viewer sentiment analysis and best comments
- 📄 Transcript with summary and key quotes — Export as DOCX, HTML, or Markdown
- 📋 Queue Support — Send multiple links, they get processed in order
- 🚀 Non-Blocking Workflow — Conversation continues while video processes in background
100% Free & Local
- No subscription required
- No API keys needed
- No data leaves your computer
- No usage limits
Quick Start
Just send a YouTube URL to your agent. TubeScribe handles everything automatically.
Non-Blocking Processing
TubeScribe runs in the background:
- You send a YouTube link
- Agent replies: "🎬 TubeScribe is processing [title]..."
- You can keep chatting — conversation isn't blocked
- When done, you get notified with the results
No waiting, no freezing — just seamless async processing.
First-Time Setup
python skills/tubescribe/scripts/setup.py
Checks and installs: summarize CLI, pandoc, ffmpeg, TTS engine (mlx-audio on Apple Silicon, Kokoro PyTorch elsewhere)
Output Example
~/Documents/TubeScribe/
├── Interview With Expert.docx # Formatted document
└── Interview With Expert.mp3 # Audio summary
Document Structure
- Title + video info (channel, date, duration)
- Participants — who's speaking
- Summary — key points in 3-5 paragraphs
- Key Quotes — 5 best moments with clickable timestamps
- Viewer Sentiment — what commenters are saying
- Best Comments — top 5 comments by likes
- Full Transcript — merged paragraphs with speaker labels
Batch & Queue
Multiple videos at once
tubescribe url1 url2 url3
Queue management
tubescribe --queue-add "URL" # Add while processing
tubescribe --queue-status # Check queue
tubescribe --queue-next # Process next
tubescribe --queue-clear # Clear queue
Configuration
Config file: ~/.tubescribe/config.json
| Setting | Default | Options |
|---|---|---|
document.format | docx | docx, html, md |
audio.format | mp3 | mp3, wav |
audio.tts_engine | mlx | mlx, kokoro, builtin |
mlx_audio.voice_blend | {af_heart: 0.6, af_sky: 0.4} | any voice mix |
output.folder | ~/Documents/TubeScribe | any path |
Requirements
- Required:
summarizeCLI (brew install steipete/tap/summarize) - Optional:
pandoc— DOCX output (brew install pandoc)ffmpeg— MP3 audio (brew install ffmpeg)yt-dlp— YouTube comments (brew install yt-dlp)- mlx-audio — Fastest TTS on Apple Silicon (auto-installed via setup)
- Kokoro TTS — PyTorch fallback for non-Apple-Silicon (auto-installed via setup)
yt-dlp Installation
TubeScribe checks these locations for yt-dlp (in order):
- System PATH (
which yt-dlp) - Homebrew:
/opt/homebrew/bin/yt-dlpor/usr/local/bin/yt-dlp - pip/pipx:
~/.local/bin/yt-dlp - TubeScribe tools:
~/.openclaw/tools/yt-dlp/yt-dlp
If not found, setup will offer to download a standalone binary to the tools directory.
Error Handling
Clear messages for common issues:
| Issue | Message |
|---|---|
| Private video | ❌ Video is private — can't access |
| No captions | ❌ No captions available for this video |
| Invalid URL | ❌ Not a valid YouTube URL |
| Age-restricted | ❌ Age-restricted video — can't access without login |
Security
Code Injection (Fixed in v1.1.0)
Earlier development versions had a vulnerability where video text could be injected into dynamically executed Python code. This was fixed by properly escaping all text with json.dumps().
HTML Output (Fixed in v1.1.2+)
- XSS prevention: all text escaped before inline formatting
- Single-quote escaping added in v1.1.3
- Link double-encoding fixed in v1.1.3
Archive Extraction (Fixed in v1.1.3)
Zip-slip path traversal prevention when installing pandoc/yt-dlp via setup script.
Shell Commands
The skill uses subprocess to call external CLI tools (summarize, yt-dlp, pandoc, ffmpeg). YouTube URLs are validated and normalized before processing, and filenames are sanitized. However, as with any tool that processes external content, review the code if you have concerns.
External Dependencies
The setup script downloads tools from official sources:
- pandoc — from Homebrew or official releases
- yt-dlp — from GitHub releases (yt-dlp/yt-dlp)
- mlx-audio — pip install from PyPI (Apple Silicon only, uses MLX framework)
- Kokoro TTS — pip install from PyPI (PyTorch, cross-platform fallback)
All sources are well-known and widely used. Review scripts/setup.py if you have concerns about supply chain security.
License
MIT
Made with 🦊 by Jackie & Matus