whisper-transcribe
⚠Review·Scanned 2/18/2026
This skill transcribes audio using the scripts/transcribe.sh shell wrapper around the whisper CLI to produce txt, srt, vtt, or json outputs. It runs local shell commands (#!/bin/bash, whisper "${args[@]}" "$file") and performs network actions for package/model downloads (pip install openai-whisper, first-run model downloads).
from clawhub.ai·v1ffe779·4.6 KB·0 installs
Scanned from 1.0.0 at 1ffe779 · Transparency log ↗
$ vett add clawhub.ai/josunlp/whisper-transcribeReview findings below
Whisper Transcribe
Transcribe audio with scripts/transcribe.sh:
# Basic (auto-detect language, base model)
scripts/transcribe.sh recording.mp3
# German, small model, SRT subtitles
scripts/transcribe.sh --model small --language de --format srt lecture.wav
# Batch process, all formats
scripts/transcribe.sh --format all --output-dir ./transcripts/ *.mp3
# Word-level timestamps
scripts/transcribe.sh --timestamps interview.m4a
Models
| Model | RAM | Speed | Accuracy | Best for |
|---|---|---|---|---|
| tiny | ~1GB | ⚡⚡⚡ | ★★ | Quick drafts, known language |
| base | ~1GB | ⚡⚡ | ★★★ | General use (default) |
| small | ~2GB | ⚡ | ★★★★ | Good accuracy |
| medium | ~5GB | 🐢 | ★★★★★ | High accuracy |
| large | ~10GB | 🐌 | ★★★★★ | Best accuracy (slow on Pi) |
Output Formats
- txt — Plain text transcript
- srt — SubRip subtitles (for video)
- vtt — WebVTT subtitles
- json — Detailed JSON with timestamps and confidence
- all — Generate all formats at once
Requirements
whisperCLI (pip install openai-whisper)ffmpeg(for audio decoding)- First run downloads the model (~150MB for base)