whisper-transcribe

⚠Review·Scanned 2/18/2026

This skill transcribes audio using the scripts/transcribe.sh shell wrapper around the whisper CLI to produce txt, srt, vtt, or json outputs. It runs local shell commands (#!/bin/bash, whisper "${args[@]}" "$file") and performs network actions for package/model downloads (pip install openai-whisper, first-run model downloads).

from clawhub.ai·v1ffe779·4.6 KB·0 installs

Scanned from 1.0.0 at 1ffe779 · Transparency log ↗

$ vett add clawhub.ai/josunlp/whisper-transcribeReview findings below

Whisper Transcribe

Transcribe audio with scripts/transcribe.sh:

# Basic (auto-detect language, base model)
scripts/transcribe.sh recording.mp3

# German, small model, SRT subtitles
scripts/transcribe.sh --model small --language de --format srt lecture.wav

# Batch process, all formats
scripts/transcribe.sh --format all --output-dir ./transcripts/ *.mp3

# Word-level timestamps
scripts/transcribe.sh --timestamps interview.m4a

Models

Model	RAM	Speed	Accuracy	Best for
tiny	~1GB	⚡⚡⚡	★★	Quick drafts, known language
base	~1GB	⚡⚡	★★★	General use (default)
small	~2GB	⚡	★★★★	Good accuracy
medium	~5GB	🐢	★★★★★	High accuracy
large	~10GB	🐌	★★★★★	Best accuracy (slow on Pi)

Output Formats

txt — Plain text transcript
srt — SubRip subtitles (for video)
vtt — WebVTT subtitles
json — Detailed JSON with timestamps and confidence
all — Generate all formats at once

Requirements

whisper CLI (pip install openai-whisper)
ffmpeg (for audio decoding)
First run downloads the model (~150MB for base)