whisper-transcribe

Review·Scanned 2/18/2026

This skill transcribes audio using the scripts/transcribe.sh shell wrapper around the whisper CLI to produce txt, srt, vtt, or json outputs. It runs local shell commands (#!/bin/bash, whisper "${args[@]}" "$file") and performs network actions for package/model downloads (pip install openai-whisper, first-run model downloads).

from clawhub.ai·v1ffe779·4.6 KB·0 installs
Scanned from 1.0.0 at 1ffe779 · Transparency log ↗
$ vett add clawhub.ai/josunlp/whisper-transcribeReview findings below

Whisper Transcribe

Transcribe audio with scripts/transcribe.sh:

# Basic (auto-detect language, base model)
scripts/transcribe.sh recording.mp3

# German, small model, SRT subtitles
scripts/transcribe.sh --model small --language de --format srt lecture.wav

# Batch process, all formats
scripts/transcribe.sh --format all --output-dir ./transcripts/ *.mp3

# Word-level timestamps
scripts/transcribe.sh --timestamps interview.m4a

Models

ModelRAMSpeedAccuracyBest for
tiny~1GB⚡⚡⚡★★Quick drafts, known language
base~1GB⚡⚡★★★General use (default)
small~2GB★★★★Good accuracy
medium~5GB🐢★★★★★High accuracy
large~10GB🐌★★★★★Best accuracy (slow on Pi)

Output Formats

  • txt — Plain text transcript
  • srt — SubRip subtitles (for video)
  • vtt — WebVTT subtitles
  • json — Detailed JSON with timestamps and confidence
  • all — Generate all formats at once

Requirements

  • whisper CLI (pip install openai-whisper)
  • ffmpeg (for audio decoding)
  • First run downloads the model (~150MB for base)