piper-tts

Review·Scanned 2/17/2026

This skill provides local TTS using Piper ONNX, auto-downloading voice models and writing audio files. It reads Path.home() / '.openclaw' / '.env' and Path.home() / '.env', uses MATRIX_HOMESERVER and MATRIX_ACCESS_TOKEN, runs ffmpeg, and calls rhasspy/piper-voices and Matrix endpoints.

from clawhub.ai·v13c548b·11.1 KB·0 installs
Scanned from 1.0.0 at 13c548b · Transparency log ↗
$ vett add clawhub.ai/das-monki/piper-ttsReview findings below

Local TTS (Piper)

Fast local text-to-speech using Piper ONNX voices. Runs entirely offline with no cloud dependencies. Supports multiple languages and voice styles.

Usage

# Default voice (en_US-amy-medium)
~/.openclaw/skills/piper-tts/scripts/piper-tts.py "Hello, how are you today?"

# Select a specific voice
~/.openclaw/skills/piper-tts/scripts/piper-tts.py "Guten Tag" -v de_DE-thorsten-medium

# Pipe text from stdin
echo "Read this aloud" | ~/.openclaw/skills/piper-tts/scripts/piper-tts.py -

# Custom output path and format
~/.openclaw/skills/piper-tts/scripts/piper-tts.py "Hello" -o greeting.mp3 -f mp3

# Adjust speaking rate and send to Matrix room
~/.openclaw/skills/piper-tts/scripts/piper-tts.py "Slow and steady" --rate 0.8 --room-id '!abc:matrix.org'

# List available downloaded voices
~/.openclaw/skills/piper-tts/scripts/piper-tts.py --list-voices

# Quiet mode (suppress progress)
~/.openclaw/skills/piper-tts/scripts/piper-tts.py "Hello" --quiet

Options

  • -v/--voice: Voice model name (default: en_US-amy-medium)
  • -o/--output: Output file path (default: auto-generated in /tmp)
  • -f/--format: Output format: wav, mp3, ogg (default: wav)
  • --rate: Speaking rate multiplier, 0.5-2.0 (default: 1.0)
  • --room-id: Matrix room ID to send audio to
  • --list-voices: List downloaded voice models
  • -q/--quiet: Suppress progress messages

Voices

Piper supports 900+ voices across 60+ languages. Voice models are auto-downloaded from HuggingFace on first use.

Popular voices

VoiceLanguageQuality
en_US-amy-medium (default)English (US)Medium
en_US-lessac-highEnglish (US)High
en_GB-alba-mediumEnglish (UK)Medium
de_DE-thorsten-mediumGermanMedium
fr_FR-siwis-mediumFrenchMedium
es_ES-davefx-mediumSpanishMedium

Benchmark

Voice QualitySynthesis Time (100 words)RTF
Medium~0.3s0.04x
High~0.8s0.10x

openclaw.json

{
  "tools": {
    "media": {
      "tts": {
        "enabled": true,
        "models": [
          {
            "type": "cli",
            "command": "~/.openclaw/skills/piper-tts/scripts/piper-tts.py",
            "args": ["--quiet", "-f", "ogg", "-o", "{{OutputPath}}", "{{Text}}"],
            "timeoutSeconds": 30
          }
        ]
      }
    }
  }
}