openai-tts

Verified·Scanned 2/18/2026

This skill converts text to speech using OpenAI's TTS API and provides a CLI and Python module with auto-chunking, voice options, and multiple output formats. It requires OPENAI_API_KEY, includes commands like python openai_tts.py "Hello world" -o output.mp3, and calls the OpenAI audio API at https://platform.openai.com/docs/api-reference/audio/createSpeech.

from clawhub.ai·v300b0cf·19.2 KB·0 installs
Scanned from 1.0.1 at 300b0cf · Transparency log ↗
$ vett add clawhub.ai/merend/openai-tts

OpenAI TTS Skill

Text-to-speech conversion using OpenAI's TTS API for generating high-quality, natural-sounding audio from text.

Installation

pip install openai pydub

# For audio processing (pydub dependency)
# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt install ffmpeg

Setup

Set your OpenAI API key:

export OPENAI_API_KEY="sk-..."

Usage

Command Line

# Basic usage
python openai_tts.py "Hello world" -o output.mp3

# From file
python openai_tts.py -f article.txt -o article.mp3

# With voice selection
python openai_tts.py "Your text" -o output.mp3 --voice nova

# High quality
python openai_tts.py "Your text" -o output.mp3 --model tts-1-hd

# Adjust speed (0.25 to 4.0)
python openai_tts.py "Your text" -o output.mp3 --speed 1.5

# Pipe input
echo "Hello world" | python openai_tts.py -o output.mp3

# Verbose mode
python openai_tts.py "Test" -o test.mp3 -v

# List available voices
python openai_tts.py --list-voices

As Module

from openai_tts import generate_tts

# Basic
generate_tts("Hello world", "output.mp3")

# With options
generate_tts(
    text="Your text here",
    output_path="output.mp3",
    voice="nova",
    model="tts-1-hd",
    response_format="mp3",
    speed=1.25,  # 0.25 to 4.0
    verbose=True
)

Voices

VoiceTypeDescription
alloyNeutralBalanced, versatile
echoMaleWarm, conversational
fableNeutralExpressive, storytelling
onyxMaleDeep, authoritative (default)
novaFemaleFriendly, upbeat
shimmerFemaleClear, professional

Models

ModelQualitySpeedCost
tts-1StandardFast$0.015/1K chars
tts-1-hdHigh DefinitionSlower$0.030/1K chars

Features

  • Auto-chunking: Automatically splits text longer than 4096 characters
  • Multiple formats: mp3, opus, aac, flac
  • 6 voices: Male and female options
  • Pipe support: Read from stdin

Output Formats

FormatDescription
mp3Default, widely compatible
opusSmaller file size, good quality
aacApple/iOS compatible
flacLossless, larger files

License

MIT