openai-tts
✓Verified·Scanned 2/18/2026
This skill converts text to speech using OpenAI's TTS API and provides a CLI and Python module with auto-chunking, voice options, and multiple output formats. It requires OPENAI_API_KEY, includes commands like python openai_tts.py "Hello world" -o output.mp3, and calls the OpenAI audio API at https://platform.openai.com/docs/api-reference/audio/createSpeech.
from clawhub.ai·v300b0cf·19.2 KB·0 installs
Scanned from 1.0.1 at 300b0cf · Transparency log ↗
$ vett add clawhub.ai/merend/openai-tts
OpenAI TTS Skill
Text-to-speech conversion using OpenAI's TTS API for generating high-quality, natural-sounding audio from text.
Installation
pip install openai pydub
# For audio processing (pydub dependency)
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt install ffmpeg
Setup
Set your OpenAI API key:
export OPENAI_API_KEY="sk-..."
Usage
Command Line
# Basic usage
python openai_tts.py "Hello world" -o output.mp3
# From file
python openai_tts.py -f article.txt -o article.mp3
# With voice selection
python openai_tts.py "Your text" -o output.mp3 --voice nova
# High quality
python openai_tts.py "Your text" -o output.mp3 --model tts-1-hd
# Adjust speed (0.25 to 4.0)
python openai_tts.py "Your text" -o output.mp3 --speed 1.5
# Pipe input
echo "Hello world" | python openai_tts.py -o output.mp3
# Verbose mode
python openai_tts.py "Test" -o test.mp3 -v
# List available voices
python openai_tts.py --list-voices
As Module
from openai_tts import generate_tts
# Basic
generate_tts("Hello world", "output.mp3")
# With options
generate_tts(
text="Your text here",
output_path="output.mp3",
voice="nova",
model="tts-1-hd",
response_format="mp3",
speed=1.25, # 0.25 to 4.0
verbose=True
)
Voices
| Voice | Type | Description |
|---|---|---|
| alloy | Neutral | Balanced, versatile |
| echo | Male | Warm, conversational |
| fable | Neutral | Expressive, storytelling |
| onyx | Male | Deep, authoritative (default) |
| nova | Female | Friendly, upbeat |
| shimmer | Female | Clear, professional |
Models
| Model | Quality | Speed | Cost |
|---|---|---|---|
| tts-1 | Standard | Fast | $0.015/1K chars |
| tts-1-hd | High Definition | Slower | $0.030/1K chars |
Features
- Auto-chunking: Automatically splits text longer than 4096 characters
- Multiple formats: mp3, opus, aac, flac
- 6 voices: Male and female options
- Pipe support: Read from stdin
Output Formats
| Format | Description |
|---|---|
| mp3 | Default, widely compatible |
| opus | Smaller file size, good quality |
| aac | Apple/iOS compatible |
| flac | Lossless, larger files |
License
MIT