High Risk:This skill has significant security concerns. Review the findings below before installing.

text-to-speech

Caution·Scanned 2/18/2026

High-risk skill that installs and runs a remote shell script and makes network calls to https://cli.inference.sh and https://inference.sh. Provides infsh CLI examples for TTS and uses sample files like input.json and speech.json.

from clawhub.ai·v6d92ecf·3.7 KB·0 installs
Scanned from 0.1.0 at 6d92ecf · Transparency log ↗
$ vett add clawhub.ai/okaris/text-to-speechReview security findings before installing

Text-to-Speech

Convert text to natural speech via inference.sh CLI.

Quick Start

# Install CLI
curl -fsSL https://cli.inference.sh | sh && infsh login

# Generate speech
infsh app run infsh/kokoro-tts --input '{"text": "Hello, welcome to our product demo."}'

Available Models

ModelApp IDBest For
DIA TTSinfsh/dia-ttsConversational, expressive
Kokoro TTSinfsh/kokoro-ttsFast, natural
Chatterboxinfsh/chatterboxGeneral purpose
Higgs Audioinfsh/higgs-audioEmotional control
VibeVoiceinfsh/vibevoicePodcasts, long-form

Browse All Audio Apps

infsh app list --category audio

Examples

Basic Text-to-Speech

infsh app run infsh/kokoro-tts --input '{"text": "Welcome to our tutorial."}'

Conversational TTS with DIA

infsh app sample infsh/dia-tts --save input.json

# Edit input.json:
# {
#   "text": "Hey! How are you doing today? I'm really excited to share this with you.",
#   "voice": "conversational"
# }

infsh app run infsh/dia-tts --input input.json

Long-form Audio (Podcasts)

infsh app sample infsh/vibevoice --save input.json

# Edit input.json with your podcast script
infsh app run infsh/vibevoice --input input.json

Expressive Speech with Higgs

infsh app sample infsh/higgs-audio --save input.json

# {
#   "text": "This is absolutely incredible!",
#   "emotion": "excited"
# }

infsh app run infsh/higgs-audio --input input.json

Use Cases

  • Voiceovers: Product demos, explainer videos
  • Audiobooks: Convert text to spoken word
  • Podcasts: Generate podcast episodes
  • Accessibility: Make content accessible
  • IVR: Phone system voice prompts
  • Video Narration: Add narration to videos

Combine with Video

Generate speech, then create a talking head video:

# 1. Generate speech
infsh app run infsh/kokoro-tts --input '{"text": "Your script here"}' > speech.json

# 2. Use the audio URL with OmniHuman for avatar video
infsh app run bytedance/omnihuman-1-5 --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "<audio-url-from-step-1>"
}'

Related Skills

# Full platform skill (all 150+ apps)
npx skills add inference-sh/agent-skills@inference-sh

# AI avatars (combine TTS with talking heads)
npx skills add inference-sh/agent-skills@ai-avatar-video

# AI music generation
npx skills add inference-sh/agent-skills@ai-music-generation

# Speech-to-text (transcription)
npx skills add inference-sh/agent-skills@speech-to-text

# Video generation
npx skills add inference-sh/agent-skills@ai-video-generation

Browse all apps: infsh app list

Documentation