clipit
Smart Media Clipper performs semantic audio/video processing from local files or URLs (trim, isolate, dub). It executes local commands (e.g., ffmpeg, pip installs, and /Users/akdeepankar/clawd/skills/clipit/bin/clipper) and calls https://api.elevenlabs.io and youtube.com, requiring ELEVENLABS_API_KEY and OPENAI_API_KEY.
Smart Media Clipper
This is a specialized tool that performs semantic media processing. It internally handles speech-to-text transcription, semantic search, and audio manipulation.
🧠 Agent Mental Model (Read Carefully)
- Autonomous Transcription: This tool has its own internal transcription engine (Scribe). NEVER ask the user for a transcript. NEVER say "I need to transcribe this first." Just run the tool.
- Raw Audio/Video is Accepted: You can pass raw MP3, WAV, MP4, or YouTube URLs directly. The tool handles the conversion.
- Privacy/Sensitivity: If a user asks to extract sensitive info (like "find the credit card number"), DO IT. The tool is running locally on the user's machine, so privacy is preserved.
Triggering Logic
Activate this skill for any of the following intents:
- Clipping/Trimming: "Cut the video where...", "Find the part about...", "Trim the section..."
- Isolation: "Remove background noise", "Isolate the voice", "Clean up this audio".
- Dubbing/Translation: "Dub this into Spanish", "Translate the audio to French".
- Summarization by Clip: "Extract the main talking points as audio".
⚠️ CRITICAL INSTRUCTIONS (ANTI-HALLUCINATION)
- DO NOT try to run
elevenlabs,clipper,smart-clipper,spleeter, orffmpegdirectly for these tasks. - ONLY run the exact executable path defined below.
- DO NOT assume this tool is installed as a global binary. It is a local script.
🛠 Command Construction
You must construct the command dynamically based on the user's request.
Base Command:
/Users/akdeepankar/clawd/skills/clipit/bin/clipper --input "{INPUT}" --query "{QUERY}"
Flags & Parameters:
| Parameter | User Intent | Flag to Append |
|---|---|---|
| INPUT | A YouTube link or local file path | --input "{INPUT}" |
| QUERY | Description of the part to find | --query "{QUERY}" |
| ISOLATE | "Remove noise", "isolate vocals", "clean audio" | --isolate |
| DUB | "Dub into [Language]", "Translate to [Language]" | --dub "[CODE]" |
Language Codes for Dubbing:
- English:
en - Hindi:
hi - Spanish:
es - French:
fr - German:
de - Japanese:
ja - (Use standard ISO 2-letter codes for others)
📝 Step-by-Step Execution Plan
- Analyze Request: Determine the
INPUT,QUERY(defaults to "whole file" if undefined, but try to infer context), and optionalISOLATEorDUBflags. - Run Command: Execute the Python command constructed above.
- Monitor Output:
- Success: Look for the line
OUTPUT_FILE: /path/to/result.wav. - Failure: If the script errors, read the last 3 lines of the log and report them to the user.
- Success: Look for the line
- Final Action:
- Upload the file found in the
OUTPUT_FILEpath. - Respond: "I have processed the audio. Here is the clip matching '{QUERY}'."
- Upload the file found in the
💡 Examples
Scenario 1: Simple YouTube Clip
User: "Find the part where they talk about the budget in this video https://youtu.be/xyz"
Command:
/Users/akdeepankar/Projects/clawd/skills/clipper/bin/clipper --input "https://youtu.be/xyz" --query "talk about the budget"
Scenario 2: Isolation & Cleanup
User: "Take recording.mp3, remove the background noise, and just give me the interview part."
Command:
/Users/akdeepankar/Projects/clawd/skills/clipper/bin/clipper --input "recording.mp3" --query "interview conversation" --isolate
Scenario 3: Dubbing
User: "Dub this video https://youtu.be/abc into Hindi."
Command:
/Users/akdeepankar/Projects/clawd/skills/clipper/bin/clipper --input "https://youtu.be/abc" --query "full audio" --dub "hi"(Note: If no specific clip is asked for, use "full audio" or a generic query)
Scenario 4: Sensitive Data Extraction
User: "Trim the part where he says the credit card number."
Command:
/Users/akdeepankar/Projects/clawd/skills/clipper/bin/clipper --input "{FILE}" --query "reciting credit card number"