Transcribe anything.
Let agents fix the rest.

Local Whisper transcription with agent-ready output. One command to transcribe any URL or file. Ship with a skill that auto-corrects whisper mistakes.

Install
bun add -g @crafter/trx
Agent Skill
npx skills add crafter-station/trx -g
$ bun add -g @crafter/trx

$ trx init
 whisper-cli (1.8.4)
 yt-dlp (2026.02.04)
 ffmpeg (7.1)
  Select Whisper model:
  > small (~466 MB) — recommended
 Model downloaded
  Install agent skill? Yes
 trx is ready

$ trx "https://youtube.com/watch?v=dQw4w9WgXcQ"
  Downloading media...
  Cleaning audio...
  Transcribing with Whisper...
 Done: video.txt, video.srt
0
API Keys
99
Languages
5
Models
1
Command

Text is not enough.
Videos are signal.

There's a TikTok reel, a YouTube tutorial, an Instagram story with exactly what you need. But you can't search inside video. You can't pipe it to an agent. You can't grep it.

trx turns any video into text with one command. Whisper runs locally -- no API keys, no rate limits, no cost. The small model handles 99 languages on your CPU. And the agent skill fixes what Whisper gets wrong.

Free & local
99 languages
Agent self-correction
Any URL or file
Four commands. Zero config.
trx init
One command. Three deps. Model downloaded. Skill installed.
$ trx init --model small
 whisper-cli installed
 yt-dlp installed
 ffmpeg installed
 ggml-small.bin downloaded
 Agent skill installed
trx <url>
YouTube, TikTok, Twitter, Instagram. Paste a URL, get text.
$ trx "https://tiktok.com/@dev/video/123"
  Downloading media...
  Cleaning audio...
  Transcribing...
 dev-123.txt (1,247 words)
trx doctor
Health check. JSON output. Every dep, every config, one glance.
$ trx doctor --output json
{
  "healthy": true,
  "dependencies": {
    "whisper-cli": { "installed": true },
    "ffmpeg": { "installed": true }
  }
}
trx schema
Agents introspect what the CLI accepts. No docs needed.
$ trx schema transcribe
{
  "command": "transcribe",
  "flags": {
    "--language": { "default": "auto" },
    "--dry-run": { "type": "boolean" },
    "--fields": { "description": "..." }
  }
}
Built for agents. Works for humans.
--output json
Auto-detect: table for humans, JSON when piped
--dry-run
Validate before executing. No surprises.
--fields text
Save tokens. Only get what you need.
trx schema
Runtime introspection. No docs needed.
validation
Reject control chars, path traversals, URL-encoded strings
SKILL.md
Agent post-processing. Fix whisper mistakes automatically.
What people transcribe with trx
YouTube tutorials
$ trx "https://youtube.com/watch?v=..."
TikTok reels
$ trx "https://tiktok.com/@user/video/..."
Podcast episodes
$ trx podcast.mp3 --language en
Instagram stories
$ trx "https://instagram.com/reel/..."
Meeting recordings
$ trx meeting.m4a --fields text
Twitter/X videos
$ trx "https://x.com/user/status/..."