Paratran 音频转录工具 - 基于 MLX 的 Apple Silicon 高性能 ASR，比 Whisper 快 30 倍

Paratran Transcription by briansunter/paratran

GitHub

安装命令

npx skills add https://github.com/briansunter/paratran --skill 'Paratran Transcription'

AI/机器学习命令行工具音频处理

🇨🇳中文介绍

Paratran 转录

使用 parakeet-mlx 为 Apple Silicon 提供音频转录功能。在 Open ASR 排行榜上排名第一，通过 MLX 比 Whisper 快约 30 倍。

提供三种接口：CLI、REST API 和 MCP 服务器。

安装

快速运行（无需安装）

uvx paratran recording.wav

持久化安装

uv tool install paratran

从源代码安装

git clone https://github.com/briansunter/paratran.git
cd paratran
uv sync
uv run paratran recording.wav

CLI 转录

# 转录为文本（默认）
paratran recording.wav

# 多个文件，带详细输出
paratran -v file1.wav file2.mp3 file3.m4a

# 输出为 SRT 字幕格式
paratran --output-format srt recording.wav

# 所有格式（txt, json, srt, vtt）输出到目录
paratran --output-format all --output-dir ./output recording.wav

# 使用束搜索解码
paratran --decoding beam recording.wav

# 自定义模型和缓存目录
paratran --model mlx-community/parakeet-tdt-1.1b-v2 --cache-dir /path/to/models recording.wav

CLI 选项

标志	默认值	描述
`--model`	`mlx-community/parakeet-tdt-0.6b-v3`	HuggingFace 模型 ID 或本地路径
`--cache-dir`	HuggingFace 默认值	模型缓存目录
`--output-dir`	`.`	输出目录
`--output-format`	`txt`

环境变量：PARATRAN_MODEL, PARATRAN_MODEL_DIR。

REST API 服务器

# 启动服务器
paratran serve

# 自定义主机、端口和模型缓存
paratran serve --host 127.0.0.1 --port 9000 --cache-dir /path/to/models

端点

GET /health — 返回模型名称、状态和缓存目录。

POST /transcribe — 上传音频文件，返回转录 JSON。

# 基本转录
curl -X POST http://localhost:8000/transcribe -F "file=@recording.m4a"

# 使用束搜索和句子分割
curl -X POST "http://localhost:8000/transcribe?decoding=beam&max_words=20" -F "file=@recording.m4a"

# 仅提取文本
curl -s -X POST http://localhost:8000/transcribe -F "file=@audio.m4a" | jq -r '.text'

查询参数：decoding, beam_size, length_penalty, patience, duration_reward, max_words, silence_gap, max_duration, chunk_duration, overlap_duration, fp32。

响应格式

{
  "text": "完整的转录文本。",
  "duration": 3.52,
  "processing_time": 0.176,
  "sentences": [
    {
      "text": "完整的转录文本。",
      "start": 0.0,
      "end": 3.52,
      "tokens": [
        { "text": "完整", "start": 0.0, "end": 0.24 },
        { "text": "的转录文本", "start": 0.24, "end": 0.8 }
      ]
    }
  ]
}

交互式 API 文档位于 http://localhost:8000/docs。

MCP 服务器

Paratran 包含一个 MCP 服务器，因此 Claude Code、Claude Desktop 或任何 MCP 客户端都可以直接转录音频文件。

Claude Code

添加到 .claude/settings.json：

{
  "mcpServers": {
    "paratran": {
      "command": "uvx",
      "args": ["--from", "paratran", "paratran-mcp"]
    }
  }
}

Claude Desktop

添加到 ~/Library/Application Support/Claude/claude_desktop_config.json：

{
  "mcpServers": {
    "paratran": {
      "command": "uvx",
      "args": ["--from", "paratran", "paratran-mcp"]
    }
  }
}

可选地在 env 块中设置 PARATRAN_MODEL_DIR 以自定义模型缓存位置。

MCP 工具

transcribe 工具接受：

file_path（必需）— 音频文件的绝对路径
所有转录选项：decoding, beam_size, length_penalty, patience, duration_reward, max_words, silence_gap, max_duration, chunk_duration, overlap_duration,

返回包含完整文本、时长、处理时间以及带词级时间戳的句子的 JSON 字符串。

每周安装量

仓库

briansunter/paratran

首次出现

1970年1月1日

安全审计

Gen Agent Trust HubWarn SnykWarn

🇺🇸English

Paratran Transcription

Audio transcription for Apple Silicon using parakeet-mlx. #1 on Open ASR Leaderboard, ~30x faster than Whisper via MLX.

Three interfaces: CLI, REST API, and MCP server.

Setup

Quick run (no install)

uvx paratran recording.wav

Persistent install

uv tool install paratran

From source

git clone https://github.com/briansunter/paratran.git
cd paratran
uv sync
uv run paratran recording.wav

CLI Transcription

# Transcribe to text (default)
paratran recording.wav

# Multiple files with verbose output
paratran -v file1.wav file2.mp3 file3.m4a

# Output as SRT subtitles
paratran --output-format srt recording.wav

# All formats (txt, json, srt, vtt) to a directory
paratran --output-format all --output-dir ./output recording.wav

# Beam search decoding
paratran --decoding beam recording.wav

# Custom model and cache directory
paratran --model mlx-community/parakeet-tdt-1.1b-v2 --cache-dir /path/to/models recording.wav

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

Flag	Default	Description
`--model`	`mlx-community/parakeet-tdt-0.6b-v3`	HF model ID or local path
`--cache-dir`	HuggingFace default	Model cache directory
`--output-dir`	`.`	Output directory
`--output-format`	`txt`	`txt`, `json`, `srt`, `vtt`, or `all`
`--decoding`	`greedy`	`greedy` or `beam`
`--chunk-duration`	`120`	Chunk duration in seconds (0 to disable)
`--overlap-duration`	`15`	Overlap between chunks
`--beam-size`	`5`	Beam size (beam decoding)
`--fp32`		Use FP32 precision instead of BF16
`-v`		Verbose output

Paratran 音频转录工具 - 基于 MLX 的 Apple Silicon 高性能 ASR，比 Whisper 快 30 倍

🇨🇳中文介绍

Paratran 转录

安装

快速运行（无需安装）

持久化安装

从源代码安装

CLI 转录

CLI 选项

REST API 服务器

端点

响应格式

MCP 服务器

Claude Code

Claude Desktop

MCP 工具

🇺🇸English

Paratran Transcription

Setup

Quick run (no install)

Persistent install

From source

CLI Transcription

相关 Skills

CLI Options

REST API Server

Endpoints

Response format

MCP Server

Claude Code

Claude Desktop

MCP Tool

最新 Skills