alicloud-ai-audio-asr by cinience/alicloud-skills
npx skills add https://github.com/cinience/alicloud-skills --skill alicloud-ai-audio-asrCategory: provider
mkdir -p output/alicloud-ai-audio-asr
python -m py_compile skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py && echo "py_compile_ok" > output/alicloud-ai-audio-asr/validate.txt
通过标准:命令退出码为 0 且生成 output/alicloud-ai-audio-asr/validate.txt 文件。
output/alicloud-ai-audio-asr/ 目录下。使用 Qwen ASR 对录制的音频进行转录(非实时),包括短音频同步调用和长音频异步任务。
使用以下确切的模型字符串之一:
qwen3-asr-flashqwen-audio-asr广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
qwen3-asr-flash-filetrans选择指南:
qwen3-asr-flash 或 qwen-audio-asr。qwen3-asr-flash-filetrans。安装 SDK 依赖项(脚本仅使用 Python 标准库):
python3 -m venv .venv . .venv/bin/activate
在环境中设置 DASHSCOPE_API_KEY,或将 dashscope_api_key 添加到 ~/.alibabacloud/credentials 文件中。
audio (字符串, 必需): 公开 URL 或本地文件路径。model (字符串, 可选): 默认 qwen3-asr-flash。language_hints (数组, 可选): 例如 zh, en。sample_rate (数字, 可选)vocabulary_id (字符串, 可选)disfluency_removal_enabled (布尔值, 可选)timestamp_granularities (数组, 可选): 例如 sentence。async (布尔值, 可选): 对于同步模型默认为 false,对于 qwen3-asr-flash-filetrans 默认为 true。text (字符串): 标准化的转录文本。task_id (字符串, 可选): 异步提交时存在。status (字符串): SUCCEEDED 或提交状态。raw (对象): 原始 API 响应。同步转录 (OpenAI 兼容协议):
curl -sS --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash",
"messages": [
{
"role": "user",
"content": [
{
"type": "input_audio",
"input_audio": {
"data": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}
]
}
],
"stream": false,
"asr_options": {
"enable_itn": false
}
}'
异步长文件转录 (DashScope 协议):
curl -sS --location 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'X-DashScope-Async: enable' \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash-filetrans",
"input": {
"file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}'
轮询任务结果:
curl -sS --location "https://dashscope.aliyuncs.com/api/v1/tasks/<task_id>" \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"
使用捆绑的脚本处理 URL/本地文件输入和可选的异步轮询:
python skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash \
--language-hints zh,en \
--print-response
长文件模式:
python skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash-filetrans \
--async \
--wait
input_audio.data (data URI)。language_hints 最少,以减少识别歧义。output/alicloud-ai-audio-asr/transcripts/ 目录下。output/alicloud-ai-audio-asr/transcripts/OUTPUT_DIR 环境变量覆盖基础目录。references/api_reference.mdreferences/sources.mdskills/ai/audio/alicloud-ai-audio-tts-realtime/ 提供。每周安装数
162
代码仓库
GitHub 星标数
340
首次出现
10 天前
安全审计
安装于
gemini-cli161
github-copilot161
codex161
kimi-cli161
amp161
cline161
Category: provider
mkdir -p output/alicloud-ai-audio-asr
python -m py_compile skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py && echo "py_compile_ok" > output/alicloud-ai-audio-asr/validate.txt
Pass criteria: command exits 0 and output/alicloud-ai-audio-asr/validate.txt is generated.
output/alicloud-ai-audio-asr/.Use Qwen ASR for recorded audio transcription (non-realtime), including short audio sync calls and long audio async jobs.
Use one of these exact model strings:
qwen3-asr-flashqwen-audio-asrqwen3-asr-flash-filetransSelection guidance:
qwen3-asr-flash or qwen-audio-asr for short/normal recordings (sync).qwen3-asr-flash-filetrans for long-file transcription (async task workflow).Install SDK dependencies (script uses Python stdlib only):
python3 -m venv .venv . .venv/bin/activate
Set DASHSCOPE_API_KEY in environment, or add dashscope_api_key to ~/.alibabacloud/credentials.
audio (string, required): public URL or local file path.model (string, optional): default qwen3-asr-flash.language_hints (array, optional): e.g. zh, en.sample_rate (number, optional)vocabulary_id (string, optional)disfluency_removal_enabled (bool, optional)timestamp_granularities (array, optional): e.g. .text (string): normalized transcript text.task_id (string, optional): present for async submission.status (string): SUCCEEDED or submission status.raw (object): original API response.Sync transcription (OpenAI-compatible protocol):
curl -sS --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash",
"messages": [
{
"role": "user",
"content": [
{
"type": "input_audio",
"input_audio": {
"data": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}
]
}
],
"stream": false,
"asr_options": {
"enable_itn": false
}
}'
Async long-file transcription (DashScope protocol):
curl -sS --location 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'X-DashScope-Async: enable' \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash-filetrans",
"input": {
"file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}'
Poll task result:
curl -sS --location "https://dashscope.aliyuncs.com/api/v1/tasks/<task_id>" \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"
Use the bundled script for URL/local-file input and optional async polling:
python skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash \
--language-hints zh,en \
--print-response
Long-file mode:
python skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash-filetrans \
--async \
--wait
input_audio.data (data URI) when direct URL is unavailable.language_hints minimal to reduce recognition ambiguity.output/alicloud-ai-audio-asr/transcripts/.output/alicloud-ai-audio-asr/transcripts/OUTPUT_DIR.references/api_reference.mdreferences/sources.mdskills/ai/audio/alicloud-ai-audio-tts-realtime/.Weekly Installs
162
Repository
GitHub Stars
340
First Seen
10 days ago
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
gemini-cli161
github-copilot161
codex161
kimi-cli161
amp161
cline161
Azure 配额管理指南:服务限制、容量验证与配额增加方法
79,700 周安装
Docassemble 表单构建器技能 - 创建智能动态问卷与文档生成工具
257 周安装
Fastify TypeScript 生产级后端框架指南:高性能 Node.js Web 开发与 JSON 模式验证
257 周安装
AI 演示文稿生成器 | 一键创建专业幻灯片,支持 Marp 格式输出
257 周安装
Mapbox搜索模式指南:地理编码、POI搜索与位置发现最佳实践
257 周安装
Zustand适配器:为json-render提供状态管理后端,支持嵌套切片与Zustand v5+
257 周安装
Blender MCP 插件使用指南:3D 场景自动化与 Python 脚本控制教程
257 周安装
sentenceasync (bool, optional): default false for sync models, true for qwen3-asr-flash-filetrans.