alicloud-ai-multimodal-qwen-vl by cinience/alicloud-skills
npx skills add https://github.com/cinience/alicloud-skills --skill alicloud-ai-multimodal-qwen-vlCategory: provider
mkdir -p output/alicloud-ai-multimodal-qwen-vl
python -m py_compile skills/ai/multimodal/alicloud-ai-multimodal-qwen-vl/scripts/analyze_image.py && echo "py_compile_ok" > output/alicloud-ai-multimodal-qwen-vl/validate.txt
通过标准:命令以 0 退出且生成 output/alicloud-ai-multimodal-qwen-vl/validate.txt 文件。
output/alicloud-ai-multimodal-qwen-vl/。通过 DashScope 兼容模式 API,使用通义千问 VL 模型处理图像输入 + 文本输出的理解任务。
安装依赖项(建议在虚拟环境中进行):
python3 -m venv .venv . .venv/bin/activate python -m pip install requests
在环境中设置 DASHSCOPE_API_KEY,或将 dashscope_api_key 添加到 文件中。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
~/.alibabacloud/credentials优先使用 Qwen3 VL 系列:
qwen3-vl-plusqwen3-vl-flash当需要明确的"最新"路由或可复现的快照时,请使用官方模型列表中支持的别名/快照,例如:
qwen3-vl-plus-latestqwen3-vl-plus-2025-12-19qwen3-vl-flash-latest某些工作负载中仍可见的旧版名称:
qwen-vl-max-latestqwen-vl-plus-latestprompt (字符串,必需):关于图像的用户问题/指令。image (字符串,必需):HTTPS URL、本地路径或 data: URL。model (字符串,可选):默认为 qwen3-vl-plus。max_tokens (整数,可选):默认为 512。temperature (浮点数,可选):默认为 0.2。detail (字符串,可选):auto/low/high,默认为 auto。json_mode (布尔值,可选):尽可能返回纯 JSON 响应。schema (对象,可选):用于结构化提取的 JSON Schema。max_retries (整数,可选):针对 429/5xx 的重试次数,默认为 2。retry_backoff_s (浮点数,可选):指数退避的基础秒数,默认为 1.5。text (字符串):主要的模型回答。model (字符串):实际使用的模型。usage (对象):后端返回的令牌使用情况。python skills/ai/multimodal/alicloud-ai-multimodal-qwen-vl/scripts/analyze_image.py \
--request '{"prompt":"Summarize the main content in this image","image":"https://example.com/demo.jpg"}' \
--print-response
使用本地图像:
python skills/ai/multimodal/alicloud-ai-multimodal-qwen-vl/scripts/analyze_image.py \
--request '{"prompt":"Extract key information from the image","image":"./samples/invoice.png","model":"qwen3-vl-plus"}' \
--print-response
结构化提取(JSON 模式):
python skills/ai/multimodal/alicloud-ai-multimodal-qwen-vl/scripts/analyze_image.py \
--request '{"prompt":"Extract fields: title, amount, date","image":"./samples/invoice.png"}' \
--json-mode \
--print-response
结构化提取(JSON Schema):
python skills/ai/multimodal/alicloud-ai-multimodal-qwen-vl/scripts/analyze_image.py \
--request '{"prompt":"Extract invoice fields","image":"./samples/invoice.png"}' \
--schema skills/ai/multimodal/alicloud-ai-multimodal-qwen-vl/references/examples/invoice.schema.json \
--print-response
curl -sS https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model":"qwen3-vl-plus",
"messages":[
{
"role":"user",
"content":[
{"type":"image_url","image_url":{"url":"https://example.com/demo.jpg"}},
{"type":"text","text":"Describe this image and list executable actions"}
]
}
],
"max_tokens":512,
"temperature":0.2
}'
--output,JSON 响应将保存到该文件。output/alicloud-ai-multimodal-qwen-vl/。python tests/ai/multimodal/alicloud-ai-multimodal-qwen-vl-test/scripts/smoke_test_qwen_vl.py \
--image ./tmp/vl_test_cat.png
| 错误 | 可能原因 | 操作 |
|---|---|---|
| 401/403 | 缺少或无效的密钥 | 检查 DASHSCOPE_API_KEY 和账户权限。 |
| 400 | 无效的请求模式或不支持的图像源 | 验证 messages 内容和图像 URL/路径格式。 |
| 429 | 速率限制 | 使用指数退避和降低并发度进行重试。 |
| 5xx | 临时后端问题 | 使用退避和幂等请求设计进行重试。 |
-latest。references/sources.mdreferences/api_reference.md每周安装量
175
代码仓库
GitHub 星标数
340
首次出现
12 天前
安全审计
安装于
gemini-cli174
github-copilot174
codex174
kimi-cli174
amp174
cursor174
Category: provider
mkdir -p output/alicloud-ai-multimodal-qwen-vl
python -m py_compile skills/ai/multimodal/alicloud-ai-multimodal-qwen-vl/scripts/analyze_image.py && echo "py_compile_ok" > output/alicloud-ai-multimodal-qwen-vl/validate.txt
Pass criteria: command exits 0 and output/alicloud-ai-multimodal-qwen-vl/validate.txt is generated.
output/alicloud-ai-multimodal-qwen-vl/.Use Qwen VL models for image input + text output understanding tasks via DashScope compatible-mode API.
Install dependencies (recommended in a venv):
python3 -m venv .venv . .venv/bin/activate python -m pip install requests
Set DASHSCOPE_API_KEY in environment, or add dashscope_api_key to ~/.alibabacloud/credentials.
Prefer the Qwen3 VL family:
qwen3-vl-plusqwen3-vl-flashWhen you need explicit "latest" routing or reproducible snapshots, use supported aliases/snapshots from the official model list, such as:
qwen3-vl-plus-latestqwen3-vl-plus-2025-12-19qwen3-vl-flash-latestLegacy names still seen in some workloads:
qwen-vl-max-latestqwen-vl-plus-latestprompt (string, required): user question/instruction about image.image (string, required): HTTPS URL, local path, or data: URL.model (string, optional): default qwen3-vl-plus.max_tokens (int, optional): default 512.temperature (float, optional): default 0.2.detail (string, optional): auto//, default .text (string): primary model answer.model (string): model actually used.usage (object): token usage if returned by backend.python skills/ai/multimodal/alicloud-ai-multimodal-qwen-vl/scripts/analyze_image.py \
--request '{"prompt":"Summarize the main content in this image","image":"https://example.com/demo.jpg"}' \
--print-response
Using local image:
python skills/ai/multimodal/alicloud-ai-multimodal-qwen-vl/scripts/analyze_image.py \
--request '{"prompt":"Extract key information from the image","image":"./samples/invoice.png","model":"qwen3-vl-plus"}' \
--print-response
Structured extraction (JSON mode):
python skills/ai/multimodal/alicloud-ai-multimodal-qwen-vl/scripts/analyze_image.py \
--request '{"prompt":"Extract fields: title, amount, date","image":"./samples/invoice.png"}' \
--json-mode \
--print-response
Structured extraction (JSON Schema):
python skills/ai/multimodal/alicloud-ai-multimodal-qwen-vl/scripts/analyze_image.py \
--request '{"prompt":"Extract invoice fields","image":"./samples/invoice.png"}' \
--schema skills/ai/multimodal/alicloud-ai-multimodal-qwen-vl/references/examples/invoice.schema.json \
--print-response
curl -sS https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model":"qwen3-vl-plus",
"messages":[
{
"role":"user",
"content":[
{"type":"image_url","image_url":{"url":"https://example.com/demo.jpg"}},
{"type":"text","text":"Describe this image and list executable actions"}
]
}
],
"max_tokens":512,
"temperature":0.2
}'
--output is set, JSON response is saved to that file.output/alicloud-ai-multimodal-qwen-vl/.python tests/ai/multimodal/alicloud-ai-multimodal-qwen-vl-test/scripts/smoke_test_qwen_vl.py \
--image ./tmp/vl_test_cat.png
| Error | Likely cause | Action |
|---|---|---|
| 401/403 | Missing or invalid key | Check DASHSCOPE_API_KEY and account permissions. |
| 400 | Invalid request schema or unsupported image source | Validate messages content and image URL/path format. |
| 429 | Rate limit | Retry with exponential backoff and lower concurrency. |
| 5xx | Temporary backend issue | Retry with backoff and idempotent request design. |
-latest.references/sources.mdreferences/api_reference.mdWeekly Installs
175
Repository
GitHub Stars
340
First Seen
12 days ago
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
gemini-cli174
github-copilot174
codex174
kimi-cli174
amp174
cursor174
Azure RBAC 权限管理工具:查找最小角色、创建自定义角色与自动化分配
101,200 周安装
lowhighautojson_mode (bool, optional): return JSON-only response when possible.schema (object, optional): JSON Schema for structured extraction.max_retries (int, optional): retry count for 429/5xx, default 2.retry_backoff_s (float, optional): exponential backoff base seconds, default 1.5.