triaging-issues by pytorch/pytorch
npx skills add https://github.com/pytorch/pytorch --skill triaging-issues包含钩子
此技能使用 Claude 钩子,可以自动响应事件执行代码。安装前请仔细审查。
此技能通过路由问题、应用标签和留下首轮回复,帮助对 GitHub issue 进行分类处理。
标签参考: 查看 labels.json 获取适用于分类处理的完整 305 个标签目录。仅应用此文件中存在的标签。 不要发明或猜测标签名称。此文件排除了 CI 触发器、测试配置、发布说明和已弃用的标签。
PT2 分类指南: 查看 pt2-triage-rubric.md 获取处理 PT2/torch.compile 问题时详细的标签应用指南。
回复模板: 查看 templates.json 获取标准回复消息。
使用以下 GitHub MCP 工具进行分类处理:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| 工具 | 用途 |
|---|---|
mcp__github__issue_read | 获取 issue 详情、评论和现有标签 |
mcp__github__issue_write | 应用标签或关闭 issue |
mcp__github__add_issue_comment | 添加评论(仅用于重定向问题) |
mcp__github__search_issues | 查找类似 issue 以获取上下文 |
| 前缀/类别 | 原因 |
|---|---|
不在 labels.json 中的标签 | 仅应用白名单中存在的标签 |
ciflow/* | 仅用于 PR 的 CI 作业触发器 |
test-config/* | 仅用于 PR 的测试套件选择器 |
release notes: * | 为发布说明自动分配 |
ci-*, ci:* | CI 基础设施控制 |
sev* | 严重性标签需要人工决定 |
merge blocking | 需要人工决定 |
| 任何包含 "deprecated" 的标签 | 已过时 |
oncall: releng | 不是分类重定向目标。请改用 module: ci |
如果被阻止: 当钩子阻止添加标签时,仅添加 triage review 并停止。将由人工处理。
这些规则由一个 PreToolUse 钩子强制执行,该钩子会根据 labels.json 验证所有标签。
如果人工已经应用了标签(尤其是 ci: sev、严重性标签或优先级标签),请勿移除或替换它们。你的工作是补充,而不是覆盖。
如果 issue 已有任何 oncall: 标签,请完全跳过它。 不要:
triaged该 issue 属于次级 oncall 团队。他们负责自己的队列。
templates.json 中的 redirect_to_forum 模板。request_more_info 模板请求更多信息并停止。检查 issue 正文是否包含指向外部文件的链接,用户需要下载这些文件才能复现。
需要检测的模式:
.zip、.pt、.pth、.pkl、.safetensors、.onnx、.bin 文件操作:
[链接已移除 - 出于安全原因,不允许外部文件下载]needs reproduction 标签templates.json 中的 needs_reproduction 模板请求一个自包含的复现示例triaged — 等待用户提供可复现的示例在以下情况也添加 needs reproduction:
如果 issue 涉及极值或数值精度差异:
需要检测的模式:
torch.finfo(dtype).max 或 torch.finfo(dtype).min 的值重要 — 避免基于关键字触发的错误标记:
根据根本原因标记,而不是根据错误或标题中出现的关键字。关键字告诉你什么失败了,而不是为什么失败。
import torch 时出现的 undefined symbol: ncclAlltoAll 错误是打包问题(module: binaries),而不是分布式训练 bug — 用户从未运行分布式代码。nan 不是 module: NaNs and Infs,除非 bug 实际上是关于 NaN 传播。autograd 的堆栈跟踪并不意味着 module: autograd — 检查 bug 是在 autograd 本身还是仅仅在调用路径上。module: tests,而不是 module: numerical-stability。问:“修复需要在何处进行?” 这决定了标签。
操作:
module: edge cases 标签topic: fuzzertemplates.json 中的 numerical_accuracy 模板链接到文档如果 issue 属于其他仓库(vision/text/audio/RL/ExecuTorch 等),转移该 issue 并停止。
PT2 不是重定向。 oncall: pt2 与步骤 3 中的其他 oncall 标签不同。PT2 问题继续执行步骤 4–7 以完成分类 — 添加 oncall: pt2,然后继续使用 module: 标签进行标记,标记为 triaged 等。
查看 pt2-triage-rubric.md 获取关于应用哪些 module: 标签的详细决策指南。
关键: 当将 issue 重定向到非 PT2 oncall 队列时,应用恰好一个 oncall: ... 标签并停止。不要:
module: 标签triaged次级 oncall 团队将处理他们自己的分类。你的工作只是将其路由给他们。
| 标签 | 何时使用 |
|---|---|
oncall: jit | TorchScript 问题 |
oncall: distributed | 分布式训练(DDP、FSDP、RPC、c10d、DTensor、DeviceMesh、对称内存、上下文并行、流水线) |
oncall: export | torch.export 问题 |
oncall: quantization | 量化问题 |
oncall: mobile | 移动端(iOS/Android),不包括 ExecuTorch |
oncall: profiler | 性能分析器问题(CPU、GPU、Kineto) |
oncall: visualization | TensorBoard 集成 |
需要避免的常见路由错误:
oncall: mobile。MPS 问题保留在通用队列中,使用 module: mps。oncall: distributed。 DTensor 问题应始终路由到 oncall: distributed,即使它们没有提到 DDP/FSDP。module: onnx。 没有 oncall: onnx。使用 module: onnx 并保留在通用队列中。module: ci。 不要使用 oncall: releng。对于 CI 基础设施问题,使用 module: ci。torch.compile 错误处理分布式操作(例如 dist.all_reduce)时,该问题通常需要同时添加 oncall: pt2 和 oncall: distributed,因为修复可能涉及两个代码库。注意: oncall: cpu inductor 是 PT2 的一个子队列。对于通用分类,只需使用 oncall: pt2。
仅当 issue 保留在通用队列中时:
module: ... 标签labels.json 中的描述,了解特定标签何时取代通用标签的指导(例如,对于 SDPA 问题,使用 module: sdpa 而不是 module: nn;对于灵活注意力问题,使用 module: flex attention 而不是 module: nn)。feature — 全新的功能,目前以任何形式都不存在enhancement — 对已有效工作的内容的改进(例如,为已通过回退/组合方式运行的操作添加原生后端内核、性能优化、更好的错误消息)。如果增强是关于性能的,也添加 module: performance。function request — 新函数或现有函数的新参数/模式enhancement,而不是 feature常被遗漏的标签 — 务必检查这些:
| 条件 | 标签 |
|---|---|
| 段错误、非法内存访问、SIGSEGV | module: crash |
| 性能问题:回归、速度变慢或优化请求 | module: performance |
| Windows 上的问题 | module: windows |
| 先前有效的功能现在损坏 | module: regression |
| 先前有效的损坏文档/链接 | module: docs + module: regression(不是 enhancement) |
| 关于测试失败的问题(不是底层功能) | module: tests |
| 反向传播/梯度计算 bug | module: autograd(除了操作的模块标签外) |
torch.linalg 操作或线性代数操作(solve、svd、eig、inv 等) | module: linear algebra |
has workaround | 仅当变通方法是非平凡且非显而易见时才添加。如果问题是“X 对非连续张量无效”,调用 .contiguous() 是 bug 的同义反复逆,而不是变通方法。真正的变通方法是诸如安装特定包版本、添加同步点、插入 gc.collect() 或使用 bug 描述中未明确暗示的不同 API。 |
根据实际的 bug 标记,而不是关键字。 阅读 issue 以了解实际损坏的内容。一个关于广播的 bug,碰巧在参数名中提到了 "nan",是前端 bug,而不是 NaN/Inf bug。
关键: 如果你认为 issue 是高优先级,你必须:
triage review 标签,并且不要添加 triaged未经人工确认,不要直接添加 high priority。
高优先级标准:
bot-triaged 标签会在任何 issue 变更后由后置钩子自动应用。你无需手动添加。
如果未转移/重定向且未标记为待审核,则添加 triaged。
禁止:
high priority允许:
triage review 以引起人工注意feature、enhancement、function request)triaged 标签注意: bot-triaged 会在任何 issue 变更后由后置钩子自动应用。
每周安装数
173
仓库
GitHub 星标数
98.5K
首次出现
2026年1月29日
安全审计
安装于
opencode169
gemini-cli168
codex167
cursor166
github-copilot165
claude-code163
Contains Hooks
This skill uses Claude hooks which can execute code automatically in response to events. Review carefully before installing.
This skill helps triage GitHub issues by routing issues, applying labels, and leaving first-line responses.
Labels reference: See labels.json for the full catalog of 305 labels suitable for triage. ONLY apply labels that exist in this file. Do not invent or guess label names. This file excludes CI triggers, test configs, release notes, and deprecated labels.
PT2 triage guide: See pt2-triage-rubric.md for detailed labeling guidance when triaging PT2/torch.compile issues.
Response templates: See templates.json for standard response messages.
Use these GitHub MCP tools for triage:
| Tool | Purpose |
|---|---|
mcp__github__issue_read | Get issue details, comments, and existing labels |
mcp__github__issue_write | Apply labels or close issues |
mcp__github__add_issue_comment | Add comment (only for redirecting questions) |
mcp__github__search_issues | Find similar issues for context |
| Prefix/Category | Reason |
|---|---|
Labels not in labels.json | Only apply labels that exist in the allowlist |
ciflow/* | CI job triggers for PRs only |
test-config/* | Test suite selectors for PRs only |
release notes: * | Auto-assigned for release notes |
ci-*, ci:* | CI infrastructure controls |
sev* |
If blocked: When a label is blocked by the hook, add ONLY triage review and stop. A human will handle it.
These rules are enforced by a PreToolUse hook that validates all labels against labels.json.
If a human has already applied labels (especially ci: sev, severity labels, or priority labels), do NOT remove or replace them. Your job is to supplement, not override.
If an issue already has ANYoncall: label, SKIP IT entirely. Do not:
triagedThat issue belongs to the sub-oncall team. They own their queue.
redirect_to_forum template from templates.json.request_more_info template and stop.Check if the issue body contains links to external files that users would need to download to reproduce.
Patterns to detect:
.zip, .pt, .pth, .pkl, .safetensors, .onnx, .bin filesAction:
[Link removed - external file downloads are not permitted for security reasons]needs reproduction labelneeds_reproduction template from templates.json to request a self-contained reproductiontriaged — wait for the user to provide a reproducible exampleAlso add needs reproduction when:
If the issue involves extremal values or numerical precision differences:
Patterns to detect:
torch.finfo(dtype).max or torch.finfo(dtype).minIMPORTANT — avoid keyword-triggered mislabeling:
Label based on the root cause , not keywords that appear in the error or title. A keyword tells you what failed, not why.
undefined symbol: ncclAlltoAll error at import torch is a packaging issue (module: binaries), not a distributed training bug — the user never ran distributed code.nan in a parameter name or tolerance check is not module: NaNs and Infs unless the bug is actually about NaN propagation.autograd does not mean module: autograd — check whether the bug is in autograd itself or just on the call path.module: tests, not module: numerical-stability.Ask: "Where would the fix need to be made?" That determines the label.
Action:
module: edge cases labeltopic: fuzzernumerical_accuracy template from templates.json to link to the docsIf the issue belongs in another repo (vision/text/audio/RL/ExecuTorch/etc.), transfer the issue and STOP.
PT2 is NOT a redirect. oncall: pt2 is not like the other oncall labels in Step 3. PT2 issues continue through Steps 4–7 for full triage — add oncall: pt2, then proceed to label with module: labels, mark triaged, etc.
See pt2-triage-rubric.md for detailed labeling decisions on which module: labels to apply.
CRITICAL: When redirecting issues to a non-PT2 oncall queue, apply exactly one oncall: ... label and STOP. Do NOT:
module: labelstriagedThe sub-oncall team will handle their own triage. Your job is only to route it to them.
| Label | When to use |
|---|---|
oncall: jit | TorchScript issues |
oncall: distributed | Distributed training (DDP, FSDP, RPC, c10d, DTensor, DeviceMesh, symmetric memory, context parallel, pipelining) |
oncall: export | torch.export issues |
oncall: quantization | Quantization issues |
oncall: mobile | Mobile (iOS/Android), excludes ExecuTorch |
oncall: profiler |
Common routing mistakes to avoid:
oncall: mobile. MPS issues stay in the general queue with module: mps.oncall: distributed. DTensor issues should always be routed to oncall: distributed, even if they don't mention DDP/FSDP.module: onnx. There is no oncall: onnx. Use module: onnx and keep in the general queue.module: ci. Do not use oncall: releng. Use for CI infrastructure issues.Note: oncall: cpu inductor is a sub-queue of PT2. For general triage, just use oncall: pt2.
Only if the issue stays in the general queue:
module: ... labels based on the affected arealabels.json descriptions for guidance on when a specific label supersedes a general one (e.g., module: sdpa instead of module: nn for SDPA issues, module: flex attention instead of module: nn for flex attention).feature — wholly new functionality that does not exist today in any formenhancement — improvement to something that already works (e.g., adding a native backend kernel for an op that already runs via fallback/composite, performance optimization, better error messages). If the enhancement is about performance, also add module: performance.Commonly missed labels — always check for these:
| Condition | Label |
|---|---|
| Segfault, illegal memory access, SIGSEGV | module: crash |
| Performance issue: regression, slowdown, or optimization request | module: performance |
| Issue on Windows | module: windows |
| Previously working feature now broken | module: regression |
| Broken docs/links that previously worked | module: docs + module: regression (NOT enhancement) |
Label based on the actual bug, not keywords. Read the issue to understand what is actually broken. A bug about broadcasting that happens to mention "nan" in a parameter name is a frontend bug, not a NaN/Inf bug.
CRITICAL: If you believe an issue is high priority, you MUST:
triage review label and do not add triagedDo NOT directly add high priority without human confirmation.
High priority criteria:
The bot-triaged label is automatically applied by a post-hook after any issue mutation. You do not need to add it manually.
If not transferred/redirected and not flagged for review, add triaged.
DO NOT:
high priority directly without human confirmationDO:
triage review for human attentionfeature, enhancement, function request) when confidenttriaged label when classification is completeNote: bot-triaged is automatically applied by a post-hook after any issue mutation.
Weekly Installs
173
Repository
GitHub Stars
98.5K
First Seen
Jan 29, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
opencode169
gemini-cli168
codex167
cursor166
github-copilot165
claude-code163
| Severity labels require human decision |
merge blocking | Requires human decision |
| Any label containing "deprecated" | Obsolete |
oncall: releng | Not a triage redirect target. Use module: ci instead |
| Profiler issues (CPU, GPU, Kineto) |
oncall: visualization | TensorBoard integration |
module: citorch.compile mishandles a distributed op (e.g., dist.all_reduce), the issue typically needs BOTH oncall: pt2 and oncall: distributed since the fix may span both codebases.function request — a new function or new arguments/modes for an existing functionenhancement, not feature| Issue about a test failing (not the underlying functionality) | module: tests |
| Backward pass / gradient computation bug | module: autograd (in addition to the op's module label) |
torch.linalg ops or linear algebra ops (solve, svd, eig, inv, etc.) | module: linear algebra |
has workaround | Only add when the workaround is non-trivial and non-obvious. If the issue is "X doesn't work for non-contiguous tensors," calling .contiguous() is the tautological inverse of the bug, not a workaround. A real workaround is something like installing a specific package version, adding a synchronization point, inserting gc.collect(), or using a different API that isn't obviously implied by the bug description. |