⚠️

重要前提

安装AI Skills的关键前提是：必须科学上网，且开启TUN模式，这一点至关重要，直接决定安装能否顺利完成，在此郑重提醒三遍：科学上网，科学上网，科学上网。查看完整安装教程 →

.NET生产环境性能诊断：dotnet-trace-collect工具选择与数据收集指南

dotnet-trace-collect by dotnet/skills

138 周安装量

1,000 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/dotnet/skills --skill dotnet-trace-collect

可观测性 .NET 性能优化

🇨🇳中文介绍

.NET 跟踪收集

此技能通过为开发者的环境推荐合适的诊断工具、指导数据收集并建议分析方法，帮助开发者诊断生产环境性能问题。它不会分析代码中的反模式或自行执行分析。

使用时机

开发者需要调查生产环境性能问题（高 CPU、内存泄漏、请求缓慢、GC 频繁、网络错误等）
为特定的运行时、操作系统或部署拓扑选择正确的诊断工具
设置并运行诊断工具命令以收集数据
理解可用工具之间的权衡（例如 PerfView 与 dotnet-trace）
从容器化或 Kubernetes 工作负载中收集诊断信息

不适用时机

审查源代码中的性能反模式（请改用代码审查技能）
开发期间的基准测试（例如 BenchmarkDotNet 设置）
分析已收集的跟踪或转储文件（此技能推荐分析工具，但不执行分析）

输入

输入	必需	描述
症状	是	开发者观察到的现象（高 CPU、内存增长、请求缓慢、挂起、GC 频繁、HTTP 5xx 错误、网络超时、连接失败、程序集加载失败等）
运行时	是	.NET Framework 或现代 .NET（以及版本，特别是是否为 .NET 10+）
操作系统	是	Windows 或 Linux
部署方式	是	非容器、容器或 Kubernetes
管理员权限	推荐

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

相关 Skills

Vercel React 最佳实践指南 | Next.js 性能优化与代码规范

10,600 周安装

GSAP时间轴动画教程：创建多步骤序列动画与关键帧控制

4,100 周安装

GSAP React 动画库使用指南：useGSAP Hook 与最佳实践

3,900 周安装

GSAP 框架集成指南：Vue、Svelte 等框架中 GSAP 动画最佳实践

3,700 周安装

环境	参考文件
Windows + 现代 .NET + 管理员权限	`references/perfview.md`
Windows + 现代 .NET，无管理员权限	`references/dotnet-trace-collect.md`
Windows + .NET Framework	`references/perfview.md`
Linux + .NET 10+ + root 权限	`references/dotnet-trace-collect-linux.md`
Linux + .NET 10 之前版本	`references/dotnet-trace-collect.md`
Linux + 需要原生堆栈	`references/perfcollect.md`
容器/K8s（控制台访问）	`references/dotnet-trace-collect.md`（或 `dotnet-trace-collect-linux.md`）
容器/K8s（无控制台）	`references/dotnet-monitor.md`

环境	首选工具	备选方案 / 备注
Windows + 现代 .NET + 管理员权限	PerfView	如果无管理员权限，使用 `dotnet-trace`
Windows + .NET Framework + 管理员权限	PerfView	无管理员权限时，没有跟踪备选方案；对于挂起/内存泄漏，直接提供转储命令（`procdump -ma` 或任务管理器），因为 `dump-collect` 不支持 .NET Framework
Linux + .NET 10+ + root 权限	`dotnet-trace collect-linux`	如果无 root 权限或内核先决条件不满足，使用 `dotnet-trace`
Linux + .NET 10 之前版本	`dotnet-trace`	需要原生堆栈时添加 `perfcollect`（需要 root 权限）
Linux 容器/Kubernetes	如果在工作负载上下文中，使用控制台工具；如果无控制台访问，使用 `dotnet-monitor`	详见 Linux 容器 / Kubernetes 部分

PerfView（首选）— 生成更丰富的基于 ETW 的数据；需要管理员权限。对于请求缓慢，添加 /ThreadTime 以捕获线程级别的等待和阻塞详情。
dotnet-trace — 当没有管理员权限时的备选方案。
对于长时间运行的重现：使用带有 /StopOn 触发器的 PerfView，该触发器在你想要捕获的症状上触发（例如，/StopOnPerfCounter、/StopOnGCEvent、/StopOnException）并配合循环缓冲区（/CircularMB + /BufferSizeMB）。关键：停止触发器必须在感兴趣的事件上触发，而不是在恢复时。 循环缓冲区会持续覆盖旧数据，因此如果你在恢复时触发，收集停止时缓冲区可能已经覆盖了感兴趣的行为。仅当已知开始事件先于停止事件发生时，才添加 /StartOn。对于请求缓慢，默认情况下不要包含停止触发器 — 让用户根据其特定场景设计一个。

PerfView — Windows 上 .NET Framework 的主要诊断工具。需要管理员权限。
长时间重现的相同触发器指导：使用在症状上触发的 /StopOn 触发器（例如，/StopOnPerfCounter、/StopOnGCEvent、/StopOnException）配合 /CircularMB + /BufferSizeMB。
无管理员权限：PerfView 需要管理员权限，并且 .NET Framework 没有替代的跟踪工具。进程转储仍然可以在没有管理员权限的情况下捕获 — 直接提供转储命令（例如，procdump -ma <PID> 或任务管理器），因为 dump-collect 技能不支持 .NET Framework。转储可以帮助诊断挂起和内存泄漏。然而，对于高 CPU、请求缓慢和GC 频繁，在没有管理员访问权限的情况下，无法在 .NET Framework 上进行调查。建议用户获取管理员权限。

dotnet-trace collect-linux（.NET 10+ 且具有 root 权限）— 生成最丰富的跟踪，包括原生调用堆栈和内核事件。
dotnet-trace — 如果工具安装在镜像中，则在容器内使用。对于转储，调用 dump-collect 技能。
perfcollect — 当在 .NET 10 之前版本上需要原生堆栈时在容器内使用（需要 SYS_ADMIN / --privileged）。

捕获两个转储，在内存增长时（例如，一个早期，一个在显著增长后）。调用 dump-collect 技能进行转储收集 — 不要直接提供转储命令。在 PerfView 中比较转储以查看哪些对象增加了 — 这是识别泄漏内容的最有效方法。
无管理员权限：两个进程转储可以了解堆上正在增长的内容，但可能不足以识别根本原因。如果转储不够，请在具有管理员权限的环境中重现问题以收集更丰富的数据（跟踪）。
Linux 上的现代 .NET（.NET 10 之前版本）：推荐捕获两个转储（调用 dump-collect 技能）进行堆比较，同时在内存增长时使用 dotnet-trace（用于分配跟踪）。不需要触发器 — 在增长期间捕获。两者结合能提供最佳视图。
具有管理员权限的 Linux 上的现代 .NET 10+：推荐捕获两个转储（调用 dump-collect 技能）进行堆比较，同时在内存增长时使用 dotnet-trace collect-linux（更丰富的数据，包括原生堆栈）。不需要触发器。
.NET Framework：推荐捕获两个转储，同时在内存增长时使用 PerfView 跟踪以查看正在分配的内容。dump-collect 技能不支持 .NET Framework，因此直接提供转储命令（例如，procdump -ma <PID> 或在任务管理器中右键单击 → 创建转储文件）。不需要触发器 — 只需在增长期间捕获跟踪。不要等待 OutOfMemoryException。

Windows (PerfView)：使用 PerfView /ThreadTime collect /BufferSizeMB:1024 /CircularMB:2048。/ThreadTime 参数添加了线程级别的等待和阻塞详情。对于 ASP.NET Core，添加 Kestrel 提供程序：PerfView /ThreadTime collect /BufferSizeMB:1024 /CircularMB:2048 /Providers:*Microsoft.AspNetCore.Hosting,*Microsoft-AspNetCore-Server-Kestrel。默认情况下不要包含停止触发器 — 让用户根据其特定场景设计一个。
Linux (dotnet-trace)：dotnet-trace 默认捕获线程时间数据 — 不需要特殊参数。使用 dotnet-trace collect -p <PID>。对于 ASP.NET Core，添加 Kestrel 提供程序：dotnet-trace collect -p <PID> --providers Microsoft.AspNetCore.Hosting,Microsoft-AspNetCore-Server-Kestrel。
具有 root 权限的 Linux .NET 10+：使用 dotnet-trace collect-linux --profile thread-time 获取包含原生堆栈的更丰富数据。对于 ASP.NET Core，添加：--providers Microsoft.AspNetCore.Hosting,Microsoft-AspNetCore-Server-Kestrel。
容器：dotnet-monitor 可以通过其 REST API（/trace?pid=<PID>&durationSeconds=30）捕获跟踪。

从跟踪开始以了解线程正在做什么。使用适合环境的跟踪工具（Windows 上使用带 /ThreadTime 的 PerfView，Linux 上使用 dotnet-trace，具有 root 权限的 .NET 10+ Linux 上使用 dotnet-trace collect-linux --profile thread-time）。跟踪可以揭示： * 活锁（线程空转但没有进展）— 线程看起来很忙，但应用程序没有进展。 * 线程饥饿 — ThreadPool 已耗尽，排队的任务项未被处理。这可能看起来像死锁，但根本原因不同。 * 是否还有任何进展 — 如果某些线程正在取得进展，问题可能是瓶颈而不是真正的挂起。
如果跟踪无法解释挂起，问题可能是真正的死锁（线程在循环中相互等待）。在这种情况下，调用 dump-collect 技能来收集进程转储 — 不要直接提供转储命令。
使用调试器分析转储以检查线程堆栈并识别锁循环： * Windows：Visual Studio 或带有 SOS 调试器扩展的 WinDbg。 * Linux：带有 SOS 调试器扩展的 lldb。

提供程序	涵盖内容
`System.Net.Http`	HttpClient/SocketsHttpHandler — 请求生命周期、HTTP 状态码、连接池
`System.Net.NameResolution`	DNS 查找（开始/停止、持续时间）
`System.Net.Security`	TLS/SSL 握手（SslStream）
`System.Net.Sockets`	低级别套接字连接/断开

来自 System.Net.Http 的关键事件：RequestStart（方案、主机、端口、路径）、RequestStop（statusCode — 如果未收到响应则为 -1）、RequestFailed（超时、连接被拒绝等的异常消息）、RequestLeftQueue（等待连接池中连接的时间 — 表示连接池耗尽）、ConnectionEstablished、ConnectionClosed。

Windows (PerfView)：使用 PerfView /ThreadTime collect /BufferSizeMB:1024 /CircularMB:2048 /Providers:*System.Net.Http,*System.Net.NameResolution,*System.Net.Security,*System.Net.Sockets。对于 .NET Framework，省略 /Providers 标志 — /ThreadTime 已经包含了网络事件。线程时间跟踪显示线程在哪里被阻塞，而网络事件显示哪些请求失败以及原因。
Linux (dotnet-trace)：dotnet-trace 默认捕获线程时间数据，但指定 --providers 会覆盖默认值，因此你还必须包含 --profile：dotnet-trace collect -p <PID> --profile dotnet-common,dotnet-sampled-thread-time --providers System.Net.Http,System.Net.NameResolution,System.Net.Security,System.Net.Sockets。
具有 root 权限的 Linux .NET 10+：使用 dotnet-trace collect-linux --profile dotnet-common,cpu-sampling,thread-time --providers System.Net.Http,System.Net.NameResolution,System.Net.Security,System.Net.Sockets。
容器：dotnet-monitor 可以通过其 REST API 使用自定义提供程序捕获跟踪。

对于现代 .NET，程序集加载问题（FileNotFoundException、FileLoadException、ReflectionTypeLoadException、版本冲突、跨 AssemblyLoadContexts 的重复程序集加载）需要从具有 Loader 关键字（0x4）的 Microsoft-Windows-DotNETRuntime 提供程序收集程序集加载器绑定器事件。这些事件跟踪运行时程序集解析算法的每一步 — 探测了哪些路径、哪个 AssemblyLoadContext 处理了加载、加载是否成功以及失败原因。对于 .NET Framework，相同的提供程序和关键字适用于基于 ETW 的收集；此外，Fusion 日志查看器（fuslogvw.exe）可以在不需要跟踪的情况下诊断程序集绑定失败。

Windows (PerfView)：默认的 PerfView 跟踪已经包含绑定器事件 - 只需运行 PerfView collect，无需额外提供程序。对于较小的跟踪文件，使用 PerfView collect /ClrEvents:Default-Profile，这会移除最冗长的默认事件，同时保留诊断程序集加载问题所必需的事件。
Linux / 跨平台 (dotnet-trace)：使用 dotnet-trace collect --clrevents assemblyloader -- <path-to-built-exe> 来启动并跟踪进程，或使用 dotnet-trace collect --clrevents assemblyloader -p <PID> 附加到正在运行的进程。
具有 root 权限的 Linux .NET 10+：使用 dotnet-trace collect-linux --clrevents assemblyloader。
容器：dotnet-monitor 可以通过其 REST API 使用加载器提供程序捕获跟踪。

安装：如果工具尚未可用，如何安装（例如 dotnet tool install -g dotnet-trace）。当推荐多个工具时，为每个工具提供安装和使用说明 — 不要提及一个工具而不展示如何安装和使用它。
PID 发现（在任何 -p <PID> 命令之前必需）：首先验证目标进程（例如：dotnet-trace ps、curl <monitor-endpoint>/processes 或容器内的 ps）。如果应用程序在容器中预期是 PID 1，在收集之前仍然要验证。
收集命令：要运行的确切命令，包括相关的提供程序、输出格式和持续时间。
容器注意事项： * 从容器内部收集：确保工具安装在镜像中或使用 kubectl cp 复制进去。 * 从容器外部收集：使用 dotnet-monitor 作为边车，共享诊断端口（Unix 域套接字在 /tmp 中）。 * Kubernetes：dotnet-monitor 作为边车容器，或使用 kubectl debug 创建临时调试容器。
长时间运行的重现（Windows/PerfView）：展示如何使用触发器参数和循环缓冲区设置。
输出位置：收集的文件将保存在哪里，以及如何将其从目标机器复制出来进行分析。
工件交接清单：当将跟踪文件交给他人分析时，包括运行时版本、操作系统/内核、容器镜像标签或构建 SHA、PID/进程名称、UTC 收集开始/结束时间戳、使用的确切命令以及最终工件路径。

收集的数据	分析工具	备注
`.nettrace` 文件	PerfView (Windows), Speedscope (web)	PerfView 在 Windows 上提供最丰富的视图
`.etl` / `.etl.zip` 文件	PerfView	来自 PerfView 或 perfcollect 的 ETW 跟踪
来自 perfcollect 的 `perf.data.nl`	PerfView (Windows)	将文件复制到 Windows 机器并用 PerfView 打开

陷阱	解决方案
在 .NET Framework 上使用 `dotnet-trace`	`dotnet-trace` 仅适用于现代 .NET（.NET Core 3.0+）。对于 .NET Framework 使用 PerfView。
无管理员权限使用 PerfView	PerfView 需要管理员权限进行 ETW 跟踪。如果没有管理员权限，回退到 `dotnet-trace`。
容器中无 `SYS_ADMIN` 使用 `perfcollect`	容器默认丢弃 `SYS_ADMIN`。使用 `--privileged` 运行或添加 `SYS_ADMIN` 能力，或回退到 `dotnet-trace`。
长时间重现产生巨大的跟踪文件	在 Windows 上，使用 PerfView `/StopOn` 触发器，该触发器在你想捕获的症状上触发（例如，`/StopOnPerfCounter`、`/StopOnGCEvent`、`/StopOnException`）配合 `/CircularMB` 和 `/BufferSizeMB`。切勿在恢复时触发 — 循环缓冲区持续覆盖旧数据，因此感兴趣的行为可能在收集停止时已经丢失。
容器中诊断端口无法访问	将 `/tmp` 挂载为应用容器和 `dotnet-monitor` 边车之间的共享卷，用于诊断 Unix 域套接字。
忘记在容器镜像中安装工具	将 `dotnet tool install` 添加到你的 Dockerfile 中，或使用 `dotnet-monitor` 作为边车以避免修改应用镜像。
在生产环境中使用 `--no-auth` 暴露 `dotnet-monitor`	保持身份验证启用，绑定到 localhost，并使用 `kubectl port-forward` 进行访问。仅在短期的隔离调试中使用 `--no-auth`。
对于网络问题仅收集 CPU/线程时间跟踪	CPU 和线程时间跟踪本身不显示 HTTP 状态码、DNS 计时或连接池行为。在线程时间跟踪之外添加网络提供程序（`System.Net.Http`、`System.Net.NameResolution`、`System.Net.Security`、`System.Net.Sockets`）。
当只需要一个时启用所有网络提供程序	每个网络提供程序都会增加开销。如果问题明显是 HTTP 级别的（5xx 状态码），仅 `System.Net.Http` 可能就足够了。当根本原因不明确时，添加 DNS、TLS 和套接字提供程序。

🇺🇸English

.NET Trace Collect

This skill helps developers diagnose production performance issues by recommending the right diagnostic tools for their environment, guiding data collection, and suggesting analysis approaches. It does not analyze code for anti-patterns or perform the analysis itself.

When to Use

A developer needs to investigate a production performance issue (high CPU, memory leak, slow requests, excessive GC, networking errors, etc.)
Choosing the right diagnostic tool for a specific runtime, OS, or deployment topology
Setting up and running diagnostic tool commands for data collection
Understanding trade-offs between available tools (e.g. PerfView vs dotnet-trace)
Collecting diagnostics from containerized or Kubernetes workloads

When Not to Use

Reviewing source code for performance anti-patterns (use a code review skill instead)
Benchmarking during development (e.g. BenchmarkDotNet setup)
Analyzing collected trace or dump files (this skill recommends tools for analysis, but does not perform it)

Inputs

Input	Required	Description
Symptom	Yes	What the developer is observing (high CPU, memory growth, slow requests, hangs, excessive GC, HTTP 5xx errors, networking timeouts, connection failures, assembly loading failures, etc.)
Runtime	Yes	.NET Framework or modern .NET (and version, especially whether .NET 10+)
OS	Yes	Windows or Linux
Deployment	Yes	Non-container, container, or Kubernetes
Admin privileges	Recommended	Whether the developer has admin/root access on the target machine
Repro characteristics	Recommended	Whether the issue is easy to reproduce or requires a long time to manifest

Workflow

Step 1: Understand the environment

Determine or ask the developer to clarify:

Symptom : What they are observing (high CPU, memory leak, slow requests, hangs, excessive GC, HTTP 5xx errors, networking timeouts, connection failures, assembly loading failures, etc.)
Runtime : .NET Framework or modern .NET? If modern .NET, which version? (Especially whether .NET 10 or later.)
OS : Windows or Linux?
Deployment : Running directly on the host, in a container, or in Kubernetes?
Admin privileges : Do they have admin/root access on the target machine or container?
Repro characteristics : Does the issue reproduce quickly, or does it take a long time to manifest?
Workload context : Determine or ask the user if you are running in the context of the workload (i.e., on the same machine or connected to the same environment where the issue is occurring). If so, you can run diagnostic commands directly on their behalf. If not, provide the commands as guidance for the user to run themselves.

Use this information to select the right tool in Step 2.

Step 2: Recommend diagnostic tools

Select tools based on the environment using the priority rules below. Once a tool is selected, load the corresponding reference file for detailed command-line usage.

Tool reference lookup

Environment	Reference file(s)
Windows + modern .NET + admin	`references/perfview.md`
Windows + modern .NET, no admin	`references/dotnet-trace-collect.md`
Windows + .NET Framework	`references/perfview.md`
Linux + .NET 10+ + root	`references/dotnet-trace-collect-linux.md`
Linux + pre-.NET 10	`references/dotnet-trace-collect.md`
Linux + native stacks needed	`references/perfcollect.md`

Quick decision matrix (first-pass triage)

Environment	Preferred tool	Fallback / Notes
Windows + modern .NET + admin	PerfView	If admin is unavailable, use `dotnet-trace`
Windows + .NET Framework + admin	PerfView	Without admin, there is no trace fallback; for hangs/memory leaks, provide dump commands directly (`procdump -ma` or Task Manager) since `dump-collect` does not support .NET Framework
Linux + .NET 10+ + root	`dotnet-trace collect-linux`	Use `dotnet-trace` if root or kernel prerequisites are not met
Linux + pre-.NET 10

Windows (non-container, modern .NET)

PerfView (preferred) — produces richer ETW-based data; requires admin privileges. For slow requests , add /ThreadTime to capture thread-level wait and block detail.
dotnet-trace — fallback when admin privileges are not available.
For long-running repros : use PerfView with a /StopOn trigger that fires on the symptom you want to capture (e.g., /StopOnPerfCounter, /StopOnGCEvent, /StopOnException) and a circular buffer (/CircularMB + /BufferSizeMB). Critical: the stop trigger must fire on the interesting event, not the recovery. The circular buffer continuously overwrites old data, so if you trigger on recovery, the buffer may have already overwritten the interesting behavior by the time collection stops. Only add if the start event is known to precede the stop event. For , do not include a stop trigger by default — let the user design one based on their specific scenario.

Windows containers

PerfView — most Windows containers (including Kubernetes on Windows) use process-isolation by default. Collect from the host with /EnableEventsInContainers. After collection, you have two options:
- Analyze locally while the container is still running — PerfView can reach into the live container to resolve symbols, so you can open the trace immediately on the host machine.
- Analyze off-machine — before the container shuts down, copy the .etl.zip into the container and run PerfViewCollect merge /ImageIDsOnly inside it to embed symbol information. Then copy the merged trace out. Without this merge step, symbols for binaries inside the container will be unresolvable on other machines.

For the less common Hyper-V containers, collect inside the container directly. See references/perfview.md for detailed commands.

dotnet-monitor , dotnet-trace — inside the container if the tools are installed in the image. For dumps, invoke the dump-collect skill.

Windows (.NET Framework)

PerfView — the primary diagnostic tool for .NET Framework on Windows. Requires admin.
Same trigger guidance for long repros: use /StopOn triggers that fire on the symptom (e.g., /StopOnPerfCounter, /StopOnGCEvent, /StopOnException) with /CircularMB + /BufferSizeMB.
Without admin : PerfView requires admin, and there are no alternative trace tools for .NET Framework. Process dumps can still be captured without admin — provide dump commands directly (e.g., procdump -ma <PID> or Task Manager) since the dump-collect skill does not support .NET Framework. Dumps can help diagnose hangs and memory leaks. However, for high CPU , slow requests , and , there is no way to investigate on .NET Framework without admin access. Advise the user to obtain admin privileges.

Linux (non-container, .NET 10+)

dotnet-trace collect-linux (preferred) — uses perf_events for richer traces including native call stacks and kernel events. Captures machine-wide by default (no PID required). Requires root and kernel >= 6.4.
dotnet-trace — fallback when root privileges are not available or kernel requirements are not met. Managed stacks only.

Linux (non-container, pre-.NET 10)

dotnet-trace (preferred) — managed trace collection; no admin required.
perfcollect — when native call stacks are needed (requires admin/root).

Linux Container / Kubernetes

If running in the context of the workload (i.e., you have console access to the container), prefer console-based tools. These are easier to set up than dotnet-monitor, which requires authentication configuration and sidecar deployment:

dotnet-trace collect-linux (.NET 10+ with root) — produces the richest traces including native call stacks and kernel events.
dotnet-trace — inside the container if the tool is installed in the image. For dumps, invoke the dump-collect skill.
perfcollect — inside the container when native stacks are needed on pre-.NET 10 (requires SYS_ADMIN / --privileged).

If not running in the workload context (no console access), or if dotnet-monitor is already deployed:

dotnet-monitor — designed for containers; runs as a sidecar. No tools needed in the app container. Easiest option when console access is not available.

Memory dumps

When dumps are needed (memory leaks, hangs), do not provide dump collection commands directly for modern .NET — invoke the dump-collect skill instead. The dump-collect skill only supports modern .NET (.NET Core 3.0+). For .NET Framework , provide dump collection guidance directly (e.g., procdump -ma <PID> or Task Manager). This skill focuses on trace collection only.

Memory leaks

Capture two dumps as memory is increasing (e.g., one early, one after significant growth). Invoke the dump-collect skill for dump collection — do not provide dump commands directly. Diff the dumps in PerfView to see which objects have increased — this is the most effective way to identify what is leaking.
Without admin privileges : Two process dumps can give a sense of what's growing on the heap, but may not be enough to identify the root cause. If dumps aren't sufficient, reproduce the issue in an environment where admin privileges are available to collect richer data (traces).
Modern .NET on Linux (pre-.NET 10) : Recommend two dump captures (invoke dump-collect skill) for heap diff, plus dotnet-trace while memory is growing (for allocation tracking). No trigger needed — capture during the growth period. Both together give the best picture.
Modern .NET 10+ on Linux with admin : Recommend two dump captures (invoke dump-collect skill) for heap diff, plus dotnet-trace collect-linux while memory is growing (richer data including native stacks). No trigger needed.
.NET Framework : Recommend two dumps plus a PerfView trace while memory is growing to see what is being allocated. The dump-collect skill does not support .NET Framework, so provide dump commands directly (e.g., or right-click → Create Dump File in Task Manager). No trigger is needed — just capture the trace during the growth period. Do not wait for an .

Excessive GC

Excessive GC requires a trace to analyze GC events, pause times, and allocation patterns — a dump is not sufficient.

Windows (PerfView) : Use PerfView collect /GCCollectOnly to capture GC events.
Linux (dotnet-trace) : Use dotnet-trace collect -p <PID> --profile gc-verbose.
Linux .NET 10+ with root : Use dotnet-trace collect-linux --profile gc-verbose for richer data with native stacks.
Containers : dotnet-monitor can capture GC traces via its REST API (/trace?profile=gc-verbose).

Slow Requests

Slow requests require a thread time trace to see where threads are spending time — waiting on locks, I/O, external calls, etc. Use larger buffers since thread time traces generate more data. For ASP.NET Core applications, also enable Microsoft.AspNetCore.Hosting and Microsoft-AspNetCore-Server-Kestrel providers to get server-side request lifecycle timing (when requests arrive, how long they take to process).

Windows (PerfView) : Use PerfView /ThreadTime collect /BufferSizeMB:1024 /CircularMB:2048. The /ThreadTime argument adds thread-level wait and block detail. For ASP.NET Core, add Kestrel providers: PerfView /ThreadTime collect /BufferSizeMB:1024 /CircularMB:2048 /Providers:*Microsoft.AspNetCore.Hosting,*Microsoft-AspNetCore-Server-Kestrel. Do not include a stop trigger by default — let the user design one based on their specific scenario.
Linux (dotnet-trace) : dotnet-trace captures thread time data by default — no special arguments needed. Use dotnet-trace collect -p <PID>. For ASP.NET Core, add Kestrel providers: dotnet-trace collect -p <PID> --providers Microsoft.AspNetCore.Hosting,Microsoft-AspNetCore-Server-Kestrel.
Linux .NET 10+ with root : Use dotnet-trace collect-linux --profile thread-time for richer data with native stacks. For ASP.NET Core, add: .

Hangs

Start with a trace to understand what threads are doing. Use the appropriate trace tool for the environment (PerfView with /ThreadTime on Windows, dotnet-trace on Linux, dotnet-trace collect-linux --profile thread-time on .NET 10+ Linux with root). The trace can reveal:
- Livelocks (threads spinning without forward progress) — threads appear busy but the application makes no progress.
- Thread starvation — the ThreadPool is exhausted and queued work items are not being processed. This can look like a deadlock but has a different root cause.
- Whether there is any forward progress at all — if some threads are making progress, the issue may be a bottleneck rather than a true hang.
If the trace does not explain the hang , the issue may be a true deadlock (threads waiting on each other in a cycle). In this case, invoke the dump-collect skill to collect a process dump — do not provide dump commands directly.
Analyze the dump with a debugger to inspect thread stacks and identify the lock cycle:
- Windows : Visual Studio or WinDbg with the SOS debugger extension.

Networking Issues

Networking issues (HTTP 5xx errors from downstream services, request timeouts, connection failures, DNS resolution failures, TLS handshake failures, connection pool exhaustion) require both a thread-time trace and networking event providers. The thread-time trace shows where threads are blocked (slow downstream calls, thread starvation), while the networking events show the request lifecycle — which requests failed, what status codes came back, how long DNS resolution and TLS handshakes took, and how long requests waited for a connection from the pool.

For .NET Framework , PerfView /ThreadTime already collects the relevant networking events (from the System.Net ETW provider) — no additional providers are needed.

For modern .NET , you must explicitly enable the System.Net.* EventSource providers:

Provider	What it covers
`System.Net.Http`	HttpClient/SocketsHttpHandler — request lifecycle, HTTP status codes, connection pool
`System.Net.NameResolution`	DNS lookups (start/stop, duration)
`System.Net.Security`	TLS/SSL handshakes (SslStream)
`System.Net.Sockets`	Low-level socket connect/disconnect

Key events from System.Net.Http: RequestStart (scheme, host, port, path), RequestStop (statusCode — -1 if no response was received), RequestFailed (exception message for timeouts, connection refused, etc.), RequestLeftQueue (time waiting for a connection from the pool — indicates connection pool exhaustion), ConnectionEstablished, ConnectionClosed.

Collect a thread-time trace with networking providers enabled (modern .NET only — .NET Framework needs only PerfView /ThreadTime):

Windows (PerfView) : Use PerfView /ThreadTime collect /BufferSizeMB:1024 /CircularMB:2048 /Providers:*System.Net.Http,*System.Net.NameResolution,*System.Net.Security,*System.Net.Sockets. For .NET Framework, omit the /Providers flag — /ThreadTime already includes the networking events. The thread-time trace shows where threads are blocked while the networking events show what requests are failing and why.
Linux (dotnet-trace) : dotnet-trace captures thread time data by default, but specifying --providers overrides the defaults so you must also include --profile: dotnet-trace collect -p <PID> --profile dotnet-common,dotnet-sampled-thread-time --providers System.Net.Http,System.Net.NameResolution,System.Net.Security,System.Net.Sockets.
: Use .

Assembly Loading Issues

For modern .NET, assembly loading issues (FileNotFoundException, FileLoadException, ReflectionTypeLoadException, version conflicts, duplicate assembly loads across AssemblyLoadContexts) require collecting assembly loader binder events from the Microsoft-Windows-DotNETRuntime provider with the Loader keyword (0x4). These events trace every step of the runtime's assembly resolution algorithm — which paths were probed, which AssemblyLoadContext handled the load, whether the load succeeded or failed, and why. For .NET Framework, the same provider and keyword work for ETW-based collection; additionally, the Fusion Log Viewer (fuslogvw.exe) can diagnose assembly binding failures without requiring a trace.

The provider specification is Microsoft-Windows-DotNETRuntime:0x4:4 (provider name, AssemblyLoader keyword, Informational verbosity).

Windows (PerfView) : A default PerfView trace already includes binder events - simply run PerfView collect with no extra providers. For a smaller trace file, use PerfView collect /ClrEvents:Default-Profile, which removes the most verbose default events while keeping the events necessary for diagnosing assembly loading issues.
Linux / cross-platform (dotnet-trace) : Use dotnet-trace collect --clrevents assemblyloader -- <path-to-built-exe> to launch and trace the process, or dotnet-trace collect --clrevents assemblyloader -p <PID> to attach to a running process.
Linux .NET 10+ with root : Use dotnet-trace collect-linux --clrevents assemblyloader.
Containers : dotnet-monitor can capture traces with the loader provider via its REST API.

For short-lived processes that fail on startup (common with assembly loading issues), prefer the dotnet-trace launch form (-- <path-to-built-exe>) over attaching by PID, since the process may exit before you can attach.

Explain the trade-offs when recommending a tool. For example:

PerfView gives richer data but needs admin; runs on Windows including Windows containers.
dotnet-trace works cross-platform without admin but captures less system-level detail.
perfcollect captures native call stacks but needs admin/root.
dotnet-monitor is the best option for containers/K8s when console access is not available, but requires sidecar deployment and authentication configuration.

Step 3: Guide data collection

Provide the specific commands for the recommended tool. Load the appropriate reference file from the tool reference lookup table for detailed command-line examples.

Key guidance to include:

Installation : How to install the tool if it is not already available (e.g. dotnet tool install -g dotnet-trace). When recommending multiple tools, provide installation and usage instructions for each one — do not mention a tool without showing how to install and use it.
PID discovery (required before any-p <PID> command): Verify the target process first (for example: dotnet-trace ps, curl <monitor-endpoint>/processes, or ps inside a container). If the app is expected to be PID 1 in a container, still verify before collecting.
Collection command : The exact command to run, including relevant providers, output format, and duration.
Container considerations :
- Collecting from inside the container: ensure the tool is installed in the image or use kubectl cp to copy it in.
- Collecting from outside the container: use as a sidecar with a shared diagnostic port (Unix domain socket in ).

Step 4: Recommend analysis approach

After data is collected, recommend the appropriate tool for analysis. Do not perform the analysis — just point the developer to the right tool and documentation.

Collected Data	Analysis Tool	Notes
`.nettrace` file	PerfView (Windows), Speedscope (web)	PerfView gives the richest view on Windows
`.etl` / `.etl.zip` file	PerfView	ETW traces from PerfView or perfcollect
`perf.data.nl` from perfcollect	PerfView (Windows)	Copy the file to a Windows machine and open with PerfView

Validation

The recommended tool is compatible with the developer's runtime, OS, and deployment topology
The collection command runs without errors
The output file is generated in the expected location
The developer knows which analysis tool to use for the collected data

Common Pitfalls

Pitfall	Solution
Using `dotnet-trace` on .NET Framework	`dotnet-trace` only works with modern .NET (.NET Core 3.0+). Use PerfView for .NET Framework.
PerfView without admin privileges	PerfView requires admin for ETW tracing. Fall back to `dotnet-trace` if admin is not available.
`perfcollect` in container without `SYS_ADMIN`	Containers drop `SYS_ADMIN` by default. Run with `--privileged` or add `SYS_ADMIN` capability, or fall back to .

Weekly Installs

Repository

dotnet/skills

GitHub Stars

703

First Seen

Mar 10, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

opencode55

kimi-cli54

github-copilot54

amp54

cline54

codex54

.NET生产环境性能诊断：dotnet-trace-collect工具选择与数据收集指南

🇨🇳中文介绍

.NET 跟踪收集

使用时机

不适用时机

输入

相关 Skills

工作流程

步骤 1：了解环境

步骤 2：推荐诊断工具

工具参考查找

快速决策矩阵（初步分类）

Windows（非容器，现代 .NET）

Windows 容器

Windows（.NET Framework）

Linux（非容器，.NET 10+）

Linux（非容器，.NET 10 之前版本）

Linux 容器 / Kubernetes

内存转储

内存泄漏

GC 频繁

请求缓慢

挂起

网络问题

程序集加载问题

步骤 3：指导数据收集

步骤 4：推荐分析方法

验证

常见陷阱