npx skills add https://github.com/jezweb/claude-skills --skill ux-audit像真实用户一样浏览 Web 应用——带着他们的目标、耐心和上下文。超越“它是否能用?”而关注“它是否好用?”,通过追踪情感摩擦(信任、焦虑、困惑)、计算点击效率、测试恢复能力,并询问终极问题:“我会回来使用吗?”使用 Chrome MCP(用于已登录会话的认证应用)或 Playwright 进行浏览器自动化。生成结构化的审计报告,并按影响程度对发现的问题进行排序。
在启动任何模式之前,先检测可用的浏览器工具:
mcp__claude-in-chrome__*) — 适用于认证应用的首选。使用用户已登录的 Chrome 会话,因此 OAuth/ Cookie 可以直接工作。mcp__plugin_playwright_playwright__*) — 适用于公共应用或并行会话。如果没有可用的工具,请告知用户并建议安装 Chrome MCP 或 Playwright。
有关特定工具的命令,请参阅 references/browser-tools.md。
如果用户未提供 URL,则自动查找一个。优先使用已部署/线上版本——那是真实用户看到的。
检查 wrangler.jsonc 中的自定义域名或 routes:
grep -E '"pattern"|"custom_domain"' wrangler.jsonc 2>/dev/null
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
如果找到,则使用生产环境 URL(例如 https://app.example.com)。
检查 CLAUDE.md、README 或 package.json 的 homepage 字段中的已部署 URL。
回退到本地开发服务器 — 检查是否有服务器已在运行:
lsof -i :5173 -i :3000 -i :8787 -t 2>/dev/null
如果正在运行,则使用 http://localhost:{port}。
为何优先线上而非本地:线上站点有真实数据、真实认证、真实网络延迟、真实 CDN 行为和真实 CORS/CSP 策略。在本地测试会错过特定于部署的问题(缺少环境变量、资源路径错误、CORS 错误、API 响应缓慢)。UX 审计应该测试用户实际体验到的内容。
何时本地更好:用户明确表示“测试 localhost”,或者功能尚未部署。
控制审计的彻底程度。作为参数传递:/ux-audit quick、/ux-audit thorough,或默认为 standard。
| 深度 | 持续时间 | 自主性 | 涵盖内容 |
|---|---|---|---|
| quick | 5-10 分钟 | 交互式 | 一个用户流程,仅限顺利路径。变更后的抽查。 |
| standard | 20-40 分钟 | 半自主 | 完整走查 + 主要页面的质量保证扫描。默认。 |
| thorough | 1-3 小时 | 全自主 | 多个人物角色,所有页面,所有模式组合。夜间模式。 |
| exhaustive | 4-8+ 小时 | 全自主 | 每个页面上的每个交互元素。点击每个按钮,打开每个对话框,填写每个表单,触发每个状态。不留任何未测试项。 |
详尽模式超越了彻底模式。彻底模式测试工作流和页面。详尽模式测试应用程序中的每一个交互元素。
对于发现的每个页面:
进度跟踪:此模式会生成大量发现。逐步将发现写入报告——不要将所有内容都保存在内存中。每个页面完成后更新 docs/ux-audit-exhaustive-YYYY-MM-DD.md。
元素清点格式(每页):
/clients — 47 interactive elements
[x] "Add Client" button — opens modal ✓, form submits ✓, validation ✓
[x] Search input — filters correctly ✓, clear button works ✓, empty search ✓
[x] Sort dropdown — all 4 options work ✓, persists on navigation ✗ (BUG)
[x] Client row click — navigates to detail ✓
[x] Star button — toggles ✓, persists on refresh ✓
[ ] Pagination — next ✓, prev ✓, page numbers ✓, items per page ✗ (not tested - no data)
...
彻底模式设计为无人值守运行。在一天结束时启动,第二天早上查看报告。用户不应需要自己发现问题——此模式能捕获一切。
心态:不要机械地核对清单。思考每天会使用此应用的真实用户。他们工作日的工作线索是什么?他们将如何在系统中移动?他们会理解他们正在看的内容吗?应用会通过其设计教会他们如何使用,还是他们会猜测?在开始前阅读 references/workflow-comprehension.md。
.jez/screenshots/ux-audit/(按时间顺序编号)docs/ux-audit-thorough-YYYY-MM-DD.md,包含按严重程度统计的问题数量在每个页面上,通过浏览器工具注入 JavaScript,以编程方式检测布局问题:
// Detect elements overflowing their parent
document.querySelectorAll('*').forEach(el => {
const r = el.getBoundingClientRect();
const p = el.parentElement?.getBoundingClientRect();
if (p && (r.left < p.left - 1 || r.right > p.right + 1)) {
console.warn('OVERFLOW:', el.tagName, el.className, 'extends beyond parent');
}
});
// Detect text clipped by containers
document.querySelectorAll('h1,h2,h3,h4,p,span,a,button,label').forEach(el => {
if (el.scrollWidth > el.clientWidth + 2 || el.scrollHeight > el.clientHeight + 2) {
console.warn('CLIPPED:', el.tagName, el.textContent?.slice(0,50));
}
});
// Detect elements with zero or negative visibility
document.querySelectorAll('*').forEach(el => {
const s = getComputedStyle(el);
const r = el.getBoundingClientRect();
if (r.width > 0 && r.height > 0 && r.left + r.width < 0) {
console.warn('OFF-SCREEN LEFT:', el.tagName, el.className);
}
});
// Detect low contrast text (rough check)
document.querySelectorAll('h1,h2,h3,p,span,a,li,td,th,label,button').forEach(el => {
const s = getComputedStyle(el);
if (s.color === s.backgroundColor || s.opacity === '0') {
console.warn('INVISIBLE TEXT:', el.tagName, el.textContent?.slice(0,30));
}
});
注入后读取控制台输出。每个警告都是一个需要截图和调查的潜在发现。
对于每个页面,调整视口大小通过标准断点并截图:
| 宽度 | 代表内容 | 检查项 |
|---|---|---|
| 1280px | 桌面(标准) | 基线布局,侧边栏 + 内容 |
| 1024px | 小桌面 / 平板横向 | 导航折叠点,网格重排 |
| 768px | 平板纵向 | 侧边栏行为,堆叠布局 |
| 375px | 移动端 | 所有内容堆叠,触摸目标,无水平滚动 |
如果布局在断点之间发生变化(侧边栏折叠,网格减少列数),也要截图过渡点。
在每个页面上,读取浏览器控制台以查找:
关键:视觉浏览会错过 UI 隐藏的 API 故障。数据获取库(TanStack Query、SWR)会吞掉 HTTP 错误,并显示空/加载状态而不是错误消息。显示“未找到结果”的组件实际上可能收到了 403——但 UI 看起来正常。
在整个审计会话期间监控网络响应。如果使用 Playwright,在开始浏览之前附加响应监听器:
// Inject into page or use Playwright's page.on('response')
// Collect all non-2xx API responses
const networkErrors = [];
// After each page navigation, check for failed fetch/XHR requests
// by reading the browser's network log or console output
如果使用 Chrome MCP,使用 read_network_requests 在每次页面访问后检查失败的 API 调用。
收集内容:URL、HTTP 状态、方法、发生时所在的页面。
严重程度映射:
| 状态码 | 严重程度 | 通常含义 |
|---|---|---|
| 500+ | 严重 | 服务器错误 — 某些东西坏了 |
| 403 | 高 | 权限错误 或 路由冲突(静态路由被 /:param 遮蔽) |
| 404 | 中 | 缺少端点 — 可能是重命名/移除的 API 路由 |
| 401 | 低 | 对于未认证的探测是预期的,但如果发生在已认证页面上则标记 |
| CORS 错误 | 高 | API 端点缺少 CORS 头 — 功能在生产环境中损坏 |
这能捕获视觉浏览会错过的内容:
GET /api/boards/users 被 GET /api/boards/:boardId 遮蔽)报告格式:在“网络错误”部分按状态码分组:
## Network Errors (detected during browsing)
### 403 Forbidden (2 endpoints)
- `GET /api/boards/users` on /app/boards/123 — likely route collision with /:boardId
- `POST /api/settings/theme` on /app/settings — permission check failing
### 500 Internal Server Error (1 endpoint)
- `GET /api/reports/summary` on /app/dashboard — server error
| 操作 | quick | standard | thorough |
|---|---|---|---|
| 导航页面 | 直接执行 | 直接执行 | 直接执行 |
| 截图 | 关键时刻 | 摩擦点 | 每个页面 + 每个问题 |
| 用测试数据填写表单 | 先询问 | 先询问 | 直接执行(明显是假数据) |
| 点击删除/破坏性操作 | 先询问 | 先询问 | 先询问(唯一例外) |
| 提交表单 | 先询问 | 简要确认 | 直接执行(仅限测试数据) |
| 写入报告文件 | 直接执行 | 简要确认 | 直接执行 |
何时使用:“ux walkthrough”、“walk through the app”、“is the app intuitive?”、“ux audit”、“dogfood this”
这是价值最高的模式。你正在内部试用应用——像真实用户一样使用它,带着他们的目标、限制和耐心水平。不是机械的清单核对,而是真正尝试完成一项工作。
询问用户两个问题:
如果用户未指定角色,则采用合理的默认值:一个非技术性、时间紧迫、轻度分心、今天首次使用此应用的人。
导航到应用的入口点。从这里开始,在没有任何 UI 先验知识的情况下尝试完成任务。采用角色的心态:
在每个屏幕上,根据走查清单进行评估(参见 references/walkthrough-checklist.md)。心中要牢记的关键问题:
布局:所有文本是否完全可见?没有裁剪或重叠?间距是否一致?理解:我是否理解此页面的用途以及我可以在这里做什么?标签对非开发者来说有意义吗?寻路:我知道我在应用中的位置吗?我能回到我来的地方吗?导航是否显示我的位置?流程:此页面是否与上一个页面自然连接?下一步是否显而易见,无需思考?信任:我是否确信这会按我预期的方式工作?我是否害怕会破坏某些东西?效率:这需要多少次点击/步骤?有更短的路径吗?恢复:如果我现在犯了一个错误,我能恢复吗?
跟踪完成任务所需的努力:
完成主要任务后,测试出错时会发生什么:
在完成(或失败)任务后,以角色的身份反思:
在摩擦点截图。将发现整理成 UX 审计报告。使用 references/report-template.md 中的模板将报告写入 docs/ux-audit-YYYY-MM-DD.md。
严重程度级别:
何时使用:“qa test”、“test all pages”、“check everything works”、“qa sweep”
对所有页面和功能进行系统性的机械测试。
发现页面:读取应用的路由器配置、站点地图,或手动导航侧边栏/菜单以找到所有路由
创建任务列表,列出要测试的区域(按功能区域分组)
对于每个页面/功能:
* 页面渲染无错误
* 数据正确显示(表格、列表、详情)
* 表单成功提交(创建)
* 记录可以编辑(更新)
* 删除操作有效并带有确认
* 错误输入时触发验证
* 空状态正确显示
* 错误状态得到处理
4. 跨领域关注点:
* 深色模式:所有元素可见,无对比度问题
* 移动视口(375px):布局未破坏,触摸目标足够大
* 搜索和过滤器:返回正确结果
* 通知:显示并可关闭
5. 生成质量保证扫描汇总表:
| 页面 | 状态 | 问题 |
|---|---|---|
| /patients | 通过 | — |
| /patients/new | 失败 | 电子邮件缺少表单验证 |
docs/qa-sweep-YYYY-MM-DD.md何时使用:“check [feature]”、“test [page]”、“verify [component] works”
对特定区域进行集中测试。
| 场景 | 模式 | 深度 |
|---|---|---|
| 刚更改了一个页面,快速完整性检查 | 针对性检查 | quick |
| 构建功能后,向用户展示前 | UX 走查 | standard |
| 发布前,验证没有损坏 | 质量保证扫描 | standard |
| 变更后对特定页面的快速检查 | 针对性检查 | quick |
| 定期的 UX 健康检查 | UX 走查 | standard |
| 客户演示准备 | 质量保证扫描 + UX 走查 | standard |
| 一天结束时的综合测试,早上查看 | 所有模式组合 | thorough |
| 上线前的完整审计,附带证据 | 所有模式组合 | thorough |
| 客户演示前测试字面上的所有内容 | 每个页面上的每个元素 | exhaustive |
| 周末进行的完整应用认证 | 每个元素、状态、视口、模式 | exhaustive |
跳过此技能用于:仅 API 服务、CLI 工具、单元/集成测试(使用测试框架)、性能测试。
默认规则(standard 深度)。有关 quick/thorough 的覆盖规则,请参见上文的“按深度划分的自主性”表格。
| 时机 | 阅读 |
|---|---|
| 开始彻底模式前 — 了解用户的世界 | references/workflow-comprehension.md |
| 走查过程中评估每个屏幕 | references/walkthrough-checklist.md |
| 运行六个场景测试 | references/scenario-tests.md |
| 撰写审计报告 | references/report-template.md |
| 浏览器工具命令和选择 | references/browser-tools.md |
每周安装次数
653
仓库
GitHub 星标数
643
首次出现
2026年2月18日
安全审计
安装于
opencode601
gemini-cli596
codex596
github-copilot594
cursor584
kimi-cli579
Dogfood web apps by browsing them as a real user would — with their goals, their patience, and their context. Goes beyond "does it work?" to "is it good?" by tracking emotional friction (trust, anxiety, confusion), counting click efficiency, testing resilience, and asking the ultimate question: "would I come back?" Uses Chrome MCP (for authenticated apps with your session) or Playwright for browser automation. Produces structured audit reports with findings ranked by impact.
Before starting any mode, detect available browser tools:
mcp__claude-in-chrome__*) — preferred for authenticated apps. Uses the user's logged-in Chrome session, so OAuth/cookies just work.mcp__plugin_playwright_playwright__*) — for public apps or parallel sessions.If none are available, inform the user and suggest installing Chrome MCP or Playwright.
See references/browser-tools.md for tool-specific commands.
If the user didn't provide a URL, find one automatically. Prefer the deployed/live version — that's what real users see.
Check wrangler.jsonc for custom domains or routes:
grep -E '"pattern"|"custom_domain"' wrangler.jsonc 2>/dev/null
If found, use the production URL (e.g. https://app.example.com).
Check for deployed URL in CLAUDE.md, README, or package.json homepage field.
Fall back to local dev server — check if one is already running:
lsof -i :5173 -i :3000 -i :8787 -t 2>/dev/null
If running, use http://localhost:{port}.
Why live over local : The live site has real data, real auth, real network latency, real CDN behaviour, and real CORS/CSP policies. Testing locally misses deployment-specific issues (missing env vars, broken asset paths, CORS errors, slow API responses). The UX audit should test what the user actually experiences.
When local is better : The user explicitly says "test localhost", or the feature isn't deployed yet.
Control how thorough the audit is. Pass as an argument: /ux-audit quick, /ux-audit thorough, or default to standard.
| Depth | Duration | Autonomy | What it covers |
|---|---|---|---|
| quick | 5-10 min | Interactive | One user flow, happy path only. Spot check after a change. |
| standard | 20-40 min | Semi-autonomous | Full walkthrough + QA sweep of main pages. Default. |
| thorough | 1-3 hours | Fully autonomous | Multiple personas, all pages, all modes combined. Overnight mode. |
| exhaustive | 4-8+ hours | Fully autonomous | Every interactive element on every page. Every button clicked, every dialog opened, every form filled, every state triggered. Leave nothing untested. |
The exhaustive mode goes beyond thorough. Thorough tests workflows and pages. Exhaustive tests every single interactive element in the application.
For each page discovered:
Progress tracking : This mode generates a LOT of findings. Write findings to the report incrementally — don't hold everything in memory. Update docs/ux-audit-exhaustive-YYYY-MM-DD.md after each page is complete.
Element inventory format (per page):
/clients — 47 interactive elements
[x] "Add Client" button — opens modal ✓, form submits ✓, validation ✓
[x] Search input — filters correctly ✓, clear button works ✓, empty search ✓
[x] Sort dropdown — all 4 options work ✓, persists on navigation ✗ (BUG)
[x] Client row click — navigates to detail ✓
[x] Star button — toggles ✓, persists on refresh ✓
[ ] Pagination — next ✓, prev ✓, page numbers ✓, items per page ✗ (not tested - no data)
...
The thorough mode is designed to run unattended. Kick it off at end of day, review the report in the morning. The user should NOT need to find issues themselves — this mode catches everything.
Mindset : Don't run through a checklist. Think about the real person who will use this app every day. What are the threads of their workday? How will they move through the system? Will they understand what they're looking at? Will the app teach them how to use it through its design, or will they be guessing? Read references/workflow-comprehension.md before starting.
On each page, inject JavaScript via the browser tool to programmatically detect layout issues:
// Detect elements overflowing their parent
document.querySelectorAll('*').forEach(el => {
const r = el.getBoundingClientRect();
const p = el.parentElement?.getBoundingClientRect();
if (p && (r.left < p.left - 1 || r.right > p.right + 1)) {
console.warn('OVERFLOW:', el.tagName, el.className, 'extends beyond parent');
}
});
// Detect text clipped by containers
document.querySelectorAll('h1,h2,h3,h4,p,span,a,button,label').forEach(el => {
if (el.scrollWidth > el.clientWidth + 2 || el.scrollHeight > el.clientHeight + 2) {
console.warn('CLIPPED:', el.tagName, el.textContent?.slice(0,50));
}
});
// Detect elements with zero or negative visibility
document.querySelectorAll('*').forEach(el => {
const s = getComputedStyle(el);
const r = el.getBoundingClientRect();
if (r.width > 0 && r.height > 0 && r.left + r.width < 0) {
console.warn('OFF-SCREEN LEFT:', el.tagName, el.className);
}
});
// Detect low contrast text (rough check)
document.querySelectorAll('h1,h2,h3,p,span,a,li,td,th,label,button').forEach(el => {
const s = getComputedStyle(el);
if (s.color === s.backgroundColor || s.opacity === '0') {
console.warn('INVISIBLE TEXT:', el.tagName, el.textContent?.slice(0,30));
}
});
Read console output after injection. Each warning is a potential finding to screenshot and investigate.
For each page, resize the viewport through standard breakpoints and screenshot:
| Width | What it represents | Check for |
|---|---|---|
| 1280px | Desktop (standard) | Baseline layout, sidebar + content |
| 1024px | Small desktop / tablet landscape | Nav collapse point, grid reflow |
| 768px | Tablet portrait | Sidebar behaviour, stacked layout |
| 375px | Mobile | Everything stacked, touch targets, no horizontal scroll |
If the layout changes between breakpoints (sidebar collapses, grid reduces columns), screenshot the transition point too.
On each page, read the browser console for:
Critical : Visual browsing misses API failures that the UI hides. Data-fetching libraries (TanStack Query, SWR) swallow HTTP errors and show empty/loading states instead of error messages. A component showing "No results found" might actually be getting a 403 — but the UI looks normal.
Monitor network responses throughout the entire audit session. If using Playwright, attach a response listener before browsing starts:
// Inject into page or use Playwright's page.on('response')
// Collect all non-2xx API responses
const networkErrors = [];
// After each page navigation, check for failed fetch/XHR requests
// by reading the browser's network log or console output
If using Chrome MCP, use read_network_requests to check for failed API calls after each page visit.
What to collect : URL, HTTP status, method, the page you were on when it happened.
Severity mapping :
| Status | Severity | What it usually means |
|---|---|---|
| 500+ | Critical | Server error — something is broken |
| 403 | High | Permission error OR route collision (static route shadowed by /:param) |
| 404 | Medium | Missing endpoint — may be a renamed/removed API route |
| 401 | Low | Expected for unauthenticated probes, but flag if it happens on authenticated pages |
| CORS error | High | API endpoint missing CORS headers — feature broken in production |
What this catches that visual browsing misses :
GET /api/boards/users shadowed by GET /api/boards/:boardId)Report format : Group by status code in a "Network Errors" section:
## Network Errors (detected during browsing)
### 403 Forbidden (2 endpoints)
- `GET /api/boards/users` on /app/boards/123 — likely route collision with /:boardId
- `POST /api/settings/theme` on /app/settings — permission check failing
### 500 Internal Server Error (1 endpoint)
- `GET /api/reports/summary` on /app/dashboard — server error
| Action | quick | standard | thorough |
|---|---|---|---|
| Navigate pages | Just do it | Just do it | Just do it |
| Take screenshots | Key moments | Friction points | Every page + every issue |
| Fill forms with test data | Ask first | Ask first | Just do it (obviously fake data) |
| Click delete/destructive | Ask first | Ask first | Ask first (only exception) |
| Submit forms | Ask first | Brief confirmation | Just do it (test data only) |
| Write report file | Just do it | Brief confirmation | Just do it |
When : "ux walkthrough", "walk through the app", "is the app intuitive?", "ux audit", "dogfood this"
This is the highest-value mode. You are dogfooding the app — using it as a real user would, with their goals, their constraints, and their patience level. Not a mechanical checklist pass, but genuinely trying to get a job done.
Ask the user two questions:
If the user doesn't specify a persona, adopt a reasonable default: a non-technical person who is time-poor, mildly distracted, and using this app for the first time today.
Navigate to the app's entry point. From here, attempt the task with no prior knowledge of the UI. Adopt the persona's mindset:
At each screen, evaluate against the walkthrough checklist (see references/walkthrough-checklist.md). Key questions to hold in mind:
Layout : Is all text fully visible? Nothing clipped or overlapping? Spacing consistent? Comprehension : Do I understand what this page is for and what I can do here? Do the labels make sense to a non-developer? Wayfinding : Do I know where I am in the app? Can I get back to where I came from? Does the nav show my location? Flow : Does this page connect naturally to the last one? Is the next step obvious without thinking? Trust : Do I feel confident this will do what I expect? Am I afraid I'll break something? Efficiency : How many clicks/steps is this taking? Is there a shorter path? Recovery : If I make a mistake right now, can I get back?
Track the effort required to complete the task:
After completing the main task, test what happens when things go wrong:
After completing (or failing) the task, reflect as the persona:
Take screenshots at friction points. Compile findings into a UX audit report. Write report to docs/ux-audit-YYYY-MM-DD.md using the template from references/report-template.md
Severity levels :
When : "qa test", "test all pages", "check everything works", "qa sweep"
Systematic mechanical testing of all pages and features.
Discover pages : Read the app's router config, sitemap, or manually navigate the sidebar/menu to find all routes
Create a task list of areas to test (group by feature area)
For each page/feature :
Cross-cutting concerns :
Produce a QA sweep summary table :
| Page | Status | Issues |
|---|---|---|
| /patients | Pass | — |
| /patients/new | Fail | Form validation missing on email |
docs/qa-sweep-YYYY-MM-DD.mdWhen : "check [feature]", "test [page]", "verify [component] works"
Focused testing of a specific area.
| Scenario | Mode | Depth |
|---|---|---|
| Just changed a page, quick sanity check | Targeted Check | quick |
| After building a feature, before showing users | UX Walkthrough | standard |
| Before a release, verify nothing is broken | QA Sweep | standard |
| Quick check on a specific page after changes | Targeted Check | quick |
| Periodic UX health check | UX Walkthrough | standard |
| Client demo prep | QA Sweep + UX Walkthrough | standard |
| End-of-day comprehensive test, review in morning | All modes combined | thorough |
| Pre-launch full audit with evidence | All modes combined | thorough |
| Test literally everything before a client demo | Every element on every page |
Skip this skill for : API-only services, CLI tools, unit/integration tests (use test frameworks), performance testing.
Default rules (standard depth). See "Autonomy by Depth" table above for quick/thorough overrides.
| When | Read |
|---|---|
| Before starting thorough mode — understand the user's world | references/workflow-comprehension.md |
| Evaluating each screen during walkthrough | references/walkthrough-checklist.md |
| Running the six scenario tests | references/scenario-tests.md |
| Writing the audit report | references/report-template.md |
| Browser tool commands and selection | references/browser-tools.md |
Weekly Installs
653
Repository
GitHub Stars
643
First Seen
Feb 18, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
opencode601
gemini-cli596
codex596
github-copilot594
cursor584
kimi-cli579
Gmail过滤器创建教程 - 使用Google Workspace CLI自动分类邮件与添加标签
6,500 周安装
.jez/screenshots/ux-audit/ (numbered chronologically)docs/ux-audit-thorough-YYYY-MM-DD.md with issue counts by severity| exhaustive |
| Weekend-long complete app certification | Every element, state, viewport, mode | exhaustive |