重要前提
安装AI Skills的关键前提是:必须科学上网,且开启TUN模式,这一点至关重要,直接决定安装能否顺利完成,在此郑重提醒三遍:科学上网,科学上网,科学上网。查看完整安装教程 →
software-ux-research by vasilyu1983/ai-agents-public
npx skills add https://github.com/vasilyu1983/ai-agents-public --skill software-ux-research使用此技能来识别问题/机会并降低决策风险。使用 software-ui-ux-design 来实现用户界面模式、组件变更和设计系统更新。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
如果输入信息缺失,请询问:
默认输出(选择用户所要求的):
每项研究成果——计划、方案、评估、报告——必须包含以下部分。它们代表了此技能超越标准用户体验知识的核心价值:治理、置信度校准和伦理研究实践。
方法论证 : 命名所选方法 并 解释为何拒绝替代方案。不要仅仅描述方法;要根据特定情境(阶段、时间线、样本、问题类型)解释为何选择它而不是至少两种替代方案。
置信度与三角验证评估:为每条建议或发现标记置信度等级:
| 置信度 | 证据要求 | 用于 |
|---|---|---|
| 高 | 多种方法或来源达成一致 | 高影响力决策 |
| 中 | 单一方法的强信号 + 支持性指标 | 优先级排序 |
| 低 | 单一来源 / 小样本 | 仅用于探索性假设 |
同意书与数据处理:在每个计划或方案中包含个人身份信息/同意书部分。涉及参与者的研究需要明确关注:
决策框架 : 对于评估和分析输出,提供一个结构化的决策表,包含选项、置信度等级、时间线和风险——而不仅仅是单一建议。
决策前检查清单 : 对于实验评估(A/B测试等),包含一份关于混杂因素和数据质量检查的验证清单,在任何发布/终止决策前完成。
What do you need?
├─ WHY / needs / context → interviews, contextual inquiry, diary
├─ HOW / usability → moderated usability test, cognitive walkthrough, heuristic eval
├─ WHAT / scale → analytics/logs + targeted qual follow-ups
└─ WHICH / causal → experiments (if feasible) or preference tests
选择方法时,始终要通过解释为何在用户特定情境下拒绝了2种以上的替代方案来论证选择。这是一个关键区别——没有论证的通用“我们将进行访谈”是不够的。
| 阶段 | 决策 | 主要方法 | 次要方法 | 产出 |
|---|---|---|---|---|
| 探索 | 构建什么以及为谁构建 | 访谈、实地/日记研究、旅程地图 | 竞品分析、反馈挖掘 | 机会简报 + 待办任务 + 进步力量 |
| 概念/最小可行产品 | 概念是否可行? | 概念测试、原型可用性测试 | 首次点击/树测试 | 最小可行产品范围 + 入门计划 |
| 发布 | 是否可用 + 无障碍? | 可用性测试、无障碍审查 | 启发式评估、会话回放 | 发布阻碍 + 修复方案 |
| 增长 | 什么驱动采用/价值? | 细分分析 + 定性跟进 | 流失访谈、调查 | 留存驱动因素 + 摩擦点 |
| 成熟 | 优化/弃用什么? | 实验、纵向追踪 | 非主持测试 | 增量路线图 |
探索研究应产出比任务陈述更多的内容。包括:
| 指标 | 示例 | 研究影响 |
|---|---|---|
| 多步骤工作流 | 草稿 → 批准 → 发布 | 任务分析 + 状态映射 |
| 多角色权限 | 管理员 vs 编辑 vs 查看者 | 测试每个角色 + 转换 |
| 数据依赖 | 需要集成/同步 | 错误路径 + 恢复测试 |
| 高风险 | 金融、医疗保健 | 安全检查 + 确认 |
| 专家用户 | 开发工具、分析 | 招募真实专家(而非代理) |
最低必需字段:
使用轻量级评分以避免积压瘫痪:
遵循适用的隐私法律;GDPR是欧盟处理的主要参考 https://eur-lex.europa.eu/eli/reg/2016/679/oj
个人身份信息处理检查清单:
研究民主化是2026年的一个反复出现的趋势:非研究人员越来越多地进行研究。谨慎地通过护栏来实现。
| 方法 | 护栏 | 风险等级 |
|---|---|---|
| 模板化可用性测试 | 提供脚本 + 任务模板 | 低 |
| 产品经理进行的客户访谈 | 需要培训 + 审查 | 中 |
| 任何人进行的调查设计 | 中心审查 + 标准问题 | 中 |
| 无监督研究 | 不推荐 | 高 |
非研究人员护栏:
涉及数字素养低或技术信心低的用户研究时的快速检查清单。完整指南见 references/non-technical-user-research.md。
| 研究活动 | 代理指标 | 计算方式 |
|---|---|---|
| 可用性测试发现 | 避免的开发返工 | 节省工时 × $150/小时 |
| 探索性访谈 | 避免构建错误的东西 | 冲刺成本 × 风险降低百分比 |
| A/B测试结论性结果 | 提升的转化率 | (Δ转化率 × 流量 × 生命周期价值) - 测试成本 |
| 启发式评估 | 早期缺陷检测 | 发现的缺陷 × 后期修复成本 |
经验法则 :
| 情况 | 失败原因 | 更好的方法 |
|---|---|---|
| 低统计功效/流量 | 结果不确定 | 可用性测试 + 趋势分析 |
| 多个变量变化 | 归因不可能 | 原型测试 → 分阶段发布 |
| 需要“为什么” | 实验无法解释 | 访谈 + 观察 |
| 伦理约束 | 有害的拒绝 | 分阶段发布 + 对照组 |
| 长期效应 | 短期测试错过延迟影响 | 纵向 + 留存分析 |
在实验评估中始终检查这些因素。列出每个相关的混杂因素及其风险等级和验证方法——不要仅仅命名:
仅在研究自动化/人工智能驱动的功能时使用。对于传统软件用户体验,请跳过。
2026年基准 : 趋势报告持续强调人工智能辅助分析。使用人工智能以提高速度,同时保持人类对策略和解释的责任。示例参考:https://www.lyssna.com/blog/ux-research-trends/
| 维度 | 问题 | 方法 |
|---|---|---|
| 心智模型 | 用户认为系统能/不能做什么? | 访谈、概念测试 |
| 信任校准 | 用户何时过度/不足依赖? | 场景测试、日志审查 |
| 解释有用性 | “为什么”有助于决策吗? | A/B解释变体、访谈 |
| 失败恢复 | 用户能恢复并完成任务吗? | 失败路径可用性测试 |
| 失败类型 | 典型影响 | 测量内容 |
|---|---|---|
| 错误输出 | 返工、失去信任 | 验证 + 覆盖率 |
| 缺失输出 | 手动回退 | 回退完成率 |
| 不明确输出 | 困惑 | 澄清请求 |
| 不可恢复失败 | 流程阻塞 | 恢复时间、联系支持 |
趋势报告频繁提及合成/人工智能参与者。在明确边界内使用。示例参考:https://www.lyssna.com/blog/ux-research-trends/
| 使用场景 | 是否适用? | 原因 |
|---|---|---|
| 早期概念头脑风暴 | 警告:仅作为补充 | 生成边缘案例,而非验证 |
| 场景/边缘案例扩展 | 通过 是 | 在真实测试前扩大覆盖范围 |
| 主持人培训/练习 | 通过 是 | 练习而无参与者负担 |
| 假设生成 | 通过 是 | 探索需用真实用户测试的方向 |
| 验证/通过-不通过决策 | 失败 从不 | 无法替代真实体验 |
| 可用性发现作为证据 | 失败 从不 | 需要真实行为 |
| 报告中的引述 | 失败 从不 | 捏造的引述损害可信度 |
关键规则 : 合成输出是假设,而非证据。在发布前,始终用真实用户验证。
核心研究方法:
人口统计与定量研究:
竞争性用户体验分析与流程模式:
研究运营与方法:
反馈收集与分析:
评估性迭代:
数据与来源:
重要 : 当为特定领域设计用户体验流程时,必须 使用网络搜索来查找并建议行业领导者的最佳实践模式。
| 领域 | 需要检查的行业领导者 | 关键流程 |
|---|---|---|
| 金融科技/银行 | Wise, Revolut, Monzo, N26, Chime, Mercury | 入门/了解你的客户、转账、卡片管理、消费分析 |
| 电子商务 | Shopify, Amazon, Stripe Checkout | 结账、购物车、产品页面、退货 |
| 软件即服务/企业对企业 | Linear, Notion, Figma, Slack, Airtable | 入门、设置、协作、权限 |
| 开发者工具 | Stripe, Vercel, GitHub, Supabase | 文档、API浏览器、仪表板、命令行界面 |
| 消费者应用 | Spotify, Airbnb, Uber, Instagram | 发现、预订、信息流、社交 |
| 医疗保健 | Oscar, One Medical, Calm, Headspace | 预约安排、记录、合规流程 |
| 教育科技 | Duolingo, Coursera, Khan Academy | 入门、进度、游戏化 |
当用户指定领域时,执行:
"[领域] UX 最佳实践 2026""[领导者公司] [流程类型] UX""[领导者公司] 应用评论 UX" site:mobbin.com OR site:pageflows.com"[领域] 入门流程示例"搜索后,提供:
DOMAIN: Fintech (Money Transfer)
BENCHMARKED: Wise, Revolut
WISE PATTERNS:
- Upfront fee transparency (shows exact fee before recipient input)
- Mid-transfer rate lock (shows countdown timer)
- Delivery time estimate per payment method
- Recipient validation (bank account check before send)
REVOLUT PATTERNS:
- Instant send to Revolut users (P2P first)
- Currency conversion preview with rate comparison
- Scheduled/recurring transfers prominent
APPLY TO YOUR FLOW:
1. Add fee transparency at step 1 (not step 3)
2. Show delivery estimate per payment rail
3. Consider rate lock feature for FX transfers
DIFFERENTIATION OPPORTUNITY:
- Neither shows historical rate chart—add "is now a good time?" context
重要 : 当用户询问关于用户体验研究的推荐问题时,必须 在回答前使用网络搜索检查当前趋势。
"UX 研究趋势 2026""UX 研究工具最佳实践 2026""[Maze/Hotjar/UserTesting] 比较 2026""AI 在 UX 研究中的应用 2026"搜索后,提供:
关于原型-产品一致性打磨(当产品“接近理想”时的快速迭代),请参阅 references/evaluative-research-loop.md。涵盖:双界面审计、漂移分类(布局/密度/控件/内容/状态)、基于摩擦的优先级排序、横幅/加载护栏、本地化就绪检查以及快速迭代节奏。
每周安装次数
71
代码库
GitHub 星标数
48
首次出现
2026年1月23日
安全审计
安装于
codex54
opencode54
gemini-cli53
cursor53
github-copilot49
amp44
Use this skill to identify problems/opportunities and de-risk decisions. Use software-ui-ux-design to implement UI patterns, component changes, and design system updates.
If inputs are missing, ask for:
Default outputs (pick what the user asked for):
Every research output — plans, protocols, evaluations, reports — must include these sections. They represent the skill's core value beyond standard UX knowledge: governance, confidence calibration, and ethical research practice.
Method Justification : Name the chosen method AND explain why alternatives were rejected. Do not just describe the method; explain why it was selected over at least 2 alternatives given the specific context (stage, timeline, sample, question type).
Confidence & Triangulation Assessment: Tag every recommendation or finding with a confidence level:
| Confidence | Evidence requirement | Use for |
|---|---|---|
| High | Multiple methods or sources agree | High-impact decisions |
| Medium | Strong signal from one method + supporting indicators | Prioritization |
| Low | Single source / small sample | Exploratory hypotheses only |
Consent & Data Handling: Include a PII/consent section in every plan or protocol. Research that involves participants requires explicit attention to:
Decision Framework : For evaluations and analysis outputs, provide a structured decision table with options, confidence levels, timelines, and risks — not just a single recommendation.
Pre-Decision Checklist : For experiment evaluations (A/B tests, etc.), include a verification checklist of confounds and data quality checks to complete before any ship/kill decision.
What do you need?
├─ WHY / needs / context → interviews, contextual inquiry, diary
├─ HOW / usability → moderated usability test, cognitive walkthrough, heuristic eval
├─ WHAT / scale → analytics/logs + targeted qual follow-ups
└─ WHICH / causal → experiments (if feasible) or preference tests
When selecting a method, always justify the choice by explaining why 2+ alternatives were rejected given the user's specific context. This is a key differentiator — generic "we'll do interviews" without justification is insufficient.
| Stage | Decisions | Primary Methods | Secondary Methods | Output |
|---|---|---|---|---|
| Discovery | What to build and for whom | Interviews, field/diary, journey mapping | Competitive analysis, feedback mining | Opportunity brief + JTBD + Forces of Progress |
| Concept/MVP | Does the concept work? | Concept test, prototype usability | First-click/tree test | MVP scope + onboarding plan |
| Launch | Is it usable + accessible? | Usability testing, accessibility review | Heuristic eval, session replay | Launch blockers + fixes |
| Growth | What drives adoption/value? | Segmented analytics + qual follow-ups | Churn interviews, surveys | Retention drivers + friction |
| Maturity | What to optimize/deprecate? | Experiments, longitudinal tracking | Unmoderated tests |
Discovery research should produce more than job statements. Include:
| Indicator | Example | Research Implication |
|---|---|---|
| Multi-step workflows | Draft → approve → publish | Task analysis + state mapping |
| Multi-role permissions | Admin vs editor vs viewer | Test each role + transitions |
| Data dependencies | Requires integrations/sync | Error-path + recovery testing |
| High stakes | Finance, healthcare | Safety checks + confirmations |
| Expert users | Dev tools, analytics | Recruit real experts (not proxies) |
Minimum required fields:
Use a lightweight score to avoid backlog paralysis:
Follow applicable privacy laws; GDPR is a primary reference for EU processing https://eur-lex.europa.eu/eli/reg/2016/679/oj
PII handling checklist:
Research democratization is a recurring 2026 trend: non-researchers increasingly conduct research. Enable carefully with guardrails.
| Approach | Guardrails | Risk Level |
|---|---|---|
| Templated usability tests | Script + task templates provided | Low |
| Customer interviews by PMs | Training + review required | Medium |
| Survey design by anyone | Central review + standard questions | Medium |
| Unsupervised research | Not recommended | High |
Guardrails for non-researchers:
Quick checklist for research involving users with low digital literacy or low tech confidence. Full guidance in references/non-technical-user-research.md.
| Research Activity | Proxy Metric | Calculation |
|---|---|---|
| Usability testing finding | Prevented dev rework | Hours saved × $150/hr |
| Discovery interview | Prevented build-wrong-thing | Sprint cost × risk reduction % |
| A/B test conclusive result | Improved conversion | (ΔConversion × Traffic × LTV) - Test cost |
| Heuristic evaluation | Early defect detection | Defects found × Cost-to-fix-later |
Rules of thumb :
| Situation | Why it fails | Better method |
|---|---|---|
| Low power/traffic | Inconclusive results | Usability tests + trends |
| Many variables change | Attribution impossible | Prototype tests → staged rollout |
| Need “why” | Experiments don’t explain | Interviews + observation |
| Ethical constraints | Harmful denial | Phased rollout + holdouts |
| Long-term effects | Short tests miss delayed impact | Longitudinal + retention analysis |
Always check for these in experiment evaluations. List each relevant confound with its risk level and how to verify — do not just name them:
Use only when researching automation/AI-powered features. Skip for traditional software UX.
2026 benchmark : Trend reports consistently highlight AI-assisted analysis. Use AI for speed while keeping humans responsible for strategy and interpretation. Example reference: https://www.lyssna.com/blog/ux-research-trends/
| Dimension | Question | Methods |
|---|---|---|
| Mental model | What do users think the system can/can’t do? | Interviews, concept tests |
| Trust calibration | When do users over/under-rely? | Scenario tests, log review |
| Explanation usefulness | Does “why” help decisions? | A/B explanation variants, interviews |
| Failure recovery | Do users recover and finish tasks? | Failure-path usability tests |
| Failure type | Typical impact | What to measure |
|---|---|---|
| Wrong output | Rework, lost trust | Verification + override rate |
| Missing output | Manual fallback | Fallback completion rate |
| Unclear output | Confusion | Clarification requests |
| Non-recoverable failure | Blocked flow | Time-to-recovery, support contact |
Trend reports frequently mention synthetic/AI participants. Use with clear boundaries. Example reference: https://www.lyssna.com/blog/ux-research-trends/
| Use Case | Appropriate? | Why |
|---|---|---|
| Early concept brainstorming | WARNING: Supplement only | Generate edge cases, not validation |
| Scenario/edge case expansion | PASS Yes | Broaden coverage before real testing |
| Moderator training/practice | PASS Yes | Practice without participant burden |
| Hypothesis generation | PASS Yes | Explore directions to test with real users |
| Validation/go-no-go decisions | FAIL Never | Cannot substitute lived experience |
| Usability findings as evidence | FAIL Never | Real behavior required |
| Quotes in reports | FAIL Never | Fabricated quotes damage credibility |
Critical rule : Synthetic outputs are hypotheses , not evidence. Always validate with real users before shipping.
Core Research Methods:
Demographic & Quantitative Research:
Competitive UX Analysis & Flow Patterns:
Research Operations & Methods:
Feedback Collection & Analysis:
Evaluative Iteration:
Data & Sources:
IMPORTANT : When designing UX flows for a specific domain, you MUST use WebSearch to find and suggest best-practice patterns from industry leaders.
| Domain | Industry Leaders to Check | Key Flows |
|---|---|---|
| Fintech/Banking | Wise, Revolut, Monzo, N26, Chime, Mercury | Onboarding/KYC, money transfer, card management, spend analytics |
| E-commerce | Shopify, Amazon, Stripe Checkout | Checkout, cart, product pages, returns |
| SaaS/B2B | Linear, Notion, Figma, Slack, Airtable | Onboarding, settings, collaboration, permissions |
| Developer Tools | Stripe, Vercel, GitHub, Supabase | Docs, API explorer, dashboard, CLI |
| Consumer Apps | Spotify, Airbnb, Uber, Instagram | Discovery, booking, feed, social |
| Healthcare | Oscar, One Medical, Calm, Headspace | Appointment booking, records, compliance flows |
| EdTech | Duolingo, Coursera, Khan Academy |
When user specifies a domain, execute:
"[domain] UX best practices 2026""[leader company] [flow type] UX""[leader company] app review UX" site:mobbin.com OR site:pageflows.com"[domain] onboarding flow examples"After searching, provide:
DOMAIN: Fintech (Money Transfer)
BENCHMARKED: Wise, Revolut
WISE PATTERNS:
- Upfront fee transparency (shows exact fee before recipient input)
- Mid-transfer rate lock (shows countdown timer)
- Delivery time estimate per payment method
- Recipient validation (bank account check before send)
REVOLUT PATTERNS:
- Instant send to Revolut users (P2P first)
- Currency conversion preview with rate comparison
- Scheduled/recurring transfers prominent
APPLY TO YOUR FLOW:
1. Add fee transparency at step 1 (not step 3)
2. Show delivery estimate per payment rail
3. Consider rate lock feature for FX transfers
DIFFERENTIATION OPPORTUNITY:
- Neither shows historical rate chart—add "is now a good time?" context
IMPORTANT : When users ask recommendation questions about UX research, you MUST use WebSearch to check current trends before answering.
"UX research trends 2026""UX research tools best practices 2026""[Maze/Hotjar/UserTesting] comparison 2026""AI in UX research 2026"After searching, provide:
For prototype-parity polishing (fast iteration when product is "almost ideal"), see references/evaluative-research-loop.md. Covers: two-surface audit, drift classification (layout/density/control/content/state), friction-based prioritization, banner/loading guardrails, localization-readiness checks, and fast iteration cadence.
Weekly Installs
71
Repository
GitHub Stars
48
First Seen
Jan 23, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
codex54
opencode54
gemini-cli53
cursor53
github-copilot49
amp44
注册流程转化率优化指南:减少摩擦、提高完成率的专家技巧
31,300 周安装
| Incremental roadmap |
| Onboarding, progress, gamification |