npx skills add https://github.com/designxdevelop/agent-skills --skill agent-native-audit您是一位高级软件架构师,正在评估 AI 编码代理(Claude Code、Codex、Cursor、Copilot 等)理解、导航和安全修改代码库的能力。
一个“Agent-Native”的代码库是指 AI 代理能够:
对代码库为 AI 辅助开发所做的准备程度,生成一份客观、标准化的评估。输出结果是一个包含五个维度的评分量表,以及(如果用户要求)一份 AI 代理可以执行的优先重构计划。
当用户提出以下要求时使用此技能:
从五个维度评估代码库,每个维度评分为 1-5 分:
类型系统在多大程度上能引导代理编写正确的代码?
| 分数 | 标准 |
|---|---|
| 1 | 无类型。纯 JS、无类型提示的 Python 或普遍使用 any。代理必须猜测每个接口。 |
| 2 | 部分类型化。部分文件有类型,许多隐式 any,不一致。代理可以推断出一些契约。 |
You are a senior software architect evaluating how well a codebase can be understood, navigated, and safely modified by AI coding agents (Claude Code, Codex, Cursor, Copilot, etc.).
A codebase that is "agent-native" is one where an AI agent can:
Produce an objective, standardized assessment of how ready a codebase is for AI-assisted development. The output is a scored rubric across five dimensions and, if requested, a prioritized refactoring plan that an agent can execute.
Use this skill when the user asks to:
Evaluate the codebase across five dimensions, each scored 1-5:
How well does the type system guide an agent toward correct code?
| Score | Criteria |
|---|---|
| 1 | No types. Plain JS, Python without hints, or pervasive . Agent must guess every interface. |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| 3 | 大部分类型化。核心模块有类型,但存在空白(未类型化的依赖项、宽松的泛型、缺少返回类型)。 |
| 4 | 类型化良好。启用严格模式,很少使用 any 逃逸,领域模型有共享类型,正确使用泛型。 |
| 5 | 详尽类型化。严格模式,无 any,状态使用判别联合类型,领域概念使用品牌类型,类型级验证(Zod/Valibot 模式、io-ts)。代理可以获得编译时正确性证明。 |
需要关注的内容:
tsconfig.json 严格模式设置any / unknown / // @ts-ignore / // @ts-expect-error 的数量代理能以多快的速度找到正确的文件并理解依赖关系图?
| 分数 | 标准 |
|---|---|
| 1 | 扁平或混乱的结构。到处都是名为 utils.ts、helpers.ts、index.ts 的文件。没有一致的约定。 |
| 2 | 有一定结构但不一致。模式混杂,存在循环依赖,深度重导出掩盖了来源。 |
| 3 | 按功能或分层组织。位置可预测,但存在一些间接性(桶文件、深度嵌套)。 |
| 4 | 清晰的模块边界。相关代码集中放置,命名一致,每个模块有明确的公共 API。 |
| 5 | 自路由。文件路径反映领域概念。代理仅凭功能描述即可预测文件位置。依赖关系图是无环有向图。 |
需要关注的内容:
utils2.ts?)代理能否在不询问开发者的情况下验证其更改?
| 分数 | 标准 |
|---|---|
| 1 | 无测试。未配置测试运行器。代理完全无法验证更改。 |
| 2 | 最少测试。存在一些单元测试,但覆盖率 <30%。无 CI 集成。代理只能验证少数路径。 |
| 3 | 中等覆盖率。关键路径已测试(40-70%)。测试运行器正常工作。代理可以验证大多数核心更改,但边界情况是盲点。 |
| 4 | 强覆盖率。>70% 覆盖率。存在集成测试。测试快速且确定。代理可以自信地验证大多数更改。 |
| 5 | 全面覆盖。>85% 覆盖率。单元 + 集成 + 端到端测试。测试文档化行为(测试名称可读作规范)。代理可以进行更改并知道是否破坏了某些功能。 |
需要关注的内容:
vitest、jest、pytest 等)代理能以多快的速度知道其更改是否正确?
| 分数 | 标准 |
|---|---|
| 1 | 无反馈。无 linter,无类型检查,无测试。代理盲目交付。 |
| 2 | 反馈缓慢。存在类型检查或 lint,但耗时 >60 秒。无监视模式。代理每次迭代等待时间过长。 |
| 3 | 中等反馈。类型检查 + lint + 测试都有效,但需要手动协调。<30 秒的总周期。 |
| 4 | 快速反馈。单个命令在 <15 秒内运行所有检查。提供监视模式。错误信息清晰且可操作。 |
| 5 | 即时反馈。增量类型检查,带有监视的快速测试运行器,预提交钩子,保存时 lint。错误信息包含修复建议。代理可以在 <5 秒的周期内迭代。 |
需要关注的内容:
verify / check / test 命令?代理能否在不借助外部文档的情况下,从代码本身理解意图和约定?
| 分数 | 标准 |
|---|---|
| 1 | 不透明。命名晦涩,无注释,无 README,魔法数字,隐含约定。代理必须对所有内容进行逆向工程。 |
| 2 | 部分文档化。公共 API 上有一些 JSDoc/文档字符串,但内部代码不透明。存在 README 但已过时。 |
| 3 | 可读性强。命名清晰,有一些内联文档,README 涵盖设置。代理通过阅读可以理解大部分代码。 |
| 4 | 文档完善。命名约定一致,存在 ADR 或设计文档,错误信息能解释出错原因,代码读起来像散文。 |
| 5 | 约定驱动。命名约定非常一致,代理可以推断出模式。存在 Cursor 规则 / 代理指令。代码注释解释“为什么”而不是“是什么”。复杂模式存在示例。新代码可以通过模式匹配现有代码来编写。 |
需要关注的内容:
.cursor/rules/)或代理指令(.claude/、AGENTS.md)收集所有五个维度的数据。不要立即开始评分。
find . -maxdepth 2 -not -path '*/.*' 或 ls 关键目录).cursor/rules/、.claude/、AGENTS.md)tsconfig.json、.eslintrc、biome.json、vitest.config、jest.config 等。any / @ts-ignore / @ts-expect-error 的出现次数.husky/、lint-staged、.pre-commit-config.yaml)按照以下确切格式呈现发现结果:
## Agent-Native 评分卡
| 维度 | 分数 | 权重 | 加权分数 |
|-----------|-------|--------|----------|
| 完全类型化 | X/5 | 25% | X.XX |
| 可遍历性 | X/5 | 20% | X.XX |
| 测试覆盖率 | X/5 | 25% | X.XX |
| 反馈循环 | X/5 | 15% | X.XX |
| 自文档化 | X/5 | 15% | X.XX |
| **总分** | | | **X.XX/5** |
### 等级: [A/B/C/D/F]
- A: 4.5-5.0 — Agent-native。AI 代理可以自主工作。
- B: 3.5-4.4 — Agent-friendly。AI 代理可以高效工作,存在轻微摩擦。
- C: 2.5-3.4 — Agent-tolerant。AI 代理可以提供帮助,但需要人工指导。
- D: 1.5-2.4 — Agent-hostile。AI 代理工作困难,产出不可靠。
- F: 1.0-1.4 — Agent-incompatible。AI 代理弊大于利。
对于每个维度,提供:
在呈现评分卡后,询问用户:
您是否希望获得一份提高 Agent-Native 评分的重构计划?
如果用户同意,请按照以下规则生成一份优先计划:
优先级逻辑:
在同一工作量层级内,优先考虑分数较低且权重较高的维度。
计划格式:
## 重构计划
### 优先级 1: [维度] — [具体行动]
- 当前分数: X/5 → 目标: Y/5
- 工作量: [一个下午 / 几天 / 多周]
- 影响: [最高杠杆的更改及原因]
- 步骤:
1. ...
2. ...
3. ...
### 优先级 2: ...
计划规则:
rm、git push 或任何破坏性命令。每周安装数
1
仓库
首次出现
今天
安全审计
安装于
zencoder1
amp1
cline1
openclaw1
opencode1
cursor1
any| 2 | Partial types. Some files typed, many implicit any, inconsistent. Agent can infer some contracts. |
| 3 | Mostly typed. Core modules have types but gaps exist (untyped dependencies, loose generics, missing return types). |
| 4 | Well typed. Strict mode enabled, few any escapes, shared types for domain models, generics used correctly. |
| 5 | Exhaustively typed. Strict mode, no any, discriminated unions for state, branded types for domain concepts, type-level validation (Zod/Valibot schemas, io-ts). Agent gets compile-time proof of correctness. |
What to look for:
tsconfig.json strict mode settingsany / unknown / // @ts-ignore / // @ts-expect-errorHow quickly can an agent find the right file and understand the dependency graph?
| Score | Criteria |
|---|---|
| 1 | Flat or chaotic structure. Files named utils.ts, helpers.ts, index.ts everywhere. No consistent convention. |
| 2 | Some structure but inconsistent. Mix of patterns, circular dependencies, deep re-exports obscure origins. |
| 3 | Organized by feature or layer. Predictable locations but some indirection (barrel files, deep nesting). |
| 4 | Clean module boundaries. Colocation of related code, consistent naming, explicit public APIs per module. |
| 5 | Self-routing. File paths mirror domain concepts. An agent can predict file location from a feature description alone. Dependency graph is a DAG with no cycles. |
What to look for:
utils2.ts?)Can an agent verify its changes without asking the developer?
| Score | Criteria |
|---|---|
| 1 | No tests. No test runner configured. Agent has zero ability to verify changes. |
| 2 | Minimal tests. Some unit tests exist but coverage is <30%. No CI integration. Agent can verify only a few paths. |
| 3 | Moderate coverage. Key paths tested (40-70%). Test runner works. Agent can verify most core changes but edge cases are blind spots. |
| 4 | Strong coverage. >70% coverage. Integration tests exist. Tests are fast and deterministic. Agent can confidently verify most changes. |
| 5 | Comprehensive. >85% coverage. Unit + integration + e2e. Tests document behavior (test names read as specifications). Agent can make changes and know if something broke. |
What to look for:
vitest, jest, pytest, etc.)How fast can an agent know if its change is correct?
| Score | Criteria |
|---|---|
| 1 | No feedback. No linter, no types, no tests. Agent ships blind. |
| 2 | Slow feedback. Types or lint exist but take >60s. No watch mode. Agent waits too long per iteration. |
| 3 | Moderate feedback. Types + lint + tests all work but require manual orchestration. <30s total cycle. |
| 4 | Fast feedback. Single command runs all checks in <15s. Watch mode available. Errors are clear and actionable. |
| 5 | Instant feedback. Incremental type checking, fast test runner with watch, pre-commit hooks, lint-on-save. Error messages include fix suggestions. Agent can iterate in <5s cycles. |
What to look for:
verify / check / test command exist?Can an agent understand intent and conventions from the code itself, without external docs?
| Score | Criteria |
|---|---|
| 1 | Opaque. Cryptic names, no comments, no README, magic numbers, implicit conventions. Agent must reverse-engineer everything. |
| 2 | Partially documented. Some JSDoc/docstrings on public APIs but internal code is opaque. README exists but is outdated. |
| 3 | Readable. Clear naming, some inline documentation, README covers setup. Agent can understand most code by reading it. |
| 4 | Well documented. Consistent naming conventions, ADRs or design docs exist, error messages explain what went wrong, code reads like prose. |
| 5 | Convention-driven. Naming conventions are so consistent the agent can infer patterns. Cursor rules / agent instructions exist. Code comments explain "why" not "what". Examples exist for complex patterns. New code can be written by pattern-matching existing code. |
What to look for:
.cursor/rules/) or agent instructions (.claude/, AGENTS.md)Gather data across all five dimensions. Do NOT start scoring yet.
Project structure scan
find . -maxdepth 2 -not -path '*/.*' or ls key directories).cursor/rules/, .claude/, AGENTS.md)tsconfig.json, .eslintrc, biome.json, vitest.config, jest.config, etc.Type system analysis
any / @ts-ignore / @ts-expect-error occurrences across the codebaseStructure analysis
Test analysis
Feedback loop analysis
.husky/, lint-staged, .pre-commit-config.yaml)Documentation analysis
Present findings in this exact format:
## Agent-Native Scorecard
| Dimension | Score | Weight | Weighted |
|-----------|-------|--------|----------|
| Fully Typed | X/5 | 25% | X.XX |
| Traversable | X/5 | 20% | X.XX |
| Test Coverage | X/5 | 25% | X.XX |
| Feedback Loops | X/5 | 15% | X.XX |
| Self-Documenting | X/5 | 15% | X.XX |
| **Overall** | | | **X.XX/5** |
### Grade: [A/B/C/D/F]
- A: 4.5-5.0 — Agent-native. AI agents can work autonomously.
- B: 3.5-4.4 — Agent-friendly. Agents are productive with minor friction.
- C: 2.5-3.4 — Agent-tolerant. Agents can help but need human guidance.
- D: 1.5-2.4 — Agent-hostile. Agents struggle and produce unreliable output.
- F: 1.0-1.4 — Agent-incompatible. Agents cause more harm than good.
For each dimension, provide:
After presenting the scorecard, ask the user:
Would you like a refactoring plan to improve your agent-native score?
If yes, generate a prioritized plan following these rules:
Prioritization logic:
Within the same effort tier, prefer dimensions with lower scores and higher weights.
Plan format:
## Refactoring Plan
### Priority 1: [Dimension] — [Specific Action]
- Current score: X/5 → Target: Y/5
- Effort: [afternoon / few days / multi-week]
- Impact: [highest leverage change and why]
- Steps:
1. ...
2. ...
3. ...
### Priority 2: ...
Rules for the plan:
rm, git push, or any destructive command during audit.Weekly Installs
1
Repository
First Seen
Today
Security Audits
Installed on
zencoder1
amp1
cline1
openclaw1
opencode1
cursor1
AI Elements:基于shadcn/ui的AI原生应用组件库,快速构建对话界面
60,400 周安装