Agentation 自主驾驶模式：网页设计批注自动化工具，实现智能评审与交互可视化

agentation-self-driving by benjitaylor/agentation

2,300 周安装量

3,100 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/benjitaylor/agentation --skill agentation-self-driving

AI/机器学习开发设计

🇨🇳中文介绍

Agentation 自主驾驶模式

通过 Agentation 工具栏自动为网页添加设计批注，实现自主评审——在可见的带界面浏览器中运行，让用户能够实时观察智能体工作，就像观看自动驾驶汽车导航一样。

启动——始终保持界面可见

浏览器必须可见。切勿以无头模式运行。用户将观察你扫描、悬停、点击和批注的过程。

预检：首先验证 agent-browser 是否可用：

command -v agent-browser >/dev/null || { echo "ERROR: agent-browser not found. Install the agent-browser skill first."; exit 1; }

启动：首先尝试直接打开。仅当打开命令因会话过期错误而失败时，才关闭现有会话——这可以避免终止他人正在使用的浏览器：

# 尝试打开。如果失败（会话过期），先关闭再重试。
agent-browser --headed open <url> 2>&1 || { agent-browser close 2>/dev/null; agent-browser --headed open <url>; }

然后验证 Agentation 工具栏是否存在并展开它：

# 1. 检查页面上是否存在工具栏（data-feedback-toolbar 是根标记）
agent-browser eval "document.querySelector('[data-feedback-toolbar]') ? 'toolbar found' : 'NOT FOUND'"
# 如果返回 "NOT FOUND"：此页面未安装 Agentation——停止并告知用户

# 2. 仅在折叠时展开（已展开时点击会折叠它）
agent-browser eval "document.querySelector('[data-feedback-toolbar][class*=expanded]') ? 'already expanded' : (document.querySelector('[class*=toggleContent]')?.click(), 'expanding')"

# 3. 验证：截取快照并查找工具栏控件
agent-browser snapshot -i
# 如果已展开：你将看到“阻止页面交互”复选框、颜色按钮（紫色、蓝色等）
# 如果已折叠：你只会看到小的切换按钮——重试步骤 2

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

相关 Skills

find-skills 技能搜索工具 - Vercel Labs 开源智能体技能包管理器

733,500 周安装

Vercel React 最佳实践指南 | 58条Next.js性能优化规则与代码重构

252,100 周安装

Vercel Web界面规范检查工具 - 自动检测代码是否符合Web设计指南

202,600 周安装

agent-browser 浏览器自动化工具 - Vercel Labs 命令行网页操作与测试

133,200 周安装

@ref 兼容性：只有 click、fill、type、hover、focus、check、select、drag 支持 @ref 语法。命令 scrollintoview、get box 和 eval 不支持——它们期望 CSS 选择器。使用 eval 配合 querySelector 进行滚动和位置查找。

# 1. 截取交互式快照——识别目标元素并构建 CSS 选择器
agent-browser snapshot -i
# 示例：快照显示标题 "Point at bugs." [ref=e10]
# 推导 CSS 选择器：'h1'，或更具体：'h1:first-of-type'

# 2. 通过 eval 将元素滚动到视图中（不要使用 scrollintoview @ref——那会出错）
agent-browser eval "document.querySelector('h1').scrollIntoView({block:'center'})"

# 3. 通过 eval 获取其边界框（不要使用 get box @ref——那也会出错）
agent-browser eval "((r) => r.x+','+r.y+','+r.width+','+r.height)(document.querySelector('h1').getBoundingClientRect())"
# 返回："383,245,200,40"（解析为 x,y,width,height）

# 4. 将光标移动到元素中心，然后点击
#    centerX = x + width/2,  centerY = y + height/2
agent-browser mouse move <centerX> <centerY>
agent-browser mouse down left
agent-browser mouse up left

# 5. 获取批注对话框的引用——读取完整的快照输出
#    对话框引用出现在列表的底部，不要用 head/tail 截断
agent-browser snapshot -i
# 查找：文本框 "What should change?" 和 "Cancel" / "Add" 按钮

# 6. 输入评论——fill 和 click 确实支持 @ref
agent-browser fill @<textboxRef> "Your critique here"

# 7. 提交（填写文本后 Add 按钮会启用）
agent-browser click @<addRef>

快照行	CSS 选择器
`heading "Point at bugs." [ref=e10]`	`h1` 或 `h1:first-of-type`
`button "npm install agentation Copy" [ref=e15]`	`button:has(code)` 或通过 eval 按文本内容选择
`link "Star on GitHub" [ref=e28]`	`a[href*=github]`
`paragraph (long text...) [ref=e20]`	按区域定位：`section:nth-of-type(2) p`

通过 eval（scrollIntoView）滚动到目标区域
选择一个特定元素——标题、段落、按钮、区域容器
通过 eval（getBoundingClientRect）获取其边界框
执行坐标点击序列（mouse move → mouse down → mouse up）
读取完整的快照输出以查找底部的对话框引用
撰写评论（fill @ref）并提交（click @ref）
验证批注是否已添加（见下文）
移动到下一个区域

区域	关注点
首屏/折叠上方	标题层级、行动号召按钮位置、视觉分组
导航	标签样式、分类分组、视觉权重
演示/插图	清晰度、深度、动画可读性
内容区域	间距节奏、标注处理、排版层级
关键标语	是否有足够视觉强调的共鸣语句
行动号召按钮和页脚	转化权重、视觉分隔、最终行动

“浏览器未启动。请先调用 launch。”：来自先前运行的过期会话——运行 agent-browser close 2>/dev/null，然后重试 --headed open 命令
页面上未找到工具栏：未安装 Agentation——先运行 /agentation 进行设置
点击后无对话框：工具栏已折叠——使用状态感知的 eval 重新展开（先检查 [class*=expanded]），重试
目标元素错误：点击 Cancel，滚动到预期元素，使用正确坐标重试
Add 按钮保持禁用状态：未填写文本——重新截取快照并填写文本框
页面已导航：“阻止页面交互”已关闭——通过工具栏设置启用
批注计数未增加：提交失败——对话框可能仍处于打开状态，重新截取快照并检查
运行中途中断（Ctrl+C）：浏览器保持打开状态，并停留在中断时的状态。在开始新会话前运行 agent-browser close 进行清理

陷阱	发生情况	修复方法
`scrollintoview @ref`	崩溃：“解析 css 选择器时出现不支持的标记 @ref”	使用 `eval "document.querySelector('sel').scrollIntoView({block:'center'})"`
`get box @ref`	同样崩溃——`get box` 将引用解析为 CSS 选择器	使用 `eval "((r)=>r.x+','+r.y+','+r.width+','+r.height)(document.querySelector('sel').getBoundingClientRect())"`
`eval` 中使用双感叹号	Bash 在命令运行前将双感叹号扩展为历史替换	改用 `expr !== null` 或 `expr ? true : false`
`eval` 中使用反斜杠转义引号	转义的内引号在不同 shell 中会中断	去掉引号：`[class*=toggleContent]` 适用于没有空格的简单值
`snapshot -i	head -50`	批注对话框引用（`textbox "What should change?"`、`Add`、`Cancel`）出现在快照的底部
`click @ref` 用于覆盖层元素	点击会穿透到真实的 DOM，绕过 Agentation 覆盖层	使用 `mouse move` → `mouse down left` → `mouse up left` 进行基于坐标的点击，覆盖层会拦截这些点击
`--headed open` 失败并显示“浏览器未启动”	先前运行的过期会话阻止新启动	运行 `agent-browser close 2>/dev/null`，然后重试打开命令

经验法则：@ref 适用于交互命令（click、fill、type、hover）。对于其他所有命令（eval、get、scrollintoview），在 eval 中通过 querySelector 使用 CSS 选择器。

🇺🇸English

Agentation Self-Driving Mode

Autonomously critique a web page by adding design annotations via the Agentation toolbar — in a visible headed browser so the user can watch the agent work in real time, like watching a self-driving car navigate.

Launch — Always Headed

The browser MUST be visible. Never run headless. The user watches you scan, hover, click, and annotate.

Preflight : Verify agent-browser is available before anything else:

command -v agent-browser >/dev/null || { echo "ERROR: agent-browser not found. Install the agent-browser skill first."; exit 1; }

Launch : Try opening directly first. Only close an existing session if the open command fails with a stale session error — this avoids killing a browser someone else is using:

# Try to open. If it fails (stale session), close first then retry.
agent-browser --headed open <url> 2>&1 || { agent-browser close 2>/dev/null; agent-browser --headed open <url>; }

Then verify the Agentation toolbar is present and expand it:

# 1. Check toolbar exists on the page (data-feedback-toolbar is the root marker)
agent-browser eval "document.querySelector('[data-feedback-toolbar]') ? 'toolbar found' : 'NOT FOUND'"
# If "NOT FOUND": Agentation is not installed on this page — stop and tell the user

# 2. Expand ONLY if collapsed (clicking when already expanded collapses it)
agent-browser eval "document.querySelector('[data-feedback-toolbar][class*=expanded]') ? 'already expanded' : (document.querySelector('[class*=toggleContent]')?.click(), 'expanding')"

# 3. Verify: take a snapshot and look for toolbar controls
agent-browser snapshot -i
# If expanded: you'll see "Block page interactions" checkbox, color buttons (Purple, Blue, etc.)
# If collapsed: you'll only see the small toggle button — retry step 2

"Block page interactions" must be checked (default: on).

eval quoting rule : Always use [class*=toggleContent] (no quotes around the attribute value) in eval strings. Do not use double-bang in eval because bash treats it as history expansion. Do not use backslash-escaped inner quotes either, as they break unpredictably across shells.

Critical: How to Create Annotations

Standard element clicks (click @ref) do NOT trigger annotation dialogs. The Agentation overlay intercepts pointer events at the coordinate level. Use coordinate-based mouse events — this also makes the interaction visible in the browser as the cursor moves across the page.

@ref compatibility: Only click, fill, type, hover, focus, check, select, drag support @ref syntax. The commands scrollintoview, get box, and do NOT — they expect CSS selectors. Use with for scrolling and position lookup.

# 1. Take interactive snapshot — identify target element and build a CSS selector
agent-browser snapshot -i
# Example: snapshot shows  heading "Point at bugs." [ref=e10]
# Derive a CSS selector: 'h1', or more specific: 'h1:first-of-type'

# 2. Scroll the element into view via eval (NOT scrollintoview @ref — that breaks)
agent-browser eval "document.querySelector('h1').scrollIntoView({block:'center'})"

# 3. Get its bounding box via eval (NOT get box @ref — that also breaks)
agent-browser eval "((r) => r.x+','+r.y+','+r.width+','+r.height)(document.querySelector('h1').getBoundingClientRect())"
# Returns: "383,245,200,40"  (parse these as x,y,width,height)

# 4. Move cursor to element center, then click
#    centerX = x + width/2,  centerY = y + height/2
agent-browser mouse move <centerX> <centerY>
agent-browser mouse down left
agent-browser mouse up left

# 5. Get the annotation dialog refs — read the FULL snapshot output
#    Dialog refs appear at the BOTTOM of the list, don't truncate with head/tail
agent-browser snapshot -i
# Look for: textbox "What should change?" and "Cancel" / "Add" buttons

# 6. Type critique — fill and click DO support @ref
agent-browser fill @<textboxRef> "Your critique here"

# 7. Submit (Add button enables after text is filled)
agent-browser click @<addRef>

If no dialog appears after clicking, the toolbar may have collapsed. Re-expand (only if collapsed) and retry:

agent-browser eval "document.querySelector('[data-feedback-toolbar][class*=expanded]') ? 'ok' : (document.querySelector('[class*=toggleContent]')?.click(), 'expanded')"

Building CSS selectors from snapshots

The snapshot shows element roles, names, and refs. Map them to CSS selectors:

Snapshot line	CSS selector
`heading "Point at bugs." [ref=e10]`	`h1` or `h1:first-of-type`
`button "npm install agentation Copy" [ref=e15]`	`button:has(code)` or by text content via eval
`link "Star on GitHub" [ref=e28]`	`a[href*=github]`
`paragraph (long text...) [ref=e20]`

When in doubt, use a broader selector and verify with eval:

agent-browser eval "document.querySelector('h2').textContent"

The Loop

Work top-to-bottom through the page. For each annotation:

Scroll to the target area via eval (scrollIntoView)
Pick a specific element — heading, paragraph, button, section container
Get its bounding box via eval (getBoundingClientRect)
Execute the coordinate-click sequence (mouse move → mouse down → mouse up)
Read the full snapshot output to find dialog refs at the bottom
Write the critique (fill @ref) and submit (click @ref)
Verify the annotation was added (see below)
Move to the next area

Verifying annotations

After submitting each annotation, confirm the count increased:

agent-browser eval "document.querySelectorAll('[data-annotation-marker]').length"
# Should return the expected count (1 after first, 2 after second, etc.)

If the count didn't increase, the submission failed silently — re-snapshot and check if the dialog is still open.

Aim for 5-8 annotations per page unless told otherwise.

What to Critique

Area	What to look for
Hero / above the fold	Headline hierarchy, CTA placement, visual grouping
Navigation	Label styling, category grouping, visual weight
Demo / illustrations	Clarity, depth, animation readability
Content sections	Spacing rhythm, callout treatments, typography hierarchy
Key taglines	Whether resonant lines get enough visual emphasis
CTAs and footer	Conversion weight, visual separation, final actions

Critique Style

2-3 sentences max per annotation:

Specific and actionable : "Stack the install command below the subheading at 16px" not "fix the layout"
1-2 concrete alternatives : Reference CSS values, layout patterns, or design systems
Name the principle : Visual hierarchy, Gestalt grouping, whitespace, emphasis, conversion design
Reference comparable products : "Like how Stripe/Linear/Vercel handles this"

Bad: "This section needs work" Good: "This bullet list reads like docs, not a showcase. Use a 3-column card grid with icons — similar to Stripe's guidelines pattern. Creates visual rhythm and scannability."

Install

The skill must be symlinked into ~/.claude/skills/ for Claude Code to discover it:

ln -s "$(pwd)/skills/agentation-self-driving" ~/.claude/skills/agentation-self-driving

Restart Claude Code after installing. Verify with /agentation-self-driving — if it loads the skill instructions, the symlink is working.

Troubleshooting

"Browser not launched. Call launch first." : Stale session from a previous run — run agent-browser close 2>/dev/null then retry the --headed open command
Toolbar not found on page : Agentation isn't installed — run /agentation to set it up first
No dialog after clicking : Toolbar collapsed — re-expand with the state-aware eval (check [class*=expanded] first), retry
Wrong element targeted : Click Cancel, scroll to intended element, retry with correct coordinates
Add button stays disabled : Text wasn't filled — re-snapshot and fill the textbox
Page navigated : "Block page interactions" is off — enable via toolbar settings
Annotation count didn't increase : Submission failed — dialog may still be open, re-snapshot and check
Interrupted mid-run (Ctrl+C) : The browser stays open with whatever state it was in. Run agent-browser close to clean up before starting a new session

agent-browser Pitfalls

These will silently break the workflow if you're not aware of them:

Pitfall	What happens	Fix
`scrollintoview @ref`	Crashes: "Unsupported token @ref while parsing css selector"	Use `eval "document.querySelector('sel').scrollIntoView({block:'center'})"`
`get box @ref`	Same crash — `get box` parses refs as CSS selectors	Use `eval "((r)=>r.x+','+r.y+','+r.width+','+r.height)(document.querySelector('sel').getBoundingClientRect())"`
`eval` with double-bang	Bash expands double-bang as history substitution before the command runs

Rule of thumb : @ref works for interaction commands (click, fill, type, hover). For everything else (eval, get, scrollintoview), use CSS selectors via querySelector in an eval.

Two-Session Workflow (Full Self-Driving)

With MCP connected (toolbar shows "MCP Connected"), annotations auto-send to any listening agent. This enables:

Session 1 (this skill): Watches the page, adds critique annotations in the visible browser
Session 2 : Runs agentation_watch_annotations in a loop, receives annotations, edits code to address each one

The user watches Session 1 drive through the page in the browser while Session 2 fixes issues in the codebase — fully autonomous design review and implementation.

Weekly Installs

1.3K

Repository

benjitaylor/agentation

GitHub Stars

2.8K

First Seen

Feb 15, 2026

Security Audits

Gen Agent Trust HubPass SocketFail SnykWarn

Installed on

codex1.2K

opencode1.2K

gemini-cli1.2K

github-copilot1.2K

claude-code1.1K

cursor1.1K

Agentation 自主驾驶模式：网页设计批注自动化工具，实现智能评审与交互可视化

🇨🇳中文介绍

Agentation 自主驾驶模式

启动——始终保持界面可见

相关 Skills

关键：如何创建批注

从快照构建 CSS 选择器

循环流程

验证批注

评审内容

评审风格

安装

故障排除

agent-browser 陷阱

双会话工作流（完全自主驾驶）