playwright-visual-testing by manutej/luxor-claude-marketplace
npx skills add https://github.com/manutej/luxor-claude-marketplace --skill playwright-visual-testing一个使用 Playwright MCP 服务器集成进行浏览器自动化和视觉测试的综合技能。该技能能够快速进行 UI 测试、视觉回归检测、自动化浏览器交互以及现代 Web 应用程序的跨浏览器验证。
在以下情况下使用此技能:
Playwright 为现代 Web 应用提供可靠的端到端测试:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
在当前页面中导航到 URL。
参数:
url: 要导航到的 URL(必需)
示例:
url: "https://example.com"
最佳实践:
导航回历史记录中的上一页。
参数: 无
示例:
// 点击链接后返回
使用场景:
关闭当前浏览器页面。
参数: 无
何时使用:
调整浏览器视口大小。
参数:
width: 宽度(像素)(必需)
height: 高度(像素)(必需)
常见视口:
// 移动端
width: 375, height: 667 // iPhone SE
width: 414, height: 896 // iPhone XR
// 平板
width: 768, height: 1024 // iPad
// 桌面端
width: 1280, height: 720 // HD
width: 1920, height: 1080 // Full HD
示例:
width: 375
height: 667
捕获当前页面的可访问性快照。
参数: 无
返回:
为何使用快照:
示例快照结构:
heading "Welcome" [ref=123]
text "to our site"
button "Sign In" [ref=456]
textbox "Email" [ref=789]
value: ""
截取当前页面或元素的屏幕截图。
参数:
filename: 输出文件名(可选,默认为 page-{timestamp}.png)
type: 图像格式 - "png" 或 "jpeg"(默认:png)
fullPage: 捕获完整可滚动页面(默认:false)
element: 人类可读的元素描述(可选)
ref: 快照中的元素引用(可选,需要 element 参数)
屏幕截图类型:
filename: "homepage-viewport.png"
filename: "homepage-full.png"
fullPage: true
3. 元素截图 :
filename: "header.png"
element: "主标题导航"
ref: "123"
最佳实践:
对元素执行点击操作。
参数:
element: 人类可读的元素描述(必需)
ref: 快照中的元素引用(必需)
button: "left"、"right" 或 "middle"(默认:left)
doubleClick: true 表示双击(默认:false)
modifiers: 修饰键数组 ["Alt", "Control", "ControlOrMeta", "Meta", "Shift"]
示例:
element: "提交按钮"
ref: "456"
2. 右键点击 :
element: "上下文菜单触发器"
ref: "789"
button: "right"
3. 带修饰键的点击 :
element: "在新标签页中打开的链接"
ref: "123"
modifiers: ["ControlOrMeta"]
4. 双击 :
element: "要选择的单词"
ref: "321"
doubleClick: true
在可编辑元素中输入文本。
参数:
element: 人类可读的元素描述(必需)
ref: 快照中的元素引用(必需)
text: 要输入的文本(必需)
slowly: 一次输入一个字符(默认:false)
submit: 输入后按 Enter 键(默认:false)
示例:
element: "邮箱文本框"
ref: "123"
text: "user@example.com"
2. 带提交的搜索 :
element: "搜索字段"
ref: "456"
text: "playwright testing"
submit: true
3. 逐字符输入 (触发按键处理程序):
element: "自动完成输入框"
ref: "789"
text: "New York"
slowly: true
按下键盘按键。
参数:
key: 按键名称或字符(必需)
常用按键:
ArrowLeft, ArrowRight, ArrowUp, ArrowDown
Enter, Escape, Tab, Backspace, Delete
Home, End, PageUp, PageDown
F1-F12
Control, Alt, Shift, Meta
示例:
// 导航
key: "ArrowDown"
// 提交表单
key: "Enter"
// 关闭对话框
key: "Escape"
// 在字段间切换
key: "Tab"
一次性填写多个表单字段。
参数:
fields: 字段对象数组(必需)
- name: 人类可读的字段名称
- type: "textbox"、"checkbox"、"radio"、"combobox"、"slider"
- ref: 快照中的元素引用
- value: 要设置的值(字符串,复选框为 "true"/"false")
示例:
fields: [
{
name: "用户名",
type: "textbox",
ref: "123",
value: "john_doe"
},
{
name: "密码",
type: "textbox",
ref: "456",
value: "secretpass123"
},
{
name: "记住我",
type: "checkbox",
ref: "789",
value: "true"
}
]
从下拉列表中选择选项。
参数:
element: 人类可读的元素描述(必需)
ref: 快照中的元素引用(必需)
values: 要选择的值数组(必需)
示例:
element: "国家下拉列表"
ref: "123"
values: ["United States"]
多选:
element: "编程语言"
ref: "456"
values: ["JavaScript", "Python", "Go"]
将鼠标悬停在元素上。
参数:
element: 人类可读的元素描述(必需)
ref: 快照中的元素引用(必需)
使用场景:
示例:
element: "帮助图标"
ref: "123"
在元素之间拖放。
参数:
startElement: 源元素描述(必需)
startRef: 源元素引用(必需)
endElement: 目标元素描述(必需)
endRef: 目标元素引用(必需)
示例:
startElement: "任务卡片"
startRef: "123"
endElement: "完成列"
endRef: "456"
使用场景:
在页面上下文中执行 JavaScript。
参数:
function: JavaScript 函数字符串(必需)
element: 元素描述(可选)
ref: 元素引用(可选,需要 element 参数)
示例:
function: "() => { return document.title; }"
element: "自定义小部件"
ref: "123"
function: "(element) => { return element.getAttribute('data-value'); }"
常见使用场景:
// 获取页面标题
function: "() => document.title"
// 滚动到底部
function: "() => window.scrollTo(0, document.body.scrollHeight)"
// 获取元素尺寸
function: "(element) => { const rect = element.getBoundingClientRect(); return { width: rect.width, height: rect.height }; }"
// 设置本地存储
function: "() => localStorage.setItem('theme', 'dark')"
// 获取计算样式
function: "(element) => getComputedStyle(element).backgroundColor"
上传文件到文件输入框。
参数:
paths: 绝对文件路径数组(必需)
- 省略或传递空数组以取消文件选择器
示例:
paths: [
"/Users/user/Documents/resume.pdf",
"/Users/user/Photos/headshot.jpg"
]
单个文件:
paths: ["/Users/user/Downloads/report.csv"]
取消上传:
paths: []
获取浏览器的控制台消息。
参数:
onlyErrors: 仅返回错误消息(默认:false)
返回:
示例:
onlyErrors: false
onlyErrors: true
使用场景:
获取自页面加载以来的所有网络请求。
参数: 无
返回:
使用场景:
响应浏览器对话框。
参数:
accept: 接受或关闭对话框(必需)
promptText: 提示对话框的文本(可选)
对话框类型:
示例:
accept: true
accept: false
accept: true
promptText: "John Doe"
在继续之前等待条件满足。
参数:
text: 等待文本出现(可选)
textGone: 等待文本消失(可选)
time: 等待指定秒数(可选)
示例:
text: "加载完成"
textGone: "加载中..."
time: 2
最佳实践:
管理浏览器标签页。
参数:
action: "list"、"new"、"close"、"select"(必需)
index: 用于 close/select 的标签页索引(可选)
操作:
action: "list"
action: "new"
action: "close"
index: 1 // 可选,省略则关闭当前标签页
4. 切换标签页 :
action: "select"
index: 0
使用场景:
安装配置中指定的浏览器。
参数: 无
何时使用:
场景: 验证主页视觉上未发生变化
1. 导航到页面
- 使用 browser_navigate 并指定目标 URL
- 等待页面完全加载
2. 捕获基线
- 截取全页屏幕截图
- 使用 browser_snapshot 获取上下文
- 记录可见元素
3. 进行更改(如果测试变更)
- 更新代码,部署
- 清除缓存
4. 捕获新状态
- 导航到相同 URL
- 截取相同的屏幕截图
- 手动或使用工具进行比较
5. 验证差异
- 预期的变更存在
- 没有意外的回归
- 记录发现
场景: 跨设备测试布局
1. 定义视口
- 移动端:375x667(iPhone SE)
- 平板:768x1024(iPad)
- 桌面端:1920x1080(Full HD)
2. 对于每个视口:
a. 调整浏览器大小
- browser_resize 并指定尺寸
b. 导航到页面
- browser_navigate 到 URL
c. 等待布局
- browser_wait_for 并指定条件
d. 捕获快照
- browser_snapshot 获取结构
e. 截取屏幕截图
- browser_take_screenshot 并指定描述性名称
- 在文件名中包含视口信息
3. 比较布局
- 验证响应式断点
- 检查元素重排
- 验证移动端导航
- 确保内容可访问
4. 记录问题
- 截取任何问题的屏幕截图
- 记录问题出现的视口
- 记录预期与实际行为
场景: 测试多步骤表单提交
1. 导航到表单
- browser_navigate 到表单 URL
- browser_snapshot 获取字段引用
2. 填写表单字段
- 使用 browser_fill_form 进行批量输入
- 或为每个字段单独使用 browser_type
- 包含验证触发器
3. 测试验证
- 使用无效数据提交
- browser_snapshot 查看错误
- 截取错误状态的屏幕截图
- 验证错误消息出现
4. 完成有效提交
- 填写所有必填字段
- browser_click 提交按钮
- 等待成功消息
- browser_wait_for 确认文本
5. 验证结果
- 检查成功页面
- 验证数据提交
- 截取确认屏幕截图
- 检查网络请求
场景: 测试单个组件变更
1. 导航到组件页面
- browser_navigate 到页面
- browser_snapshot 获取结构
2. 定位组件
- 从快照中查找元素引用
- 验证组件可见
3. 测试状态
a. 默认状态
- 截取元素屏幕截图
- 记录初始外观
b. 悬停状态
- browser_hover 在元素上
- 截取元素屏幕截图
- 与默认状态比较
c. 激活/聚焦状态
- browser_click 在元素上
- 截取元素屏幕截图
- 验证视觉反馈
d. 错误状态(如果适用)
- 触发验证错误
- 截取元素屏幕截图
- 验证错误样式
4. 记录状态变更
- 比较屏幕截图
- 记录预期行为
- 报告任何问题
场景: 验证跨浏览器一致性
1. 定义浏览器矩阵
- Chromium(Chrome/Edge)
- Firefox
- WebKit(Safari)
2. 对于每个浏览器:
a. 配置浏览器
- 在 MCP 服务器配置中设置
b. 运行测试套件
- 导航到页面
- 捕获快照
- 截取屏幕截图
- 测试交互
c. 记录结果
- 保存特定于浏览器的屏幕截图
- 记录渲染差异
- 记录特定于浏览器的错误
3. 比较结果
- 并排比较屏幕截图
- 功能差异
- 性能变化
- CSS 渲染问题
4. 处理差异
- 修复关键的跨浏览器错误
- 记录可接受的差异
- 如果需要,添加特定于浏览器的样式
场景: 完整的用户工作流验证
1. 开始旅程
- browser_navigate 到着陆页
- browser_snapshot 初始状态
- 截取起始点屏幕截图
2. 身份验证
- 导航到登录页
- 使用 browser_fill_form 填写凭据
- 提交表单
- 等待重定向
- 截取登录后状态的屏幕截图
3. 主要工作流步骤
对于每个步骤:
- 在操作前截取快照
- 执行用户操作
- 等待完成
- 在操作后截取屏幕截图
- 验证预期状态
4. 完成事务
- 提交最终操作
- 等待确认
- 截取成功状态的屏幕截图
- 验证完成消息
5. 清理
- 如果需要,注销
- 截取最终状态的屏幕截图
- 记录旅程结果
场景: 验证语义结构和可访问性
1. 导航到页面
- browser_navigate 到 URL
2. 捕获可访问性快照
- browser_snapshot 获取语义树
- 检查元素角色
- 检查标题层次结构
- 验证标签和描述
3. 验证结构
- 正确的标题级别(h1 → h2 → h3)
- 表单输入有标签
- 按钮有可访问的名称
- 交互式元素有角色
- 存在 ARIA 属性
4. 测试键盘导航
- browser_press_key "Tab"
- 每次 Tab 后截取快照
- 验证焦点指示器
- 确保逻辑 Tab 顺序
- 测试跳过链接
5. 测试屏幕阅读器体验
- 检查快照文本内容
- 验证存在替代文本
- 检查 ARIA 实时区域
- 验证语义地标
- 确保有意义的结构
6. 记录发现
- 截取可访问性树的屏幕截图
- 记录缺失的标签
- 报告层次结构问题
- 提出改进建议
{页面}-{视口}-{状态}-{时间戳}.png
示例:
homepage-desktop-default-1634567890.png
login-mobile-error-1634567891.png
checkout-tablet-success-1634567892.png
2. 文件名组织
screenshots/
├── baselines/
│ ├── homepage-desktop.png
│ ├── homepage-mobile.png
│ └── homepage-tablet.png
├── current/
│ └── homepage-desktop-20251017.png
└── diffs/
└── homepage-desktop-diff-20251017.png
3. 全页截图与视口截图
何时使用快照:
何时使用屏幕截图:
何时同时使用两者:
// 良好
browser_wait_for 并指定 text: "数据已加载"
// 避免
browser_wait_for 并指定 time: 5
2. 等待动画
// 等待加载旋转图标消失
browser_wait_for 并指定 textGone: "加载中..."
3. 等待网络空闲
// 等待后检查网络请求
browser_network_requests 验证完成情况
4. 动态内容
// 在截图前等待特定文本
browser_wait_for 并指定 text: "结果:42 个项目"
1. browser_snapshot
2. 在快照中查找元素引用
3. 使用引用进行交互
4. 切勿猜测元素引用
2. 验证元素状态
// 截取快照以验证元素存在
// 检查元素是否可见且可操作
// 然后执行交互
3. 处理动态元素
// 等待元素出现
browser_wait_for 并指定 text: "提交"
// 然后截取新的快照
browser_snapshot
// 获取更新的引用并进行交互
4. 错误恢复
// 如果交互失败:
1. 截取当前状态的屏幕截图
2. 捕获控制台消息(browser_console_messages)
3. 检查网络请求(browser_network_requests)
4. 截取新的快照以查看当前状态
// 简单表单使用批量输入(更快)
browser_fill_form 并指定所有字段
// 复杂表单使用单独输入(更好控制)
每个字段使用 browser_type
每次输入后使用 browser_wait_for
验证验证触发器
2. 验证测试
// 测试每个验证规则
1. 输入无效数据
2. 尝试提交
3. 截取快照查看错误
4. 截取错误消息的屏幕截图
5. 更正数据
6. 验证错误清除
3. 多步骤表单
// 记录每个步骤
1. 填写步骤 1
2. 提交前截取屏幕截图
3. 点击下一步
4. 等待步骤 2
5. 截取新状态的快照
6. 为每个步骤重复
// 用户操作后
browser_network_requests
// 验证预期的端点被调用
// 检查状态码
// 验证请求/响应数据
2. 性能测试
// 捕获网络计时
browser_network_requests
// 分析:
- 请求数量
- 总传输大小
- 响应时间
- 失败的请求
3. 调试失败的请求
browser_network_requests
// 查找失败的请求
// 检查错误消息
// 截取当前状态的屏幕截图
// 控制台消息中的错误
创建可重用的测试模式:
视觉回归测试模板:
1. 导航:browser_navigate 到 {URL}
2. 等待:browser_wait_for 等待 {条件}
3. 基线:browser_take_screenshot "baseline-{名称}.png", fullPage: true
4. [进行更改]
5. 捕获:browser_take_screenshot "current-{名称}.png", fullPage: true
6. 比较:[手动或自动比较]
7. 记录:截取任何差异的屏幕截图
响应式测试模板:
对于视口 in [mobile, tablet, desktop]:
1. 调整大小:browser_resize 到 {视口尺寸}
2. 导航:browser_navigate 到 {URL}
3. 等待:browser_wait_for 等待稳定性
4. 快照:browser_snapshot
5. 截图:browser_take_screenshot "{页面}-{视口}.png"
6. 验证:检查布局完整性
表单测试模板:
1. 导航:browser_navigate 到 {表单 URL}
2. 快照:browser_snapshot 获取引用
3. 填写:browser_fill_form 使用测试数据
4. 截图:"form-filled.png"
5. 提交:browser_click 提交按钮
6. 等待:browser_wait_for 等待结果
7. 验证:快照和截图结果
8. 检查:browser_network_requests 检查提交情况
系统化组织屏幕截图:
项目结构:
tests/
visual/
baselines/ # 参考屏幕截图
results/ # 当前测试屏幕截图
diffs/ # 差异图像
reports/ # 包含比较的 HTML 报告
命名约定:
{测试名称}_{视口}_{状态}_{日期}.png
示例:
login_desktop_default_20251017.png
cart_mobile_empty_20251017.png
checkout_tablet_error_20251017.png
元数据文件:
screenshot-metadata.json:
{
"screenshot": "login_desktop_default_20251017.png",
"timestamp": "2025-10-17T10:30:00Z",
"url": "https://example.com/login",
"viewport": {"width": 1920, "height": 1080},
"browser": "chromium",
"test": "login_flow",
"passed": true
}
跨浏览器高效测试:
浏览器矩阵:
- Chromium(最新)
- Firefox(最新)
- WebKit(最新)
并行执行:
1. 定义测试套件
2. 配置每个浏览器
3. 并行运行测试
4. 收集结果
5. 跨浏览器比较
6. 生成跨浏览器报告
结果组织:
screenshots/
chromium/
homepage.png
login.png
firefox/
homepage.png
login.png
webkit/
homepage.png
login.png
comparison/
homepage-browsers.html
login-browsers.html
自动化视觉比较工作流:
1. 捕获基线(一次性):
- 导航到每个页面
- 截取参考屏幕截图
- 存储在 baselines/
2. 运行视觉测试:
- 导航到每个页面
- 截取当前屏幕截图
- 存储在 results/
3. 比较图像:
- 逐像素比较
- 高亮显示差异
- 生成差异图像
- 计算相似度分数
4. 生成报告:
- 列出所有比较
- 显示并排视图
- 高亮显示失败
- 包含指标
5. 审查和更新:
- 审查失败
- 接受预期的变更
- 更新基线
- 修复回归
测试设计系统组件:
组件测试套件:
对于每个组件:
1. 导航到组件页面
2. 快照获取结构
3. 测试每个变体:
- 默认
- 悬停
- 激活
- 禁用
- 错误
4. 截取每个状态的屏幕截图
5. 验证可访问性
6. 检查响应式行为
文档生成:
1. 捕获所有组件状态
2. 按组件组织
3. 生成视觉目录
4. 包含代码示例
5. 记录使用指南
示例:
components/
Button/
button-default.png
button-hover.png
button-active.png
button-disabled.png
button-error.png
Input/
input-default.png
input-focus.png
input-error.png
input-disabled.png
屏幕截图显示空白
找不到用于交互的元素
浏览器无法启动
屏幕截图与预期不同
表单提交失败
未捕获到网络请求
对话框未处理
1. browser_snapshot - 查看页面结构
2. browser_take_screenshot - 查看视觉状态
3. browser_console_messages onlyErrors: true - 检查错误
4. browser_network_requests - 查看网络活动
2. 隔离问题
1. 简化测试到最小复现
2. 在单个浏览器中测试
3. 禁用动态内容
4. 移除可变元素
5. 逐步测试
3. **记录
A comprehensive skill for browser automation and visual testing using Playwright MCP server integration. This skill enables rapid UI testing, visual regression detection, automated browser interactions, and cross-browser validation for modern web applications.
Use this skill when:
Playwright provides reliable end-to-end testing for modern web apps:
Navigate to a URL in the current page.
Parameters:
url: The URL to navigate to (required)
Example:
url: "https://example.com"
Best Practices:
Navigate back to the previous page in history.
Parameters: None
Example:
// Navigate back after clicking a link
Use Cases:
Close the current browser page.
Parameters: None
When to Use:
Resize the browser viewport.
Parameters:
width: Width in pixels (required)
height: Height in pixels (required)
Common Viewports:
// Mobile
width: 375, height: 667 // iPhone SE
width: 414, height: 896 // iPhone XR
// Tablet
width: 768, height: 1024 // iPad
// Desktop
width: 1280, height: 720 // HD
width: 1920, height: 1080 // Full HD
Example:
width: 375
height: 667
Capture accessibility snapshot of the current page.
Parameters: None
Returns:
Why Use Snapshots:
Example Snapshot Structure:
heading "Welcome" [ref=123]
text "to our site"
button "Sign In" [ref=456]
textbox "Email" [ref=789]
value: ""
Take a screenshot of the current page or element.
Parameters:
filename: Output filename (optional, defaults to page-{timestamp}.png)
type: Image format - "png" or "jpeg" (default: png)
fullPage: Capture full scrollable page (default: false)
element: Human-readable element description (optional)
ref: Element reference from snapshot (optional, requires element)
Screenshot Types:
filename: "homepage-viewport.png"
filename: "homepage-full.png"
fullPage: true
3. Element Screenshot :
filename: "header.png"
element: "main header navigation"
ref: "123"
Best Practices:
Perform click on an element.
Parameters:
element: Human-readable element description (required)
ref: Element reference from snapshot (required)
button: "left", "right", or "middle" (default: left)
doubleClick: true for double-click (default: false)
modifiers: Array of modifier keys ["Alt", "Control", "ControlOrMeta", "Meta", "Shift"]
Examples:
element: "Submit button"
ref: "456"
2. Right Click :
element: "Context menu trigger"
ref: "789"
button: "right"
3. Click with Modifier :
element: "Link to open in new tab"
ref: "123"
modifiers: ["ControlOrMeta"]
4. Double Click :
element: "Word to select"
ref: "321"
doubleClick: true
Type text into an editable element.
Parameters:
element: Human-readable element description (required)
ref: Element reference from snapshot (required)
text: Text to type (required)
slowly: Type one character at a time (default: false)
submit: Press Enter after typing (default: false)
Examples:
element: "Email textbox"
ref: "123"
text: "user@example.com"
2. Search with Submit :
element: "Search field"
ref: "456"
text: "playwright testing"
submit: true
3. Character-by-Character (triggers key handlers):
element: "Auto-complete input"
ref: "789"
text: "New York"
slowly: true
Press a keyboard key.
Parameters:
key: Key name or character (required)
Common Keys:
ArrowLeft, ArrowRight, ArrowUp, ArrowDown
Enter, Escape, Tab, Backspace, Delete
Home, End, PageUp, PageDown
F1-F12
Control, Alt, Shift, Meta
Examples:
// Navigation
key: "ArrowDown"
// Submit form
key: "Enter"
// Close dialog
key: "Escape"
// Tab through fields
key: "Tab"
Fill multiple form fields at once.
Parameters:
fields: Array of field objects (required)
- name: Human-readable field name
- type: "textbox", "checkbox", "radio", "combobox", "slider"
- ref: Element reference from snapshot
- value: Value to set (string, "true"/"false" for checkboxes)
Example:
fields: [
{
name: "Username",
type: "textbox",
ref: "123",
value: "john_doe"
},
{
name: "Password",
type: "textbox",
ref: "456",
value: "secretpass123"
},
{
name: "Remember me",
type: "checkbox",
ref: "789",
value: "true"
}
]
Select option from dropdown.
Parameters:
element: Human-readable element description (required)
ref: Element reference from snapshot (required)
values: Array of values to select (required)
Example:
element: "Country dropdown"
ref: "123"
values: ["United States"]
Multi-select:
element: "Programming languages"
ref: "456"
values: ["JavaScript", "Python", "Go"]
Hover over an element.
Parameters:
element: Human-readable element description (required)
ref: Element reference from snapshot (required)
Use Cases:
Example:
element: "Help icon"
ref: "123"
Drag and drop between elements.
Parameters:
startElement: Source element description (required)
startRef: Source element reference (required)
endElement: Target element description (required)
endRef: Target element reference (required)
Example:
startElement: "Task card"
startRef: "123"
endElement: "Done column"
endRef: "456"
Use Cases:
Execute JavaScript in page context.
Parameters:
function: JavaScript function as string (required)
element: Element description (optional)
ref: Element reference (optional, requires element)
Examples:
function: "() => { return document.title; }"
element: "Custom widget"
ref: "123"
function: "(element) => { return element.getAttribute('data-value'); }"
Common Use Cases:
// Get page title
function: "() => document.title"
// Scroll to bottom
function: "() => window.scrollTo(0, document.body.scrollHeight)"
// Get element dimensions
function: "(element) => { const rect = element.getBoundingClientRect(); return { width: rect.width, height: rect.height }; }"
// Set local storage
function: "() => localStorage.setItem('theme', 'dark')"
// Get computed style
function: "(element) => getComputedStyle(element).backgroundColor"
Upload files to file input.
Parameters:
paths: Array of absolute file paths (required)
- Omit or pass empty array to cancel file chooser
Example:
paths: [
"/Users/user/Documents/resume.pdf",
"/Users/user/Photos/headshot.jpg"
]
Single File:
paths: ["/Users/user/Downloads/report.csv"]
Cancel Upload:
paths: []
Get console messages from the browser.
Parameters:
onlyErrors: Return only error messages (default: false)
Returns:
Examples:
onlyErrors: false
onlyErrors: true
Use Cases:
Get all network requests since page load.
Parameters: None
Returns:
Use Cases:
Respond to browser dialogs.
Parameters:
accept: Accept or dismiss dialog (required)
promptText: Text for prompt dialogs (optional)
Dialog Types:
Examples:
accept: true
accept: false
accept: true
promptText: "John Doe"
Wait for conditions before proceeding.
Parameters:
text: Wait for text to appear (optional)
textGone: Wait for text to disappear (optional)
time: Wait for specified seconds (optional)
Examples:
text: "Loading complete"
textGone: "Loading..."
time: 2
Best Practices:
Manage browser tabs.
Parameters:
action: "list", "new", "close", "select" (required)
index: Tab index for close/select (optional)
Actions:
action: "list"
action: "new"
action: "close"
index: 1 // Optional, closes current if omitted
4. Switch Tab :
action: "select"
index: 0
Use Cases:
Install the browser specified in config.
Parameters: None
When to Use:
Scenario: Verify homepage hasn't changed visually
1. Navigate to page
- Use browser_navigate with target URL
- Wait for page to load completely
2. Capture baseline
- Take full-page screenshot
- Use browser_snapshot for context
- Document visible elements
3. Make changes (if testing changes)
- Update code, deploy
- Clear cache
4. Capture new state
- Navigate to same URL
- Take identical screenshot
- Compare manually or with tools
5. Validate differences
- Expected changes present
- No unexpected regressions
- Document findings
Scenario: Test layout across devices
1. Define viewports
- Mobile: 375x667 (iPhone SE)
- Tablet: 768x1024 (iPad)
- Desktop: 1920x1080 (Full HD)
2. For each viewport:
a. Resize browser
- browser_resize with dimensions
b. Navigate to page
- browser_navigate to URL
c. Wait for layout
- browser_wait_for with condition
d. Capture snapshot
- browser_snapshot for structure
e. Take screenshot
- browser_take_screenshot with descriptive name
- Include viewport in filename
3. Compare layouts
- Verify responsive breakpoints
- Check element reflow
- Validate mobile navigation
- Ensure content accessibility
4. Document issues
- Screenshot any problems
- Note viewport where issue occurs
- Record expected vs actual behavior
Scenario: Test multi-step form submission
1. Navigate to form
- browser_navigate to form URL
- browser_snapshot to get field refs
2. Fill form fields
- Use browser_fill_form for batch entry
- Or individual browser_type for each field
- Include validation triggers
3. Test validation
- Submit with invalid data
- browser_snapshot to see errors
- Screenshot error states
- Verify error messages appear
4. Complete valid submission
- Fill all required fields
- browser_click submit button
- Wait for success message
- browser_wait_for confirmation text
5. Verify results
- Check success page
- Verify data submission
- Screenshot confirmation
- Check network requests
Scenario: Test individual component changes
1. Navigate to component page
- browser_navigate to page
- browser_snapshot for structure
2. Locate component
- Find element ref from snapshot
- Verify component is visible
3. Test states
a. Default state
- Take element screenshot
- Document initial appearance
b. Hover state
- browser_hover on element
- Take element screenshot
- Compare with default
c. Active/focused state
- browser_click on element
- Take element screenshot
- Verify visual feedback
d. Error state (if applicable)
- Trigger validation error
- Take element screenshot
- Verify error styling
4. Document state changes
- Compare screenshots
- Note expected behaviors
- Report any issues
Scenario: Verify consistency across browsers
1. Define browser matrix
- Chromium (Chrome/Edge)
- Firefox
- WebKit (Safari)
2. For each browser:
a. Configure browser
- Set in MCP server config
b. Run test suite
- Navigate to pages
- Capture snapshots
- Take screenshots
- Test interactions
c. Document results
- Save browser-specific screenshots
- Note rendering differences
- Log browser-specific bugs
3. Compare results
- Side-by-side screenshots
- Functionality differences
- Performance variations
- CSS rendering issues
4. Address discrepancies
- Fix critical cross-browser bugs
- Document acceptable differences
- Add browser-specific styles if needed
Scenario: Complete user workflow validation
1. Start journey
- browser_navigate to landing page
- browser_snapshot initial state
- Screenshot starting point
2. Authentication
- Navigate to login
- Fill credentials with browser_fill_form
- Submit form
- Wait for redirect
- Screenshot logged-in state
3. Main workflow steps
For each step:
- Take snapshot before action
- Perform user action
- Wait for completion
- Take screenshot after action
- Verify expected state
4. Complete transaction
- Submit final action
- Wait for confirmation
- Screenshot success state
- Verify completion message
5. Cleanup
- Logout if needed
- Screenshot final state
- Document journey results
Scenario: Verify semantic structure and accessibility
1. Navigate to page
- browser_navigate to URL
2. Capture accessibility snapshot
- browser_snapshot for semantic tree
- Review element roles
- Check heading hierarchy
- Verify labels and descriptions
3. Validate structure
- Proper heading levels (h1 → h2 → h3)
- Form inputs have labels
- Buttons have accessible names
- Interactive elements have roles
- ARIA attributes present
4. Test keyboard navigation
- browser_press_key "Tab"
- Snapshot after each tab
- Verify focus indicators
- Ensure logical tab order
- Test skip links
5. Test screen reader experience
- Review snapshot text content
- Verify alt text present
- Check ARIA live regions
- Validate semantic landmarks
- Ensure meaningful structure
6. Document findings
- Screenshot accessibility tree
- Note missing labels
- Report hierarchy issues
- Suggest improvements
{page}-{viewport}-{state}-{timestamp}.png
Examples:
homepage-desktop-default-1634567890.png
login-mobile-error-1634567891.png
checkout-tablet-success-1634567892.png
2. Filename Organization
screenshots/
├── baselines/
│ ├── homepage-desktop.png
│ ├── homepage-mobile.png
│ └── homepage-tablet.png
├── current/
│ └── homepage-desktop-20251017.png
└── diffs/
└── homepage-desktop-diff-20251017.png
3. Full Page vs Viewport
Use Snapshots When:
Use Screenshots When:
Use Both When:
// Good
browser_wait_for with text: "Data loaded"
// Avoid
browser_wait_for with time: 5
2. Wait for Animations
// Wait for loading spinner to disappear
browser_wait_for with textGone: "Loading..."
3. Wait for Network Idle
// Check network requests after waiting
browser_network_requests to verify completion
4. Dynamic Content
// Wait for specific text before screenshot
browser_wait_for with text: "Results: 42 items"
1. browser_snapshot
2. Find element ref in snapshot
3. Use ref for interaction
4. Never guess element references
2. Verify Element State
// Take snapshot to verify element exists
// Check element is visible and actionable
// Then perform interaction
3. Handle Dynamic Elements
// Wait for element to appear
browser_wait_for with text: "Submit"
// Then take fresh snapshot
browser_snapshot
// Get updated ref and interact
4. Error Recovery
// If interaction fails:
1. Take screenshot of current state
2. Capture console messages (browser_console_messages)
3. Check network requests (browser_network_requests)
4. Take new snapshot to see current state
// Batch for simple forms (faster)
browser_fill_form with all fields
// Individual for complex forms (better control)
browser_type for each field
browser_wait_for after each entry
Verify validation triggers
2. Validation Testing
// Test each validation rule
1. Enter invalid data
2. Attempt submission
3. Snapshot to see errors
4. Screenshot error messages
5. Correct data
6. Verify error clears
3. Multi-Step Forms
// Document each step
1. Fill step 1
2. Screenshot before submit
3. Click next
4. Wait for step 2
5. Snapshot new state
6. Repeat for each step
// After user action
browser_network_requests
// Verify expected endpoints called
// Check status codes
// Validate request/response data
2. Performance Testing
// Capture network timing
browser_network_requests
// Analyze:
- Request count
- Total transfer size
- Response times
- Failed requests
3. Debug Failed Requests
browser_network_requests
// Find failed requests
// Check error messages
// Screenshot current state
// Console messages for errors
Create reusable test patterns:
Visual Regression Test Template:
1. Navigate: browser_navigate to {URL}
2. Wait: browser_wait_for for {condition}
3. Baseline: browser_take_screenshot "baseline-{name}.png", fullPage: true
4. [Make changes]
5. Capture: browser_take_screenshot "current-{name}.png", fullPage: true
6. Compare: [Manual or automated comparison]
7. Document: Screenshot any differences
Responsive Test Template:
For viewport in [mobile, tablet, desktop]:
1. Resize: browser_resize to {viewport dimensions}
2. Navigate: browser_navigate to {URL}
3. Wait: browser_wait_for for stability
4. Snapshot: browser_snapshot
5. Screenshot: browser_take_screenshot "{page}-{viewport}.png"
6. Validate: Check layout integrity
Form Test Template:
1. Navigate: browser_navigate to {form URL}
2. Snapshot: browser_snapshot for refs
3. Fill: browser_fill_form with test data
4. Screenshot: "form-filled.png"
5. Submit: browser_click submit button
6. Wait: browser_wait_for for result
7. Verify: Snapshot and screenshot result
8. Check: browser_network_requests for submission
Organize screenshots systematically:
Project Structure:
tests/
visual/
baselines/ # Reference screenshots
results/ # Current test screenshots
diffs/ # Difference images
reports/ # HTML reports with comparisons
Naming Convention:
{test-name}_{viewport}_{state}_{date}.png
Examples:
login_desktop_default_20251017.png
cart_mobile_empty_20251017.png
checkout_tablet_error_20251017.png
Metadata File:
screenshot-metadata.json:
{
"screenshot": "login_desktop_default_20251017.png",
"timestamp": "2025-10-17T10:30:00Z",
"url": "https://example.com/login",
"viewport": {"width": 1920, "height": 1080},
"browser": "chromium",
"test": "login_flow",
"passed": true
}
Test across browsers efficiently:
Browser Matrix:
- Chromium (latest)
- Firefox (latest)
- WebKit (latest)
Parallel Execution:
1. Define test suite
2. Configure each browser
3. Run tests in parallel
4. Collect results
5. Compare across browsers
6. Generate cross-browser report
Result Organization:
screenshots/
chromium/
homepage.png
login.png
firefox/
homepage.png
login.png
webkit/
homepage.png
login.png
comparison/
homepage-browsers.html
login-browsers.html
Automate visual comparison workflow:
1. Capture Baselines (one-time):
- Navigate to each page
- Take reference screenshots
- Store in baselines/
2. Run Visual Tests:
- Navigate to each page
- Take current screenshots
- Store in results/
3. Compare Images:
- Pixel-by-pixel comparison
- Highlight differences
- Generate diff images
- Calculate similarity score
4. Generate Report:
- List all comparisons
- Show side-by-side views
- Highlight failures
- Include metrics
5. Review and Update:
- Review failures
- Accept intentional changes
- Update baselines
- Fix regressions
Test design system components:
Component Test Suite:
For each component:
1. Navigate to component page
2. Snapshot for structure
3. Test each variant:
- Default
- Hover
- Active
- Disabled
- Error
4. Screenshot each state
5. Verify accessibility
6. Check responsive behavior
Documentation Generation:
1. Capture all component states
2. Organize by component
3. Generate visual catalog
4. Include code examples
5. Document usage guidelines
Example:
components/
Button/
button-default.png
button-hover.png
button-active.png
button-disabled.png
button-error.png
Input/
input-default.png
input-focus.png
input-error.png
input-disabled.png
Screenshot appears blank
Element not found for interaction
Browser not launching
Screenshot differs from expected
Form submission fails
Network requests not captured
Dialog not handled
1. browser_snapshot - See page structure
2. browser_take_screenshot - See visual state
3. browser_console_messages onlyErrors: true - Check errors
4. browser_network_requests - See network activity
2. Isolate Issue
1. Simplify test to minimum reproduction
2. Test in single browser
3. Disable dynamic content
4. Remove variable elements
5. Test step-by-step
3. Document Problem
1. Screenshot before issue
2. Screenshot at failure point
3. Capture console messages
4. Save network requests
5. Note expected vs actual
6. Include reproduction steps
Test homepage hasn't visually changed:
1. Navigate
browser_navigate
url: "https://example.com"
2. Wait for page load
browser_wait_for
textGone: "Loading..."
3. Capture baseline
browser_take_screenshot
filename: "homepage-baseline.png"
fullPage: true
4. [After code changes, repeat]
5. Capture current
browser_take_screenshot
filename: "homepage-current.png"
fullPage: true
6. Compare images manually or with tools
7. Document differences
Test login form functionality:
1. Navigate to login
browser_navigate
url: "https://example.com/login"
2. Get form structure
browser_snapshot
3. Fill form
browser_fill_form
fields: [
{
name: "Email",
type: "textbox",
ref: "123",
value: "test@example.com"
},
{
name: "Password",
type: "textbox",
ref: "456",
value: "password123"
}
]
4. Screenshot filled form
browser_take_screenshot
filename: "login-filled.png"
5. Submit
browser_click
element: "Sign In button"
ref: "789"
6. Wait for redirect
browser_wait_for
text: "Welcome back"
7. Screenshot success
browser_take_screenshot
filename: "login-success.png"
8. Verify network request
browser_network_requests
Test responsive layout:
Mobile:
1. Resize to mobile
browser_resize
width: 375
height: 667
2. Navigate
browser_navigate
url: "https://example.com"
3. Wait
browser_wait_for
time: 2
4. Screenshot
browser_take_screenshot
filename: "homepage-mobile.png"
fullPage: true
Tablet:
5. Resize to tablet
browser_resize
width: 768
height: 1024
6. Navigate
browser_navigate
url: "https://example.com"
7. Screenshot
browser_take_screenshot
filename: "homepage-tablet.png"
fullPage: true
Desktop:
8. Resize to desktop
browser_resize
width: 1920
height: 1080
9. Navigate
browser_navigate
url: "https://example.com"
10. Screenshot
browser_take_screenshot
filename: "homepage-desktop.png"
fullPage: true
Test button states:
1. Navigate to component library
browser_navigate
url: "https://example.com/components/button"
2. Get page structure
browser_snapshot
3. Default state
browser_take_screenshot
filename: "button-default.png"
element: "Primary button"
ref: "123"
4. Hover state
browser_hover
element: "Primary button"
ref: "123"
browser_take_screenshot
filename: "button-hover.png"
element: "Primary button"
ref: "123"
5. Active state
browser_click
element: "Primary button"
ref: "123"
browser_take_screenshot
filename: "button-active.png"
element: "Primary button"
ref: "123"
6. Snapshot for verification
browser_snapshot
Test complete checkout process:
1. Navigate to product
browser_navigate
url: "https://example.com/products/item-123"
2. Add to cart
browser_snapshot
browser_click
element: "Add to Cart button"
ref: "456"
browser_wait_for
text: "Added to cart"
3. Go to cart
browser_click
element: "Cart icon"
ref: "789"
browser_take_screenshot
filename: "cart-with-item.png"
4. Proceed to checkout
browser_click
element: "Checkout button"
ref: "101"
5. Fill shipping info
browser_snapshot
browser_fill_form
fields: [
{name: "Name", type: "textbox", ref: "111", value: "John Doe"},
{name: "Address", type: "textbox", ref: "222", value: "123 Main St"},
{name: "City", type: "textbox", ref: "333", value: "New York"},
{name: "Zip", type: "textbox", ref: "444", value: "10001"}
]
6. Screenshot checkout
browser_take_screenshot
filename: "checkout-filled.png"
fullPage: true
7. Complete order
browser_click
element: "Place Order button"
ref: "555"
browser_wait_for
text: "Order confirmed"
8. Screenshot confirmation
browser_take_screenshot
filename: "order-confirmed.png"
fullPage: true
9. Verify network requests
browser_network_requests
Test keyboard navigation and structure:
1. Navigate to page
browser_navigate
url: "https://example.com/form"
2. Capture semantic structure
browser_snapshot
3. Verify heading hierarchy
- Check h1 → h2 → h3 order
- Ensure single h1
- Verify logical structure
4. Test keyboard navigation
browser_press_key
key: "Tab"
browser_snapshot
browser_take_screenshot
filename: "focus-field-1.png"
5. Continue tabbing
browser_press_key
key: "Tab"
browser_snapshot
browser_take_screenshot
filename: "focus-field-2.png"
6. Verify all interactive elements reachable
- Buttons
- Links
- Form fields
- Custom widgets
7. Check ARIA labels
- Form labels present
- Button labels descriptive
- Error messages announced
- Status updates live
8. Screenshot accessibility tree
browser_take_screenshot
filename: "accessibility-structure.png"
Debug failed API calls:
1. Navigate to page
browser_navigate
url: "https://example.com/dashboard"
2. Wait for page
browser_wait_for
time: 3
3. Check console errors
browser_console_messages
onlyErrors: true
4. Check network requests
browser_network_requests
5. Find failed requests
- Status: 4xx or 5xx
- Timeout errors
- CORS issues
6. Screenshot error state
browser_take_screenshot
filename: "api-error-state.png"
7. Retry action
browser_click
element: "Refresh button"
ref: "123"
8. Monitor new requests
browser_network_requests
9. Document findings
- Failed endpoint
- Error message
- Request/response data
- Screenshot
Test confirmation dialogs:
1. Navigate to page
browser_navigate
url: "https://example.com/settings"
2. Trigger delete action
browser_snapshot
browser_click
element: "Delete Account button"
ref: "123"
3. Handle confirmation
browser_handle_dialog
accept: false # Cancel first time
4. Verify still on page
browser_snapshot
5. Try again
browser_click
element: "Delete Account button"
ref: "123"
6. Accept this time
browser_handle_dialog
accept: true
7. Wait for result
browser_wait_for
text: "Account deleted"
8. Screenshot confirmation
browser_take_screenshot
filename: "account-deleted.png"
Test multi-tab workflow:
1. List current tabs
browser_tabs
action: "list"
2. Open link in new tab
browser_click
element: "Privacy Policy link"
ref: "123"
modifiers: ["ControlOrMeta"]
3. Switch to new tab
browser_tabs
action: "select"
index: 1
4. Screenshot new tab
browser_take_screenshot
filename: "privacy-policy.png"
5. Switch back
browser_tabs
action: "select"
index: 0
6. Close extra tab
browser_tabs
action: "close"
index: 1
7. Verify single tab
browser_tabs
action: "list"
Test loading animations:
1. Navigate to page
browser_navigate
url: "https://example.com/data-heavy"
2. Screenshot loading state
browser_take_screenshot
filename: "loading-spinner.png"
3. Wait for loading to complete
browser_wait_for
textGone: "Loading..."
4. Wait for animations
browser_wait_for
time: 1
5. Screenshot final state
browser_take_screenshot
filename: "content-loaded.png"
fullPage: true
6. Verify stability
browser_wait_for
time: 2
browser_take_screenshot
filename: "stable-state.png"
fullPage: true
7. Compare screenshots
- loading-spinner.png
- content-loaded.png
- stable-state.png
Navigate:
browser_navigate url: "{URL}"
Snapshot:
browser_snapshot
Screenshot:
browser_take_screenshot filename: "{name}.png"
Full Page Screenshot:
browser_take_screenshot filename: "{name}.png", fullPage: true
Element Screenshot:
browser_take_screenshot filename: "{name}.png", element: "{description}", ref: "{ref}"
Click:
browser_click element: "{description}", ref: "{ref}"
Type:
browser_type element: "{description}", ref: "{ref}", text: "{text}"
Fill Form:
browser_fill_form fields: [{name, type, ref, value}, ...]
Wait:
browser_wait_for text: "{text}"
browser_wait_for textGone: "{text}"
browser_wait_for time: {seconds}
Resize:
browser_resize width: {width}, height: {height}
Console:
browser_console_messages onlyErrors: true
Network:
browser_network_requests
Mobile:
375 x 667 (iPhone SE)
390 x 844 (iPhone 12/13/14)
414 x 896 (iPhone 11 Pro Max)
360 x 640 (Android Small)
412 x 915 (Android Large)
Tablet:
768 x 1024 (iPad Portrait)
1024 x 768 (iPad Landscape)
810 x 1080 (Android Tablet)
Desktop:
1280 x 720 (HD)
1366 x 768 (Laptop)
1920 x 1080 (Full HD)
2560 x 1440 (2K)
3840 x 2160 (4K)
tests/
├── visual/
│ ├── baselines/
│ ├── results/
│ └── diffs/
├── e2e/
│ ├── auth/
│ ├── checkout/
│ └── navigation/
├── responsive/
│ ├── mobile/
│ ├── tablet/
│ └── desktop/
└── components/
├── buttons/
├── forms/
└── navigation/
reports/
├── visual-regression.html
├── cross-browser.html
└── accessibility.html
Skill Version : 1.0.0 Last Updated : October 2025 Skill Category : Browser Automation, Visual Testing, Quality Assurance Compatible With : Playwright MCP Server, Chromium, Firefox, WebKit
Weekly Installs
348
Repository
GitHub Stars
46
First Seen
Jan 22, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykFail
Installed on
opencode318
codex309
gemini-cli306
cursor298
github-copilot298
claude-code252
agent-browser 浏览器自动化工具 - Vercel Labs 命令行网页操作与测试
140,500 周安装