Browserbase CLI 浏览器自动化指南：本地与远程模式设置、命令大全与工作流

browser by browserbase/skills

1,500 周安装量

461 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/browserbase/skills --skill browser

开发云服务测试

🇨🇳中文介绍

浏览器自动化

使用 browse CLI 配合 Claude 自动化浏览器交互。

设置检查

在运行任何浏览器命令之前，请验证 CLI 是否可用：

which browse || npm install -g @browserbasehq/browse-cli

环境选择（本地 vs 远程）

CLI 根据可用配置自动选择本地或远程浏览器环境：

本地模式（默认）

使用本地 Chrome —— 无需 API 密钥
适用于：开发、简单页面、没有机器人防护的可信站点

远程模式（Browserbase）

当设置了 BROWSERBASE_API_KEY 和 BROWSERBASE_PROJECT_ID 时激活
提供：反机器人隐身、自动验证码解决、住宅代理、会话持久化
在以下情况使用远程模式： 目标站点有机器人检测、验证码、IP 速率限制、Cloudflare 防护，或需要特定地理位置访问
在 https://browserbase.com/settings 获取凭据

何时选择哪种模式

简单浏览（文档、维基、公共 API）：本地模式即可
受保护站点（登录墙、验证码、反爬虫）：使用远程模式
如果本地模式失败，出现机器人检测或访问被拒：切换到远程模式

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

页面状态（优先使用快照而非截图）

browse snapshot                          # 获取包含元素引用的无障碍树（快速、结构化）
browse screenshot [path]                 # 截取视觉截图（较慢，消耗视觉令牌）
browse get url                           # 获取当前 URL
browse get title                         # 获取页面标题
browse get text <selector>               # 获取文本内容（使用 "body" 获取所有文本）
browse get html <selector>               # 获取元素的 HTML 内容
browse get value <selector>              # 获取表单字段值

使用 browse snapshot 作为理解页面状态的默认方式 —— 它返回包含元素引用的无障碍树，你可以使用这些引用来进行交互。仅在需要视觉上下文（布局、图像、调试）时使用 browse screenshot。

browse click <ref>                       # 通过快照中的引用点击元素（例如，@0-5）
browse type <text>                       # 在聚焦的元素中输入文本
browse fill <selector> <value>           # 填写输入框并按 Enter
browse select <selector> <values...>     # 选择下拉选项
browse press <key>                       # 按键（Enter、Tab、Escape、Cmd+A 等）
browse drag <fromX> <fromY> <toX> <toY>  # 从一点拖拽到另一点
browse scroll <x> <y> <deltaX> <deltaY> # 在坐标处滚动
browse highlight <selector>              # 在页面上高亮元素
browse is visible <selector>             # 检查元素是否可见
browse is checked <selector>             # 检查元素是否被选中
browse wait <type> [arg]                 # 等待：加载、选择器、超时

browse stop                              # 停止浏览器守护进程
browse status                            # 检查守护进程状态（包括环境）
browse env                               # 显示当前环境（本地或远程）
browse env local                         # 切换到本地 Chrome
browse env remote                        # 切换到 Browserbase（需要 API 密钥）
browse pages                             # 列出所有打开的标签页
browse tab_switch <index>                # 按索引切换标签页
browse tab_close [index]                 # 关闭标签页

browse open <url> —— 导航到页面
browse snapshot —— 读取无障碍树以了解页面结构并获取元素引用
browse click <ref> / browse type <text> / browse fill <selector> <value> —— 使用快照中的引用进行交互
browse snapshot —— 确认操作成功
根据需要重复步骤 3-4
browse stop —— 完成后关闭浏览器

browse open https://example.com
browse snapshot                          # 查看页面结构 + 元素引用
browse click @0-5                        # 点击引用为 0-5 的元素
browse get title
browse stop

功能	本地	Browserbase
速度	更快	稍慢
设置	需要 Chrome	需要 API 密钥
隐身模式	否	是（自定义 Chromium，反机器人指纹识别）
验证码解决	否	是（自动 reCAPTCHA/hCaptcha）
住宅代理	否	是（201 个国家，地理定位）
会话持久化	否	是（Cookie/认证信息跨会话持久化）
适用于	开发/简单页面	受保护站点、机器人检测、生产环境爬取

始终先执行browse open 再进行交互
使用browse snapshot 检查页面状态 —— 它速度快且提供元素引用
仅在需要视觉上下文时截图（布局检查、图像、调试）
使用快照中的引用 进行点击/交互 —— 例如，browse click @0-5
完成后执行browse stop 以清理浏览器会话

"没有活动页面" : 运行 browse stop，然后检查 browse status。如果仍显示运行中，使用 pkill -f "browse.*daemon" 终止僵尸守护进程，然后重试 browse open
未找到 Chrome : 安装 Chrome 或使用 browse env remote
操作失败 : 运行 browse snapshot 查看可用元素及其引用
Browserbase 失败 : 验证 API 密钥和项目 ID 是否已设置

切换到远程模式

当检测到以下情况时切换到远程模式：验证码（reCAPTCHA、hCaptcha、Turnstile）、机器人检测页面（"正在检查您的浏览器..."）、HTTP 403/429、本应有内容的站点显示空白页，或用户要求切换。

对于简单站点（文档、维基、公共 API、localhost）不要切换。

browse env remote            # 切换到 Browserbase
browse env local             # 切换回本地 Chrome

切换是持久性的，直到你运行 browse stop 或再次切换。如果未设置 API 密钥：

openclaw browserbase setup   # 交互式 —— 提示输入 API 密钥 + 项目 ID

有关详细示例，请参阅 EXAMPLES.md。有关 API 参考，请参阅 REFERENCE.md。

🇺🇸English

Browser Automation

Automate browser interactions using the browse CLI with Claude.

Setup check

Before running any browser commands, verify the CLI is available:

which browse || npm install -g @browserbasehq/browse-cli

Environment Selection (Local vs Remote)

The CLI automatically selects between local and remote browser environments based on available configuration:

Local mode (default)

Uses local Chrome — no API keys needed
Best for: development, simple pages, trusted sites with no bot protection

Remote mode (Browserbase)

Activated when BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID are set
Provides: anti-bot stealth, automatic CAPTCHA solving, residential proxies, session persistence
Use remote mode when: the target site has bot detection, CAPTCHAs, IP rate limiting, Cloudflare protection, or requires geo-specific access
Get credentials at https://browserbase.com/settings

When to choose which

Simple browsing (docs, wikis, public APIs): local mode is fine
Protected sites (login walls, CAPTCHAs, anti-scraping): use remote mode
If local mode fails with bot detection or access denied: switch to remote mode

Commands

All commands work identically in both modes. The daemon auto-starts on first command.

Navigation

browse open <url>                        # Go to URL (aliases: goto)
browse reload                            # Reload current page
browse back                              # Go back in history
browse forward                           # Go forward in history

Page state (prefer snapshot over screenshot)

browse snapshot                          # Get accessibility tree with element refs (fast, structured)
browse screenshot [path]                 # Take visual screenshot (slow, uses vision tokens)
browse get url                           # Get current URL
browse get title                         # Get page title
browse get text <selector>               # Get text content (use "body" for all text)
browse get html <selector>               # Get HTML content of element
browse get value <selector>              # Get form field value

Use browse snapshot as your default for understanding page state — it returns the accessibility tree with element refs you can use to interact. Only use browse screenshot when you need visual context (layout, images, debugging).

Interaction

browse click <ref>                       # Click element by ref from snapshot (e.g., @0-5)
browse type <text>                       # Type text into focused element
browse fill <selector> <value>           # Fill input and press Enter
browse select <selector> <values...>     # Select dropdown option(s)
browse press <key>                       # Press key (Enter, Tab, Escape, Cmd+A, etc.)
browse drag <fromX> <fromY> <toX> <toY>  # Drag from one point to another
browse scroll <x> <y> <deltaX> <deltaY> # Scroll at coordinates
browse highlight <selector>              # Highlight element on page
browse is visible <selector>             # Check if element is visible
browse is checked <selector>             # Check if element is checked
browse wait <type> [arg]                 # Wait for: load, selector, timeout

Session management

browse stop                              # Stop the browser daemon
browse status                            # Check daemon status (includes env)
browse env                               # Show current environment (local or remote)
browse env local                         # Switch to local Chrome
browse env remote                        # Switch to Browserbase (requires API keys)
browse pages                             # List all open tabs
browse tab_switch <index>                # Switch to tab by index
browse tab_close [index]                 # Close tab

Typical workflow

browse open <url> — navigate to the page
browse snapshot — read the accessibility tree to understand page structure and get element refs
browse click <ref> / browse type <text> / browse fill <selector> <value> — interact using refs from snapshot
browse snapshot — confirm the action worked
Repeat 3-4 as needed
browse stop — close the browser when done

Quick Example

browse open https://example.com
browse snapshot                          # see page structure + element refs
browse click @0-5                        # click element with ref 0-5
browse get title
browse stop

Mode Comparison

Feature	Local	Browserbase
Speed	Faster	Slightly slower
Setup	Chrome required	API key required
Stealth mode	No	Yes (custom Chromium, anti-bot fingerprinting)
CAPTCHA solving	No	Yes (automatic reCAPTCHA/hCaptcha)
Residential proxies	No	Yes (201 countries, geo-targeting)
Session persistence	No	Yes (cookies/auth persist across sessions)
Best for	Development/simple pages	Protected sites, bot detection, production scraping

Best Practices

Alwaysbrowse open first before interacting
Usebrowse snapshot to check page state — it's fast and gives you element refs
Only screenshot when visual context is needed (layout checks, images, debugging)
Use refs from snapshot to click/interact — e.g., browse click @0-5
browse stop when done to clean up the browser session

Troubleshooting

"No active page" : Run browse stop, then check browse status. If it still says running, kill the zombie daemon with pkill -f "browse.*daemon", then retry browse open
Chrome not found : Install Chrome or use browse env remote
Action fails : Run browse snapshot to see available elements and their refs
Browserbase fails : Verify API key and project ID are set

Switching to Remote Mode

Switch to remote when you detect: CAPTCHAs (reCAPTCHA, hCaptcha, Turnstile), bot detection pages ("Checking your browser..."), HTTP 403/429, empty pages on sites that should have content, or the user asks for it.

Don't switch for simple sites (docs, wikis, public APIs, localhost).

browse env remote            # switch to Browserbase
browse env local             # switch back to local Chrome

The switch is sticky until you run browse stop or switch again. If API keys aren't set:

openclaw browserbase setup   # interactive — prompts for API key + project ID

For detailed examples, see EXAMPLES.md. For API reference, see REFERENCE.md.

Weekly Installs

139

Repository

browserbase/skills

GitHub Stars

452

First Seen

Feb 4, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykWarn

Installed on

codex127

opencode122

gemini-cli121

github-copilot120

kimi-cli118

amp118

React 组合模式指南：Vercel 组件架构最佳实践，提升代码可维护性

102,200 周安装