⚠️

重要前提

安装AI Skills的关键前提是：必须科学上网，且开启TUN模式，这一点至关重要，直接决定安装能否顺利完成，在此郑重提醒三遍：科学上网，科学上网，科学上网。查看完整安装教程 →

使用agent-browser自动化VS Code：通过Chrome DevTools Protocol实现UI测试与交互

launch by microsoft/vscode

64 周安装量

183,000 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/microsoft/vscode --skill launch

自动化 VSCode 测试

🇨🇳中文介绍

VS Code 自动化

使用 agent-browser 自动化 VS Code (Code OSS)。VS Code 基于 Electron/Chromium 构建，并暴露一个 Chrome DevTools Protocol (CDP) 端口，agent-browser 可以连接到此端口，从而启用与网页相同的快照-交互工作流程。

先决条件

必须安装 agent-browser。 它列在 devDependencies 中 — 在仓库根目录运行 npm install。如果它不在你的 PATH 中，请使用 npx agent-browser，或者使用 npm install -g agent-browser 全局安装。
对于 Code OSS (VS Code 开发构建版)： 在启动前必须构建仓库。如果需要，./scripts/code.sh 会自动运行构建，或者如果你已经构建过，可以设置 VSCODE_SKIP_PRELAUNCH=1 来跳过编译步骤。
CSS 选择器是内部实现细节。 在 eval 命令中使用的像 .interactive-input-part、和这样的选择器是 VS Code 的内部实现，可能在不同版本间发生变化。如果它们停止工作，请使用重新发现当前的 DOM 结构。

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

启动 Code OSS (VS Code 开发构建版)

VS Code 仓库包含 scripts/code.sh，它可以从源代码启动 Code OSS。它会将所有参数传递给 Electron 二进制文件，因此 --remote-debugging-port 可以直接使用：

cd <repo-root>  # 你的 VS Code 检出目录的根目录
./scripts/code.sh --remote-debugging-port=9224

等待窗口完全初始化，然后连接：

# 等待 Code OSS 启动，重试直到连接成功
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done

# 验证你连接到了正确的目标（不是 about:blank）
# 如果 `tab` 显示错误的目标，运行 `agent-browser close` 并重新连接
agent-browser tab
agent-browser snapshot -i

如果你已经构建过，设置 VSCODE_SKIP_PRELAUNCH=1 以跳过编译步骤：VSCODE_SKIP_PRELAUNCH=1 ./scripts/code.sh --remote-debugging-port=9224 (从仓库根目录)
Code OSS 使用默认的用户数据目录。与 VS Code Insiders 不同，通常不需要 --user-data-dir，因为通常只有一个 Code OSS 实例在运行。
如果你看到 "Sent env to running instance. Terminating..."，这意味着 Code OSS 已经在运行，并将你的参数转发给了现有实例。退出 Code OSS 并使用标志重新启动，或者使用 --user-data-dir=/tmp/code-oss-debug 强制启动一个新实例。

启动 Sessions 应用程序 (Agent Sessions 窗口)

Sessions 应用程序是一个单独的工作台模式，使用 --sessions 标志启动。它使用专用的用户数据目录，以避免与主 Code OSS 实例冲突。

cd <repo-root>  # 你的 VS Code 检出目录的根目录
./scripts/code.sh --sessions --remote-debugging-port=9224

等待窗口完全初始化，然后连接：

# 等待 Sessions 应用程序启动，重试直到连接成功
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done

# 验证你连接到了正确的目标（不是 about:blank）
agent-browser tab
agent-browser snapshot -i

--sessions 标志启动 Agent Sessions 工作台，而不是标准的 VS Code 工作台。
如果你已经构建过，设置 VSCODE_SKIP_PRELAUNCH=1 以跳过编译步骤。

启动 VS Code 扩展进行调试

要通过 agent-browser 调试 VS Code 扩展，请使用 --extensionDevelopmentPath 和 --remote-debugging-port 启动 VS Code Insiders。使用 --user-data-dir 以避免与已运行的实例冲突。

# 首先构建扩展
cd <extension-repo-root>  # 例如，你的扩展检出目录的根目录
npm run compile

# 使用扩展和 CDP 启动 VS Code Insiders
code-insiders \
  --extensionDevelopmentPath="<extension-repo-root>" \
  --remote-debugging-port=9223 \
  --user-data-dir=/tmp/vscode-ext-debug

# 等待 VS Code 启动，重试直到连接成功
for i in 1 2 3 4 5; do agent-browser connect 9223 2>/dev/null && break || sleep 3; done

# 验证你连接到了正确的目标（不是 about:blank）
# 如果 `tab` 显示错误的目标，运行 `agent-browser close` 并重新连接
agent-browser tab
agent-browser snapshot -i

--extensionDevelopmentPath=<路径> — 从源代码加载你的扩展（必须先编译）
--remote-debugging-port=9223 — 启用 CDP（使用 9223 以避免与 9222 上的其他应用程序冲突）
--user-data-dir=<路径> — 使用单独的配置文件，以便启动新进程而不是发送到现有的 VS Code 实例

如果没有 --user-data-dir，VS Code 会检测到正在运行的实例，将参数转发给它，然后立即退出 — 你会看到 "Sent env to running instance. Terminating..."，并且 CDP 永远不会启动。

代码更改后重启

对 Code OSS 源代码进行更改后，必须重启以获取新构建。 工作台在启动时加载已编译的 JavaScript — 更改不会热重载。

重新构建 更改后的代码
终止正在运行的 Code OSS 实例
重新启动 使用相同的标志

# 1. 确保你的构建是最新的。

#    通常你可以在这里跳过一个手动步骤，让步骤 3 中的 ./scripts/code.sh
#    在需要时触发构建（或者在另一个终端中运行 `npm run watch`）。

# 2. 终止监听调试端口的 Code OSS 实例（如果正在运行）
pids=$(lsof -t -i :9224)
if [ -n "$pids" ]; then
	kill $pids
fi

# 3. 重新启动
./scripts/code.sh --remote-debugging-port=9224

# 4. 重新连接 agent-browser
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
agent-browser tab
agent-browser snapshot -i

提示： 如果你频繁迭代，请在单独的终端中运行 npm run watch，以便自动进行编译。你仍然需要终止并重新启动 Code OSS 以加载新构建。

与 Monaco 编辑器交互（聊天输入、代码编辑器）

VS Code 使用 Monaco 编辑器处理所有文本输入，包括 Copilot Chat 输入。Monaco 编辑器需要特定的 agent-browser 技术 — 标准的 click、fill 和 keyboard type 命令可能无法工作，具体取决于 VS Code 的构建版本。

通用模式：通过键盘快捷键 + `press` 聚焦

这适用于所有 VS Code 构建版本（Code OSS、Insiders、稳定版）：

# 1. 使用键盘快捷键打开并聚焦聊天输入框
# macOS:
agent-browser press Control+Meta+i
# Linux / Windows:
agent-browser press Control+Alt+i

# 2. 使用单独的 press 命令输入
agent-browser press H
agent-browser press e
agent-browser press l
agent-browser press l
agent-browser press o
agent-browser press Space  # 使用 "Space" 表示空格
agent-browser press w
agent-browser press o
agent-browser press r
agent-browser press l
agent-browser press d

# 验证文本是否出现（可选）
agent-browser eval '
(() => {
  const sidebar = document.querySelector(".part.auxiliarybar");
  const viewLines = sidebar.querySelectorAll(".interactive-input-editor .view-line");
  return Array.from(viewLines).map(vl => vl.textContent).join("|");
})()'

# 3. 发送消息（所有平台相同）
agent-browser press Enter

各平台聊天聚焦快捷键：

macOS: Ctrl+Cmd+I → agent-browser press Control+Meta+i
Linux: Ctrl+Alt+I → agent-browser press Control+Alt+i
Windows: Ctrl+Alt+I → agent-browser press Control+Alt+i

此快捷键聚焦聊天输入框，并将 document.activeElement 设置为一个带有 native-edit-context 类的 DIV — 这是 VS Code 的原生文本编辑表面，可以正确处理来自 agent-browser press 的按键事件。

`type @ref` — 适用于某些构建版本

在 VS Code Insiders（扩展调试模式）上，type @ref 一步处理聚焦和输入：

agent-browser snapshot -i
# 查找：textbox "The editor is not accessible..." [ref=e62]
agent-browser type @e62 "Hello from George!"

提示： 如果 type @ref 静默丢弃文本（编辑器保持为空），则引用可能已过时或编辑器尚未就绪。重新快照以获取新的引用并重试。你可以使用下面“验证文本和清除”中的代码片段来验证文本是否已输入。

然而，type @ref 在 Code OSS 上静默失败 — 命令完成且没有错误，但没有文本出现。这也适用于 keyboard type 和 keyboard inserttext。输入后始终验证文本是否出现，如果没有，则回退到键盘快捷键 + press 模式。press-逐键方法在所有构建版本中普遍有效。

⚠️ 警告： keyboard type 在某些聚焦状态下（例如，在 JS 鼠标事件之后）可能会无限期挂起。如果它在几秒钟内没有返回，请中断它并回退到 press 进行逐个按键输入。

方法	VS Code Insiders	Code OSS
`press` 逐键（聚焦快捷键后）	✅ 有效	✅ 有效
`type @ref`	✅ 有效	❌ 静默失败
`keyboard type`（聚焦后）	✅ 有效	❌ 静默失败
`keyboard inserttext`（聚焦后）	✅ 有效	❌ 静默失败
`click @ref`	❌ 被覆盖层阻止	❌ 被覆盖层阻止
`fill @ref`	❌ 元素不可见	❌ 元素不可见

备用方案：通过 JavaScript 鼠标事件聚焦

如果键盘快捷键不起作用（例如，聊天面板未配置），你可以通过 JavaScript 聚焦编辑器：

agent-browser eval '
(() => {
  const inputPart = document.querySelector(".interactive-input-part");
  const editor = inputPart.querySelector(".monaco-editor");
  const rect = editor.getBoundingClientRect();
  const x = rect.x + rect.width / 2;
  const y = rect.y + rect.height / 2;
  editor.dispatchEvent(new MouseEvent("mousedown", { bubbles: true, clientX: x, clientY: y }));
  editor.dispatchEvent(new MouseEvent("mouseup", { bubbles: true, clientX: x, clientY: y }));
  editor.dispatchEvent(new MouseEvent("click", { bubbles: true, clientX: x, clientY: y }));
  return "activeElement: " + document.activeElement?.className;
})()'

# 然后为每个字符使用 press
agent-browser press H
agent-browser press e
# ...

验证文本和清除

# 验证聊天输入框中的文本
agent-browser eval '
(() => {
  const sidebar = document.querySelector(".part.auxiliarybar");
  const viewLines = sidebar.querySelectorAll(".interactive-input-editor .view-line");
  return Array.from(viewLines).map(vl => vl.textContent).join("|");
})()'

# 清除输入框（全选 + 退格键）
# macOS:
agent-browser press Meta+a
# Linux / Windows:
agent-browser press Control+a
# 然后删除：
agent-browser press Backspace

VS Code 截图技巧

在超宽显示器上，聊天侧边栏可能位于 CDP 截图的远右角。选项：

使用 agent-browser screenshot --full 捕获整个窗口
使用元素截图：agent-browser screenshot ".part.auxiliarybar" sidebar.png
使用 agent-browser screenshot --annotate 查看带标签的元素位置
首先最大化侧边栏：点击“最大化辅助侧边栏”按钮

macOS： 如果 agent-browser screenshot 返回 "Permission denied"，你的终端需要屏幕录制权限。在系统设置 → 隐私与安全 → 屏幕录制中授予权限。作为备用方案，使用 eval 验证代码片段来确认文本已输入 — 这不需要屏幕权限。

"Connection refused" 或 "Cannot connect"

确保 Code OSS 是使用 --remote-debugging-port=NNNN 启动的
如果 Code OSS 已经在运行，请退出并使用该标志重新启动
检查端口是否未被其他进程占用：
- macOS / Linux: lsof -i :9224
- Windows: netstat -ano | findstr 9224

快照中未出现元素

VS Code 使用多个 webview。使用 agent-browser tab 列出目标并切换到正确的目标
使用 agent-browser snapshot -i -C 以包含光标可交互元素（带有 onclick 处理程序的 div）

无法在 Monaco 编辑器输入框中输入

聚焦输入框后，使用 agent-browser press 进行逐个按键输入。使用键盘快捷键聚焦聊天输入框（macOS: Ctrl+Cmd+I, Linux/Windows: Ctrl+Alt+I）。
type @ref、keyboard type 和 keyboard inserttext 在 VS Code Insiders 上有效，但在 Code OSS 上静默失败 — 它们完成且没有错误，但没有文本出现。press-逐键方法普遍有效。
查看上面的“与 Monaco 编辑器交互”部分以获取完整的兼容性矩阵。

完成后务必终止 Code OSS 实例。 Code OSS 是一个完整的 Electron 应用程序，会消耗大量内存（通常为 1–4 GB+）。让它保持运行会浪费资源并占用 CDP 端口。

# 断开 agent-browser 连接
agent-browser close

# 终止监听调试端口的 Code OSS 实例（如果正在运行）
# macOS / Linux:
pids=$(lsof -t -i :9224)
if [ -n "$pids" ]; then
	kill $pids
fi

# Windows:
# taskkill /F /PID <PID>
# 或使用任务管理器结束 "Code - OSS"

验证它已消失：

# 确认没有进程在监听调试端口
lsof -i :9224  # 应该不返回任何内容

🇺🇸English

VS Code Automation

Automate VS Code (Code OSS) using agent-browser. VS Code is built on Electron/Chromium and exposes a Chrome DevTools Protocol (CDP) port that agent-browser can connect to, enabling the same snapshot-interact workflow used for web pages.

Prerequisites

agent-browser must be installed. It's listed in devDependencies — run npm install in the repo root. Use npx agent-browser if it's not on your PATH, or install globally with npm install -g agent-browser.
For Code OSS (VS Code dev build): The repo must be built before launching. ./scripts/code.sh runs the build automatically if needed, or set VSCODE_SKIP_PRELAUNCH=1 to skip the compile step if you've already built.
CSS selectors are internal implementation details. Selectors like .interactive-input-part, .interactive-input-editor, and .part.auxiliarybar used in eval commands are VS Code internals that may change across versions. If they stop working, use agent-browser snapshot -i to re-discover the current DOM structure.

Core Workflow

Launch Code OSS with remote debugging enabled
Connect agent-browser to the CDP port
Snapshot to discover interactive elements
Interact using element refs
Re-snapshot after navigation or state changes

📸 Take screenshots for a paper trail. Use agent-browser screenshot <path> at key moments — after launch, before/after interactions, and when something goes wrong. Screenshots provide visual proof of what the UI looked like and are invaluable for debugging failures or documenting what was accomplished.

Save screenshots inside a timestamped subfolder so each run is isolated and nothing gets overwritten:
# Create a timestamped folder for this run's screenshots
SCREENSHOT_DIR="/tmp/code-oss-screenshots/$(date +%Y-%m-%dT%H-%M-%S)"
mkdir -p "$SCREENSHOT_DIR"

# Save a screenshot (path is a positional argument — use ./ or absolute paths)
# Bare filenames without ./ may be misinterpreted as CSS selectors
agent-browser screenshot "$SCREENSHOT_DIR/after-launch.png"

# Launch Code OSS with remote debugging
./scripts/code.sh --remote-debugging-port=9224

# Wait for Code OSS to start, retry until connected
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done

# Verify you're connected to the right target (not about:blank)
# If `tab` shows the wrong target, run `agent-browser close` and reconnect
agent-browser tab

# Discover UI elements
agent-browser snapshot -i

# Focus the chat input (macOS)
agent-browser press Control+Meta+i

Connecting

# Connect to a specific port
agent-browser connect 9222

# Or use --cdp on each command
agent-browser --cdp 9222 snapshot -i

# Auto-discover a running Chromium-based app
agent-browser --auto-connect snapshot -i

After connect, all subsequent commands target the connected app without needing --cdp.

Tab Management

Electron apps often have multiple windows or webviews. Use tab commands to list and switch between them:

# List all available targets (windows, webviews, etc.)
agent-browser tab

# Switch to a specific tab by index
agent-browser tab 2

# Switch by URL pattern
agent-browser tab --url "*settings*"

Launching Code OSS (VS Code Dev Build)

The VS Code repository includes scripts/code.sh which launches Code OSS from source. It passes all arguments through to the Electron binary, so --remote-debugging-port works directly:

cd <repo-root>  # the root of your VS Code checkout
./scripts/code.sh --remote-debugging-port=9224

Wait for the window to fully initialize, then connect:

# Wait for Code OSS to start, retry until connected
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done

# Verify you're connected to the right target (not about:blank)
# If `tab` shows the wrong target, run `agent-browser close` and reconnect
agent-browser tab
agent-browser snapshot -i

Tips:

Set VSCODE_SKIP_PRELAUNCH=1 to skip the compile step if you've already built: VSCODE_SKIP_PRELAUNCH=1 ./scripts/code.sh --remote-debugging-port=9224 (from the repo root)
Code OSS uses the default user data directory. Unlike VS Code Insiders, you don't typically need --user-data-dir since there's usually only one Code OSS instance running.
If you see "Sent env to running instance. Terminating..." it means Code OSS is already running and forwarded your args to the existing instance. Quit Code OSS and relaunch with the flag, or use --user-data-dir=/tmp/code-oss-debug to force a new instance.

Launching the Sessions App (Agent Sessions Window)

The Sessions app is a separate workbench mode launched with the --sessions flag. It uses a dedicated user data directory to avoid conflicts with the main Code OSS instance.

cd <repo-root>  # the root of your VS Code checkout
./scripts/code.sh --sessions --remote-debugging-port=9224

Wait for the window to fully initialize, then connect:

# Wait for Sessions app to start, retry until connected
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done

# Verify you're connected to the right target (not about:blank)
agent-browser tab
agent-browser snapshot -i

Tips:

The --sessions flag launches the Agent Sessions workbench instead of the standard VS Code workbench.
Set VSCODE_SKIP_PRELAUNCH=1 to skip the compile step if you've already built.

Launching VS Code Extensions for Debugging

To debug a VS Code extension via agent-browser, launch VS Code Insiders with --extensionDevelopmentPath and --remote-debugging-port. Use --user-data-dir to avoid conflicting with an already-running instance.

# Build the extension first
cd <extension-repo-root>  # e.g., the root of your extension checkout
npm run compile

# Launch VS Code Insiders with the extension and CDP
code-insiders \
  --extensionDevelopmentPath="<extension-repo-root>" \
  --remote-debugging-port=9223 \
  --user-data-dir=/tmp/vscode-ext-debug

# Wait for VS Code to start, retry until connected
for i in 1 2 3 4 5; do agent-browser connect 9223 2>/dev/null && break || sleep 3; done

# Verify you're connected to the right target (not about:blank)
# If `tab` shows the wrong target, run `agent-browser close` and reconnect
agent-browser tab
agent-browser snapshot -i

Key flags:

--extensionDevelopmentPath=<path> — loads your extension from source (must be compiled first)
--remote-debugging-port=9223 — enables CDP (use 9223 to avoid conflicts with other apps on 9222)
--user-data-dir=<path> — uses a separate profile so it starts a new process instead of sending to an existing VS Code instance

Without--user-data-dir, VS Code detects the running instance, forwards the args to it, and exits immediately — you'll see "Sent env to running instance. Terminating..." and CDP never starts.

Restarting After Code Changes

After making changes to Code OSS source code, you must restart to pick up the new build. The workbench loads the compiled JavaScript at startup — changes are not hot-reloaded.

Restart Workflow

Rebuild the changed code
Kill the running Code OSS instance
Relaunch with the same flags

# 1. Ensure your build is up to date.

#    Normally you can skip a manual step here and let ./scripts/code.sh in step 3
#    trigger the build when needed (or run `npm run watch` in another terminal).

# 2. Kill the Code OSS instance listening on the debug port (if running)
pids=$(lsof -t -i :9224)
if [ -n "$pids" ]; then
	kill $pids
fi

# 3. Relaunch
./scripts/code.sh --remote-debugging-port=9224

# 4. Reconnect agent-browser
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
agent-browser tab
agent-browser snapshot -i

Tip: If you're iterating frequently, run npm run watch in a separate terminal so compilation happens automatically. You still need to kill and relaunch Code OSS to load the new build.

Interacting with Monaco Editor (Chat Input, Code Editors)

VS Code uses Monaco Editor for all text inputs including the Copilot Chat input. Monaco editors require specific agent-browser techniques — standard click, fill, and keyboard type commands may not work depending on the VS Code build.

The Universal Pattern: Focus via Keyboard Shortcut + `press`

This works on all VS Code builds (Code OSS, Insiders, stable):

# 1. Open and focus the chat input with the keyboard shortcut
# macOS:
agent-browser press Control+Meta+i
# Linux / Windows:
agent-browser press Control+Alt+i

# 2. Type using individual press commands
agent-browser press H
agent-browser press e
agent-browser press l
agent-browser press l
agent-browser press o
agent-browser press Space  # Use "Space" for spaces
agent-browser press w
agent-browser press o
agent-browser press r
agent-browser press l
agent-browser press d

# Verify text appeared (optional)
agent-browser eval '
(() => {
  const sidebar = document.querySelector(".part.auxiliarybar");
  const viewLines = sidebar.querySelectorAll(".interactive-input-editor .view-line");
  return Array.from(viewLines).map(vl => vl.textContent).join("|");
})()'

# 3. Send the message (same on all platforms)
agent-browser press Enter

Chat focus shortcut by platform:

macOS: Ctrl+Cmd+I → agent-browser press Control+Meta+i
Linux: Ctrl+Alt+I → agent-browser press Control+Alt+i
Windows: Ctrl+Alt+I → agent-browser press Control+Alt+i

This shortcut focuses the chat input and sets document.activeElement to a DIV with class native-edit-context — VS Code's native text editing surface that correctly processes key events from agent-browser press.

`type @ref` — Works on Some Builds

On VS Code Insiders (extension debug mode), type @ref handles focus and input in one step:

agent-browser snapshot -i
# Look for: textbox "The editor is not accessible..." [ref=e62]
agent-browser type @e62 "Hello from George!"

Tip: If type @ref silently drops text (the editor stays empty), the ref may be stale or the editor not yet ready. Re-snapshot to get a fresh ref and try again. You can verify text was entered using the snippet in "Verifying Text and Clearing" below.

However, type @ref silently fails on Code OSS — the command completes without error but no text appears. This also applies to keyboard type and keyboard inserttext. Always verify text appeared after typing, and fall back to the keyboard shortcut + press pattern if it didn't. The press-per-key approach works universally across all builds.

⚠️ Warning: keyboard type can hang indefinitely in some focus states (e.g., after JS mouse events). If it doesn't return within a few seconds, interrupt it and fall back to press for individual keystrokes.

Compatibility Matrix

Method	VS Code Insiders	Code OSS
`press` per key (after focus shortcut)	✅ Works	✅ Works
`type @ref`	✅ Works	❌ Silent fail
`keyboard type` (after focus)	✅ Works	❌ Silent fail
`keyboard inserttext` (after focus)	✅ Works	❌ Silent fail
`click @ref`	❌ Blocked by overlay

Fallback: Focus via JavaScript Mouse Events

If the keyboard shortcut doesn't work (e.g., chat panel isn't configured), you can focus the editor via JavaScript:

agent-browser eval '
(() => {
  const inputPart = document.querySelector(".interactive-input-part");
  const editor = inputPart.querySelector(".monaco-editor");
  const rect = editor.getBoundingClientRect();
  const x = rect.x + rect.width / 2;
  const y = rect.y + rect.height / 2;
  editor.dispatchEvent(new MouseEvent("mousedown", { bubbles: true, clientX: x, clientY: y }));
  editor.dispatchEvent(new MouseEvent("mouseup", { bubbles: true, clientX: x, clientY: y }));
  editor.dispatchEvent(new MouseEvent("click", { bubbles: true, clientX: x, clientY: y }));
  return "activeElement: " + document.activeElement?.className;
})()'

# Then use press for each character
agent-browser press H
agent-browser press e
# ...

Verifying Text and Clearing

# Verify text in the chat input
agent-browser eval '
(() => {
  const sidebar = document.querySelector(".part.auxiliarybar");
  const viewLines = sidebar.querySelectorAll(".interactive-input-editor .view-line");
  return Array.from(viewLines).map(vl => vl.textContent).join("|");
})()'

# Clear the input (Select All + Backspace)
# macOS:
agent-browser press Meta+a
# Linux / Windows:
agent-browser press Control+a
# Then delete:
agent-browser press Backspace

Screenshot Tips for VS Code

On ultrawide monitors, the chat sidebar may be in the far-right corner of the CDP screenshot. Options:

Use agent-browser screenshot --full to capture the entire window
Use element screenshots: agent-browser screenshot ".part.auxiliarybar" sidebar.png
Use agent-browser screenshot --annotate to see labeled element positions
Maximize the sidebar first: click the "Maximize Secondary Side Bar" button

macOS: If agent-browser screenshot returns "Permission denied", your terminal needs Screen Recording permission. Grant it in System Settings → Privacy & Security → Screen Recording. As a fallback, use the eval verification snippet to confirm text was entered — this doesn't require screen permissions.

Troubleshooting

"Connection refused" or "Cannot connect"

Make sure Code OSS was launched with --remote-debugging-port=NNNN
If Code OSS was already running, quit and relaunch with the flag
Check that the port isn't in use by another process:
- macOS / Linux: lsof -i :9224
- Windows: netstat -ano | findstr 9224

Elements not appearing in snapshot

VS Code uses multiple webviews. Use agent-browser tab to list targets and switch to the right one
Use agent-browser snapshot -i -C to include cursor-interactive elements (divs with onclick handlers)

Cannot type in Monaco Editor inputs

Use agent-browser press for individual keystrokes after focusing the input. Focus the chat input with the keyboard shortcut (macOS: Ctrl+Cmd+I, Linux/Windows: Ctrl+Alt+I).
type @ref, keyboard type, and keyboard inserttext work on VS Code Insiders but silently fail on Code OSS — they complete without error but no text appears. The press-per-key approach works universally.
See the "Interacting with Monaco Editor" section above for the full compatibility matrix.

Cleanup

Always kill the Code OSS instance when you're done. Code OSS is a full Electron app that consumes significant memory (often 1–4 GB+). Leaving it running wastes resources and holds the CDP port.

# Disconnect agent-browser
agent-browser close

# Kill the Code OSS instance listening on the debug port (if running)
# macOS / Linux:
pids=$(lsof -t -i :9224)
if [ -n "$pids" ]; then
	kill $pids
fi

# Windows:
# taskkill /F /PID <PID>
# Or use Task Manager to end "Code - OSS"

Verify it's gone:

# Confirm no process is listening on the debug port
lsof -i :9224  # should return nothing

Weekly Installs

Repository

microsoft/vscode

GitHub Stars

183.0K

First Seen

Mar 5, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

github-copilot64

amp64

cline64

codex64

kimi-cli64

gemini-cli64

Skills CLI 使用指南：AI Agent 技能包管理器安装与管理教程

48,700 周安装