重要前提
安装AI Skills的关键前提是:必须科学上网,且开启TUN模式,这一点至关重要,直接决定安装能否顺利完成,在此郑重提醒三遍:科学上网,科学上网,科学上网。查看完整安装教程 →
launch by microsoft/vscode
npx skills add https://github.com/microsoft/vscode --skill launch使用 agent-browser 自动化 VS Code (Code OSS)。VS Code 基于 Electron/Chromium 构建,并暴露一个 Chrome DevTools Protocol (CDP) 端口,agent-browser 可以连接到此端口,从而启用与网页相同的快照-交互工作流程。
agent-browser。 它列在 devDependencies 中 — 在仓库根目录运行 npm install。如果它不在你的 PATH 中,请使用 npx agent-browser,或者使用 npm install -g agent-browser 全局安装。./scripts/code.sh 会自动运行构建,或者如果你已经构建过,可以设置 VSCODE_SKIP_PRELAUNCH=1 来跳过编译步骤。eval 命令中使用的像 .interactive-input-part、 和 这样的选择器是 VS Code 的内部实现,可能在不同版本间发生变化。如果它们停止工作,请使用 重新发现当前的 DOM 结构。广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
.interactive-input-editor.part.auxiliarybaragent-browser snapshot -i📸 为记录过程截图。 在关键时刻使用
agent-browser screenshot <路径>— 启动后、交互前后以及出现问题时。截图提供了 UI 外观的视觉证明,对于调试故障或记录完成的工作非常宝贵。将截图保存在带时间戳的子文件夹中,以便每次运行都是独立的,且不会覆盖任何内容:
# 为此运行的截图创建一个带时间戳的文件夹 SCREENSHOT_DIR="/tmp/code-oss-screenshots/$(date +%Y-%m-%dT%H-%M-%S)" mkdir -p "$SCREENSHOT_DIR" # 保存截图(路径是一个位置参数 — 使用 ./ 或绝对路径) # 没有 ./ 的裸文件名可能被误解为 CSS 选择器 agent-browser screenshot "$SCREENSHOT_DIR/after-launch.png"
# 启动带有远程调试的 Code OSS
./scripts/code.sh --remote-debugging-port=9224
# 等待 Code OSS 启动,重试直到连接成功
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
# 验证你连接到了正确的目标(不是 about:blank)
# 如果 `tab` 显示错误的目标,运行 `agent-browser close` 并重新连接
agent-browser tab
# 发现 UI 元素
agent-browser snapshot -i
# 聚焦聊天输入框 (macOS)
agent-browser press Control+Meta+i
# 连接到特定端口
agent-browser connect 9222
# 或者在每个命令上使用 --cdp
agent-browser --cdp 9222 snapshot -i
# 自动发现正在运行的基于 Chromium 的应用程序
agent-browser --auto-connect snapshot -i
connect 之后,所有后续命令都针对已连接的应用程序,无需再使用 --cdp。
Electron 应用程序通常有多个窗口或 webview。使用标签页命令来列出和切换它们:
# 列出所有可用目标(窗口、webview 等)
agent-browser tab
# 按索引切换到特定标签页
agent-browser tab 2
# 按 URL 模式切换
agent-browser tab --url "*settings*"
VS Code 仓库包含 scripts/code.sh,它可以从源代码启动 Code OSS。它会将所有参数传递给 Electron 二进制文件,因此 --remote-debugging-port 可以直接使用:
cd <repo-root> # 你的 VS Code 检出目录的根目录
./scripts/code.sh --remote-debugging-port=9224
等待窗口完全初始化,然后连接:
# 等待 Code OSS 启动,重试直到连接成功
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
# 验证你连接到了正确的目标(不是 about:blank)
# 如果 `tab` 显示错误的目标,运行 `agent-browser close` 并重新连接
agent-browser tab
agent-browser snapshot -i
提示:
VSCODE_SKIP_PRELAUNCH=1 以跳过编译步骤:VSCODE_SKIP_PRELAUNCH=1 ./scripts/code.sh --remote-debugging-port=9224 (从仓库根目录)--user-data-dir,因为通常只有一个 Code OSS 实例在运行。--user-data-dir=/tmp/code-oss-debug 强制启动一个新实例。Sessions 应用程序是一个单独的工作台模式,使用 --sessions 标志启动。它使用专用的用户数据目录,以避免与主 Code OSS 实例冲突。
cd <repo-root> # 你的 VS Code 检出目录的根目录
./scripts/code.sh --sessions --remote-debugging-port=9224
等待窗口完全初始化,然后连接:
# 等待 Sessions 应用程序启动,重试直到连接成功
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
# 验证你连接到了正确的目标(不是 about:blank)
agent-browser tab
agent-browser snapshot -i
提示:
--sessions 标志启动 Agent Sessions 工作台,而不是标准的 VS Code 工作台。VSCODE_SKIP_PRELAUNCH=1 以跳过编译步骤。要通过 agent-browser 调试 VS Code 扩展,请使用 --extensionDevelopmentPath 和 --remote-debugging-port 启动 VS Code Insiders。使用 --user-data-dir 以避免与已运行的实例冲突。
# 首先构建扩展
cd <extension-repo-root> # 例如,你的扩展检出目录的根目录
npm run compile
# 使用扩展和 CDP 启动 VS Code Insiders
code-insiders \
--extensionDevelopmentPath="<extension-repo-root>" \
--remote-debugging-port=9223 \
--user-data-dir=/tmp/vscode-ext-debug
# 等待 VS Code 启动,重试直到连接成功
for i in 1 2 3 4 5; do agent-browser connect 9223 2>/dev/null && break || sleep 3; done
# 验证你连接到了正确的目标(不是 about:blank)
# 如果 `tab` 显示错误的目标,运行 `agent-browser close` 并重新连接
agent-browser tab
agent-browser snapshot -i
关键标志:
--extensionDevelopmentPath=<路径> — 从源代码加载你的扩展(必须先编译)--remote-debugging-port=9223 — 启用 CDP(使用 9223 以避免与 9222 上的其他应用程序冲突)--user-data-dir=<路径> — 使用单独的配置文件,以便启动新进程而不是发送到现有的 VS Code 实例如果没有 --user-data-dir,VS Code 会检测到正在运行的实例,将参数转发给它,然后立即退出 — 你会看到 "Sent env to running instance. Terminating...",并且 CDP 永远不会启动。
对 Code OSS 源代码进行更改后,必须重启以获取新构建。 工作台在启动时加载已编译的 JavaScript — 更改不会热重载。
# 1. 确保你的构建是最新的。
# 通常你可以在这里跳过一个手动步骤,让步骤 3 中的 ./scripts/code.sh
# 在需要时触发构建(或者在另一个终端中运行 `npm run watch`)。
# 2. 终止监听调试端口的 Code OSS 实例(如果正在运行)
pids=$(lsof -t -i :9224)
if [ -n "$pids" ]; then
kill $pids
fi
# 3. 重新启动
./scripts/code.sh --remote-debugging-port=9224
# 4. 重新连接 agent-browser
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
agent-browser tab
agent-browser snapshot -i
提示: 如果你频繁迭代,请在单独的终端中运行
npm run watch,以便自动进行编译。你仍然需要终止并重新启动 Code OSS 以加载新构建。
VS Code 使用 Monaco 编辑器处理所有文本输入,包括 Copilot Chat 输入。Monaco 编辑器需要特定的 agent-browser 技术 — 标准的 click、fill 和 keyboard type 命令可能无法工作,具体取决于 VS Code 的构建版本。
press 聚焦这适用于所有 VS Code 构建版本(Code OSS、Insiders、稳定版):
# 1. 使用键盘快捷键打开并聚焦聊天输入框
# macOS:
agent-browser press Control+Meta+i
# Linux / Windows:
agent-browser press Control+Alt+i
# 2. 使用单独的 press 命令输入
agent-browser press H
agent-browser press e
agent-browser press l
agent-browser press l
agent-browser press o
agent-browser press Space # 使用 "Space" 表示空格
agent-browser press w
agent-browser press o
agent-browser press r
agent-browser press l
agent-browser press d
# 验证文本是否出现(可选)
agent-browser eval '
(() => {
const sidebar = document.querySelector(".part.auxiliarybar");
const viewLines = sidebar.querySelectorAll(".interactive-input-editor .view-line");
return Array.from(viewLines).map(vl => vl.textContent).join("|");
})()'
# 3. 发送消息(所有平台相同)
agent-browser press Enter
各平台聊天聚焦快捷键:
Ctrl+Cmd+I → agent-browser press Control+Meta+iCtrl+Alt+I → agent-browser press Control+Alt+iCtrl+Alt+I → agent-browser press Control+Alt+i此快捷键聚焦聊天输入框,并将 document.activeElement 设置为一个带有 native-edit-context 类的 DIV — 这是 VS Code 的原生文本编辑表面,可以正确处理来自 agent-browser press 的按键事件。
type @ref — 适用于某些构建版本在 VS Code Insiders(扩展调试模式)上,type @ref 一步处理聚焦和输入:
agent-browser snapshot -i
# 查找:textbox "The editor is not accessible..." [ref=e62]
agent-browser type @e62 "Hello from George!"
提示: 如果
type @ref静默丢弃文本(编辑器保持为空),则引用可能已过时或编辑器尚未就绪。重新快照以获取新的引用并重试。你可以使用下面“验证文本和清除”中的代码片段来验证文本是否已输入。
然而,type @ref 在 Code OSS 上静默失败 — 命令完成且没有错误,但没有文本出现。这也适用于 keyboard type 和 keyboard inserttext。输入后始终验证文本是否出现,如果没有,则回退到键盘快捷键 + press 模式。press-逐键方法在所有构建版本中普遍有效。
⚠️ 警告:
keyboard type在某些聚焦状态下(例如,在 JS 鼠标事件之后)可能会无限期挂起。如果它在几秒钟内没有返回,请中断它并回退到press进行逐个按键输入。
| 方法 | VS Code Insiders | Code OSS |
|---|---|---|
press 逐键(聚焦快捷键后) | ✅ 有效 | ✅ 有效 |
type @ref | ✅ 有效 | ❌ 静默失败 |
keyboard type(聚焦后) | ✅ 有效 | ❌ 静默失败 |
keyboard inserttext(聚焦后) | ✅ 有效 | ❌ 静默失败 |
click @ref | ❌ 被覆盖层阻止 | ❌ 被覆盖层阻止 |
fill @ref | ❌ 元素不可见 | ❌ 元素不可见 |
如果键盘快捷键不起作用(例如,聊天面板未配置),你可以通过 JavaScript 聚焦编辑器:
agent-browser eval '
(() => {
const inputPart = document.querySelector(".interactive-input-part");
const editor = inputPart.querySelector(".monaco-editor");
const rect = editor.getBoundingClientRect();
const x = rect.x + rect.width / 2;
const y = rect.y + rect.height / 2;
editor.dispatchEvent(new MouseEvent("mousedown", { bubbles: true, clientX: x, clientY: y }));
editor.dispatchEvent(new MouseEvent("mouseup", { bubbles: true, clientX: x, clientY: y }));
editor.dispatchEvent(new MouseEvent("click", { bubbles: true, clientX: x, clientY: y }));
return "activeElement: " + document.activeElement?.className;
})()'
# 然后为每个字符使用 press
agent-browser press H
agent-browser press e
# ...
# 验证聊天输入框中的文本
agent-browser eval '
(() => {
const sidebar = document.querySelector(".part.auxiliarybar");
const viewLines = sidebar.querySelectorAll(".interactive-input-editor .view-line");
return Array.from(viewLines).map(vl => vl.textContent).join("|");
})()'
# 清除输入框(全选 + 退格键)
# macOS:
agent-browser press Meta+a
# Linux / Windows:
agent-browser press Control+a
# 然后删除:
agent-browser press Backspace
在超宽显示器上,聊天侧边栏可能位于 CDP 截图的远右角。选项:
agent-browser screenshot --full 捕获整个窗口agent-browser screenshot ".part.auxiliarybar" sidebar.pngagent-browser screenshot --annotate 查看带标签的元素位置macOS: 如果
agent-browser screenshot返回 "Permission denied",你的终端需要屏幕录制权限。在系统设置 → 隐私与安全 → 屏幕录制中授予权限。作为备用方案,使用eval验证代码片段来确认文本已输入 — 这不需要屏幕权限。
--remote-debugging-port=NNNN 启动的lsof -i :9224netstat -ano | findstr 9224agent-browser tab 列出目标并切换到正确的目标agent-browser snapshot -i -C 以包含光标可交互元素(带有 onclick 处理程序的 div)agent-browser press 进行逐个按键输入。使用键盘快捷键聚焦聊天输入框(macOS: Ctrl+Cmd+I, Linux/Windows: Ctrl+Alt+I)。type @ref、keyboard type 和 keyboard inserttext 在 VS Code Insiders 上有效,但在 Code OSS 上静默失败 — 它们完成且没有错误,但没有文本出现。press-逐键方法普遍有效。完成后务必终止 Code OSS 实例。 Code OSS 是一个完整的 Electron 应用程序,会消耗大量内存(通常为 1–4 GB+)。让它保持运行会浪费资源并占用 CDP 端口。
# 断开 agent-browser 连接
agent-browser close
# 终止监听调试端口的 Code OSS 实例(如果正在运行)
# macOS / Linux:
pids=$(lsof -t -i :9224)
if [ -n "$pids" ]; then
kill $pids
fi
# Windows:
# taskkill /F /PID <PID>
# 或使用任务管理器结束 "Code - OSS"
验证它已消失:
# 确认没有进程在监听调试端口
lsof -i :9224 # 应该不返回任何内容
每周安装量
64
仓库
GitHub 星标数
183.0K
首次出现
2026年3月5日
安全审计
安装于
github-copilot64
amp64
cline64
codex64
kimi-cli64
gemini-cli64
Automate VS Code (Code OSS) using agent-browser. VS Code is built on Electron/Chromium and exposes a Chrome DevTools Protocol (CDP) port that agent-browser can connect to, enabling the same snapshot-interact workflow used for web pages.
agent-browser must be installed. It's listed in devDependencies — run npm install in the repo root. Use npx agent-browser if it's not on your PATH, or install globally with npm install -g agent-browser../scripts/code.sh runs the build automatically if needed, or set VSCODE_SKIP_PRELAUNCH=1 to skip the compile step if you've already built..interactive-input-part, .interactive-input-editor, and .part.auxiliarybar used in eval commands are VS Code internals that may change across versions. If they stop working, use agent-browser snapshot -i to re-discover the current DOM structure.📸 Take screenshots for a paper trail. Use
agent-browser screenshot <path>at key moments — after launch, before/after interactions, and when something goes wrong. Screenshots provide visual proof of what the UI looked like and are invaluable for debugging failures or documenting what was accomplished.Save screenshots inside a timestamped subfolder so each run is isolated and nothing gets overwritten:
# Create a timestamped folder for this run's screenshots SCREENSHOT_DIR="/tmp/code-oss-screenshots/$(date +%Y-%m-%dT%H-%M-%S)" mkdir -p "$SCREENSHOT_DIR" # Save a screenshot (path is a positional argument — use ./ or absolute paths) # Bare filenames without ./ may be misinterpreted as CSS selectors agent-browser screenshot "$SCREENSHOT_DIR/after-launch.png"
# Launch Code OSS with remote debugging
./scripts/code.sh --remote-debugging-port=9224
# Wait for Code OSS to start, retry until connected
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
# Verify you're connected to the right target (not about:blank)
# If `tab` shows the wrong target, run `agent-browser close` and reconnect
agent-browser tab
# Discover UI elements
agent-browser snapshot -i
# Focus the chat input (macOS)
agent-browser press Control+Meta+i
# Connect to a specific port
agent-browser connect 9222
# Or use --cdp on each command
agent-browser --cdp 9222 snapshot -i
# Auto-discover a running Chromium-based app
agent-browser --auto-connect snapshot -i
After connect, all subsequent commands target the connected app without needing --cdp.
Electron apps often have multiple windows or webviews. Use tab commands to list and switch between them:
# List all available targets (windows, webviews, etc.)
agent-browser tab
# Switch to a specific tab by index
agent-browser tab 2
# Switch by URL pattern
agent-browser tab --url "*settings*"
The VS Code repository includes scripts/code.sh which launches Code OSS from source. It passes all arguments through to the Electron binary, so --remote-debugging-port works directly:
cd <repo-root> # the root of your VS Code checkout
./scripts/code.sh --remote-debugging-port=9224
Wait for the window to fully initialize, then connect:
# Wait for Code OSS to start, retry until connected
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
# Verify you're connected to the right target (not about:blank)
# If `tab` shows the wrong target, run `agent-browser close` and reconnect
agent-browser tab
agent-browser snapshot -i
Tips:
VSCODE_SKIP_PRELAUNCH=1 to skip the compile step if you've already built: VSCODE_SKIP_PRELAUNCH=1 ./scripts/code.sh --remote-debugging-port=9224 (from the repo root)--user-data-dir since there's usually only one Code OSS instance running.--user-data-dir=/tmp/code-oss-debug to force a new instance.The Sessions app is a separate workbench mode launched with the --sessions flag. It uses a dedicated user data directory to avoid conflicts with the main Code OSS instance.
cd <repo-root> # the root of your VS Code checkout
./scripts/code.sh --sessions --remote-debugging-port=9224
Wait for the window to fully initialize, then connect:
# Wait for Sessions app to start, retry until connected
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
# Verify you're connected to the right target (not about:blank)
agent-browser tab
agent-browser snapshot -i
Tips:
--sessions flag launches the Agent Sessions workbench instead of the standard VS Code workbench.VSCODE_SKIP_PRELAUNCH=1 to skip the compile step if you've already built.To debug a VS Code extension via agent-browser, launch VS Code Insiders with --extensionDevelopmentPath and --remote-debugging-port. Use --user-data-dir to avoid conflicting with an already-running instance.
# Build the extension first
cd <extension-repo-root> # e.g., the root of your extension checkout
npm run compile
# Launch VS Code Insiders with the extension and CDP
code-insiders \
--extensionDevelopmentPath="<extension-repo-root>" \
--remote-debugging-port=9223 \
--user-data-dir=/tmp/vscode-ext-debug
# Wait for VS Code to start, retry until connected
for i in 1 2 3 4 5; do agent-browser connect 9223 2>/dev/null && break || sleep 3; done
# Verify you're connected to the right target (not about:blank)
# If `tab` shows the wrong target, run `agent-browser close` and reconnect
agent-browser tab
agent-browser snapshot -i
Key flags:
--extensionDevelopmentPath=<path> — loads your extension from source (must be compiled first)--remote-debugging-port=9223 — enables CDP (use 9223 to avoid conflicts with other apps on 9222)--user-data-dir=<path> — uses a separate profile so it starts a new process instead of sending to an existing VS Code instanceWithout--user-data-dir, VS Code detects the running instance, forwards the args to it, and exits immediately — you'll see "Sent env to running instance. Terminating..." and CDP never starts.
After making changes to Code OSS source code, you must restart to pick up the new build. The workbench loads the compiled JavaScript at startup — changes are not hot-reloaded.
# 1. Ensure your build is up to date.
# Normally you can skip a manual step here and let ./scripts/code.sh in step 3
# trigger the build when needed (or run `npm run watch` in another terminal).
# 2. Kill the Code OSS instance listening on the debug port (if running)
pids=$(lsof -t -i :9224)
if [ -n "$pids" ]; then
kill $pids
fi
# 3. Relaunch
./scripts/code.sh --remote-debugging-port=9224
# 4. Reconnect agent-browser
for i in 1 2 3 4 5; do agent-browser connect 9224 2>/dev/null && break || sleep 3; done
agent-browser tab
agent-browser snapshot -i
Tip: If you're iterating frequently, run
npm run watchin a separate terminal so compilation happens automatically. You still need to kill and relaunch Code OSS to load the new build.
VS Code uses Monaco Editor for all text inputs including the Copilot Chat input. Monaco editors require specific agent-browser techniques — standard click, fill, and keyboard type commands may not work depending on the VS Code build.
pressThis works on all VS Code builds (Code OSS, Insiders, stable):
# 1. Open and focus the chat input with the keyboard shortcut
# macOS:
agent-browser press Control+Meta+i
# Linux / Windows:
agent-browser press Control+Alt+i
# 2. Type using individual press commands
agent-browser press H
agent-browser press e
agent-browser press l
agent-browser press l
agent-browser press o
agent-browser press Space # Use "Space" for spaces
agent-browser press w
agent-browser press o
agent-browser press r
agent-browser press l
agent-browser press d
# Verify text appeared (optional)
agent-browser eval '
(() => {
const sidebar = document.querySelector(".part.auxiliarybar");
const viewLines = sidebar.querySelectorAll(".interactive-input-editor .view-line");
return Array.from(viewLines).map(vl => vl.textContent).join("|");
})()'
# 3. Send the message (same on all platforms)
agent-browser press Enter
Chat focus shortcut by platform:
Ctrl+Cmd+I → agent-browser press Control+Meta+iCtrl+Alt+I → agent-browser press Control+Alt+iCtrl+Alt+I → agent-browser press Control+Alt+iThis shortcut focuses the chat input and sets document.activeElement to a DIV with class native-edit-context — VS Code's native text editing surface that correctly processes key events from agent-browser press.
type @ref — Works on Some BuildsOn VS Code Insiders (extension debug mode), type @ref handles focus and input in one step:
agent-browser snapshot -i
# Look for: textbox "The editor is not accessible..." [ref=e62]
agent-browser type @e62 "Hello from George!"
Tip: If
type @refsilently drops text (the editor stays empty), the ref may be stale or the editor not yet ready. Re-snapshot to get a fresh ref and try again. You can verify text was entered using the snippet in "Verifying Text and Clearing" below.
However, type @ref silently fails on Code OSS — the command completes without error but no text appears. This also applies to keyboard type and keyboard inserttext. Always verify text appeared after typing, and fall back to the keyboard shortcut + press pattern if it didn't. The press-per-key approach works universally across all builds.
⚠️ Warning:
keyboard typecan hang indefinitely in some focus states (e.g., after JS mouse events). If it doesn't return within a few seconds, interrupt it and fall back topressfor individual keystrokes.
| Method | VS Code Insiders | Code OSS |
|---|---|---|
press per key (after focus shortcut) | ✅ Works | ✅ Works |
type @ref | ✅ Works | ❌ Silent fail |
keyboard type (after focus) | ✅ Works | ❌ Silent fail |
keyboard inserttext (after focus) | ✅ Works | ❌ Silent fail |
click @ref | ❌ Blocked by overlay |
If the keyboard shortcut doesn't work (e.g., chat panel isn't configured), you can focus the editor via JavaScript:
agent-browser eval '
(() => {
const inputPart = document.querySelector(".interactive-input-part");
const editor = inputPart.querySelector(".monaco-editor");
const rect = editor.getBoundingClientRect();
const x = rect.x + rect.width / 2;
const y = rect.y + rect.height / 2;
editor.dispatchEvent(new MouseEvent("mousedown", { bubbles: true, clientX: x, clientY: y }));
editor.dispatchEvent(new MouseEvent("mouseup", { bubbles: true, clientX: x, clientY: y }));
editor.dispatchEvent(new MouseEvent("click", { bubbles: true, clientX: x, clientY: y }));
return "activeElement: " + document.activeElement?.className;
})()'
# Then use press for each character
agent-browser press H
agent-browser press e
# ...
# Verify text in the chat input
agent-browser eval '
(() => {
const sidebar = document.querySelector(".part.auxiliarybar");
const viewLines = sidebar.querySelectorAll(".interactive-input-editor .view-line");
return Array.from(viewLines).map(vl => vl.textContent).join("|");
})()'
# Clear the input (Select All + Backspace)
# macOS:
agent-browser press Meta+a
# Linux / Windows:
agent-browser press Control+a
# Then delete:
agent-browser press Backspace
On ultrawide monitors, the chat sidebar may be in the far-right corner of the CDP screenshot. Options:
agent-browser screenshot --full to capture the entire windowagent-browser screenshot ".part.auxiliarybar" sidebar.pngagent-browser screenshot --annotate to see labeled element positionsmacOS: If
agent-browser screenshotreturns "Permission denied", your terminal needs Screen Recording permission. Grant it in System Settings → Privacy & Security → Screen Recording. As a fallback, use theevalverification snippet to confirm text was entered — this doesn't require screen permissions.
--remote-debugging-port=NNNNlsof -i :9224netstat -ano | findstr 9224agent-browser tab to list targets and switch to the right oneagent-browser snapshot -i -C to include cursor-interactive elements (divs with onclick handlers)agent-browser press for individual keystrokes after focusing the input. Focus the chat input with the keyboard shortcut (macOS: Ctrl+Cmd+I, Linux/Windows: Ctrl+Alt+I).type @ref, keyboard type, and keyboard inserttext work on VS Code Insiders but silently fail on Code OSS — they complete without error but no text appears. The press-per-key approach works universally.Always kill the Code OSS instance when you're done. Code OSS is a full Electron app that consumes significant memory (often 1–4 GB+). Leaving it running wastes resources and holds the CDP port.
# Disconnect agent-browser
agent-browser close
# Kill the Code OSS instance listening on the debug port (if running)
# macOS / Linux:
pids=$(lsof -t -i :9224)
if [ -n "$pids" ]; then
kill $pids
fi
# Windows:
# taskkill /F /PID <PID>
# Or use Task Manager to end "Code - OSS"
Verify it's gone:
# Confirm no process is listening on the debug port
lsof -i :9224 # should return nothing
Weekly Installs
64
Repository
GitHub Stars
183.0K
First Seen
Mar 5, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
github-copilot64
amp64
cline64
codex64
kimi-cli64
gemini-cli64
Skills CLI 使用指南:AI Agent 技能包管理器安装与管理教程
48,700 周安装
next-safe-action 测试指南:服务器 Actions 与中间件测试最佳实践
62 周安装
专业内容创作工具:品牌声音分析与SEO优化,提升内容营销效果
62 周安装
通用项目发布工具 - 多语言更新日志自动生成 | 支持Node.js/Python/Rust/Claude插件
62 周安装
Edge Pipeline Orchestrator:自动化金融交易策略流水线编排工具
62 周安装
Python ROI 计算器:投资回报率、营销ROI、盈亏平衡分析工具
62 周安装
Salesforce Hyperforce 2025架构指南:云原生、零信任安全与开发最佳实践
62 周安装
| ❌ Blocked by overlay |
fill @ref | ❌ Element not visible | ❌ Element not visible |