npx skills add https://github.com/thisnick/agent-rdp --skill agent-rdpagent-rdp connect --host <ip> -u <user> -p <pass> --enable-win-automation
agent-rdp automate snapshot -i # 查看交互式元素
agent-rdp automate click "@e5" # 通过引用点击按钮
agent-rdp automate fill "@e7" "Hello" # 在字段中输入文本
agent-rdp disconnect
agent-rdp connect --host <ip> -u <user> -p <pass> --enable-win-automationagent-rdp automate snapshot -i(获取带有引用的无障碍树)agent-rdp automate click @e5 或 agent-rdp automate fill @e7 "text"-i 时元素不在快照中广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
尝试不使用 -i 标志 - 有些元素未被标记为交互式,但仍然可以操作:
agent-rdp automate snapshot # 完整树,不过滤
agent-rdp automate snapshot -d 5 # 如果太大则限制深度
某些 UI 元素(WebView 内容、特定对话框、Toast 通知)不会出现在无障碍树中。最后的手段是使用 OCR:
agent-rdp screenshot -o screen.pngagent-rdp locate "按钮文本"agent-rdp mouse click <x> <y>agent-rdp connect --host 192.168.1.100 -u Admin -p secret
agent-rdp connect --host 192.168.1.100 -u Admin --password-stdin # 从标准输入读取密码
agent-rdp connect --host 192.168.1.100 --width 1920 --height 1080
agent-rdp connect --host 192.168.1.100 --drive /tmp/share:Share # 映射本地目录
agent-rdp disconnect
agent-rdp screenshot # 保存到 ./screenshot.png
agent-rdp screenshot -o desktop.png # 保存到指定文件
agent-rdp screenshot --format jpeg # JPEG 格式
agent-rdp mouse click 500 300 # 在 (500, 300) 处左键单击
agent-rdp mouse right-click 500 300 # 右键单击
agent-rdp mouse double-click 500 300 # 双击
agent-rdp mouse move 100 200 # 移动光标
agent-rdp mouse drag 100 100 500 500 # 从 (100,100) 拖拽到 (500,500)
agent-rdp keyboard type "Hello World" # 输入文本(支持 Unicode)
agent-rdp keyboard press "ctrl+c" # 按键组合
agent-rdp keyboard press "alt+tab" # 切换窗口
agent-rdp keyboard press "ctrl+shift+esc" # 任务管理器
agent-rdp keyboard press "win+r" # 运行对话框
agent-rdp keyboard press enter # 单个按键(使用 press,不是 key)
agent-rdp keyboard press escape
agent-rdp keyboard press f5
agent-rdp scroll up --amount 3 # 向上滚动 3 个刻度
agent-rdp scroll down --amount 5 # 向下滚动 5 个刻度
agent-rdp scroll left
agent-rdp scroll right
agent-rdp clipboard set "要粘贴的文本" # 设置剪贴板(在 Windows 上粘贴)
agent-rdp clipboard get # 获取剪贴板(在 Windows 上复制后)
# 连接时映射
agent-rdp connect --host <ip> -u <user> -p <pass> --drive /local/path:DriveName
# 列出映射的驱动器
agent-rdp drive list
agent-rdp session list # 列出活动会话
agent-rdp session info # 当前会话信息
agent-rdp --session work connect ... # 命名会话
agent-rdp --session work screenshot # 使用命名会话
agent-rdp wait 2000 # 等待 2 秒
agent-rdp locate "Cancel" # 查找包含 "Cancel" 的行
agent-rdp locate "Save*" --pattern # 通配符模式匹配
agent-rdp locate --all # 获取屏幕上的所有文本
agent-rdp locate "OK" --json # 带有坐标的 JSON 输出
返回带有边界框和用于点击的中心坐标的文本行:
找到 1 行包含 'Cancel':
'Cancel' 位于 (650, 420) 大小 45x14 - 中心点: (672, 427)
要点击第一个匹配项:agent-rdp mouse click 672 427
# 启用自动化连接
agent-rdp connect --host 192.168.1.100 -u Admin -p secret --enable-win-automation
# 快照 - 获取无障碍树(始终包含引用)
agent-rdp automate snapshot # 完整桌面树
agent-rdp automate snapshot -i # 仅交互式元素
agent-rdp automate snapshot -c # 紧凑模式(移除空元素)
agent-rdp automate snapshot -d 5 # 限制深度为 5 级
agent-rdp automate snapshot -s "~*Notepad*"# 限定到窗口/元素范围
agent-rdp automate snapshot -f # 从焦点元素开始
agent-rdp automate snapshot -i -c -d 3 # 组合选项
# 基于模式元素操作(使用选择器:@eN, #automationId, .className, 或 name)
agent-rdp automate click "#SaveButton" # 点击按钮
agent-rdp automate click "@e5" # 通过引用号点击
agent-rdp automate click "@e5" -d # 双击(用于文件列表项)
agent-rdp automate select "@e10" # 选择项目 (SelectionItemPattern)
agent-rdp automate select "@e5" --item "Option 1" # 在容器中按名称选择项目
agent-rdp automate toggle "@e7" # 切换复选框 (TogglePattern)
agent-rdp automate toggle "@e7" --state on # 设置特定状态
agent-rdp automate expand "@e3" # 展开菜单/树 (ExpandCollapsePattern)
agent-rdp automate collapse "@e3" # 折叠菜单/树
agent-rdp automate context-menu "@e5" # 打开上下文菜单 (Shift+F10)
agent-rdp automate focus <selector> # 聚焦元素
agent-rdp automate get <selector> # 获取元素属性
# 文本输入
agent-rdp automate fill <selector> "text" # 清除并填充文本 (ValuePattern)
agent-rdp automate clear <selector> # 仅清除
# 滚动
agent-rdp automate scroll <selector> --direction down --amount 3
# 窗口操作
agent-rdp automate window list
agent-rdp automate window focus "~*Notepad*"
agent-rdp automate window maximize
agent-rdp automate window minimize
agent-rdp automate window restore
agent-rdp automate window close "~*Notepad*"
# 运行命令/应用(打开应用的最佳方式)
agent-rdp automate run "notepad.exe" # 打开记事本
agent-rdp automate run "Start-Process ms-settings:" --wait # 打开设置
agent-rdp automate run "calc.exe" # 打开计算器
agent-rdp automate run "Get-Process" --wait --process-timeout 5000 # 使用 5 秒超时
# 等待元素
agent-rdp automate wait-for <selector> --timeout 5000
agent-rdp automate wait-for <selector> --state visible
# 状态
agent-rdp automate status
选择器语法:
@e5 或 @5 - 快照中的引用号(建议使用 e 前缀)#SaveButton - 自动化 ID.Edit - Win32 类名~*pattern* - 带有通配符的名称File - 元素名称(精确匹配)快照输出格式:
- Window "Notepad" [ref=e1, id=Notepad]
- MenuBar "Application" [ref=e2]
- MenuItem "File" [ref=e3]
- Edit "Text Editor" [ref=e5, value="Hello"]
添加 --json 以获取机器可读的输出:
agent-rdp --json clipboard get
agent-rdp --json session info
agent-rdp --json automate snapshot
agent-rdp connect --host 192.168.1.100 -u Admin -p secret
agent-rdp wait 3000 # 等待桌面
agent-rdp keyboard press "win+r" # 打开运行对话框
agent-rdp wait 1000
agent-rdp keyboard type "powershell"
agent-rdp keyboard press enter
agent-rdp wait 2000 # 等待 PowerShell
agent-rdp keyboard type "Get-Process"
agent-rdp keyboard press enter
agent-rdp screenshot --output result.png
agent-rdp disconnect
# 连接并映射本地目录
agent-rdp connect --host 192.168.1.100 -u Admin -p secret --drive /tmp/transfer:Transfer
# 在 Windows 上,通过 \\tsclient\Transfer 访问文件
agent-rdp keyboard press "win+r"
agent-rdp wait 500
agent-rdp keyboard type "\\\\tsclient\\Transfer"
agent-rdp keyboard press enter
# 启用自动化连接
agent-rdp connect --host 192.168.1.100 -u Admin -p secret --enable-win-automation
# 打开记事本
agent-rdp automate run "notepad.exe"
agent-rdp wait 2000
# 获取无障碍快照(始终包含引用)
agent-rdp automate snapshot -i # 仅交互式元素
# 在编辑控件中输入文本(使用快照中的引用)
agent-rdp automate fill "@e5" "来自自动化的问候!"
# 使用文件菜单保存 - 展开菜单,然后调用菜单项
agent-rdp automate expand "File" # 展开菜单 (ExpandCollapsePattern)
agent-rdp wait 500
agent-rdp automate click "Save As..." # 点击菜单项
# 等待保存对话框
agent-rdp automate wait-for "#FileNameControlHost" --timeout 5000
# 填写文件名并保存
agent-rdp automate fill "#FileNameControlHost" "test.txt"
agent-rdp automate click "#1" # 点击保存按钮
export AGENT_RDP_HOST=192.168.1.100
export AGENT_RDP_PORT=3389
export AGENT_RDP_USERNAME=Administrator
export AGENT_RDP_PASSWORD=secret
export AGENT_RDP_SESSION=default
agent-rdp connect # 使用环境变量进行连接
# 在端口 9224 上启用流式查看器
agent-rdp --stream-port 9224 connect --host 192.168.1.100 -u Admin -p secret
# 在浏览器中打开网页查看器
agent-rdp view --port 9224
# 或者手动访问 ws://localhost:9224 的 WebSocket(广播 JPEG 帧)
当启用自动化时,优先使用 automate fill 而不是 keyboard type - 它是无损的(不会丢失字符)且更快。
使用 automate run 直接启动应用:
agent-rdp automate run "notepad.exe"
agent-rdp automate run "calc.exe"
agent-rdp automate run "Start-Process ms-settings:" --wait # 设置
agent-rdp automate run "explorer.exe C:\\" # 文件资源管理器
重要:在尝试自动化任务之前,请仔细阅读这些限制。
automate snapshot 访问。Win+R(运行对话框)或 automate run 直接启动程序,而不是通过开始菜单导航。automate snapshot 输出中。locate 命令(OCR)查找按钮文本,并使用 mouse click 进行交互。这不可靠,但对于简单的“是/否”对话框可能有效。locate) 可靠性不高locate 命令使用 OCR,可能会误读字符、完全遗漏文本或返回不精确的坐标。automate snapshot(带和不带 -i 标志)locate "text" 通过 OCR 查找locate 输出的坐标配合 mouse click每周安装数
103
代码仓库
GitHub 星标数
9
首次出现
2026 年 1 月 21 日
安全审计
安装于
opencode90
gemini-cli88
github-copilot81
codex81
cursor80
openclaw75
agent-rdp connect --host <ip> -u <user> -p <pass> --enable-win-automation
agent-rdp automate snapshot -i # See interactive elements
agent-rdp automate click "@e5" # Click button by ref
agent-rdp automate fill "@e7" "Hello" # Type into field
agent-rdp disconnect
agent-rdp connect --host <ip> -u <user> -p <pass> --enable-win-automationagent-rdp automate snapshot -i (get accessibility tree with refs)agent-rdp automate click @e5 or agent-rdp automate fill @e7 "text"-iTry without -i flag - some elements aren't marked as interactive but are still actionable:
agent-rdp automate snapshot # Full tree, no filtering
agent-rdp automate snapshot -d 5 # Limit depth if too large
Some UI elements (WebView content, certain dialogs, toast notifications) don't appear in the accessibility tree. Use OCR as a last resort:
agent-rdp screenshot -o screen.pngagent-rdp locate "Button Text"agent-rdp mouse click <x> <y>agent-rdp connect --host 192.168.1.100 -u Admin -p secret
agent-rdp connect --host 192.168.1.100 -u Admin --password-stdin # Read password from stdin
agent-rdp connect --host 192.168.1.100 --width 1920 --height 1080
agent-rdp connect --host 192.168.1.100 --drive /tmp/share:Share # Map local directory
agent-rdp disconnect
agent-rdp screenshot # Save to ./screenshot.png
agent-rdp screenshot -o desktop.png # Save to specific file
agent-rdp screenshot --format jpeg # JPEG format
agent-rdp mouse click 500 300 # Left click at (500, 300)
agent-rdp mouse right-click 500 300 # Right click
agent-rdp mouse double-click 500 300 # Double click
agent-rdp mouse move 100 200 # Move cursor
agent-rdp mouse drag 100 100 500 500 # Drag from (100,100) to (500,500)
agent-rdp keyboard type "Hello World" # Type text (supports Unicode)
agent-rdp keyboard press "ctrl+c" # Key combination
agent-rdp keyboard press "alt+tab" # Switch windows
agent-rdp keyboard press "ctrl+shift+esc" # Task manager
agent-rdp keyboard press "win+r" # Run dialog
agent-rdp keyboard press enter # Single key (use press, not key)
agent-rdp keyboard press escape
agent-rdp keyboard press f5
agent-rdp scroll up --amount 3 # Scroll up 3 notches
agent-rdp scroll down --amount 5 # Scroll down 5 notches
agent-rdp scroll left
agent-rdp scroll right
agent-rdp clipboard set "Text to paste" # Set clipboard (paste on Windows)
agent-rdp clipboard get # Get clipboard (after copy on Windows)
# Map at connect time
agent-rdp connect --host <ip> -u <user> -p <pass> --drive /local/path:DriveName
# List mapped drives
agent-rdp drive list
agent-rdp session list # List active sessions
agent-rdp session info # Current session info
agent-rdp --session work connect ... # Named session
agent-rdp --session work screenshot # Use named session
agent-rdp wait 2000 # Wait 2 seconds
agent-rdp locate "Cancel" # Find lines containing "Cancel"
agent-rdp locate "Save*" --pattern # Glob pattern matching
agent-rdp locate --all # Get all text on screen
agent-rdp locate "OK" --json # JSON output with coordinates
Returns text lines with bounding boxes and center coordinates for clicking:
Found 1 line(s) containing 'Cancel':
'Cancel' at (650, 420) size 45x14 - center: (672, 427)
To click the first match: agent-rdp mouse click 672 427
# Connect with automation enabled
agent-rdp connect --host 192.168.1.100 -u Admin -p secret --enable-win-automation
# Snapshot - get accessibility tree (refs always included)
agent-rdp automate snapshot # Full desktop tree
agent-rdp automate snapshot -i # Interactive elements only
agent-rdp automate snapshot -c # Compact (remove empty elements)
agent-rdp automate snapshot -d 5 # Limit depth to 5 levels
agent-rdp automate snapshot -s "~*Notepad*"# Scope to a window/element
agent-rdp automate snapshot -f # Start from focused element
agent-rdp automate snapshot -i -c -d 3 # Combine options
# Pattern-based element operations (use selectors: @eN, #automationId, .className, or name)
agent-rdp automate click "#SaveButton" # Click button
agent-rdp automate click "@e5" # Click by ref number
agent-rdp automate click "@e5" -d # Double-click (for file list items)
agent-rdp automate select "@e10" # Select item (SelectionItemPattern)
agent-rdp automate select "@e5" --item "Option 1" # Select item by name in container
agent-rdp automate toggle "@e7" # Toggle checkbox (TogglePattern)
agent-rdp automate toggle "@e7" --state on # Set specific state
agent-rdp automate expand "@e3" # Expand menu/tree (ExpandCollapsePattern)
agent-rdp automate collapse "@e3" # Collapse menu/tree
agent-rdp automate context-menu "@e5" # Open context menu (Shift+F10)
agent-rdp automate focus <selector> # Focus element
agent-rdp automate get <selector> # Get element properties
# Text input
agent-rdp automate fill <selector> "text" # Clear and fill text (ValuePattern)
agent-rdp automate clear <selector> # Just clear
# Scrolling
agent-rdp automate scroll <selector> --direction down --amount 3
# Window operations
agent-rdp automate window list
agent-rdp automate window focus "~*Notepad*"
agent-rdp automate window maximize
agent-rdp automate window minimize
agent-rdp automate window restore
agent-rdp automate window close "~*Notepad*"
# Run commands/apps (best way to open apps)
agent-rdp automate run "notepad.exe" # Open Notepad
agent-rdp automate run "Start-Process ms-settings:" --wait # Open Settings
agent-rdp automate run "calc.exe" # Open Calculator
agent-rdp automate run "Get-Process" --wait --process-timeout 5000 # With 5s timeout
# Wait for element
agent-rdp automate wait-for <selector> --timeout 5000
agent-rdp automate wait-for <selector> --state visible
# Status
agent-rdp automate status
Selector syntax:
@e5 or @5 - Reference number from snapshot (e prefix recommended)#SaveButton - Automation ID.Edit - Win32 class name~*pattern* - Name with wildcardFile - Element name (exact match)Snapshot output format:
- Window "Notepad" [ref=e1, id=Notepad]
- MenuBar "Application" [ref=e2]
- MenuItem "File" [ref=e3]
- Edit "Text Editor" [ref=e5, value="Hello"]
Add --json for machine-readable output:
agent-rdp --json clipboard get
agent-rdp --json session info
agent-rdp --json automate snapshot
agent-rdp connect --host 192.168.1.100 -u Admin -p secret
agent-rdp wait 3000 # Wait for desktop
agent-rdp keyboard press "win+r" # Open Run dialog
agent-rdp wait 1000
agent-rdp keyboard type "powershell"
agent-rdp keyboard press enter
agent-rdp wait 2000 # Wait for PowerShell
agent-rdp keyboard type "Get-Process"
agent-rdp keyboard press enter
agent-rdp screenshot --output result.png
agent-rdp disconnect
# Connect with local directory mapped
agent-rdp connect --host 192.168.1.100 -u Admin -p secret --drive /tmp/transfer:Transfer
# On Windows, access files at \\tsclient\Transfer
agent-rdp keyboard press "win+r"
agent-rdp wait 500
agent-rdp keyboard type "\\\\tsclient\\Transfer"
agent-rdp keyboard press enter
# Connect with automation enabled
agent-rdp connect --host 192.168.1.100 -u Admin -p secret --enable-win-automation
# Open Notepad
agent-rdp automate run "notepad.exe"
agent-rdp wait 2000
# Get accessibility snapshot (refs are always included)
agent-rdp automate snapshot -i # Interactive elements only
# Type text into the edit control (use ref from snapshot)
agent-rdp automate fill "@e5" "Hello from automation!"
# Use File menu to save - expand menu, then invoke menu item
agent-rdp automate expand "File" # Expand menu (ExpandCollapsePattern)
agent-rdp wait 500
agent-rdp automate click "Save As..." # Click menu item
# Wait for Save dialog
agent-rdp automate wait-for "#FileNameControlHost" --timeout 5000
# Fill filename and save
agent-rdp automate fill "#FileNameControlHost" "test.txt"
agent-rdp automate click "#1" # Click Save button
export AGENT_RDP_HOST=192.168.1.100
export AGENT_RDP_PORT=3389
export AGENT_RDP_USERNAME=Administrator
export AGENT_RDP_PASSWORD=secret
export AGENT_RDP_SESSION=default
agent-rdp connect # Uses env vars for connection
# Enable streaming viewer on port 9224
agent-rdp --stream-port 9224 connect --host 192.168.1.100 -u Admin -p secret
# Open web viewer in browser
agent-rdp view --port 9224
# Or manually access WebSocket at ws://localhost:9224 (broadcasts JPEG frames)
Preferautomate fill over keyboard type when automation is enabled—it's lossless (no dropped characters) and faster.
Use automate run to launch apps directly:
agent-rdp automate run "notepad.exe"
agent-rdp automate run "calc.exe"
agent-rdp automate run "Start-Process ms-settings:" --wait # Settings
agent-rdp automate run "explorer.exe C:\\" # File Explorer
IMPORTANT: Read these limitations carefully before attempting automation tasks.
automate snapshot.Win+R (Run dialog) or automate run to launch programs directly instead of navigating through the Start menu.automate snapshot output.locate command (OCR) to find button text and mouse click to interact. This is unreliable but may work for simple Yes/No dialogs.locate) is not highly reliablelocate command uses OCR which can misread characters, miss text entirely, or return imprecise coordinates.automate snapshot (with and without -i flag)locate "text" to find via OCRlocate output with mouse clickWeekly Installs
103
Repository
GitHub Stars
9
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykFail
Installed on
opencode90
gemini-cli88
github-copilot81
codex81
cursor80
openclaw75
通过 LiteLLM 代理让 Claude Code 对接 GitHub Copilot 运行 | 高级变通方案指南
40,000 周安装
QLTY开发过程代码质量检查工具:实时检测、自动修复、提升代码规范
229 周安装
路由器优先架构指南:AI工具路由与协同激活模式详解
228 周安装
migrate 迁移工作流:安全升级框架、语言和基础设施的自动化工具
229 周安装
async-repl-protocol 异步 REPL 协议:Agentica 测试工具使用指南与最佳实践
229 周安装
Agentica Spawn 多智能体协作框架:Swarm、Hierarchical、Generator/Critic、Jury 四种模式详解
226 周安装
tldr-router 智能路由:AI代码分析工具,自动映射问题到最优tldr命令
229 周安装