web-scraper by liranudi/openclaw-web-scraper
npx skills add https://github.com/liranudi/openclaw-web-scraper --skill web-scraper四个脚本,无需 API 密钥。默认所有输出均为 JSON 格式。
依赖项: requests、beautifulsoup4、playwright(需搭配 Chromium)。可选: 用于 PDF 文本提取的 pdfplumber 或 PyPDF2。
安装:pip install requests beautifulsoup4 playwright && playwright install chromium
python3 scripts/google_search.py "query" --pages N --engine ENGINE
--engine — (默认)、 或 广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
duckduckgobravegoogle[{title, url, snippet}, ...]python3 scripts/read_page.py "https://url" [--max-chars N] [--visible] [--format json|markdown|text] [--no-dismiss]
--format — json(默认)、markdown 或 text--no-dismiss 跳过)python3 scripts/browser_session.py open "https://url" # 打开 + 提取
python3 scripts/browser_session.py navigate "https://other" # 跳转到新 URL
python3 scripts/browser_session.py extract [--format FMT] # 重新读取页面
python3 scripts/browser_session.py screenshot [path] [--full] # 保存截图
python3 scripts/browser_session.py click "Submit" # 通过文本/选择器点击
python3 scripts/browser_session.py search "keyword" # 在页面中搜索文本
python3 scripts/browser_session.py tab new "https://url" # 打开新标签页
python3 scripts/browser_session.py tab list # 列出所有标签页
python3 scripts/browser_session.py tab switch 1 # 切换到指定索引的标签页
python3 scripts/browser_session.py tab close [index] # 关闭标签页
python3 scripts/browser_session.py dismiss-cookies # 手动关闭 Cookie 提示
python3 scripts/browser_session.py close # 关闭浏览器
python3 scripts/download_file.py "https://example.com/doc.pdf" [--output DIR] [--filename NAME]
{status, path, filename, size_bytes, content_type, extracted_text}每周安装量
98
代码仓库
GitHub 星标数
1
首次出现
2026年3月5日
安全审计
安装于
opencode98
gemini-cli98
codex98
kimi-cli98
github-copilot98
amp98
Four scripts, zero API keys. All output is JSON by default.
Dependencies: requests, beautifulsoup4, playwright (with Chromium). Optional: pdfplumber or PyPDF2 for PDF text extraction.
Install: pip install requests beautifulsoup4 playwright && playwright install chromium
python3 scripts/google_search.py "query" --pages N --engine ENGINE
--engine — duckduckgo (default), brave, or google[{title, url, snippet}, ...]python3 scripts/read_page.py "https://url" [--max-chars N] [--visible] [--format json|markdown|text] [--no-dismiss]
--format — json (default), markdown, or text--no-dismiss)python3 scripts/browser_session.py open "https://url" # Open + extract
python3 scripts/browser_session.py navigate "https://other" # Go to new URL
python3 scripts/browser_session.py extract [--format FMT] # Re-read page
python3 scripts/browser_session.py screenshot [path] [--full] # Save screenshot
python3 scripts/browser_session.py click "Submit" # Click by text/selector
python3 scripts/browser_session.py search "keyword" # Search text in page
python3 scripts/browser_session.py tab new "https://url" # Open new tab
python3 scripts/browser_session.py tab list # List all tabs
python3 scripts/browser_session.py tab switch 1 # Switch to tab index
python3 scripts/browser_session.py tab close [index] # Close tab
python3 scripts/browser_session.py dismiss-cookies # Manually dismiss cookies
python3 scripts/browser_session.py close # Close browser
python3 scripts/download_file.py "https://example.com/doc.pdf" [--output DIR] [--filename NAME]
{status, path, filename, size_bytes, content_type, extracted_text}Weekly Installs
98
Repository
GitHub Stars
1
First Seen
Mar 5, 2026
Security Audits
Gen Agent Trust HubWarnSocketPassSnykWarn
Installed on
opencode98
gemini-cli98
codex98
kimi-cli98
github-copilot98
amp98
通过 LiteLLM 代理让 Claude Code 对接 GitHub Copilot 运行 | 高级变通方案指南
44,900 周安装