web-fetch by 0xbigboss/claude-code
npx skills add https://github.com/0xbigboss/claude-code --skill web-fetch按以下顺序抓取网页内容:
content-type: text/markdown)在提取前验证所需工具:
command -v curl >/dev/null || echo "curl is required"
command -v html2markdown >/dev/null || echo "html2markdown is required for HTML extraction"
command -v bun >/dev/null || echo "bun is required for fetch.ts fallback"
为捆绑脚本安装 Bun 依赖项:
cd ~/.claude/skills/web-fetch && bun install
将此作为任何 URL 的默认流程:
URL="<url>"
CONTENT_TYPE="$(curl -sIL "$URL" | awk -F': ' 'tolower($1)=="content-type"{print tolower($2)}' | tr -d '\r' | tail -1)"
if echo "$CONTENT_TYPE" | grep -q "markdown"; then
curl -sL "$URL"
else
curl -sL "$URL" \
| html2markdown \
--include-selector "article,main,[role=main]" \
--exclude-selector "nav,header,footer,script,style"
fi
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| 网站 | 包含选择器 | 排除选择器 |
|---|---|---|
| platform.claude.com | #content-container | - |
| docs.anthropic.com | #content-container | - |
| developer.mozilla.org | article | - |
| github.com (文档) | article | nav,.sidebar |
| 通用 | article,main,[role=main] | nav,header,footer,script,style |
示例:
curl -sL "<url>" \
| html2markdown \
--include-selector "#content-container" \
--exclude-selector "nav,header,footer"
当网站不在模式列表中时:
# 检查存在哪些内容容器
curl -s "<url>" | grep -o '<article[^>]*>\|<main[^>]*>\|id="[^"]*content[^"]*"' | head -10
# 测试选择器
curl -sL "<url>" | html2markdown --include-selector "<selector>" | head -30
# 检查行数
curl -sL "<url>" | html2markdown --include-selector "<selector>" | wc -l
当选择器产生较差输出时,运行捆绑的解析器:
bun ~/.claude/skills/web-fetch/fetch.ts "<url>"
如果已在技能目录中:
bun fetch.ts "<url>"
--include-selector "CSS" # 仅保留匹配元素
--exclude-selector "CSS" # 移除匹配元素
--domain "https://..." # 将相对链接转换为绝对链接
使用选择器时输出为空:页面可能是原生 Markdown。首先检查头部:
curl -sIL "<url>" | grep -i '^content-type:'
选择了错误的内容:网站可能有多个 article/main 区域:
curl -s "<url>" | grep -o '<article[^>]*>'
未找到 html2markdown:安装它,然后重试基于选择器的提取。
缺少 bun 或脚本依赖项:运行 cd ~/.claude/skills/web-fetch && bun install。
缺少代码块:检查网站是否使用非标准代码格式化。
客户端渲染的内容:如果 HTML 只有 "Loading..." 占位符,则内容是 JS 渲染的。curl 和 Bun 脚本都无法提取;请使用基于浏览器的工具。
每周安装数
336
仓库
GitHub 星标数
36
首次出现时间
2026年1月20日
安全审计
安装于
opencode295
codex286
gemini-cli281
github-copilot267
cursor264
amp241
Fetch web content in this order:
content-type: text/markdown)Verify required tools before extracting:
command -v curl >/dev/null || echo "curl is required"
command -v html2markdown >/dev/null || echo "html2markdown is required for HTML extraction"
command -v bun >/dev/null || echo "bun is required for fetch.ts fallback"
Install Bun dependencies for the bundled script:
cd ~/.claude/skills/web-fetch && bun install
Use this as the default flow for any URL:
URL="<url>"
CONTENT_TYPE="$(curl -sIL "$URL" | awk -F': ' 'tolower($1)=="content-type"{print tolower($2)}' | tr -d '\r' | tail -1)"
if echo "$CONTENT_TYPE" | grep -q "markdown"; then
curl -sL "$URL"
else
curl -sL "$URL" \
| html2markdown \
--include-selector "article,main,[role=main]" \
--exclude-selector "nav,header,footer,script,style"
fi
| Site | Include Selector | Exclude Selector |
|---|---|---|
| platform.claude.com | #content-container | - |
| docs.anthropic.com | #content-container | - |
| developer.mozilla.org | article | - |
| github.com (docs) | article | nav,.sidebar |
| Generic | article,main,[role=main] |
Example:
curl -sL "<url>" \
| html2markdown \
--include-selector "#content-container" \
--exclude-selector "nav,header,footer"
When a site isn't in the patterns list:
# Check what content containers exist
curl -s "<url>" | grep -o '<article[^>]*>\|<main[^>]*>\|id="[^"]*content[^"]*"' | head -10
# Test a selector
curl -sL "<url>" | html2markdown --include-selector "<selector>" | head -30
# Check line count
curl -sL "<url>" | html2markdown --include-selector "<selector>" | wc -l
When selectors produce poor output, run the bundled parser:
bun ~/.claude/skills/web-fetch/fetch.ts "<url>"
If already in the skill directory:
bun fetch.ts "<url>"
--include-selector "CSS" # Keep only matching elements
--exclude-selector "CSS" # Remove matching elements
--domain "https://..." # Convert relative links to absolute
Empty output with selectors : The page might be markdown-native. Check headers first:
curl -sIL "<url>" | grep -i '^content-type:'
Wrong content selected : The site may have multiple article/main regions:
curl -s "<url>" | grep -o '<article[^>]*>'
html2markdown not found: Install it, then retry selector-based extraction.
bun or script deps missing: Run cd ~/.claude/skills/web-fetch && bun install.
Missing code blocks : Check if the site uses non-standard code formatting.
Client-rendered content : If HTML only has "Loading..." placeholders, the content is JS-rendered. Neither curl nor the Bun script can extract it; use browser-based tools.
Weekly Installs
336
Repository
GitHub Stars
36
First Seen
Jan 20, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
opencode295
codex286
gemini-cli281
github-copilot267
cursor264
amp241
agent-browser 浏览器自动化工具 - Vercel Labs 命令行网页操作与测试
136,300 周安装
FlowStudio MCP 构建部署 Power Automate 云流指南 | 自动化流程开发
530 周安装
mcp2cli:无需代码,将MCP/OpenAPI/GraphQL实时转换为命令行工具
530 周安装
交互式作品集设计指南:30秒吸引招聘者,提升作品集转化率与个人品牌
530 周安装
每日销售简报AI工具 - 自动生成优先级行动计划,整合日历、CRM和邮件数据
531 周安装
生产排程实战指南:离散制造工厂的有限产能排程、换线优化与瓶颈管理
531 周安装
Angular 21 最佳实践指南:TypeScript、Signals、组件与性能优化
531 周安装
nav,header,footer,script,style |