firecrawl by vm0-ai/vm0-skills
npx skills add https://github.com/vm0-ai/vm0-skills --skill firecrawl通过直接的 curl 调用来使用 Firecrawl API,以抓取网站并为 AI 提取数据。
官方文档:
https://docs.firecrawl.dev/
当你需要以下功能时,请使用此技能:
export FIRECRAWL_TOKEN="fc-your-api-key"
以下所有示例均假设你已设置 FIRECRAWL_TOKEN 环境变量。
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
基础 URL:https://api.firecrawl.dev/v1
从单个网页提取内容。
写入 /tmp/firecrawl_request.json:
{
"url": "https://example.com",
"formats": ["markdown"]
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json
写入 /tmp/firecrawl_request.json:
{
"url": "https://docs.example.com/api",
"formats": ["markdown"],
"onlyMainContent": true,
"timeout": 30000
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data.markdown'
写入 /tmp/firecrawl_request.json:
{
"url": "https://example.com",
"formats": ["html"]
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data.html'
写入 /tmp/firecrawl_request.json:
{
"url": "https://example.com",
"formats": ["screenshot"]
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data.screenshot'
抓取参数:
| 参数 | 类型 | 描述 |
|---|---|---|
url | string | 要抓取的 URL(必需) |
formats | array | markdown、html、rawHtml、screenshot、links |
onlyMainContent | boolean | 跳过页眉/页脚 |
timeout | number | 超时时间(毫秒) |
爬取网站的所有页面(异步操作)。
写入 /tmp/firecrawl_request.json:
{
"url": "https://example.com",
"limit": 50,
"maxDepth": 2
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/crawl" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json
响应:
{
"success": true,
"id": "crawl-job-id-here"
}
将 <job-id> 替换为爬取请求返回的实际作业 ID:
curl -s "https://api.firecrawl.dev/v1/crawl/<job-id>" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" | jq '{status, completed, total}'
将 <job-id> 替换为实际的作业 ID:
curl -s "https://api.firecrawl.dev/v1/crawl/<job-id>" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" | jq '.data[] | {url: .metadata.url, title: .metadata.title}'
写入 /tmp/firecrawl_request.json:
{
"url": "https://blog.example.com",
"limit": 20,
"maxDepth": 3,
"includePaths": ["/posts/*"],
"excludePaths": ["/admin/*", "/login"]
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/crawl" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json
爬取参数:
| 参数 | 类型 | 描述 |
|---|---|---|
url | string | 起始 URL(必需) |
limit | number | 最大爬取页面数(默认:100) |
maxDepth | number | 最大爬取深度(默认:3) |
includePaths | array | 要包含的路径(例如,/blog/*) |
excludePaths | array | 要排除的路径 |
快速获取网站上的所有 URL。
写入 /tmp/firecrawl_request.json:
{
"url": "https://example.com"
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/map" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.links[:10]'
写入 /tmp/firecrawl_request.json:
{
"url": "https://shop.example.com",
"search": "product",
"limit": 500
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/map" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.links'
映射参数:
| 参数 | 类型 | 描述 |
|---|---|---|
url | string | 网站 URL(必需) |
search | string | 过滤包含关键词的 URL |
limit | number | 返回的最大 URL 数(默认:1000) |
搜索网络并获取完整的页面内容。
写入 /tmp/firecrawl_request.json:
{
"query": "AI news 2024",
"limit": 5
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/search" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data[] | {title: .metadata.title, url: .url}'
写入 /tmp/firecrawl_request.json:
{
"query": "machine learning tutorials",
"limit": 3,
"scrapeOptions": {
"formats": ["markdown"]
}
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/search" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data[] | {title: .metadata.title, content: .markdown[:500]}'
搜索参数:
| 参数 | 类型 | 描述 |
|---|---|---|
query | string | 搜索查询(必需) |
limit | number | 结果数量(默认:10) |
scrapeOptions | object | 抓取结果的选项 |
使用 AI 从页面中提取结构化数据。
写入 /tmp/firecrawl_request.json:
{
"urls": ["https://example.com/product/123"],
"prompt": "Extract the product name, price, and description"
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'
写入 /tmp/firecrawl_request.json:
{
"urls": ["https://example.com/product/123"],
"prompt": "Extract product information",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"price": {"type": "number"},
"currency": {"type": "string"},
"inStock": {"type": "boolean"}
}
}
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'
写入 /tmp/firecrawl_request.json:
{
"urls": [
"https://example.com/product/1",
"https://example.com/product/2"
],
"prompt": "Extract product name and price"
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'
提取参数:
| 参数 | 类型 | 描述 |
|---|---|---|
urls | array | 要从中提取的 URL(必需) |
prompt | string | 要提取的数据描述(必需) |
schema | object | 用于结构化输出的 JSON 模式 |
写入 /tmp/firecrawl_request.json:
{
"url": "https://docs.python.org/3/tutorial/",
"formats": ["markdown"],
"onlyMainContent": true
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq -r '.data.markdown' > python-tutorial.md
写入 /tmp/firecrawl_request.json:
{
"url": "https://blog.example.com",
"search": "post"
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/map" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq -r '.links[]'
写入 /tmp/firecrawl_request.json:
{
"query": "best practices REST API design 2024",
"limit": 5,
"scrapeOptions": {"formats": ["markdown"]}
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/search" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data[] | {title: .metadata.title, url: .url}'
写入 /tmp/firecrawl_request.json:
{
"urls": ["https://example.com/pricing"],
"prompt": "Extract all pricing tiers with name, price, and features"
}
然后运行:
curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'
将 <job-id> 替换为实际的作业 ID:
while true; do
STATUS="$(curl -s "https://api.firecrawl.dev/v1/crawl/<job-id>" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" | jq -r '.status')"
echo "Status: $STATUS"
[ "$STATUS" = "completed" ] && break
sleep 5
done
{
"success": true,
"data": {
"markdown": "# Page Title\n\nContent...",
"metadata": {
"title": "Page Title",
"description": "...",
"url": "https://..."
}
}
}
{
"success": true,
"status": "completed",
"completed": 50,
"total": 50,
"data": [...]
}
limit 值以控制 API 使用onlyMainContent: true 以获得更清晰的输出/crawl/{id} 获取状态success 字段每周安装次数
162
仓库
GitHub 星标数
47
首次出现
2026 年 1 月 24 日
安全审计
安装于
opencode145
codex143
gemini-cli142
github-copilot139
cursor139
amp134
Use the Firecrawl API via direct curl calls to scrape websites and extract data for AI.
Official docs:
https://docs.firecrawl.dev/
Use this skill when you need to:
export FIRECRAWL_TOKEN="fc-your-api-key"
All examples below assume you have FIRECRAWL_TOKEN set.
Base URL: https://api.firecrawl.dev/v1
Extract content from a single webpage.
Write to /tmp/firecrawl_request.json:
{
"url": "https://example.com",
"formats": ["markdown"]
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json
Write to /tmp/firecrawl_request.json:
{
"url": "https://docs.example.com/api",
"formats": ["markdown"],
"onlyMainContent": true,
"timeout": 30000
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data.markdown'
Write to /tmp/firecrawl_request.json:
{
"url": "https://example.com",
"formats": ["html"]
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data.html'
Write to /tmp/firecrawl_request.json:
{
"url": "https://example.com",
"formats": ["screenshot"]
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data.screenshot'
Scrape Parameters:
| Parameter | Type | Description |
|---|---|---|
url | string | URL to scrape (required) |
formats | array | markdown, html, rawHtml, screenshot, links |
onlyMainContent |
Crawl all pages of a website (async operation).
Write to /tmp/firecrawl_request.json:
{
"url": "https://example.com",
"limit": 50,
"maxDepth": 2
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/crawl" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json
Response:
{
"success": true,
"id": "crawl-job-id-here"
}
Replace <job-id> with the actual job ID returned from the crawl request:
curl -s "https://api.firecrawl.dev/v1/crawl/<job-id>" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" | jq '{status, completed, total}'
Replace <job-id> with the actual job ID:
curl -s "https://api.firecrawl.dev/v1/crawl/<job-id>" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" | jq '.data[] | {url: .metadata.url, title: .metadata.title}'
Write to /tmp/firecrawl_request.json:
{
"url": "https://blog.example.com",
"limit": 20,
"maxDepth": 3,
"includePaths": ["/posts/*"],
"excludePaths": ["/admin/*", "/login"]
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/crawl" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json
Crawl Parameters:
| Parameter | Type | Description |
|---|---|---|
url | string | Starting URL (required) |
limit | number | Max pages to crawl (default: 100) |
maxDepth | number | Max crawl depth (default: 3) |
includePaths | array | Paths to include (e.g., /blog/*) |
excludePaths |
Get all URLs from a website quickly.
Write to /tmp/firecrawl_request.json:
{
"url": "https://example.com"
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/map" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.links[:10]'
Write to /tmp/firecrawl_request.json:
{
"url": "https://shop.example.com",
"search": "product",
"limit": 500
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/map" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.links'
Map Parameters:
| Parameter | Type | Description |
|---|---|---|
url | string | Website URL (required) |
search | string | Filter URLs containing keyword |
limit | number | Max URLs to return (default: 1000) |
Search the web and get full page content.
Write to /tmp/firecrawl_request.json:
{
"query": "AI news 2024",
"limit": 5
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/search" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data[] | {title: .metadata.title, url: .url}'
Write to /tmp/firecrawl_request.json:
{
"query": "machine learning tutorials",
"limit": 3,
"scrapeOptions": {
"formats": ["markdown"]
}
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/search" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data[] | {title: .metadata.title, content: .markdown[:500]}'
Search Parameters:
| Parameter | Type | Description |
|---|---|---|
query | string | Search query (required) |
limit | number | Number of results (default: 10) |
scrapeOptions | object | Options for scraping results |
Extract structured data from pages using AI.
Write to /tmp/firecrawl_request.json:
{
"urls": ["https://example.com/product/123"],
"prompt": "Extract the product name, price, and description"
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'
Write to /tmp/firecrawl_request.json:
{
"urls": ["https://example.com/product/123"],
"prompt": "Extract product information",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"price": {"type": "number"},
"currency": {"type": "string"},
"inStock": {"type": "boolean"}
}
}
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'
Write to /tmp/firecrawl_request.json:
{
"urls": [
"https://example.com/product/1",
"https://example.com/product/2"
],
"prompt": "Extract product name and price"
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'
Extract Parameters:
| Parameter | Type | Description |
|---|---|---|
urls | array | URLs to extract from (required) |
prompt | string | Description of data to extract (required) |
schema | object | JSON schema for structured output |
Write to /tmp/firecrawl_request.json:
{
"url": "https://docs.python.org/3/tutorial/",
"formats": ["markdown"],
"onlyMainContent": true
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq -r '.data.markdown' > python-tutorial.md
Write to /tmp/firecrawl_request.json:
{
"url": "https://blog.example.com",
"search": "post"
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/map" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq -r '.links[]'
Write to /tmp/firecrawl_request.json:
{
"query": "best practices REST API design 2024",
"limit": 5,
"scrapeOptions": {"formats": ["markdown"]}
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/search" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data[] | {title: .metadata.title, url: .url}'
Write to /tmp/firecrawl_request.json:
{
"urls": ["https://example.com/pricing"],
"prompt": "Extract all pricing tiers with name, price, and features"
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'
Replace <job-id> with the actual job ID:
while true; do
STATUS="$(curl -s "https://api.firecrawl.dev/v1/crawl/<job-id>" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" | jq -r '.status')"
echo "Status: $STATUS"
[ "$STATUS" = "completed" ] && break
sleep 5
done
{
"success": true,
"data": {
"markdown": "# Page Title\n\nContent...",
"metadata": {
"title": "Page Title",
"description": "...",
"url": "https://..."
}
}
}
{
"success": true,
"status": "completed",
"completed": 50,
"total": 50,
"data": [...]
}
limit values to control API usageonlyMainContent: true for cleaner output/crawl/{id} for statussuccess field in responsesWeekly Installs
162
Repository
GitHub Stars
47
First Seen
Jan 24, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
opencode145
codex143
gemini-cli142
github-copilot139
cursor139
amp134
AI Elements:基于shadcn/ui的AI原生应用组件库,快速构建对话界面
63,800 周安装
Needle自动化工具包:通过Rube MCP实现Needle操作自动化
1 周安装
Hummingbot幻灯片生成器:从Markdown创建品牌风格PDF演示文稿,支持Mermaid图表
145 周安装
MX Toolbox自动化技能 - 集成Claude AI与MX记录查询工具,提升网络管理效率
1 周安装
Mural自动化技能 - 通过ComposioHQ集成Claude AI实现自动化工作流
1 周安装
Missive自动化工具:通过Rube MCP和Composio实现邮件协作自动化
1 周安装
Microsoft Clarity自动化工具包:通过Rube MCP实现用户行为分析自动化
1 周安装
| boolean |
| Skip headers/footers |
timeout | number | Timeout in milliseconds |
| array |
| Paths to exclude |