defuddle by joeseesun/defuddle-skill
npx skills add https://github.com/joeseesun/defuddle-skill --skill defuddle从网页中提取主要文章内容,去除广告、侧边栏、导航栏和其他杂乱元素。输出带有元数据的干净 Markdown。
首次使用前,请检查是否已安装 defuddle:
command -v defuddle >/dev/null 2>&1 || npm install -g defuddle jsdom
当用户提供 URL 时,请遵循以下工作流程:
始终同时使用 -m 和 -j 标志以获取带有完整元数据的 Markdown 内容:
defuddle parse "<url>" -m -j
向用户展示:
title 字段author 字段广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
wordCount 字段如果这是本次对话中第一次使用 defuddle,请询问用户:
"保存到哪个目录?(例如
~/Documents、~/Desktop或自定义路径)"
在同一对话的后续使用中,请记住用户选择的目录。
使用 frontmatter + 完整内容写入文件:
---
title: {title}
author: {author}
source: {url}
date: {published or "Unknown"}
clipped: {today's date YYYY-MM-DD}
wordCount: {wordCount}
---
# {title}
{markdown content}
文件命名 :使用文章标题作为文件名,并针对文件系统进行清理:
The Shape of the Essay Field.md告知用户文件保存的路径。
defuddle parse <source> [options]
参数:
<source> — URL (https://...) 或本地 HTML 文件路径选项:
| 标志 | 描述 |
|---|---|
-m, --markdown | 将内容转换为 Markdown |
-j, --json | 以 JSON 格式输出完整元数据 |
-o, --output <file> | 写入文件而非标准输出 |
-p, --property <name> | 提取单个属性(title、description、domain、author、published、wordCount、content) |
--debug | 详细日志记录 |
使用 -j 时,响应包括:
title — 文章标题author — 作者姓名published — 发布日期description — 元描述content — 提取的 Markdown 内容(使用 -m 时)domain — 来源域名favicon — 网站图标 URLimage — 特色图片 URLsite — 网站名称wordCount — 字数统计parseTime — 处理时间(毫秒)jsdom 是必需的 peer dependency每周安装量
906
代码仓库
GitHub 星标数
85
首次出现
2026年3月4日
安全审计
安装于
opencode869
cursor868
codex868
gemini-cli867
github-copilot867
kimi-cli866
Extract main article content from web pages, removing ads, sidebars, navigation, and other clutter. Output clean Markdown with metadata.
Before first use, check if defuddle is installed:
command -v defuddle >/dev/null 2>&1 || npm install -g defuddle jsdom
When user provides a URL, follow this workflow:
Always use both -m and -j flags to get markdown content with full metadata:
defuddle parse "<url>" -m -j
Show the user:
title fieldauthor fieldwordCount fieldIf this is the first time using defuddle in this conversation, ask the user:
"Save to which directory? (e.g.
~/Documents,~/Desktop, or a custom path)"
Remember the user's chosen directory for subsequent uses in the same conversation.
Write the file with frontmatter + full content:
---
title: {title}
author: {author}
source: {url}
date: {published or "Unknown"}
clipped: {today's date YYYY-MM-DD}
wordCount: {wordCount}
---
# {title}
{markdown content}
File naming : Use the article title as filename, sanitized for filesystem:
The Shape of the Essay Field.mdTell the user the file path where it was saved.
defuddle parse <source> [options]
Arguments:
<source> — URL (https://...) or local HTML file pathOptions:
| Flag | Description |
|---|---|
-m, --markdown | Convert content to Markdown |
-j, --json | Output as JSON with full metadata |
-o, --output <file> | Write to file instead of stdout |
-p, --property <name> | Extract single property (title, description, domain, author, published, wordCount, content) |
--debug | Verbose logging |
When using -j, the response includes:
title — Article titleauthor — Author namepublished — Publication datedescription — Meta descriptioncontent — Extracted Markdown (when -m used)domain — Source domainfavicon — Favicon URLimage — Featured image URLsite — Site namejsdom is required as a peer dependencyWeekly Installs
906
Repository
GitHub Stars
85
First Seen
Mar 4, 2026
Security Audits
Gen Agent Trust HubWarnSocketPassSnykWarn
Installed on
opencode869
cursor868
codex868
gemini-cli867
github-copilot867
kimi-cli866
React 组合模式指南:Vercel 组件架构最佳实践,提升代码可维护性
102,200 周安装
Gemini Interactions API 指南:统一接口、智能体交互与服务器端状态管理
833 周安装
Apollo MCP 服务器:让AI代理通过GraphQL API交互的完整指南
834 周安装
智能体记忆系统构建指南:分块策略、向量存储与检索优化
835 周安装
Scrapling官方网络爬虫框架 - 自适应解析、绕过Cloudflare、Python爬虫库
836 周安装
抽奖赢家选取器 - 随机选择工具,支持CSV、Excel、Google Sheets,公平透明
838 周安装
Medusa 前端开发指南:使用 SDK、React Query 构建电商商店
839 周安装
wordCount — Word countparseTime — Processing time in ms