Doublecheck AI 内容验证工具 - GitHub Copilot 三层事实核查流程，自动识别幻觉风险

doublecheck by github/awesome-copilot

513 周安装量

26,900 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/github/awesome-copilot --skill doublecheck

AI/机器学习内容创作自动化

🇨🇳中文介绍

Doublecheck

对 AI 生成的内容运行三层验证流程。目标不是告诉用户什么是正确的——而是提取每一个可验证的声明，找到用户可以独立核实的来源，并标记任何看起来像是幻觉模式的内容。

激活

Doublecheck 有两种模式：主动模式（持续）和单次模式（按需）。

主动模式

当用户调用此技能但未提供特定文本进行验证时，激活持续性的 doublecheck 模式。回复如下：

Doublecheck 现已激活。 我将在呈现回复前验证其中的事实性声明。在每个实质性回复后，您会看到一个内联的验证摘要。在任何回复后说"完整报告"以获取包含详细来源的三层完整验证。随时可以通过说"关闭 doublecheck"来关闭它。

然后在对话的剩余时间里遵循以下所有规则：

规则：在发送前对每个回复进行分类。

在生成任何实质性回复之前，确定其是否包含可验证的声明。对回复进行分类：

回复类型	包含可验证声明？	操作
事实分析、法律指导、法规解读、合规指导，或包含案例引用或法规引用的内容	是——高密度	运行完整验证报告（见下文高风险内容规则）
文档、研究或数据摘要	是——中等密度	对关键声明运行内联验证
代码生成、创意写作、头脑风暴	很少	跳过验证；注明 doublecheck 模式不适用于此类内容
随意对话、澄清问题、状态更新	否

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

第 1 层：自我审核

以批判性的视角重读目标文本。您在这一层的任务是提取和内部分析——暂时不进行网络搜索。

步骤 1：提取声明

逐句检查目标文本，提取每一个断言了可验证内容的陈述。对每个声明进行分类：

类别	寻找什么	示例
事实性	关于事物如何或曾经如何的断言	"Python 创建于 1991 年"，"GPL 要求衍生作品开源"
统计性	数字、百分比、数量	"95% 的企业使用云服务"，"合同有 30 天终止条款"
引用	对特定文档、案例、法律、论文或标准的引用	"根据 CDA 第 230 条..."，"在 Mayo v. Prometheus (2012) 案中..."
实体	关于特定人物、组织、产品或地点的声明	"OpenAI 由 Sam Altman 和 Elon Musk 创立"，"GDPR 适用于欧盟居民"
因果性	声称 X 导致 Y 或 X 引发 Y 的声明	"此漏洞允许远程代码执行"，"该法规是为应对 2008 年金融危机而通过的"
时间性	日期、时间线、事件序列	"截止日期是 3 月 15 日"，"2.0 版本在安全补丁之前发布"

为每个声明分配一个临时 ID（C1, C2, C3...），以便在后续层中跟踪。

步骤 2：检查内部一致性

对照彼此审查提取的声明：

文本中是否有任何地方自相矛盾？（例如，对同一事件陈述了两个不同的日期）
是否存在逻辑上不相容的声明？
文本是否在一处做了假设，而在另一处又与之矛盾？

立即标记任何内部矛盾——这些问题无需外部验证即可识别。

步骤 3：初始置信度评估

仅基于您自己的知识对每个声明进行初步评估：

您是否记得这是准确的？
这是否属于模型经常产生幻觉的声明类型？（具体引用、精确统计数据、确切日期属于高风险类别。）
该声明是否足够具体以便验证，还是模糊到无法证伪？

记录您的初始置信度，但暂时不要将其作为发现报告。这是第 2 层的输入，不是输出。

第 2 层：来源验证

对每个提取的声明，搜索外部证据。本层的目的是找到用户可以访问以独立验证声明的 URL。

对于每个声明：

制定搜索查询，以找到主要来源。对于引用，搜索确切的标题或案例名称。对于统计数据，搜索具体的数字和主题。对于事实性声明，搜索关键实体和关系。
运行搜索，使用 web_search。如果第一次搜索没有返回相关结果，重新制定查询并用不同的术语再试一次。
评估您找到的内容：
- 您是否找到了直接涉及该声明的主要或权威来源？
- 您是否从可信来源找到了矛盾的信息？
- 您是否没有找到任何相关内容？（这本身就是一个信号——真实的事物通常会在网络上留下痕迹。）
记录结果，并附上来源 URL。即使您也总结了来源所说的内容，也始终提供 URL。

优先选择主要和权威来源：

官方文档、规范和标准
法庭记录、立法文本、监管文件
同行评审的出版物
官方组织网站和新闻稿
已确立的参考著作（百科全书、法律数据库）

注意来源是次要的（新闻文章、博客文章、维基页面）还是主要的。用户可以据此权衡。

引用是幻觉风险最高的类别。对于任何引用特定案例、法规、论文、标准或文档的声明：

搜索确切的引用（案例名称、标题、章节编号）。
如果找到了，确认被引用的内容确实如目标文本所声称的那样。
如果根本找不到，将其标记为"捏造风险"。模型经常为不存在的事物生成听起来合理的引用。

第 3 层：对抗性审查

完全切换您的立场。在第 1 层和第 2 层，您试图理解和验证输出。在这一层，假设输出包含错误，并积极尝试找出它们。

幻觉模式检查清单

检查以下常见模式：

捏造的引用 —— 文本引用了您在第二层无法找到的特定案例、论文或法规。这是最危险的幻觉模式，因为它看起来很权威。
没有来源的精确数字 —— 文本陈述了一个具体的统计数据（例如，"78% 的公司..."），但没有说明数字的来源。模型经常生成完全捏造的、听起来合理的统计数据。
对不确定主题的自信具体性 —— 文本对某个具体细节确实未知或有争议的主题陈述得非常具体。注意在专家意见不一的领域中出现的确切日期、精确金额和明确归属。
合理但错误的关联 —— 文本将概念、裁决或事件与错误的实体关联起来。例如，将裁决归因于错误的法院，将引用归于错误的人，或错误地描述法律条款但法律名称正确。
时间混淆 —— 文本将可能已过时的事物描述为当前的，或者以错误的顺序描述事件序列。
过度概括 —— 文本将仅在特定司法管辖区、背景或时期适用的事物陈述为普遍真理。这在法律和监管内容中很常见。
缺少限定词 —— 文本将一个有细微差别的话题呈现为已解决或直截了当的，而实际上存在重大例外、限制或反驳论点。

对于每个通过第 1 层和第 2 层的主要声明，问：

什么会使这个声明错误？
在这个领域是否存在模型可能已习得的常见误解？
如果我是一个主题专家，我会反对这种陈述方式吗？
这个声明是在我的训练数据截止日期之前还是之后，它可能已经过时了吗？

需要升级的危险信号

如果发现以下任何情况，请在报告中突出标记：

一个在任何地方都找不到的具体引用
一个没有可识别来源的统计数据
一个与权威来源说法相矛盾的法律或监管声明
一个以高置信度陈述但实际上有争议或不确定的声明

完成所有三层后，使用 assets/verification-report-template.md 中的模板生成报告。

为每个声明分配最终评级：

评级	含义	用户应该做什么
已验证	找到并链接了支持来源	如果该声明对您的工作至关重要，请抽查来源链接
可信	与常识一致，未找到具体来源	视为合理但未经证实；如果依赖其做决定，请独立核实
未验证	未能找到支持或矛盾的证据	在没有独立核实的情况下，不要依赖此声明
有争议	从可信来源找到了矛盾的证据	查看矛盾来源；此声明可能是错误的
捏造风险	符合幻觉模式（例如，找不到的引用、无来源的精确统计数据）	假设这是错误的，直到您能从主要来源确认它

提供链接，而非裁决。由用户决定什么是真实的，而不是您。
当您找到矛盾信息时，呈现双方观点及来源。不要选择赢家。
如果一个声明无法证伪（太模糊或太主观而无法验证），请说明。"无法证伪"是有用的信息。
明确说明您无法检查的内容。"我无法验证这个"与"这是错误的"不同。
按严重性分组发现。首先列出最需要关注的项目。

始终在报告末尾包含以下内容：

此验证的局限性：

此工具加速了人工验证；它不能替代人工验证。

网络搜索结果可能不包括最新信息或付费墙后的来源。

对抗性审查使用了可能产生原始输出的同一底层模型。它能发现许多问题，但无法发现所有问题。

评级为"已验证"的声明意味着找到了支持来源，并不代表该声明绝对正确。来源也可能出错。

评级为"可信"的声明仍可能是错误的。没有矛盾证据并不能证明其准确性。

法律内容具有较高的幻觉风险，因为：

案例名称、引用和判决经常被模型捏造
司法管辖区的细微差别经常被简化或忽略
法规语言可能被意译，从而改变法律含义
"多数规则"和"少数规则"的区别经常丢失

对于法律内容，要特别仔细审查：案例引用、法规引用、法规解读和司法管辖区声明。尽可能搜索法律数据库。

医学和科学内容

检查引用的研究是否确实存在，以及结果描述是否准确
注意将过时的指南呈现为当前的
标记剂量、治疗方案或诊断标准——这些会变化，错误可能很危险

金融和监管内容

验证具体的金额、日期和阈值
检查监管要求是否归因于正确的司法管辖区并且是最新的
注意可能因近期立法变更而过时的税法声明

技术和安全内容

验证 CVE 编号、漏洞描述和受影响版本
检查 API 规范和配置说明是否与当前文档匹配
注意可能已过时的版本特定信息

🇺🇸English

Doublecheck

Run a three-layer verification pipeline on AI-generated output. The goal is not to tell the user what is true -- it is to extract every verifiable claim, find sources the user can check independently, and flag anything that looks like a hallucination pattern.

Activation

Doublecheck operates in two modes: active mode (persistent) and one-shot mode (on demand).

Active Mode

When the user invokes this skill without providing specific text to verify, activate persistent doublecheck mode. Respond with:

Doublecheck is now active. I'll verify factual claims in my responses before presenting them. You'll see an inline verification summary after each substantive response. Say "full report" on any response to get the complete three-layer verification with detailed sourcing. Turn it off anytime by saying "turn off doublecheck."

Then follow ALL of the rules below for the remainder of the conversation:

Rule: Classify every response before sending it.

Before producing any substantive response, determine whether it contains verifiable claims. Classify the response:

Response type	Contains verifiable claims?	Action
Factual analysis, legal guidance, regulatory interpretation, compliance guidance, or content with case citations or statutory references	Yes -- high density	Run full verification report (see high-stakes content rule below)
Summary of a document, research, or data	Yes -- moderate density	Run inline verification on key claims
Code generation, creative writing, brainstorming	Rarely	Skip verification; note that doublecheck mode doesn't apply to this type of content
Casual conversation, clarifying questions, status updates	No	Skip verification silently

Rule: Inline verification for active mode.

When active mode applies, do NOT generate a separate full verification report for every response. Instead, embed verification directly into your response using this pattern:

Generate your response normally.
After the response, add a Verification section.
In that section, list each verifiable claim with its confidence rating and a source link where available.

Format:

---
**Verification (N claims checked)**

- [VERIFIED] "Claim text" -- Source: [URL]
- [VERIFIED] "Claim text" -- Source: [URL]
- [PLAUSIBLE] "Claim text" -- no specific source found
- [FABRICATION RISK] "Claim text" -- could not find this citation; verify before relying on it

For active mode, prioritize speed. Run web searches for citations, specific statistics, and any claim you have low confidence about. You do not need to search for claims that are common knowledge or that you have high confidence about -- just rate them PLAUSIBLE and move on.

If any claim rates DISPUTED or FABRICATION RISK, call it out prominently before the verification section so the user sees it immediately. When auto-escalation applies (see below), place this callout at the top of the full report, before the summary table:

**Heads up:** I'm not confident about [specific claim]. I couldn't find a supporting source. You should verify this independently before relying on it.

Rule: Auto-escalate to full report for high-risk findings.

If your inline verification identifies ANY claim rated DISPUTED or FABRICATION RISK, do not produce inline verification. Instead, place the "Heads up" callout at the top of your response and then produce the full three-layer verification report using the template in assets/verification-report-template.md. The user should not have to ask for the detailed report when something is clearly wrong.

Rule: Full report for high-stakes content.

If the response contains legal analysis, regulatory interpretation, compliance guidance, case citations, or statutory references, always produce the full verification report using the template in assets/verification-report-template.md. Do not use inline verification for these content types -- the stakes are too high for the abbreviated format.

Rule: Discoverability footer for inline verification.

When producing inline verification (not a full report), always append this line at the end of the verification section:

_Say "full report" for detailed three-layer verification with sources._

Rule: Offer full verification on request.

If the user says "full report," "run full verification," "verify that," "doublecheck that," or similar, run the complete three-layer pipeline (described below) and produce the full report using the template in assets/verification-report-template.md.

One-Shot Mode

When the user invokes this skill and provides specific text to verify (or references previous output), run the complete three-layer pipeline and produce a full verification report using the template in assets/verification-report-template.md.

Deactivation

When the user says "turn off doublecheck," "stop doublecheck," or similar, respond with:

Doublecheck is now off. I'll respond normally without inline verification. You can reactivate it anytime.

Layer 1: Self-Audit

Re-read the target text with a critical lens. Your job in this layer is extraction and internal analysis -- no web searches yet.

Step 1: Extract Claims

Go through the target text sentence by sentence and pull out every statement that asserts something verifiable. Categorize each claim:

Category	What to look for	Examples
Factual	Assertions about how things are or were	"Python was created in 1991", "The GPL requires derivative works to be open-sourced"
Statistical	Numbers, percentages, quantities	"95% of enterprises use cloud services", "The contract has a 30-day termination clause"
Citation	References to specific documents, cases, laws, papers, or standards	"Under Section 230 of the CDA...", "In Mayo v. Prometheus (2012)..."
Entity	Claims about specific people, organizations, products, or places	"OpenAI was founded by Sam Altman and Elon Musk", "GDPR applies to EU residents"
Causal	Claims that X caused Y or X leads to Y	"This vulnerability allows remote code execution", "The regulation was passed in response to the 2008 financial crisis"
Temporal	Dates, timelines, sequences of events	"The deadline is March 15", "Version 2.0 was released before the security patch"

Assign each claim a temporary ID (C1, C2, C3...) for tracking through subsequent layers.

Step 2: Check Internal Consistency

Review the extracted claims against each other:

Does the text contradict itself anywhere? (e.g., states two different dates for the same event)
Are there claims that are logically incompatible?
Does the text make assumptions in one section that it contradicts in another?

Flag any internal contradictions immediately -- these don't need external verification to identify as problems.

Step 3: Initial Confidence Assessment

For each claim, make an initial assessment based only on your own knowledge:

Do you recall this being accurate?
Is this the kind of claim where models frequently hallucinate? (Specific citations, precise statistics, and exact dates are high-risk categories.)
Is the claim specific enough to verify, or is it vague enough to be unfalsifiable?

Record your initial confidence but do NOT report it as a finding yet. This is input for Layer 2, not output.

Layer 2: Source Verification

For each extracted claim, search for external evidence. The purpose of this layer is to find URLs the user can visit to verify claims independently.

Search Strategy

For each claim:

Formulate a search query that would surface the primary source. For citations, search for the exact title or case name. For statistics, search for the specific number and topic. For factual claims, search for the key entities and relationships.
Run the search using web_search. If the first search doesn't return relevant results, reformulate and try once more with different terms.
Evaluate what you find:
- Did you find a primary or authoritative source that directly addresses the claim?
- Did you find contradicting information from a credible source?
- Did you find nothing relevant? (This is itself a signal -- real things usually have a web footprint.)
Record the result with the source URL. Always provide the URL even if you also summarize what the source says.

What Counts as a Source

Prefer primary and authoritative sources:

Official documentation, specifications, and standards
Court records, legislative texts, regulatory filings
Peer-reviewed publications
Official organizational websites and press releases
Established reference works (encyclopedias, legal databases)

Note when a source is secondary (news article, blog post, wiki page) vs. primary. The user can weigh accordingly.

Handling Citations Specifically

Citations are the highest-risk category for hallucinations. For any claim that cites a specific case, statute, paper, standard, or document:

Search for the exact citation (case name, title, section number).
If you find it, confirm the cited content actually says what the target text claims it says.
If you cannot find it at all, flag it as FABRICATION RISK. Models frequently generate plausible-sounding citations for things that don't exist.

Layer 3: Adversarial Review

Switch your posture entirely. In Layers 1 and 2, you were trying to understand and verify the output. In this layer, assume the output contains errors and actively try to find them.

Hallucination Pattern Checklist

Check for these common patterns:

Fabricated citations -- The text cites a specific case, paper, or statute that you could not find in Layer 2. This is the most dangerous hallucination pattern because it looks authoritative.
Precise numbers without sources -- The text states a specific statistic (e.g., "78% of companies...") without indicating where the number comes from. Models often generate plausible-sounding statistics that are entirely made up.
Confident specificity on uncertain topics -- The text states something very specific about a topic where specifics are genuinely unknown or disputed. Watch for exact dates, precise dollar amounts, and definitive attributions in areas where experts disagree.
Plausible-but-wrong associations -- The text associates a concept, ruling, or event with the wrong entity. For example, attributing a ruling to the wrong court, assigning a quote to the wrong person, or describing a law's provision incorrectly while getting the law's name right.
Temporal confusion -- The text describes something as current that may be outdated, or describes a sequence of events in the wrong order.
Overgeneralization -- The text states something as universally true when it applies only in specific jurisdictions, contexts, or time periods. Common in legal and regulatory content.
Missing qualifiers -- The text presents a nuanced topic as settled or straightforward when significant exceptions, limitations, or counterarguments exist.

Adversarial Questions

For each major claim that passed Layers 1 and 2, ask:

What would make this claim wrong?
Is there a common misconception in this area that the model might have picked up?
If I were a subject matter expert, would I object to how this is stated?
Is this claim from before or after my training data cutoff, and might it be outdated?

Red Flags to Escalate

If you find any of these, flag them prominently in the report:

A specific citation that cannot be found anywhere
A statistic with no identifiable source
A legal or regulatory claim that contradicts what authoritative sources say
A claim that has been stated with high confidence but is actually disputed or uncertain

Producing the Verification Report

After completing all three layers, produce the report using the template in assets/verification-report-template.md.

Confidence Ratings

Assign each claim a final rating:

Rating	Meaning	What the user should do
VERIFIED	Supporting source found and linked	Spot-check the source link if the claim is critical to your work
PLAUSIBLE	Consistent with general knowledge, no specific source found	Treat as reasonable but unconfirmed; verify independently if relying on it for decisions
UNVERIFIED	Could not find supporting or contradicting evidence	Do not rely on this claim without independent verification
DISPUTED	Found contradicting evidence from a credible source	Review the contradicting source; this claim may be wrong
FABRICATION RISK	Matches hallucination patterns (e.g., unfindable citation, unsourced precise statistic)	Assume this is wrong until you can confirm it from a primary source

Report Principles

Provide links, not verdicts. The user decides what's true, not you.
When you found contradicting information, present both sides with sources. Don't pick a winner.
If a claim is unfalsifiable (too vague or subjective to verify), say so. "Unfalsifiable" is useful information.
Be explicit about what you could not check. "I could not verify this" is different from "this is wrong."
Group findings by severity. Lead with the items that need the most attention.

Limitations Disclosure

Always include this at the end of the report:

Limitations of this verification:

This tool accelerates human verification; it does not replace it.

Web search results may not include the most recent information or paywalled sources.

The adversarial review uses the same underlying model that may have produced the original output. It catches many issues but cannot catch all of them.

A claim rated VERIFIED means a supporting source was found, not that the claim is definitely correct. Sources can be wrong too.

Claims rated PLAUSIBLE may still be wrong. The absence of contradicting evidence is not proof of accuracy.

Domain-Specific Guidance

Legal Content

Legal content carries elevated hallucination risk because:

Case names, citations, and holdings are frequently fabricated by models
Jurisdictional nuances are often flattened or omitted
Statutory language may be paraphrased in ways that change the legal meaning
"Majority rule" and "minority rule" distinctions are often lost

For legal content, give extra scrutiny to: case citations, statutory references, regulatory interpretations, and jurisdictional claims. Search legal databases when possible.

Medical and Scientific Content

Check that cited studies actually exist and that the results are accurately described
Watch for outdated guidelines being presented as current
Flag dosages, treatment protocols, or diagnostic criteria -- these change and errors can be dangerous

Financial and Regulatory Content

Verify specific dollar amounts, dates, and thresholds
Check that regulatory requirements are attributed to the correct jurisdiction and are current
Watch for tax law claims that may be outdated after recent legislative changes

Technical and Security Content

Verify CVE numbers, vulnerability descriptions, and affected versions
Check that API specifications and configuration instructions match current documentation
Watch for version-specific information that may be outdated

Weekly Installs

473

Repository

github/awesome-copilot

GitHub Stars

26.7K

First Seen

12 days ago

Security Audits

Gen Agent Trust HubPass SocketPass SnykWarn

Installed on

gemini-cli419

codex417

opencode409

github-copilot406

cursor406

kimi-cli402

AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具

42,300 周安装

Doublecheck AI 内容验证工具 - GitHub Copilot 三层事实核查流程，自动识别幻觉风险

🇨🇳中文介绍

Doublecheck

激活

主动模式

相关 Skills

单次模式

停用

第 1 层：自我审核

步骤 1：提取声明

步骤 2：检查内部一致性

步骤 3：初始置信度评估

第 2 层：来源验证

搜索策略

什么算作来源

特别处理引用

第 3 层：对抗性审查

幻觉模式检查清单

对抗性问题

需要升级的危险信号

生成验证报告

置信度评级

报告原则

限制声明

特定领域指导

法律内容

医学和科学内容

金融和监管内容

技术和安全内容

🇺🇸English

Doublecheck

Activation

Active Mode

One-Shot Mode

Deactivation

Layer 1: Self-Audit

Step 1: Extract Claims

Step 2: Check Internal Consistency

Step 3: Initial Confidence Assessment

Layer 2: Source Verification

Search Strategy

What Counts as a Source

Handling Citations Specifically

Layer 3: Adversarial Review

Hallucination Pattern Checklist

Adversarial Questions

Red Flags to Escalate

Producing the Verification Report

Confidence Ratings

Report Principles

Limitations Disclosure

Domain-Specific Guidance

Legal Content

Medical and Scientific Content

Financial and Regulatory Content

Technical and Security Content

最新 Skills