学术引文管理工具：自动搜索、提取元数据、生成BibTeX，提升研究效率

citation-management by davila7/claude-code-templates

470 周安装量

23,500 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/davila7/claude-code-templates --skill citation-management

自动化生产力数据处理

🇨🇳中文介绍

引文管理

概述

在整个研究和写作过程中系统地管理引文。此技能提供用于搜索学术数据库（Google Scholar、PubMed）、从多个来源（CrossRef、PubMed、arXiv）提取准确元数据、验证引文信息以及生成正确格式的 BibTeX 条目的工具和策略。

对于保持引文准确性、避免参考文献错误以及确保研究可重复性至关重要。可与文献综述技能无缝集成，实现全面的研究工作流程。

何时使用此技能

在以下情况下使用此技能：

在 Google Scholar 或 PubMed 上搜索特定论文
将 DOI、PMID 或 arXiv ID 转换为正确格式的 BibTeX
提取引文的完整元数据（作者、标题、期刊、年份等）
验证现有引文的准确性
清理和格式化 BibTeX 文件
查找特定领域的高被引论文
验证引文信息是否与实际出版物相符
为手稿或论文构建参考文献列表
检查重复引文
确保引文格式一致

使用科学示意图增强视觉效果

使用此技能创建文档时，请务必考虑添加科学图表和示意图以增强视觉传达。

如果您的文档尚未包含示意图或图表：

使用 scientific-schematics 技能生成由 AI 驱动的出版物质量图表
只需用自然语言描述您想要的图表
Nano Banana Pro 将自动生成、审查和完善示意图

对于新文档： 默认应生成科学示意图，以直观地表示文本中描述的关键概念、工作流程、架构或关系。

如何生成示意图：

python scripts/generate_schematic.py "your diagram description" -o figures/output.png

AI 将自动：

创建具有适当格式的出版物质量图像
通过多次迭代进行审查和完善
确保可访问性（色盲友好、高对比度）

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

阶段 1：论文发现与搜索

目标：使用学术搜索引擎查找相关论文。

Google Scholar 搜索

Google Scholar 提供跨学科最全面的覆盖范围。

# 搜索某个主题的论文
python scripts/search_google_scholar.py "CRISPR gene editing" \
  --limit 50 \
  --output results.json

# 带年份过滤器的搜索
python scripts/search_google_scholar.py "machine learning protein folding" \
  --year-start 2020 \
  --year-end 2024 \
  --limit 100 \
  --output ml_proteins.json

高级搜索策略（参见 references/google_scholar_search.md）：

使用引号进行精确短语搜索："deep learning"
按作者搜索：author:LeCun
在标题中搜索：intitle:"neural networks"
排除术语：machine learning -survey
使用排序选项查找高被引论文
按日期范围过滤以获取近期工作

使用具体、有针对性的搜索词
包含关键术语和缩写词
对于快速发展的领域，按最近年份过滤
检查“被引用次数”以查找开创性论文
导出顶部结果以供进一步分析

PubMed 专门用于生物医学和生命科学文献（超过 3500 万条引文）。

# 搜索 PubMed
python scripts/search_pubmed.py "Alzheimer's disease treatment" \
  --limit 100 \
  --output alzheimers.json

# 使用 MeSH 术语和过滤器进行搜索
python scripts/search_pubmed.py \
  --query '"Alzheimer Disease"[MeSH] AND "Drug Therapy"[MeSH]' \
  --date-start 2020 \
  --date-end 2024 \
  --publication-types "Clinical Trial,Review" \
  --output alzheimers_trials.json

高级 PubMed 查询（参见 references/pubmed_search.md）：

使用 MeSH 术语："Diabetes Mellitus"[MeSH]
字段标签："cancer"[Title]、"Smith J"[Author]
布尔运算符：AND、OR、NOT
日期过滤器：2020:2024[Publication Date]
出版物类型："Review"[Publication Type]
与 E-utilities API 结合以实现自动化

使用 MeSH 浏览器查找正确的受控词汇
首先在 PubMed 高级搜索构建器中构建复杂查询
使用 OR 包含多个同义词
检索 PMID 以便轻松提取元数据
导出为 JSON 或直接导出为 BibTeX

阶段 2：元数据提取

目标：将论文标识符（DOI、PMID、arXiv ID）转换为完整、准确的元数据。

快速 DOI 到 BibTeX 转换

对于单个 DOI，使用快速转换工具：

# 转换单个 DOI
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2

# 从文件转换多个 DOI
python scripts/doi_to_bibtex.py --input dois.txt --output references.bib

# 不同的输出格式
python scripts/doi_to_bibtex.py 10.1038/nature12345 --format json

综合元数据提取

对于 DOI、PMID、arXiv ID 或 URL：

# 从 DOI 提取
python scripts/extract_metadata.py --doi 10.1038/s41586-021-03819-2

# 从 PMID 提取
python scripts/extract_metadata.py --pmid 34265844

# 从 arXiv ID 提取
python scripts/extract_metadata.py --arxiv 2103.14030

# 从 URL 提取
python scripts/extract_metadata.py --url "https://www.nature.com/articles/s41586-021-03819-2"

# 从文件批量提取（混合标识符）
python scripts/extract_metadata.py --input identifiers.txt --output citations.bib

元数据来源（参见 references/metadata_extraction.md）：

CrossRef API：DOI 的主要来源
- 期刊文章的全面元数据
- 出版商提供的信息
- 包括作者、标题、期刊、卷、页码、日期
- 免费，无需 API 密钥
PubMed E-utilities：生物医学文献
- 官方 NCBI 元数据
- 包括 MeSH 术语、摘要
- PMID 和 PMCID 标识符
- 免费，建议高流量时使用 API 密钥
arXiv API：物理学、数学、计算机科学、定量生物学领域的预印本
- 预印本的完整元数据
- 版本跟踪
- 作者所属机构
- 免费，开放获取
DataCite API：研究数据集、软件、其他资源
- 非传统学术成果的元数据
- 数据集和代码的 DOI
- 免费访问

提取的内容：

必填字段：作者、标题、年份
期刊文章：期刊、卷、期、页码、DOI
书籍：出版商、ISBN、版次
会议论文：书名、会议地点、页码
预印本：存储库（arXiv、bioRxiv）、预印本 ID
附加信息：摘要、关键词、URL

阶段 3：BibTeX 格式化

目标：生成干净、格式正确的 BibTeX 条目。

理解 BibTeX 条目类型

完整指南请参见 references/bibtex_formatting.md。

常见条目类型：

@article：期刊文章（最常见）
@book：书籍
@inproceedings：会议论文
@incollection：书籍章节
@phdthesis：学位论文
@misc：预印本、软件、数据集

按类型划分的必填字段：

@article{citationkey,
  author  = {Last1, First1 and Last2, First2},
  title   = {Article Title},
  journal = {Journal Name},
  year    = {2024},
  volume  = {10},
  number  = {3},
  pages   = {123--145},
  doi     = {10.1234/example}
}

@inproceedings{citationkey,
  author    = {Last, First},
  title     = {Paper Title},
  booktitle = {Conference Name},
  year      = {2024},
  pages     = {1--10}
}

@book{citationkey,
  author    = {Last, First},
  title     = {Book Title},
  publisher = {Publisher Name},
  year      = {2024}
}

使用格式化工具标准化 BibTeX 文件：

# 格式化和清理 BibTeX 文件
python scripts/format_bibtex.py references.bib \
  --output formatted_references.bib

# 按引文键排序条目
python scripts/format_bibtex.py references.bib \
  --sort key \
  --output sorted_references.bib

# 按年份排序（最新优先）
python scripts/format_bibtex.py references.bib \
  --sort year \
  --descending \
  --output sorted_references.bib

# 删除重复项
python scripts/format_bibtex.py references.bib \
  --deduplicate \
  --output clean_references.bib

# 验证并报告问题
python scripts/format_bibtex.py references.bib \
  --validate \
  --report validation_report.txt

格式化操作：

标准化字段顺序
一致的缩进和间距
标题中正确的大小写（用 {} 保护）
标准化的作者姓名格式
一致的引文键格式
删除不必要的字段
修复常见错误（缺少逗号、大括号）

阶段 4：引文验证

目标：验证所有引文准确且完整。

# 验证 BibTeX 文件
python scripts/validate_citations.py references.bib

# 验证并修复常见问题
python scripts/validate_citations.py references.bib \
  --auto-fix \
  --output validated_references.bib

# 生成详细的验证报告
python scripts/validate_citations.py references.bib \
  --report validation_report.json \
  --verbose

验证检查（参见 references/citation_validation.md）：

DOI 验证：
- DOI 通过 doi.org 正确解析
- BibTeX 和 CrossRef 之间的元数据匹配
- 没有损坏或无效的 DOI
必填字段：
- 条目类型的所有必填字段都存在
- 没有空或缺少的关键信息
- 作者姓名格式正确
数据一致性：
- 年份有效（4 位数字，合理范围）
- 卷/期是数字
- 页码格式正确（例如，123--145）
- URL 可访问
重复检测：
- 同一 DOI 被多次使用
- 相似标题（可能的重复项）
- 相同的作者/年份/标题组合
格式合规性：
- 有效的 BibTeX 语法
- 正确的大括号和引号
- 引文键唯一
- 特殊字符处理正确

{
  "total_entries": 150,
  "valid_entries": 145,
  "errors": [
    {
      "citation_key": "Smith2023",
      "error_type": "missing_field",
      "field": "journal",
      "severity": "high"
    },
    {
      "citation_key": "Jones2022",
      "error_type": "invalid_doi",
      "doi": "10.1234/broken",
      "severity": "high"
    }
  ],
  "warnings": [
    {
      "citation_key": "Brown2021",
      "warning_type": "possible_duplicate",
      "duplicate_of": "Brown2021a",
      "severity": "medium"
    }
  ]
}

阶段 5：与写作工作流程集成

为手稿构建参考文献

创建参考文献列表的完整工作流程：

# 1. 搜索您主题的论文
python scripts/search_pubmed.py \
  '"CRISPR-Cas Systems"[MeSH] AND "Gene Editing"[MeSH]' \
  --date-start 2020 \
  --limit 200 \
  --output crispr_papers.json

# 2. 从搜索结果中提取 DOI 并转换为 BibTeX
python scripts/extract_metadata.py \
  --input crispr_papers.json \
  --output crispr_refs.bib

# 3. 通过 DOI 添加特定论文
python scripts/doi_to_bibtex.py 10.1038/nature12345 >> crispr_refs.bib
python scripts/doi_to_bibtex.py 10.1126/science.abcd1234 >> crispr_refs.bib

# 4. 格式化和清理 BibTeX 文件
python scripts/format_bibtex.py crispr_refs.bib \
  --deduplicate \
  --sort year \
  --descending \
  --output references.bib

# 5. 验证所有引文
python scripts/validate_citations.py references.bib \
  --auto-fix \
  --report validation.json \
  --output final_references.bib

# 6. 审查验证报告并修复任何剩余问题
cat validation.json

# 7. 在您的 LaTeX 文档中使用
# \bibliography{final_references}

与文献综述技能集成

此技能是对 literature-review 技能的补充：

文献综述技能 → 系统性搜索和综合 引文管理技能 → 技术性引文处理

组合工作流程：

使用 literature-review 进行全面的多数据库搜索
使用 citation-management 提取和验证所有引文
使用 literature-review 按主题综合发现
使用 citation-management 验证最终参考文献列表的准确性

# 完成文献综述后

# 验证综述文档中的所有引文
python scripts/validate_citations.py my_review_references.bib --report review_validation.json

# 如果需要，按特定引文样式格式化
python scripts/format_bibtex.py my_review_references.bib \
  --style nature \
  --output formatted_refs.bib

Google Scholar 最佳实践

查找开创性论文：

按被引次数排序（被引次数最多的优先）
查找综述文章以获取概述
检查“被引用次数”以评估影响力
使用引文提醒来跟踪新引文

高级运算符（完整列表见 references/google_scholar_search.md）：

"exact phrase"           # 精确短语匹配
author:lastname          # 按作者搜索
intitle:keyword          # 仅在标题中搜索
source:journal           # 搜索特定期刊
-exclude                 # 排除术语
OR                       # 替代术语
2020..2024              # 年份范围

# 查找某个主题的最新综述
"CRISPR" intitle:review 2023..2024

# 查找特定作者关于某个主题的论文
author:Church "synthetic biology"

# 查找高被引的基础性工作
"deep learning" 2012..2015 sort:citations

# 排除综述，专注于方法
"protein folding" -survey -review intitle:method

使用 MeSH 术语：MeSH（医学主题词）提供了用于精确搜索的受控词汇。

查找 MeSH 术语：访问 https://meshb.nlm.nih.gov/search
在查询中使用："Diabetes Mellitus, Type 2"[MeSH]
与关键词结合以获得全面覆盖

[Title]              # 仅在标题中搜索
[Title/Abstract]     # 在标题或摘要中搜索
[Author]             # 按作者姓名搜索
[Journal]            # 搜索特定期刊
[Publication Date]   # 日期范围
[Publication Type]   # 文章类型
[MeSH]              # MeSH 术语

构建复杂查询：

# 最近发表的糖尿病治疗临床试验
"Diabetes Mellitus, Type 2"[MeSH] AND "Drug Therapy"[MeSH] 
AND "Clinical Trial"[Publication Type] AND 2020:2024[Publication Date]

# 特定期刊中关于 CRISPR 的综述
"CRISPR-Cas Systems"[MeSH] AND "Nature"[Journal] AND "Review"[Publication Type]

# 特定作者的最新工作
"Smith AB"[Author] AND cancer[Title/Abstract] AND 2022:2024[Publication Date]

用于自动化的 E-utilities：脚本使用 NCBI E-utilities API 进行程序化访问：

ESearch：搜索和检索 PMID
EFetch：检索完整元数据
ESummary：获取摘要信息
ELink：查找相关文章

完整的 API 文档请参见 references/pubmed_search.md。

search_google_scholar.py

搜索 Google Scholar 并导出结果。

带速率限制的自动化搜索
分页支持
年份范围过滤
导出为 JSON 或 BibTeX
被引次数信息

# 基本搜索
python scripts/search_google_scholar.py "quantum computing"

# 带过滤器的高级搜索
python scripts/search_google_scholar.py "quantum computing" \
  --year-start 2020 \
  --year-end 2024 \
  --limit 100 \
  --sort-by citations \
  --output quantum_papers.json

# 直接导出为 BibTeX
python scripts/search_google_scholar.py "machine learning" \
  --limit 50 \
  --format bibtex \
  --output ml_papers.bib

使用 E-utilities API 搜索 PubMed。

复杂查询支持（MeSH、字段标签、布尔运算）
日期范围过滤
出版物类型过滤
带元数据的批量检索
导出为 JSON 或 BibTeX

# 简单关键词搜索
python scripts/search_pubmed.py "CRISPR gene editing"

# 带过滤器的复杂查询
python scripts/search_pubmed.py \
  --query '"CRISPR-Cas Systems"[MeSH] AND "therapeutic"[Title/Abstract]' \
  --date-start 2020-01-01 \
  --date-end 2024-12-31 \
  --publication-types "Clinical Trial,Review" \
  --limit 200 \
  --output crispr_therapeutic.json

# 导出为 BibTeX
python scripts/search_pubmed.py "Alzheimer's disease" \
  --limit 100 \
  --format bibtex \
  --output alzheimers.bib

从论文标识符中提取完整元数据。

支持 DOI、PMID、arXiv ID、URL
查询 CrossRef、PubMed、arXiv API
处理多种标识符类型
批量处理
多种输出格式

# 单个 DOI
python scripts/extract_metadata.py --doi 10.1038/s41586-021-03819-2

# 单个 PMID
python scripts/extract_metadata.py --pmid 34265844

# 单个 arXiv ID
python scripts/extract_metadata.py --arxiv 2103.14030

# 从 URL 提取
python scripts/extract_metadata.py \
  --url "https://www.nature.com/articles/s41586-021-03819-2"

# 批量处理（每行一个标识符的文件）
python scripts/extract_metadata.py \
  --input paper_ids.txt \
  --output references.bib

# 不同的输出格式
python scripts/extract_metadata.py \
  --doi 10.1038/nature12345 \
  --format json  # 或 bibtex, yaml

validate_citations.py

验证 BibTeX 条目的准确性和完整性。

通过 doi.org 和 CrossRef 进行 DOI 验证
必填字段检查
重复检测
格式验证
自动修复常见问题
详细报告

# 基本验证
python scripts/validate_citations.py references.bib

# 带自动修复
python scripts/validate_citations.py references.bib \
  --auto-fix \
  --output fixed_references.bib

# 详细验证报告
python scripts/validate_citations.py references.bib \
  --report validation_report.json \
  --verbose

# 仅检查 DOI
python scripts/validate_citations.py references.bib \
  --check-dois-only

格式化和清理 BibTeX 文件。

标准化格式
排序条目（按键、年份、作者）
删除重复项
验证语法
修复常见错误
强制执行引文键约定

# 基本格式化
python scripts/format_bibtex.py references.bib

# 按年份排序（最新优先）
python scripts/format_bibtex.py references.bib \
  --sort year \
  --descending \
  --output sorted_refs.bib

# 删除重复项
python scripts/format_bibtex.py references.bib \
  --deduplicate \
  --output clean_refs.bib

# 完全清理
python scripts/format_bibtex.py references.bib \
  --deduplicate \
  --sort year \
  --validate \
  --auto-fix \
  --output final_refs.bib

快速将 DOI 转换为 BibTeX。

快速单 DOI 转换
批量处理
多种输出格式
剪贴板支持

# 单个 DOI
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2

# 多个 DOI
python scripts/doi_to_bibtex.py \
  10.1038/nature12345 \
  10.1126/science.abc1234 \
  10.1016/j.cell.2023.01.001

# 从文件（每行一个 DOI）
python scripts/doi_to_bibtex.py --input dois.txt --output references.bib

# 复制到剪贴板
python scripts/doi_to_bibtex.py 10.1038/nature12345 --clipboard

先宽后窄：
- 从通用术语开始以了解该领域
- 使用特定关键词和过滤器进行细化
- 使用同义词和相关术语
使用多个来源：
- Google Scholar 用于全面覆盖
- PubMed 用于生物医学重点
- arXiv 用于预印本
- 结合结果以获得完整性
利用引文：
- 检查“被引用次数”以查找开创性论文
- 查看关键论文的参考文献
- 使用引文网络发现相关工作
记录您的搜索：
- 保存搜索查询和日期
- 记录结果数量
- 注明应用的任何过滤器或限制

尽可能使用 DOI：
- 最可靠的标识符
- 指向出版物的永久链接
- 通过 CrossRef 获得最佳元数据来源
验证提取的元数据：
- 检查作者姓名是否正确
- 验证期刊/会议名称
- 确认出版年份
- 验证页码和卷号
处理边缘情况：
- 预印本：包含存储库和 ID
- 后来发表的预印本：使用已发表版本
- 会议论文：包含会议名称和地点
- 书籍章节：包含书名和编辑
保持一致性：
- 使用一致的作者姓名格式
- 标准化期刊缩写
- 使用相同的 DOI 格式（推荐 URL）

遵循约定：
- 使用有意义的引文键（FirstAuthor2024keyword）
- 使用 {} 保护标题中的大小写
- 使用 -- 表示页码范围（不是单破折号）
- 为所有现代出版物包含 DOI 字段
保持简洁：
- 删除不必要的字段
- 没有冗余信息
- 一致的格式
- 定期验证语法
系统化组织：
- 按年份或主题排序
- 分组相关论文
- 为不同项目使用单独的文件
- 合并时小心避免重复

尽早且经常验证：
- 添加引文时进行检查
- 在提交前验证完整的参考文献列表
- 任何手动编辑后重新验证
及时修复问题：
- 损坏的 DOI：查找正确的标识符
- 缺失字段：从原始来源提取
- 重复项：选择最佳版本，删除其他
- 格式错误：安全时使用自动修复
对关键引文进行人工审查：
- 验证关键论文引用是否正确
- 检查作者姓名是否与出版物相符
- 确认页码和卷号
- 确保 URL 是最新的

应避免的常见陷阱

单一来源偏见：仅使用 Google Scholar 或 PubMed
- 解决方案：搜索多个数据库以获得全面覆盖
盲目接受元数据：不验证提取的信息
- 解决方案：对照原始来源抽查提取的元数据
忽略 DOI 错误：参考文献列表中存在损坏或不正确的 DOI
- 解决方案：最终提交前运行验证
格式不一致：混合的引文键样式、格式
- 解决方案：使用 format_bibtex.py 进行标准化
重复条目：同一论文使用不同键多次引用
- 解决方案：在验证中使用重复检测
缺少必填字段：不完整的 BibTeX 条目
- 解决方案：验证并确保所有必填字段都存在
过时的预印本：引用预印本而存在已发表版本
- 解决方案：检查预印本是否已发表，更新为期刊版本
特殊字符问题：由于字符导致 LaTeX 编译失败
- 解决方案：在 BibTeX 中使用正确的转义或 Unicode
提交前未验证：提交带有引文错误
- 解决方案：始终在最终检查时运行验证
手动输入 BibTeX 条目：手动输入条目

 * **解决方案**：始终使用脚本从元数据源提取

示例 1：为论文构建参考文献列表

# 步骤 1：查找您主题的关键论文
python scripts/search_google_scholar.py "transformer neural networks" \
  --year-start 2017 \
  --limit 50 \
  --output transformers_gs.json

python scripts/search_pubmed.py "deep learning medical imaging" \
  --date-start 2020 \
  --limit 50 \
  --output medical_dl_pm.json

# 步骤 2：从搜索结果中提取元数据
python scripts/extract_metadata.py \
  --input transformers_gs.json \
  --output transformers.bib

python scripts/extract_metadata.py \
  --input medical_dl_pm.json \
  --output medical.bib

# 步骤 3：添加您已知的特定论文
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2 >> specific.bib
python scripts/doi_to_bibtex.py 10.1126/science.aam9317 >> specific.bib

# 步骤 4：合并所有 BibTeX 文件
cat transformers.bib medical.bib specific.bib > combined.bib

# 步骤 5：格式化和去重
python scripts/format_bibtex.py combined.bib \
  --deduplicate \
  --sort year \
  --descending \
  --output formatted.bib

# 步骤 6：验证
python scripts/validate_citations.py formatted.bib \
  --auto-fix \
  --report validation.json \
  --output final_references.bib

# 步骤 7：审查任何问题
cat validation.json | grep -A 3 '"errors"'

# 步骤 8：在 LaTeX 中使用
# \bibliography{final_references}

示例 2：转换 DOI 列表

# 您有一个包含 DOI 的文本文件（每行一个）
# dois.txt 包含：
# 10.1038/s41586-021-03819-2
# 10.1126/science.aam9317
# 10.1016/j.cell.2023.01.001

# 全部转换为 BibTeX
python scripts/doi_to_bibtex.py --input dois.txt --output references.bib

# 验证结果
python scripts/validate_citations.py references.bib --verbose

示例 3：清理现有的 BibTeX 文件

# 您有一个来自不同来源的混乱 BibTeX 文件
# 系统地清理它

# 步骤 1：格式化和标准化
python scripts/format_bibtex.py messy_references.bib \
  --output step1_formatted.bib

# 步骤 2：删除重复项
python scripts/format_bibtex.py step1_formatted.bib \
  --deduplicate \
  --output step2_deduplicated.bib

# 步骤 3：验证和自动修复
python scripts/validate_citations.py step2_deduplicated.bib \
  --auto-fix \
  --output step3_validated.bib

# 步骤 4：按年份排序
python scripts/format_bibtex.py step3_validated.bib \
  --sort year \
  --descending \
  --output clean_references.bib

# 步骤 5：最终验证报告
python scripts/validate_citations.py clean_references.bib \
  --report final_validation.json \
  --verbose

# 审查报告
cat final_validation.json

示例 4：查找和引用开创性论文

# 查找某个主题的高被引论文
python scripts/search_google_scholar.py "AlphaFold protein structure" \
  --year-start 2020 \
  --year-end 2024 \
  --sort-by citations \
  --limit 20 \
  --output alphafold_seminal.json

# 提取被引次数前 10 的论文
# （脚本将在 JSON 中包含被引次数）

# 转换为 BibTeX
python scripts/extract_metadata.py \
  --input alphafold_seminal.json \
  --output alphafold_refs.bib

# BibTeX 文件现在包含最具影响力的论文

与其他技能集成

引文管理 为 文献综述 提供技术基础设施：

文献综述：多数据库系统性搜索和综合
引文管理：元数据提取和验证

组合工作流程：

使用 literature-review 进行系统性搜索方法
使用 citation-management 提取和验证引文
使用 literature-review 综合发现
使用 citation-management 确保参考文献列表的准确性

引文管理 确保 科学写作 的准确参考文献：

导出经过验证的 BibTeX 以供在 LaTeX 手稿中使用
验证引文是否符合出版标准
根据期刊要求格式化参考文献

引文管理 与 会议模板 配合使用，生成可提交的手稿：

不同的会议需要不同的引文样式
生成正确格式的参考文献
验证引文是否符合会议要求

参考文献（位于 references/）：

google_scholar_search.md：完整的 Google Scholar 搜索指南
pubmed_search.md：PubMed 和 E-utilities API 文档
metadata_extraction.md：元数据来源和字段要求
citation_validation.md：验证标准和质量检查
bibtex_formatting.md：BibTeX 条目类型和格式化规则

脚本（位于 scripts/）：

search_google_scholar.py：Google Scholar 搜索自动化
search_pubmed.py：PubMed E-utilities API 客户端
extract_metadata.py：通用元数据提取器
validate_citations.py：引文验证和核实
format_bibtex.py：BibTeX 格式化器和清理器
doi_to_bibtex.py：快速 DOI 到 BibTeX 转换器

资产（位于 assets/）：

bibtex_template.bib：所有类型的示例 BibTeX 条目
citation_checklist.md：质量保证清单

元数据 API：

工具和验证器：

MeSH 浏览器：https://meshb.nlm.nih.gov/search
DOI 解析器：https://doi.org/
BibTeX 格式：http://www.bibtex.org/Format/

必需的 Python 包

# 核心依赖项
pip install requests  # 用于 API 的 HTTP 请求
pip install bibtexparser  # BibTeX 解析和格式化
pip install

🇺🇸English

Citation Management

Overview

Manage citations systematically throughout the research and writing process. This skill provides tools and strategies for searching academic databases (Google Scholar, PubMed), extracting accurate metadata from multiple sources (CrossRef, PubMed, arXiv), validating citation information, and generating properly formatted BibTeX entries.

Critical for maintaining citation accuracy, avoiding reference errors, and ensuring reproducible research. Integrates seamlessly with the literature-review skill for comprehensive research workflows.

When to Use This Skill

Use this skill when:

Searching for specific papers on Google Scholar or PubMed
Converting DOIs, PMIDs, or arXiv IDs to properly formatted BibTeX
Extracting complete metadata for citations (authors, title, journal, year, etc.)
Validating existing citations for accuracy
Cleaning and formatting BibTeX files
Finding highly cited papers in a specific field
Verifying that citation information matches the actual publication
Building a bibliography for a manuscript or thesis
Checking for duplicate citations
Ensuring consistent citation formatting

Visual Enhancement with Scientific Schematics

When creating documents with this skill, always consider adding scientific diagrams and schematics to enhance visual communication.

If your document does not already contain schematics or diagrams:

Use the scientific-schematics skill to generate AI-powered publication-quality diagrams
Simply describe your desired diagram in natural language
Nano Banana Pro will automatically generate, review, and refine the schematic

For new documents: Scientific schematics should be generated by default to visually represent key concepts, workflows, architectures, or relationships described in the text.

How to generate schematics:

python scripts/generate_schematic.py "your diagram description" -o figures/output.png

The AI will automatically:

Create publication-quality images with proper formatting
Review and refine through multiple iterations
Ensure accessibility (colorblind-friendly, high contrast)
Save outputs in the figures/ directory

When to add schematics:

Citation workflow diagrams
Literature search methodology flowcharts
Reference management system architectures
Citation style decision trees
Database integration diagrams
Any complex concept that benefits from visualization

For detailed guidance on creating schematics, refer to the scientific-schematics skill documentation.

Core Workflow

Citation management follows a systematic process:

Phase 1: Paper Discovery and Search

Goal : Find relevant papers using academic search engines.

Google Scholar Search

Google Scholar provides the most comprehensive coverage across disciplines.

Basic Search :

# Search for papers on a topic
python scripts/search_google_scholar.py "CRISPR gene editing" \
  --limit 50 \
  --output results.json

# Search with year filter
python scripts/search_google_scholar.py "machine learning protein folding" \
  --year-start 2020 \
  --year-end 2024 \
  --limit 100 \
  --output ml_proteins.json

Advanced Search Strategies (see references/google_scholar_search.md):

Use quotation marks for exact phrases: "deep learning"
Search by author: author:LeCun
Search in title: intitle:"neural networks"
Exclude terms: machine learning -survey
Find highly cited papers using sort options
Filter by date ranges to get recent work

Best Practices :

Use specific, targeted search terms
Include key technical terms and acronyms
Filter by recent years for fast-moving fields
Check "Cited by" to find seminal papers
Export top results for further analysis

PubMed Search

PubMed specializes in biomedical and life sciences literature (35+ million citations).

Basic Search :

# Search PubMed
python scripts/search_pubmed.py "Alzheimer's disease treatment" \
  --limit 100 \
  --output alzheimers.json

# Search with MeSH terms and filters
python scripts/search_pubmed.py \
  --query '"Alzheimer Disease"[MeSH] AND "Drug Therapy"[MeSH]' \
  --date-start 2020 \
  --date-end 2024 \
  --publication-types "Clinical Trial,Review" \
  --output alzheimers_trials.json

Advanced PubMed Queries (see references/pubmed_search.md):

Use MeSH terms: "Diabetes Mellitus"[MeSH]
Field tags: "cancer"[Title], "Smith J"[Author]
Boolean operators: AND, OR, NOT
Date filters: 2020:2024[Publication Date]
Publication types: "Review"[Publication Type]
Combine with E-utilities API for automation

Best Practices :

Use MeSH Browser to find correct controlled vocabulary
Construct complex queries in PubMed Advanced Search Builder first
Include multiple synonyms with OR
Retrieve PMIDs for easy metadata extraction
Export to JSON or directly to BibTeX

Phase 2: Metadata Extraction

Goal : Convert paper identifiers (DOI, PMID, arXiv ID) to complete, accurate metadata.

Quick DOI to BibTeX Conversion

For single DOIs, use the quick conversion tool:

# Convert single DOI
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2

# Convert multiple DOIs from a file
python scripts/doi_to_bibtex.py --input dois.txt --output references.bib

# Different output formats
python scripts/doi_to_bibtex.py 10.1038/nature12345 --format json

Comprehensive Metadata Extraction

For DOIs, PMIDs, arXiv IDs, or URLs:

# Extract from DOI
python scripts/extract_metadata.py --doi 10.1038/s41586-021-03819-2

# Extract from PMID
python scripts/extract_metadata.py --pmid 34265844

# Extract from arXiv ID
python scripts/extract_metadata.py --arxiv 2103.14030

# Extract from URL
python scripts/extract_metadata.py --url "https://www.nature.com/articles/s41586-021-03819-2"

# Batch extraction from file (mixed identifiers)
python scripts/extract_metadata.py --input identifiers.txt --output citations.bib

Metadata Sources (see references/metadata_extraction.md):

CrossRef API : Primary source for DOIs
- Comprehensive metadata for journal articles
- Publisher-provided information
- Includes authors, title, journal, volume, pages, dates
- Free, no API key required
PubMed E-utilities : Biomedical literature
- Official NCBI metadata
- Includes MeSH terms, abstracts
- PMID and PMCID identifiers
- Free, API key recommended for high volume
arXiv API : Preprints in physics, math, CS, q-bio
- Complete metadata for preprints
- Version tracking
- Author affiliations
- Free, open access
DataCite API : Research datasets, software, other resources
- Metadata for non-traditional scholarly outputs
- DOIs for datasets and code
- Free access

What Gets Extracted :

Required fields : author, title, year
Journal articles : journal, volume, number, pages, DOI
Books : publisher, ISBN, edition
Conference papers : booktitle, conference location, pages
Preprints : repository (arXiv, bioRxiv), preprint ID
Additional : abstract, keywords, URL

Phase 3: BibTeX Formatting

Goal : Generate clean, properly formatted BibTeX entries.

Understanding BibTeX Entry Types

See references/bibtex_formatting.md for complete guide.

Common Entry Types :

@article: Journal articles (most common)
@book: Books
@inproceedings: Conference papers
@incollection: Book chapters
@phdthesis: Dissertations
@misc: Preprints, software, datasets

Required Fields by Type :

@article{citationkey,
  author  = {Last1, First1 and Last2, First2},
  title   = {Article Title},
  journal = {Journal Name},
  year    = {2024},
  volume  = {10},
  number  = {3},
  pages   = {123--145},
  doi     = {10.1234/example}
}

@inproceedings{citationkey,
  author    = {Last, First},
  title     = {Paper Title},
  booktitle = {Conference Name},
  year      = {2024},
  pages     = {1--10}
}

@book{citationkey,
  author    = {Last, First},
  title     = {Book Title},
  publisher = {Publisher Name},
  year      = {2024}
}

Formatting and Cleaning

Use the formatter to standardize BibTeX files:

# Format and clean BibTeX file
python scripts/format_bibtex.py references.bib \
  --output formatted_references.bib

# Sort entries by citation key
python scripts/format_bibtex.py references.bib \
  --sort key \
  --output sorted_references.bib

# Sort by year (newest first)
python scripts/format_bibtex.py references.bib \
  --sort year \
  --descending \
  --output sorted_references.bib

# Remove duplicates
python scripts/format_bibtex.py references.bib \
  --deduplicate \
  --output clean_references.bib

# Validate and report issues
python scripts/format_bibtex.py references.bib \
  --validate \
  --report validation_report.txt

Formatting Operations :

Standardize field order
Consistent indentation and spacing
Proper capitalization in titles (protected with {})
Standardized author name format
Consistent citation key format
Remove unnecessary fields
Fix common errors (missing commas, braces)

Phase 4: Citation Validation

Goal : Verify all citations are accurate and complete.

Comprehensive Validation

# Validate BibTeX file
python scripts/validate_citations.py references.bib

# Validate and fix common issues
python scripts/validate_citations.py references.bib \
  --auto-fix \
  --output validated_references.bib

# Generate detailed validation report
python scripts/validate_citations.py references.bib \
  --report validation_report.json \
  --verbose

Validation Checks (see references/citation_validation.md):

DOI Verification :
- DOI resolves correctly via doi.org
- Metadata matches between BibTeX and CrossRef
- No broken or invalid DOIs
Required Fields :
- All required fields present for entry type
- No empty or missing critical information
- Author names properly formatted
Data Consistency :
- Year is valid (4 digits, reasonable range)
- Volume/number are numeric
- Pages formatted correctly (e.g., 123--145)
- URLs are accessible
Duplicate Detection :
- Same DOI used multiple times
- Similar titles (possible duplicates)
- Same author/year/title combinations
Format Compliance :
- Valid BibTeX syntax
- Proper bracing and quoting
- Citation keys are unique
- Special characters handled correctly

Validation Output :

{
  "total_entries": 150,
  "valid_entries": 145,
  "errors": [
    {
      "citation_key": "Smith2023",
      "error_type": "missing_field",
      "field": "journal",
      "severity": "high"
    },
    {
      "citation_key": "Jones2022",
      "error_type": "invalid_doi",
      "doi": "10.1234/broken",
      "severity": "high"
    }
  ],
  "warnings": [
    {
      "citation_key": "Brown2021",
      "warning_type": "possible_duplicate",
      "duplicate_of": "Brown2021a",
      "severity": "medium"
    }
  ]
}

Phase 5: Integration with Writing Workflow

Building References for Manuscripts

Complete workflow for creating a bibliography:

# 1. Search for papers on your topic
python scripts/search_pubmed.py \
  '"CRISPR-Cas Systems"[MeSH] AND "Gene Editing"[MeSH]' \
  --date-start 2020 \
  --limit 200 \
  --output crispr_papers.json

# 2. Extract DOIs from search results and convert to BibTeX
python scripts/extract_metadata.py \
  --input crispr_papers.json \
  --output crispr_refs.bib

# 3. Add specific papers by DOI
python scripts/doi_to_bibtex.py 10.1038/nature12345 >> crispr_refs.bib
python scripts/doi_to_bibtex.py 10.1126/science.abcd1234 >> crispr_refs.bib

# 4. Format and clean the BibTeX file
python scripts/format_bibtex.py crispr_refs.bib \
  --deduplicate \
  --sort year \
  --descending \
  --output references.bib

# 5. Validate all citations
python scripts/validate_citations.py references.bib \
  --auto-fix \
  --report validation.json \
  --output final_references.bib

# 6. Review validation report and fix any remaining issues
cat validation.json

# 7. Use in your LaTeX document
# \bibliography{final_references}

Integration with Literature Review Skill

This skill complements the literature-review skill:

Literature Review Skill → Systematic search and synthesis Citation Management Skill → Technical citation handling

Combined Workflow :

Use literature-review for comprehensive multi-database search
Use citation-management to extract and validate all citations
Use literature-review to synthesize findings thematically
Use citation-management to verify final bibliography accuracy

# After completing literature review

# Verify all citations in the review document
python scripts/validate_citations.py my_review_references.bib --report review_validation.json

# Format for specific citation style if needed
python scripts/format_bibtex.py my_review_references.bib \
  --style nature \
  --output formatted_refs.bib

Search Strategies

Google Scholar Best Practices

Finding Seminal Papers :

Sort by citation count (most cited first)
Look for review articles for overview
Check "Cited by" for impact assessment
Use citation alerts for tracking new citations

Advanced Operators (full list in references/google_scholar_search.md):

"exact phrase"           # Exact phrase matching
author:lastname          # Search by author
intitle:keyword          # Search in title only
source:journal           # Search specific journal
-exclude                 # Exclude terms
OR                       # Alternative terms
2020..2024              # Year range

Example Searches :

# Find recent reviews on a topic
"CRISPR" intitle:review 2023..2024

# Find papers by specific author on topic
author:Church "synthetic biology"

# Find highly cited foundational work
"deep learning" 2012..2015 sort:citations

# Exclude surveys and focus on methods
"protein folding" -survey -review intitle:method

PubMed Best Practices

Using MeSH Terms : MeSH (Medical Subject Headings) provides controlled vocabulary for precise searching.

Find MeSH terms at https://meshb.nlm.nih.gov/search
Use in queries : "Diabetes Mellitus, Type 2"[MeSH]
Combine with keywords for comprehensive coverage

Field Tags :

[Title]              # Search in title only
[Title/Abstract]     # Search in title or abstract
[Author]             # Search by author name
[Journal]            # Search specific journal
[Publication Date]   # Date range
[Publication Type]   # Article type
[MeSH]              # MeSH term

Building Complex Queries :

# Clinical trials on diabetes treatment published recently
"Diabetes Mellitus, Type 2"[MeSH] AND "Drug Therapy"[MeSH] 
AND "Clinical Trial"[Publication Type] AND 2020:2024[Publication Date]

# Reviews on CRISPR in specific journal
"CRISPR-Cas Systems"[MeSH] AND "Nature"[Journal] AND "Review"[Publication Type]

# Specific author's recent work
"Smith AB"[Author] AND cancer[Title/Abstract] AND 2022:2024[Publication Date]

E-utilities for Automation : The scripts use NCBI E-utilities API for programmatic access:

ESearch : Search and retrieve PMIDs
EFetch : Retrieve full metadata
ESummary : Get summary information
ELink : Find related articles

See references/pubmed_search.md for complete API documentation.

Tools and Scripts

search_google_scholar.py

Search Google Scholar and export results.

Features :

Automated searching with rate limiting
Pagination support
Year range filtering
Export to JSON or BibTeX
Citation count information

Usage :

# Basic search
python scripts/search_google_scholar.py "quantum computing"

# Advanced search with filters
python scripts/search_google_scholar.py "quantum computing" \
  --year-start 2020 \
  --year-end 2024 \
  --limit 100 \
  --sort-by citations \
  --output quantum_papers.json

# Export directly to BibTeX
python scripts/search_google_scholar.py "machine learning" \
  --limit 50 \
  --format bibtex \
  --output ml_papers.bib

search_pubmed.py

Search PubMed using E-utilities API.

Features :

Complex query support (MeSH, field tags, Boolean)
Date range filtering
Publication type filtering
Batch retrieval with metadata
Export to JSON or BibTeX

Usage :

# Simple keyword search
python scripts/search_pubmed.py "CRISPR gene editing"

# Complex query with filters
python scripts/search_pubmed.py \
  --query '"CRISPR-Cas Systems"[MeSH] AND "therapeutic"[Title/Abstract]' \
  --date-start 2020-01-01 \
  --date-end 2024-12-31 \
  --publication-types "Clinical Trial,Review" \
  --limit 200 \
  --output crispr_therapeutic.json

# Export to BibTeX
python scripts/search_pubmed.py "Alzheimer's disease" \
  --limit 100 \
  --format bibtex \
  --output alzheimers.bib

extract_metadata.py

Extract complete metadata from paper identifiers.

Features :

Supports DOI, PMID, arXiv ID, URL
Queries CrossRef, PubMed, arXiv APIs
Handles multiple identifier types
Batch processing
Multiple output formats

Usage :

# Single DOI
python scripts/extract_metadata.py --doi 10.1038/s41586-021-03819-2

# Single PMID
python scripts/extract_metadata.py --pmid 34265844

# Single arXiv ID
python scripts/extract_metadata.py --arxiv 2103.14030

# From URL
python scripts/extract_metadata.py \
  --url "https://www.nature.com/articles/s41586-021-03819-2"

# Batch processing (file with one identifier per line)
python scripts/extract_metadata.py \
  --input paper_ids.txt \
  --output references.bib

# Different output formats
python scripts/extract_metadata.py \
  --doi 10.1038/nature12345 \
  --format json  # or bibtex, yaml

validate_citations.py

Validate BibTeX entries for accuracy and completeness.

Features :

DOI verification via doi.org and CrossRef
Required field checking
Duplicate detection
Format validation
Auto-fix common issues
Detailed reporting

Usage :

# Basic validation
python scripts/validate_citations.py references.bib

# With auto-fix
python scripts/validate_citations.py references.bib \
  --auto-fix \
  --output fixed_references.bib

# Detailed validation report
python scripts/validate_citations.py references.bib \
  --report validation_report.json \
  --verbose

# Only check DOIs
python scripts/validate_citations.py references.bib \
  --check-dois-only

format_bibtex.py

Format and clean BibTeX files.

Features :

Standardize formatting
Sort entries (by key, year, author)
Remove duplicates
Validate syntax
Fix common errors
Enforce citation key conventions

Usage :

# Basic formatting
python scripts/format_bibtex.py references.bib

# Sort by year (newest first)
python scripts/format_bibtex.py references.bib \
  --sort year \
  --descending \
  --output sorted_refs.bib

# Remove duplicates
python scripts/format_bibtex.py references.bib \
  --deduplicate \
  --output clean_refs.bib

# Complete cleanup
python scripts/format_bibtex.py references.bib \
  --deduplicate \
  --sort year \
  --validate \
  --auto-fix \
  --output final_refs.bib

doi_to_bibtex.py

Quick DOI to BibTeX conversion.

Features :

Fast single DOI conversion
Batch processing
Multiple output formats
Clipboard support

Usage :

# Single DOI
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2

# Multiple DOIs
python scripts/doi_to_bibtex.py \
  10.1038/nature12345 \
  10.1126/science.abc1234 \
  10.1016/j.cell.2023.01.001

# From file (one DOI per line)
python scripts/doi_to_bibtex.py --input dois.txt --output references.bib

# Copy to clipboard
python scripts/doi_to_bibtex.py 10.1038/nature12345 --clipboard

Best Practices

Search Strategy

Start broad, then narrow :
- Begin with general terms to understand the field
- Refine with specific keywords and filters
- Use synonyms and related terms
Use multiple sources :
- Google Scholar for comprehensive coverage
- PubMed for biomedical focus
- arXiv for preprints
- Combine results for completeness
Leverage citations :
- Check "Cited by" for seminal papers
- Review references from key papers
- Use citation networks to discover related work
Document your searches :
- Save search queries and dates
- Record number of results
- Note any filters or restrictions applied

Metadata Extraction

Always use DOIs when available :
- Most reliable identifier
- Permanent link to the publication
- Best metadata source via CrossRef
Verify extracted metadata :
- Check author names are correct
- Verify journal/conference names
- Confirm publication year
- Validate page numbers and volume
Handle edge cases :
- Preprints: Include repository and ID
- Preprints later published: Use published version
- Conference papers: Include conference name and location
- Book chapters: Include book title and editors
Maintain consistency :
- Use consistent author name format
- Standardize journal abbreviations
- Use same DOI format (URL preferred)

BibTeX Quality

Follow conventions :
- Use meaningful citation keys (FirstAuthor2024keyword)
- Protect capitalization in titles with {}
- Use -- for page ranges (not single dash)
- Include DOI field for all modern publications
Keep it clean :
- Remove unnecessary fields
- No redundant information
- Consistent formatting
- Validate syntax regularly
Organize systematically :
- Sort by year or topic
- Group related papers
- Use separate files for different projects
- Merge carefully to avoid duplicates

Validation

Validate early and often :
- Check citations when adding them
- Validate complete bibliography before submission
- Re-validate after any manual edits
Fix issues promptly :
- Broken DOIs: Find correct identifier
- Missing fields: Extract from original source
- Duplicates: Choose best version, remove others
- Format errors: Use auto-fix when safe
Manual review for critical citations :
- Verify key papers cited correctly
- Check author names match publication
- Confirm page numbers and volume
- Ensure URLs are current

Common Pitfalls to Avoid

Single source bias : Only using Google Scholar or PubMed
- Solution : Search multiple databases for comprehensive coverage
Accepting metadata blindly : Not verifying extracted information
- Solution : Spot-check extracted metadata against original sources
Ignoring DOI errors : Broken or incorrect DOIs in bibliography
- Solution : Run validation before final submission
Inconsistent formatting : Mixed citation key styles, formatting
- Solution : Use format_bibtex.py to standardize
Duplicate entries : Same paper cited multiple times with different keys
- Solution : Use duplicate detection in validation
Missing required fields : Incomplete BibTeX entries
- Solution : Validate and ensure all required fields present
Outdated preprints : Citing preprint when published version exists
- : Check if preprints have been published, update to journal version

 * **Solution** : Always extract from metadata sources using scripts

Example Workflows

Example 1: Building a Bibliography for a Paper

# Step 1: Find key papers on your topic
python scripts/search_google_scholar.py "transformer neural networks" \
  --year-start 2017 \
  --limit 50 \
  --output transformers_gs.json

python scripts/search_pubmed.py "deep learning medical imaging" \
  --date-start 2020 \
  --limit 50 \
  --output medical_dl_pm.json

# Step 2: Extract metadata from search results
python scripts/extract_metadata.py \
  --input transformers_gs.json \
  --output transformers.bib

python scripts/extract_metadata.py \
  --input medical_dl_pm.json \
  --output medical.bib

# Step 3: Add specific papers you already know
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2 >> specific.bib
python scripts/doi_to_bibtex.py 10.1126/science.aam9317 >> specific.bib

# Step 4: Combine all BibTeX files
cat transformers.bib medical.bib specific.bib > combined.bib

# Step 5: Format and deduplicate
python scripts/format_bibtex.py combined.bib \
  --deduplicate \
  --sort year \
  --descending \
  --output formatted.bib

# Step 6: Validate
python scripts/validate_citations.py formatted.bib \
  --auto-fix \
  --report validation.json \
  --output final_references.bib

# Step 7: Review any issues
cat validation.json | grep -A 3 '"errors"'

# Step 8: Use in LaTeX
# \bibliography{final_references}

Example 2: Converting a List of DOIs

# You have a text file with DOIs (one per line)
# dois.txt contains:
# 10.1038/s41586-021-03819-2
# 10.1126/science.aam9317
# 10.1016/j.cell.2023.01.001

# Convert all to BibTeX
python scripts/doi_to_bibtex.py --input dois.txt --output references.bib

# Validate the result
python scripts/validate_citations.py references.bib --verbose

Example 3: Cleaning an Existing BibTeX File

# You have a messy BibTeX file from various sources
# Clean it up systematically

# Step 1: Format and standardize
python scripts/format_bibtex.py messy_references.bib \
  --output step1_formatted.bib

# Step 2: Remove duplicates
python scripts/format_bibtex.py step1_formatted.bib \
  --deduplicate \
  --output step2_deduplicated.bib

# Step 3: Validate and auto-fix
python scripts/validate_citations.py step2_deduplicated.bib \
  --auto-fix \
  --output step3_validated.bib

# Step 4: Sort by year
python scripts/format_bibtex.py step3_validated.bib \
  --sort year \
  --descending \
  --output clean_references.bib

# Step 5: Final validation report
python scripts/validate_citations.py clean_references.bib \
  --report final_validation.json \
  --verbose

# Review report
cat final_validation.json

Example 4: Finding and Citing Seminal Papers

# Find highly cited papers on a topic
python scripts/search_google_scholar.py "AlphaFold protein structure" \
  --year-start 2020 \
  --year-end 2024 \
  --sort-by citations \
  --limit 20 \
  --output alphafold_seminal.json

# Extract the top 10 by citation count
# (script will have included citation counts in JSON)

# Convert to BibTeX
python scripts/extract_metadata.py \
  --input alphafold_seminal.json \
  --output alphafold_refs.bib

# The BibTeX file now contains the most influential papers

Integration with Other Skills

Literature Review Skill

Citation Management provides the technical infrastructure for Literature Review :

Literature Review : Multi-database systematic search and synthesis
Citation Management : Metadata extraction and validation

Combined workflow :

Use literature-review for systematic search methodology
Use citation-management to extract and validate citations
Use literature-review to synthesize findings
Use citation-management to ensure bibliography accuracy

Scientific Writing Skill

Citation Management ensures accurate references for Scientific Writing :

Export validated BibTeX for use in LaTeX manuscripts
Verify citations match publication standards
Format references according to journal requirements

Venue Templates Skill

Citation Management works with Venue Templates for submission-ready manuscripts:

Different venues require different citation styles
Generate properly formatted references
Validate citations meet venue requirements

Resources

Bundled Resources

References (in references/):

google_scholar_search.md: Complete Google Scholar search guide
pubmed_search.md: PubMed and E-utilities API documentation
metadata_extraction.md: Metadata sources and field requirements
citation_validation.md: Validation criteria and quality checks
bibtex_formatting.md: BibTeX entry types and formatting rules

Scripts (in scripts/):

search_google_scholar.py: Google Scholar search automation
search_pubmed.py: PubMed E-utilities API client
extract_metadata.py: Universal metadata extractor
validate_citations.py: Citation validation and verification
format_bibtex.py: BibTeX formatter and cleaner
doi_to_bibtex.py: Quick DOI to BibTeX converter

Assets (in assets/):

bibtex_template.bib: Example BibTeX entries for all types
citation_checklist.md: Quality assurance checklist

External Resources

Search Engines :

Google Scholar: https://scholar.google.com/
PubMed: https://pubmed.ncbi.nlm.nih.gov/
PubMed Advanced Search: https://pubmed.ncbi.nlm.nih.gov/advanced/

Metadata APIs :

CrossRef API: https://api.crossref.org/
PubMed E-utilities: https://www.ncbi.nlm.nih.gov/books/NBK25501/
arXiv API: https://arxiv.org/help/api/
DataCite API: https://api.datacite.org/

Tools and Validators :

MeSH Browser: https://meshb.nlm.nih.gov/search
DOI Resolver: https://doi.org/
BibTeX Format: http://www.bibtex.org/Format/

Citation Styles :

BibTeX documentation: http://www.bibtex.org/
LaTeX bibliography management: https://www.overleaf.com/learn/latex/Bibliography_management

Dependencies

Required Python Packages

# Core dependencies
pip install requests  # HTTP requests for APIs
pip install bibtexparser  # BibTeX parsing and formatting
pip install biopython  # PubMed E-utilities access

# Optional (for Google Scholar)
pip install scholarly  # Google Scholar API wrapper
# or
pip install selenium  # For more robust Scholar scraping

Optional Tools

# For advanced validation
pip install crossref-commons  # Enhanced CrossRef API access
pip install pylatexenc  # LaTeX special character handling

Summary

The citation-management skill provides:

Comprehensive search capabilities for Google Scholar and PubMed
Automated metadata extraction from DOI, PMID, arXiv ID, URLs
Citation validation with DOI verification and completeness checking
BibTeX formatting with standardization and cleaning tools
Quality assurance through validation and reporting
Integration with scientific writing workflow
Reproducibility through documented search and extraction methods

Use this skill to maintain accurate, complete citations throughout your research and ensure publication-ready bibliographies.

Weekly Installs

328

Repository

davila7/claude-…emplates

GitHub Stars

22.6K

First Seen

Jan 21, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykWarn

Installed on

opencode273

gemini-cli258

codex242

claude-code237

cursor229

github-copilot210

Azure RBAC 权限管理工具：查找最小角色、创建自定义角色与自动化分配

101,200 周安装

Special character issues : Broken LaTeX compilation due to characters

Solution : Use proper escaping or Unicode in BibTeX

No validation before submission : Submitting with citation errors

Solution : Always run validation as final check

Manual BibTeX entry : Typing entries by hand

学术引文管理工具：自动搜索、提取元数据、生成BibTeX，提升研究效率

🇨🇳中文介绍

引文管理

概述

何时使用此技能

使用科学示意图增强视觉效果

相关 Skills

核心工作流程

阶段 1：论文发现与搜索

Google Scholar 搜索

PubMed 搜索

阶段 2：元数据提取

快速 DOI 到 BibTeX 转换

综合元数据提取

阶段 3：BibTeX 格式化

理解 BibTeX 条目类型

格式化和清理

阶段 4：引文验证

综合验证

阶段 5：与写作工作流程集成

为手稿构建参考文献

与文献综述技能集成

搜索策略

Google Scholar 最佳实践

PubMed 最佳实践

工具和脚本

search_google_scholar.py

search_pubmed.py

extract_metadata.py

validate_citations.py

format_bibtex.py

doi_to_bibtex.py

最佳实践

搜索策略

元数据提取

BibTeX 质量

验证

应避免的常见陷阱

示例工作流程

示例 1：为论文构建参考文献列表

示例 2：转换 DOI 列表

示例 3：清理现有的 BibTeX 文件

示例 4：查找和引用开创性论文

与其他技能集成

文献综述技能

科学写作技能

会议模板技能

资源

捆绑资源

外部资源

依赖项

必需的 Python 包

🇺🇸English

Citation Management

Overview

When to Use This Skill

Visual Enhancement with Scientific Schematics

Core Workflow

Phase 1: Paper Discovery and Search

Google Scholar Search

PubMed Search

Phase 2: Metadata Extraction

Quick DOI to BibTeX Conversion

Comprehensive Metadata Extraction

Phase 3: BibTeX Formatting

Understanding BibTeX Entry Types

Formatting and Cleaning

Phase 4: Citation Validation

Comprehensive Validation

Phase 5: Integration with Writing Workflow

Building References for Manuscripts

Integration with Literature Review Skill

Search Strategies

Google Scholar Best Practices

PubMed Best Practices

Tools and Scripts

search_google_scholar.py

search_pubmed.py

extract_metadata.py

validate_citations.py