hypogenic by davila7/claude-code-templates
npx skills add https://github.com/davila7/claude-code-templates --skill hypogenicHypogenic 利用大型语言模型提供自动化的假设生成与测试,以加速科学发现。该框架支持三种方法:HypoGeniC(数据驱动的假设生成)、HypoRefine(文献与数据的协同整合)以及 Union 方法(文献与数据驱动假设的机制性结合)。
几分钟内即可开始使用 Hypogenic:
# 安装包
uv pip install hypogenic
# 克隆示例数据集
git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data
# 运行基础假设生成
hypogenic_generation --config ./data/your_task/config.yaml --method hypogenic --num_hypotheses 20
# 对生成的假设进行推理
hypogenic_inference --config ./data/your_task/config.yaml --hypotheses output/hypotheses.json
或使用 Python API:
from hypogenic import BaseTask
# 使用您的配置创建任务
task = BaseTask(config_path="./data/your_task/config.yaml")
# 生成假设
task.generate_hypotheses(method="hypogenic", num_hypotheses=20)
# 运行推理
results = task.inference(hypothesis_bank="./output/hypotheses.json")
在以下场景中使用此技能:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
文献整合
性能优化
灵活配置
已验证结果
通过迭代优化,仅从观测数据生成假设。
流程:
最适合: 无现有文献的探索性研究、新数据集中的模式发现
通过智能体框架协同结合现有文献与经验数据。
流程:
最适合: 具有成熟理论基础的研究、验证或扩展现有理论
机制性地将纯文献假设与框架输出相结合。
变体:
最适合: 全面的假设覆盖,在保持多元视角的同时消除冗余
通过 pip 安装:
uv pip install hypogenic
可选依赖项:
克隆示例数据集:
# 用于 HypoGeniC 示例
git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data
# 用于 HypoRefine/Union 示例
git clone https://github.com/ChicagoHAI/Hypothesis-agent-datasets.git ./data
数据集必须遵循 HuggingFace 数据集格式,并遵守特定的命名约定:
必需文件:
<TASK>_train.json:训练数据<TASK>_val.json:验证数据<TASK>_test.json:测试数据JSON 中的必需键:
text_features_1 到 text_features_n:包含特征值的字符串列表label:包含真实标签的字符串列表示例(标题点击率预测):
{
"headline_1": [
"What Up, Comet? You Just Got *PROBED*",
"Scientists Made a Breakthrough in Quantum Computing"
],
"headline_2": [
"Scientists Everywhere Were Holding Their Breath Today. Here's Why.",
"New Quantum Computer Achieves Milestone"
],
"label": [
"Headline 2 has more clicks than Headline 1",
"Headline 1 has more clicks than Headline 2"
]
}
重要说明:
extract_label() 函数输出格式匹配review_text、post_content 等)每个任务都需要一个 config.yaml 文件来指定:
必需元素:
模板能力:
${text_features_1}、${num_hypotheses})配置结构:
task_name: your_task_name
train_data_path: ./your_task_train.json
val_data_path: ./your_task_val.json
test_data_path: ./your_task_test.json
prompt_templates:
# 用于可重用提示组件的额外键
observations: |
Feature 1: ${text_features_1}
Feature 2: ${text_features_2}
Observation: ${label}
# 必需模板
batched_generation:
system: "Your system prompt here"
user: "Your user prompt with ${num_hypotheses} placeholder"
inference:
system: "Your inference system prompt"
user: "Your inference user prompt"
# 高级功能的可选模板
few_shot_baseline: {...}
is_relevant: {...}
adaptive_inference: {...}
adaptive_selection: {...}
请参考 references/config_template.yaml 获取完整的示例配置。
要使用基于文献的假设生成,您必须预处理 PDF 论文:
步骤 1:设置 GROBID(仅首次)
bash ./modules/setup_grobid.sh
步骤 2:添加 PDF 文件 将研究论文放置在 literature/YOUR_TASK_NAME/raw/ 中
步骤 3:处理 PDF
# 启动 GROBID 服务
bash ./modules/run_grobid.sh
# 为您的任务处理 PDF
cd examples
python pdf_preprocess.py --task_name YOUR_TASK_NAME
这将把 PDF 转换为结构化格式以进行假设提取。未来的版本将支持自动化文献搜索。
hypogenic_generation --help
关键参数:
hypogenic_inference --help
关键参数:
为了进行程序化控制和自定义工作流,请在您的 Python 代码中直接使用 Hypogenic:
from hypogenic import BaseTask
# 首先克隆示例数据集
# git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data
# 使用自定义 extract_label 函数加载您的任务
task = BaseTask(
config_path="./data/your_task/config.yaml",
extract_label=lambda text: extract_your_label(text)
)
# 生成假设
task.generate_hypotheses(
method="hypogenic",
num_hypotheses=20,
output_path="./output/hypotheses.json"
)
# 运行推理
results = task.inference(
hypothesis_bank="./output/hypotheses.json",
test_data="./data/your_task/your_task_test.json"
)
# 用于文献整合方法
# git clone https://github.com/ChicagoHAI/Hypothesis-agent-datasets.git ./data
# 使用 HypoRefine 生成
task.generate_hypotheses(
method="hyporefine",
num_hypotheses=15,
literature_path="./literature/your_task/",
output_path="./output/"
)
# 这将生成 3 个假设库:
# - HypoRefine(整合方法)
# - 纯文献假设
# - Literature∪HypoRefine(并集)
from examples.multi_hyp_inference import run_multi_hypothesis_inference
# 同时测试多个假设
results = run_multi_hypothesis_inference(
config_path="./data/your_task/config.yaml",
hypothesis_bank="./output/hypotheses.json",
test_data="./data/your_task/your_task_test.json"
)
extract_label() 函数对于解析 LLM 输出至关重要。根据您的任务实现它:
def extract_label(llm_output: str) -> str:
"""从 LLM 推理文本中提取预测标签。
默认行为:搜索 'final answer:\s+(.*)' 模式。
为您的领域特定输出格式进行自定义。
"""
import re
match = re.search(r'final answer:\s+(.*)', llm_output, re.IGNORECASE)
if match:
return match.group(1).strip()
return llm_output.strip()
重要: 提取的标签必须与数据集中 label 值的格式匹配,以确保准确度计算正确。
场景: 在没有先验理论框架的情况下检测 AI 生成的内容
步骤:
准备包含文本样本和标签(人类 vs AI 生成)的数据集
使用适当的提示模板创建 config.yaml
运行假设生成:
hypogenic_generation --config config.yaml --method hypogenic --num_hypotheses 20
在测试集上运行推理:
hypogenic_inference --config config.yaml --hypotheses output/hypotheses.json --test_data data/test.json
分析结果中的模式,如正式程度、语法精确度和语气差异
场景: 基于现有研究检测酒店评论中的欺骗行为
步骤:
收集 10 篇关于语言欺骗线索的相关论文
准备包含真实和欺诈性评论的数据集
配置 config.yaml,包含文献处理和数据生成模板
运行 HypoRefine:
hypogenic_generation --config config.yaml --method hyporefine --papers papers/ --num_hypotheses 15
测试检查代词频率、细节特异性和其他语言模式的假设
比较基于文献的假设和数据驱动假设的性能
场景: 最大化假设多样性的心理压力检测
步骤:
从心理健康研究论文中生成文献假设
从社交媒体帖子中生成数据驱动假设
运行 Union 方法进行组合和去重:
hypogenic_generation --config config.yaml --method union --literature_hypotheses lit_hyp.json
推理捕捉了理论概念(发帖行为变化)和数据模式(情感语言转变)
缓存: 启用 Redis 缓存以降低重复 LLM 调用的 API 成本和计算时间
并行处理: 利用多个工作器进行大规模假设生成和测试
自适应优化: 使用具有挑战性的示例迭代提高假设质量
使用 hypogenic 的研究已证明:
问题: 生成的假设过于通用 解决方案: 优化 config.yaml 中的提示模板,要求更具体、可测试的假设
问题: 推理性能不佳 解决方案: 确保数据集有足够的训练示例,调整假设生成参数,或增加假设数量
问题: 标签提取失败 解决方案: 为领域特定输出解析实现自定义 extract_label() 函数
问题: GROBID PDF 处理失败 解决方案: 确保 GROBID 服务正在运行(bash ./modules/run_grobid.sh)且 PDF 是有效的研究论文
要向 Hypogenic 添加新任务或数据集:
按照所需格式创建三个 JSON 文件:
your_task_train.jsonyour_task_val.jsonyour_task_test.json每个文件必须具有文本特征键(text_features_1 等)和 label。
定义您的任务配置,包括:
${text_features_1}、${num_hypotheses})创建一个自定义标签提取函数,用于解析您领域的 LLM 输出:
from hypogenic import BaseTask
def extract_my_label(llm_output: str) -> str:
"""为您的任务自定义标签提取。
必须返回与数据集 'label' 字段相同格式的标签。
"""
# 示例:从特定格式中提取
if "Final prediction:" in llm_output:
return llm_output.split("Final prediction:")[-1].strip()
# 回退到默认模式
import re
match = re.search(r'final answer:\s+(.*)', llm_output, re.IGNORECASE)
return match.group(1).strip() if match else llm_output.strip()
# 使用您的自定义任务
task = BaseTask(
config_path="./your_task/config.yaml",
extract_label=extract_my_label
)
对于 HypoRefine/Union 方法:
literature/your_task_name/raw/ 目录pdf_preprocess.py 进行处理使用 CLI 或 Python API 运行假设生成和推理:
# CLI 方法
hypogenic_generation --config your_task/config.yaml --method hypogenic --num_hypotheses 20
hypogenic_inference --config your_task/config.yaml --hypotheses output/hypotheses.json
# 或使用 Python API(参见 Python API 使用部分)
了解仓库布局:
hypothesis-generation/
├── hypogenic/ # 核心包代码
├── hypogenic_cmd/ # CLI 入口点
├── hypothesis_agent/ # HypoRefine 智能体框架
├── literature/ # 文献处理工具
├── modules/ # GROBID 和预处理模块
├── examples/ # 示例脚本
│ ├── generation.py # 基础 HypoGeniC 生成
│ ├── union_generation.py # HypoRefine/Union 生成
│ ├── inference.py # 单假设推理
│ ├── multi_hyp_inference.py # 多假设推理
│ └── pdf_preprocess.py # 文献 PDF 处理
├── data/ # 示例数据集(单独克隆)
├── tests/ # 单元测试
└── IO_prompting/ # 提示模板和实验
关键目录:
Liu, H., Huang, S., Hu, J., Zhou, Y., & Tan, C. (2025). HypoBench: Towards Systematic and Principled Benchmarking for Hypothesis Generation. arXiv preprint arXiv:2504.11524.
BibTeX:
@misc{liu2025hypobenchsystematicprincipledbenchmarking,
title={HypoBench: Towards Systematic and Principled Benchmarking for Hypothesis Generation},
author={Haokun Liu and Sicong Huang and Jingyu Hu and Yangqiaoyu Zhou and Chenhao Tan},
year={2025},
eprint={2504.11524},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2504.11524},
}
Liu, H., Zhou, Y., Li, M., Yuan, C., & Tan, C. (2024). Literature Meets Data: A Synergistic Approach to Hypothesis Generation. arXiv preprint arXiv:2410.17309.
BibTeX:
@misc{liu2024literaturemeetsdatasynergistic,
title={Literature Meets Data: A Synergistic Approach to Hypothesis Generation},
author={Haokun Liu and Yangqiaoyu Zhou and Mingxuan Li and Chenfei Yuan and Chenhao Tan},
year={2024},
eprint={2410.17309},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2410.17309},
}
Zhou, Y., Liu, H., Srivastava, T., Mei, H., & Tan, C. (2024). Hypothesis Generation with Large Language Models. In Proceedings of EMNLP Workshop of NLP for Science.
BibTeX:
@inproceedings{zhou2024hypothesisgenerationlargelanguage,
title={Hypothesis Generation with Large Language Models},
author={Yangqiaoyu Zhou and Haokun Liu and Tejes Srivastava and Hongyuan Mei and Chenhao Tan},
booktitle = {Proceedings of EMNLP Workshop of NLP for Science},
year={2024},
url={https://aclanthology.org/2024.nlp4science-1.10/},
}
克隆这些仓库以获取即用型示例:
# HypoGeniC 示例(仅数据驱动)
git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data
# HypoRefine/Union 示例(文献 + 数据)
git clone https://github.com/ChicagoHAI/Hypothesis-agent-datasets.git ./data
如需贡献或提问,请访问 GitHub 仓库并查看问题页面。
config_template.yaml - 包含所有必需提示模板和参数的完整示例配置文件。这包括:
scripts 目录可用于:
assets 目录可用于:
每周安装数
117
仓库
GitHub 星标数
22.6K
首次出现
2026年1月21日
安全审计
安装于
claude-code100
opencode91
cursor88
gemini-cli87
antigravity81
codex76
Hypogenic provides automated hypothesis generation and testing using large language models to accelerate scientific discovery. The framework supports three approaches: HypoGeniC (data-driven hypothesis generation), HypoRefine (synergistic literature and data integration), and Union methods (mechanistic combination of literature and data-driven hypotheses).
Get started with Hypogenic in minutes:
# Install the package
uv pip install hypogenic
# Clone example datasets
git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data
# Run basic hypothesis generation
hypogenic_generation --config ./data/your_task/config.yaml --method hypogenic --num_hypotheses 20
# Run inference on generated hypotheses
hypogenic_inference --config ./data/your_task/config.yaml --hypotheses output/hypotheses.json
Or use Python API:
from hypogenic import BaseTask
# Create task with your configuration
task = BaseTask(config_path="./data/your_task/config.yaml")
# Generate hypotheses
task.generate_hypotheses(method="hypogenic", num_hypotheses=20)
# Run inference
results = task.inference(hypothesis_bank="./output/hypotheses.json")
Use this skill when working on:
Automated Hypothesis Generation
Literature Integration
Performance Optimization
Flexible Configuration
Proven Results
Generate hypotheses solely from observational data through iterative refinement.
Process:
Best for: Exploratory research without existing literature, pattern discovery in novel datasets
Synergistically combine existing literature with empirical data through an agentic framework.
Process:
Best for: Research with established theoretical foundations, validating or extending existing theories
Mechanistically combine literature-only hypotheses with framework outputs.
Variants:
Best for: Comprehensive hypothesis coverage, eliminating redundancy while maintaining diverse perspectives
Install via pip:
uv pip install hypogenic
Optional dependencies:
Clone example datasets:
# For HypoGeniC examples
git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data
# For HypoRefine/Union examples
git clone https://github.com/ChicagoHAI/Hypothesis-agent-datasets.git ./data
Datasets must follow HuggingFace datasets format with specific naming conventions:
Required files:
<TASK>_train.json: Training data<TASK>_val.json: Validation data<TASK>_test.json: Test dataRequired keys in JSON:
text_features_1 through text_features_n: Lists of strings containing feature valueslabel: List of strings containing ground truth labelsExample (headline click prediction):
{
"headline_1": [
"What Up, Comet? You Just Got *PROBED*",
"Scientists Made a Breakthrough in Quantum Computing"
],
"headline_2": [
"Scientists Everywhere Were Holding Their Breath Today. Here's Why.",
"New Quantum Computer Achieves Milestone"
],
"label": [
"Headline 2 has more clicks than Headline 1",
"Headline 1 has more clicks than Headline 2"
]
}
Important notes:
extract_label() function output formatreview_text, post_content, etc.)Each task requires a config.yaml file specifying:
Required elements:
Template capabilities:
${text_features_1}, ${num_hypotheses})Configuration structure:
task_name: your_task_name
train_data_path: ./your_task_train.json
val_data_path: ./your_task_val.json
test_data_path: ./your_task_test.json
prompt_templates:
# Extra keys for reusable prompt components
observations: |
Feature 1: ${text_features_1}
Feature 2: ${text_features_2}
Observation: ${label}
# Required templates
batched_generation:
system: "Your system prompt here"
user: "Your user prompt with ${num_hypotheses} placeholder"
inference:
system: "Your inference system prompt"
user: "Your inference user prompt"
# Optional templates for advanced features
few_shot_baseline: {...}
is_relevant: {...}
adaptive_inference: {...}
adaptive_selection: {...}
Refer to references/config_template.yaml for a complete example configuration.
To use literature-based hypothesis generation, you must preprocess PDF papers:
Step 1: Setup GROBID (first time only)
bash ./modules/setup_grobid.sh
Step 2: Add PDF files Place research papers in literature/YOUR_TASK_NAME/raw/
Step 3: Process PDFs
# Start GROBID service
bash ./modules/run_grobid.sh
# Process PDFs for your task
cd examples
python pdf_preprocess.py --task_name YOUR_TASK_NAME
This converts PDFs to structured format for hypothesis extraction. Automated literature search will be supported in future releases.
hypogenic_generation --help
Key parameters:
hypogenic_inference --help
Key parameters:
For programmatic control and custom workflows, use Hypogenic directly in your Python code:
from hypogenic import BaseTask
# Clone example datasets first
# git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data
# Load your task with custom extract_label function
task = BaseTask(
config_path="./data/your_task/config.yaml",
extract_label=lambda text: extract_your_label(text)
)
# Generate hypotheses
task.generate_hypotheses(
method="hypogenic",
num_hypotheses=20,
output_path="./output/hypotheses.json"
)
# Run inference
results = task.inference(
hypothesis_bank="./output/hypotheses.json",
test_data="./data/your_task/your_task_test.json"
)
# For literature-integrated approaches
# git clone https://github.com/ChicagoHAI/Hypothesis-agent-datasets.git ./data
# Generate with HypoRefine
task.generate_hypotheses(
method="hyporefine",
num_hypotheses=15,
literature_path="./literature/your_task/",
output_path="./output/"
)
# This generates 3 hypothesis banks:
# - HypoRefine (integrated approach)
# - Literature-only hypotheses
# - Literature∪HypoRefine (union)
from examples.multi_hyp_inference import run_multi_hypothesis_inference
# Test multiple hypotheses simultaneously
results = run_multi_hypothesis_inference(
config_path="./data/your_task/config.yaml",
hypothesis_bank="./output/hypotheses.json",
test_data="./data/your_task/your_task_test.json"
)
The extract_label() function is critical for parsing LLM outputs. Implement it based on your task:
def extract_label(llm_output: str) -> str:
"""Extract predicted label from LLM inference text.
Default behavior: searches for 'final answer:\s+(.*)' pattern.
Customize for your domain-specific output format.
"""
import re
match = re.search(r'final answer:\s+(.*)', llm_output, re.IGNORECASE)
if match:
return match.group(1).strip()
return llm_output.strip()
Important: Extracted labels must match the format of label values in your dataset for correct accuracy calculation.
Scenario: Detecting AI-generated content without prior theoretical framework
Steps:
Prepare dataset with text samples and labels (human vs. AI-generated)
Create config.yaml with appropriate prompt templates
Run hypothesis generation:
hypogenic_generation --config config.yaml --method hypogenic --num_hypotheses 20
Run inference on test set:
hypogenic_inference --config config.yaml --hypotheses output/hypotheses.json --test_data data/test.json
Analyze results for patterns like formality, grammatical precision, and tone differences
Scenario: Deception detection in hotel reviews building on existing research
Steps:
Collect 10 relevant papers on linguistic deception cues
Prepare dataset with genuine and fraudulent reviews
Configure config.yaml with literature processing and data generation templates
Run HypoRefine:
hypogenic_generation --config config.yaml --method hyporefine --papers papers/ --num_hypotheses 15
Test hypotheses examining pronoun frequency, detail specificity, and other linguistic patterns
Compare literature-based and data-driven hypothesis performance
Scenario: Mental stress detection maximizing hypothesis diversity
Steps:
Generate literature hypotheses from mental health research papers
Generate data-driven hypotheses from social media posts
Run Union method to combine and deduplicate:
hypogenic_generation --config config.yaml --method union --literature_hypotheses lit_hyp.json
Inference captures both theoretical constructs (posting behavior changes) and data patterns (emotional language shifts)
Caching: Enable Redis caching to reduce API costs and computation time for repeated LLM calls
Parallel Processing: Leverage multiple workers for large-scale hypothesis generation and testing
Adaptive Refinement: Use challenging examples to iteratively improve hypothesis quality
Research using hypogenic has demonstrated:
Issue: Generated hypotheses are too generic Solution: Refine prompt templates in config.yaml to request more specific, testable hypotheses
Issue: Poor inference performance Solution: Ensure dataset has sufficient training examples, adjust hypothesis generation parameters, or increase number of hypotheses
Issue: Label extraction failures Solution: Implement custom extract_label() function for domain-specific output parsing
Issue: GROBID PDF processing fails Solution: Ensure GROBID service is running (bash ./modules/run_grobid.sh) and PDFs are valid research papers
To add a new task or dataset to Hypogenic:
Create three JSON files following the required format:
your_task_train.jsonyour_task_val.jsonyour_task_test.jsonEach file must have keys for text features (text_features_1, etc.) and label.
Define your task configuration with:
${text_features_1}, ${num_hypotheses})Create a custom label extraction function that parses LLM outputs for your domain:
from hypogenic import BaseTask
def extract_my_label(llm_output: str) -> str:
"""Custom label extraction for your task.
Must return labels in same format as dataset 'label' field.
"""
# Example: Extract from specific format
if "Final prediction:" in llm_output:
return llm_output.split("Final prediction:")[-1].strip()
# Fallback to default pattern
import re
match = re.search(r'final answer:\s+(.*)', llm_output, re.IGNORECASE)
return match.group(1).strip() if match else llm_output.strip()
# Use your custom task
task = BaseTask(
config_path="./your_task/config.yaml",
extract_label=extract_my_label
)
For HypoRefine/Union methods:
literature/your_task_name/raw/ directorypdf_preprocess.pyRun hypothesis generation and inference using CLI or Python API:
# CLI approach
hypogenic_generation --config your_task/config.yaml --method hypogenic --num_hypotheses 20
hypogenic_inference --config your_task/config.yaml --hypotheses output/hypotheses.json
# Or use Python API (see Python API Usage section)
Understanding the repository layout:
hypothesis-generation/
├── hypogenic/ # Core package code
├── hypogenic_cmd/ # CLI entry points
├── hypothesis_agent/ # HypoRefine agent framework
├── literature/ # Literature processing utilities
├── modules/ # GROBID and preprocessing modules
├── examples/ # Example scripts
│ ├── generation.py # Basic HypoGeniC generation
│ ├── union_generation.py # HypoRefine/Union generation
│ ├── inference.py # Single hypothesis inference
│ ├── multi_hyp_inference.py # Multiple hypothesis inference
│ └── pdf_preprocess.py # Literature PDF processing
├── data/ # Example datasets (clone separately)
├── tests/ # Unit tests
└── IO_prompting/ # Prompt templates and experiments
Key directories:
Liu, H., Huang, S., Hu, J., Zhou, Y., & Tan, C. (2025). HypoBench: Towards Systematic and Principled Benchmarking for Hypothesis Generation. arXiv preprint arXiv:2504.11524.
BibTeX:
@misc{liu2025hypobenchsystematicprincipledbenchmarking,
title={HypoBench: Towards Systematic and Principled Benchmarking for Hypothesis Generation},
author={Haokun Liu and Sicong Huang and Jingyu Hu and Yangqiaoyu Zhou and Chenhao Tan},
year={2025},
eprint={2504.11524},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2504.11524},
}
Liu, H., Zhou, Y., Li, M., Yuan, C., & Tan, C. (2024). Literature Meets Data: A Synergistic Approach to Hypothesis Generation. arXiv preprint arXiv:2410.17309.
BibTeX:
@misc{liu2024literaturemeetsdatasynergistic,
title={Literature Meets Data: A Synergistic Approach to Hypothesis Generation},
author={Haokun Liu and Yangqiaoyu Zhou and Mingxuan Li and Chenfei Yuan and Chenhao Tan},
year={2024},
eprint={2410.17309},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2410.17309},
}
Zhou, Y., Liu, H., Srivastava, T., Mei, H., & Tan, C. (2024). Hypothesis Generation with Large Language Models. In Proceedings of EMNLP Workshop of NLP for Science.
BibTeX:
@inproceedings{zhou2024hypothesisgenerationlargelanguage,
title={Hypothesis Generation with Large Language Models},
author={Yangqiaoyu Zhou and Haokun Liu and Tejes Srivastava and Hongyuan Mei and Chenhao Tan},
booktitle = {Proceedings of EMNLP Workshop of NLP for Science},
year={2024},
url={https://aclanthology.org/2024.nlp4science-1.10/},
}
Clone these repositories for ready-to-use examples:
# HypoGeniC examples (data-driven only)
git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data
# HypoRefine/Union examples (literature + data)
git clone https://github.com/ChicagoHAI/Hypothesis-agent-datasets.git ./data
For contributions or questions, visit the GitHub repository and check the issues page.
config_template.yaml - Complete example configuration file with all required prompt templates and parameters. This includes:
Scripts directory is available for:
Assets directory is available for:
Weekly Installs
117
Repository
GitHub Stars
22.6K
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubWarnSocketPassSnykWarn
Installed on
claude-code100
opencode91
cursor88
gemini-cli87
antigravity81
codex76
AI Elements:基于shadcn/ui的AI原生应用组件库,快速构建对话界面
65,000 周安装
PPTX技能指南:使用Python和pptxgenjs创建、编辑、分析PPTX文件
57,300 周安装
browser-use CLI 浏览器自动化工具:快速持久会话,支持多步骤工作流
61,400 周安装
Python PDF处理教程:合并拆分、提取文本表格、创建PDF文件
62,000 周安装
shadcn/ui 框架:React 组件库与 UI 设计系统,Tailwind CSS 最佳实践
63,700 周安装
AI Elements:基于shadcn/ui的AI原生应用组件库,快速构建对话界面
65,000 周安装
专业SEO审计工具:全面网站诊断、技术SEO优化与页面分析指南
65,800 周安装