重要前提
安装AI Skills的关键前提是:必须科学上网,且开启TUN模式,这一点至关重要,直接决定安装能否顺利完成,在此郑重提醒三遍:科学上网,科学上网,科学上网。查看完整安装教程 →
npx skills add https://github.com/nicepkg/ai-workflow --skill backtest-expert基于专业方法论的系统化交易策略回测方法,优先考虑稳健性而非乐观结果。
目标:寻找"出错最少"的策略,而非纸上"盈利最多"的策略。
原则:增加摩擦,压力测试假设,观察哪些能存活下来。如果一个策略在悲观条件下仍能保持良好表现,那么它在实盘交易中更可能有效。
在以下情况下使用此技能:
用一句话定义策略优势。
示例:"财报发布后跳空高开 >3% 并在第一小时内回撤至前一日收盘价的股票,提供了均值回归机会。"
如果无法清晰阐明策略优势,请不要继续测试。
完全具体地定义:
关键:不允许主观判断。每个决策都必须基于规则且明确无误。
测试范围:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
检查初步结果的基本可行性。如果存在根本性缺陷,则迭代假设。
此处应花费 80% 的测试时间。
参数敏感性:
执行摩擦:
时间稳健性:
样本量:
滚动向前分析:
警告信号:
需要回答的问题:
决策标准:
处处增加摩擦:
理由:能在悲观假设下存活的策略通常在实盘交易中表现更佳。
寻找性能稳定的参数范围,而非产生性能峰值的优化值。
好:止损在 1.5% 到 3.0% 之间任何位置都能盈利的策略 坏:止损必须精确为 2.13% 才能工作的策略
稳定的性能表明真正的优势;狭窄的最优值暗示曲线拟合。
错误方法:研究精选的"市场领导者"成功案例 正确方法:测试符合标准的每只股票,包括失败的股票
选择性示例会产生幸存者偏差并高估策略质量。
直觉:用于生成假设 验证:必须纯粹基于数据驱动
绝不要让对某个想法的情感依恋影响测试结果的解释。
及早识别这些模式以节省时间:
详细示例和诊断框架请参见 references/failed_tests.md。
文件:references/methodology.md
何时阅读:需要特定测试技术的详细指导时。
内容:
文件:references/failed_tests.md
何时阅读:策略测试失败时,或从过去的错误中学习时。
内容:
时间分配:花费 20% 的时间生成想法,80% 的时间试图打破它们。
无上下文要求:如果策略需要"完美上下文"才能工作,那么它对于系统化交易来说不够稳健。
红旗:如果回测结果看起来太好(>90% 胜率、最小回撤、完美时机),请仔细审核是否存在前视偏差或数据问题。
工具限制:了解回测平台的特性(插值方法、低流动性处理、数据对齐问题)。
统计显著性:小的优势需要大的样本量来证明。每笔交易 5% 的优势需要 100+ 笔交易才能与运气区分开来。
此技能专注于系统化/量化回测,其中:
主观交易者的研究方法不同——此技能可能不适用于需要主观判断的设置。
每周安装量
54
代码仓库
GitHub 星标
142
首次出现
2026 年 1 月 24 日
安全审计
安装于
opencode39
gemini-cli33
claude-code33
codex32
cursor31
github-copilot26
Systematic approach to backtesting trading strategies based on professional methodology that prioritizes robustness over optimistic results.
Goal : Find strategies that "break the least", not strategies that "profit the most" on paper.
Principle : Add friction, stress test assumptions, and see what survives. If a strategy holds up under pessimistic conditions, it's more likely to work in live trading.
Use this skill when:
Define the edge in one sentence.
Example : "Stocks that gap up >3% on earnings and pull back to previous day's close within first hour provide mean-reversion opportunity."
If you can't articulate the edge clearly, don't proceed to testing.
Define with complete specificity:
Critical : No subjective judgment allowed. Every decision must be rule-based and unambiguous.
Test over:
Examine initial results for basic viability. If fundamentally broken, iterate on hypothesis.
This is where 80% of testing time should be spent.
Parameter sensitivity :
Execution friction :
Time robustness :
Sample size :
Walk-forward analysis :
Warning signs :
Questions to answer :
Decision criteria :
Add friction everywhere:
Rationale : Strategies that survive pessimistic assumptions often outperform in live trading.
Look for parameter ranges where performance is stable, not optimal values that create performance spikes.
Good : Strategy profitable with stop loss anywhere from 1.5% to 3.0% Bad : Strategy only works with stop loss at exactly 2.13%
Stable performance indicates genuine edge; narrow optima suggest curve-fitting.
Wrong approach : Study hand-picked "market leaders" that worked Right approach : Test every stock that met criteria, including those that failed
Selective examples create survivorship bias and overestimate strategy quality.
Intuition : Useful for generating hypotheses Validation : Must be purely data-driven
Never let attachment to an idea influence interpretation of test results.
Recognize these patterns early to save time:
See references/failed_tests.md for detailed examples and diagnostic framework.
File : references/methodology.md
When to read : For detailed guidance on specific testing techniques.
Contents :
File : references/failed_tests.md
When to read : When strategy fails tests, or learning from past mistakes.
Contents :
Time allocation : Spend 20% generating ideas, 80% trying to break them.
Context-free requirement : If strategy requires "perfect context" to work, it's not robust enough for systematic trading.
Red flag : If backtest results look too good (>90% win rate, minimal drawdowns, perfect timing), audit carefully for look-ahead bias or data issues.
Tool limitations : Understand your backtesting platform's quirks (interpolation methods, handling of low liquidity, data alignment issues).
Statistical significance : Small edges require large sample sizes to prove. 5% edge per trade needs 100+ trades to distinguish from luck.
This skill focuses on systematic/quantitative backtesting where:
Discretionary traders study differently—this skill may not apply to setups requiring subjective judgment.
Weekly Installs
54
Repository
GitHub Stars
142
First Seen
Jan 24, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode39
gemini-cli33
claude-code33
codex32
cursor31
github-copilot26
测试策略完整指南:单元/集成/E2E测试金字塔与自动化实践
11,200 周安装