LLM Council by dair-ai/dair-academy-plugins
npx skills add https://github.com/dair-ai/dair-academy-plugins --skill 'LLM Council'此技能实现了 Karpathy 的 LLM 委员会概念,其中多个开源权重大语言模型对一个查询进行审议,完全由 Fireworks AI 驱动:
所有推理都通过 Fireworks AI 使用开源权重模型运行。Fireworks 的速度和定价使得运行多模型审议变得切实可行,而在其他提供商上运行则可能缓慢或昂贵。
在运行任何阶段之前,请验证 Fireworks API 密钥是否已设置:
if [ -z "$FIREWORKS_API_KEY" ]; then
echo "ERROR: FIREWORKS_API_KEY is not set."
echo "Create a Fireworks AI account at: https://fireworks.ai/"
echo "Then export it in your shell profile (~/.zshrc or ~/.bashrc):"
echo ' export FIREWORKS_API_KEY="your_api_key_here"'
exit 1
fi
echo "FIREWORKS_API_KEY is set."
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
通过 AskUserQuestion(多选)向用户呈现以下选项:
| 模型 | Fireworks ID | 提供商 |
|---|---|---|
| GLM 5 | accounts/fireworks/models/glm-5 | Z.ai |
| DeepSeek V3.1 | accounts/fireworks/models/deepseek-v3p1 | DeepSeek |
| DeepSeek V3.2 | accounts/fireworks/models/deepseek-v3p2 | DeepSeek |
| MiniMax M2.1 | accounts/fireworks/models/minimax-m2p1 | MiniMax |
| Kimi K2.5 | accounts/fireworks/models/kimi-k2p5 | Moonshot |
| Qwen3 235B | accounts/fireworks/models/qwen3-235b-a22b | Alibaba |
| Llama 4 Maverick | accounts/fireworks/models/llama4-maverick-instruct-basic | Meta |
使用 AskUserQuestion 获取:
注意:AskUserQuestion 每个问题最多支持 4 个选项。由于有 7 个模型,可以将模型选择分成两个问题,或者显示最受欢迎的 4 个,并让用户为其余模型输入"其他"。一个好的默认做法是在第一个问题中显示 4 个模型,并注明其他模型可通过"其他"选项获得。根据多样性轮换显示的模型。
模型选择的 AskUserQuestion 示例(显示 4 个,提及其他):
question: "哪些模型应参与 LLM 委员会?(其他可用模型:Llama 4 Maverick, Qwen3 235B, GLM 5)"
header: "模型"
multiSelect: true
options:
- label: "DeepSeek V3.2"
description: "DeepSeek 最新且能力最强的模型"
- label: "MiniMax M2.1"
description: "MiniMax 强大的开源权重模型"
- label: "Kimi K2.5"
description: "Moonshot 强大的开源权重模型"
- label: "DeepSeek V3.1"
description: "DeepSeek 经过验证的推理模型"
主席选择的 AskUserQuestion 示例:
question: "哪个模型应担任主席(综合最终答案)?"
header: "主席"
multiSelect: false
options:
- label: "DeepSeek V3.2 (推荐)"
description: "最新的 DeepSeek,擅长综合分析"
- label: "GLM 5"
description: "强大的推理能力,适合综合"
- label: "Kimi K2.5"
description: "擅长结构化综合"
- label: "MiniMax M2.1"
description: "强大的开源权重模型,适合综合"
使用此映射将用户选择转换为 Fireworks 模型 ID:
MODEL_MAP = {
"GLM 5": "accounts/fireworks/models/glm-5",
"DeepSeek V3.1": "accounts/fireworks/models/deepseek-v3p1",
"DeepSeek V3.2": "accounts/fireworks/models/deepseek-v3p2",
"MiniMax M2.1": "accounts/fireworks/models/minimax-m2p1",
"Kimi K2.5": "accounts/fireworks/models/kimi-k2p5",
"Qwen3 235B": "accounts/fireworks/models/qwen3-235b-a22b",
"Llama 4 Maverick": "accounts/fireworks/models/llama4-maverick-instruct-basic",
}
收集输入后,运行此脚本以并行获取所有选定模型的响应:
QUERY="USER_QUERY_HERE"
MODELS='["accounts/fireworks/models/glm-5", "accounts/fireworks/models/deepseek-v3p1"]'
python3 << 'PYEOF'
import os
import json
import requests
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"
QUERY = os.environ.get("QUERY", "")
MODELS = json.loads(os.environ.get("MODELS", "[]"))
# Create session directory
timestamp = time.strftime("%Y%m%d-%H%M%S")
SESSION_DIR = f"/tmp/llm-council/{timestamp}"
os.makedirs(SESSION_DIR, exist_ok=True)
# Save config
config = {"query": QUERY, "models": MODELS, "timestamp": timestamp}
with open(f"{SESSION_DIR}/config.json", "w") as f:
json.dump(config, f, indent=2)
def call_model(model_id, query):
"""Call a single model via Fireworks AI"""
try:
start = time.time()
response = requests.post(
API_URL,
headers={
"Authorization": f"Bearer {FIREWORKS_API_KEY}",
"Content-Type": "application/json"
},
json={
"model": model_id,
"messages": [
{"role": "system", "content": "You are participating in an LLM council deliberation. Provide your best, most thoughtful response to the query. Be comprehensive but focused."},
{"role": "user", "content": query}
],
"max_tokens": 4000,
"temperature": 1
},
timeout=120
)
response.raise_for_status()
elapsed = time.time() - start
data = response.json()
usage = data.get("usage", {})
return {
"success": True,
"content": data["choices"][0]["message"]["content"],
"model": model_id,
"latency_seconds": round(elapsed, 2),
"tokens": {
"prompt": usage.get("prompt_tokens", 0),
"completion": usage.get("completion_tokens", 0),
"total": usage.get("total_tokens", 0)
}
}
except Exception as e:
return {
"success": False,
"content": f"[ERROR: {str(e)}]",
"model": model_id,
"latency_seconds": 0,
"tokens": {"prompt": 0, "completion": 0, "total": 0}
}
print(f"\n{'='*60}")
print("PHASE 1: Collecting Individual Responses")
print(f"{'='*60}")
print(f"Query: {QUERY[:200]}...")
print(f"Models: {', '.join([m.split('/')[-1] for m in MODELS])}")
print(f"Session: {SESSION_DIR}")
print()
# Parallel execution
results = {}
with ThreadPoolExecutor(max_workers=len(MODELS)) as executor:
futures = {executor.submit(call_model, m, QUERY): m for m in MODELS}
for future in as_completed(futures):
model = futures[future]
result = future.result()
results[model] = result
status = "OK" if result["success"] else "FAILED"
latency = f"{result['latency_seconds']}s" if result["success"] else "N/A"
print(f" [{status}] {model.split('/')[-1]} ({latency})")
# Save raw results
with open(f"{SESSION_DIR}/phase1_responses.json", "w") as f:
json.dump(results, f, indent=2)
print(f"\nPhase 1 complete. Results saved to: {SESSION_DIR}/phase1_responses.json")
print(f"SESSION_DIR={SESSION_DIR}")
PYEOF
每个模型审查并排名阶段 1 中的匿名响应:
SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"
python3 << 'PYEOF'
import os
import json
import requests
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"
SESSION_DIR = os.environ.get("SESSION_DIR")
# Load Phase 1 results
with open(f"{SESSION_DIR}/config.json") as f:
config = json.load(f)
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
phase1_results = json.load(f)
QUERY = config["query"]
MODELS = config["models"]
# Create anonymized mapping
labels = ["A", "B", "C", "D", "E", "F", "G"][:len(MODELS)]
model_to_label = dict(zip(MODELS, labels))
label_to_model = {v: k for k, v in model_to_label.items()}
# Format anonymized responses
anonymized_responses = []
for model_id in MODELS:
label = model_to_label[model_id]
content = phase1_results[model_id]["content"]
anonymized_responses.append(f"=== Response {label} ===\n{content}")
anonymized_text = "\n\n".join(anonymized_responses)
def get_rankings(model_id, query, anonymized, own_label):
"""Get rankings from a single model"""
ranking_prompt = f"""You are evaluating responses from multiple AI models to this query:
QUERY: {query}
Here are the anonymized responses:
{anonymized}
Please rank these responses from BEST to WORST. For each ranking:
1. State the response letter (A, B, C, etc.)
2. Give a brief reason (1-2 sentences)
3. You may skip ranking your own response (labeled {own_label}) or rank it fairly
Format your response EXACTLY as:
RANKINGS:
1. [Letter] - [Brief reason]
2. [Letter] - [Brief reason]
3. [Letter] - [Brief reason]
..."""
try:
start = time.time()
response = requests.post(
API_URL,
headers={
"Authorization": f"Bearer {FIREWORKS_API_KEY}",
"Content-Type": "application/json"
},
json={
"model": model_id,
"messages": [
{"role": "system", "content": f"You are ranking AI responses objectively. Your own response is labeled '{own_label}'."},
{"role": "user", "content": ranking_prompt}
],
"max_tokens": 1000,
"temperature": 1
},
timeout=90
)
response.raise_for_status()
elapsed = time.time() - start
return {
"success": True,
"content": response.json()["choices"][0]["message"]["content"],
"model": model_id,
"latency_seconds": round(elapsed, 2)
}
except Exception as e:
return {
"success": False,
"content": f"[ERROR: {str(e)}]",
"model": model_id,
"latency_seconds": 0
}
print(f"\n{'='*60}")
print("PHASE 2: Cross-Model Ranking")
print(f"{'='*60}")
print(f"Label mapping: {json.dumps({v: k.split('/')[-1] for k, v in model_to_label.items()})}")
print()
# Collect rankings from all models in parallel
rankings = {}
with ThreadPoolExecutor(max_workers=len(MODELS)) as executor:
futures = {
executor.submit(get_rankings, mid, QUERY, anonymized_text, model_to_label[mid]): mid
for mid in MODELS
}
for future in as_completed(futures):
model = futures[future]
result = future.result()
rankings[model] = result
status = "OK" if result["success"] else "FAILED"
latency = f"{result['latency_seconds']}s" if result["success"] else "N/A"
print(f" [{status}] {model.split('/')[-1]} ({latency})")
# Save rankings
output = {
"label_mapping": label_to_model,
"model_to_label": model_to_label,
"rankings": rankings
}
with open(f"{SESSION_DIR}/phase2_rankings.json", "w") as f:
json.dump(output, f, indent=2)
print(f"\nPhase 2 complete. Rankings saved to: {SESSION_DIR}/phase2_rankings.json")
PYEOF
主席模型接收所有响应和排名,然后生成最终的综合结果:
SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"
CHAIRMAN_MODEL="accounts/fireworks/models/glm-5"
python3 << 'PYEOF'
import os
import json
import requests
import time
FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"
SESSION_DIR = os.environ.get("SESSION_DIR")
CHAIRMAN_MODEL = os.environ.get("CHAIRMAN_MODEL")
# Load all previous results
with open(f"{SESSION_DIR}/config.json") as f:
config = json.load(f)
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
phase1 = json.load(f)
with open(f"{SESSION_DIR}/phase2_rankings.json") as f:
phase2 = json.load(f)
QUERY = config["query"]
label_to_model = phase2["label_mapping"]
model_to_label = phase2["model_to_label"]
# Format responses with model names revealed
responses_text = []
for model_id, result in phase1.items():
label = model_to_label.get(model_id, "?")
model_name = model_id.split("/")[-1]
responses_text.append(f"=== {label}: {model_name} ===\n{result['content']}")
# Format rankings
rankings_text = []
for model_id, result in phase2["rankings"].items():
model_name = model_id.split("/")[-1]
rankings_text.append(f"[{model_name}'s Rankings]\n{result['content']}")
synthesis_prompt = f"""You are the Chairman of an LLM Council. Your task is to synthesize the best possible answer from multiple AI responses.
ORIGINAL QUERY:
{QUERY}
INDIVIDUAL RESPONSES:
{chr(10).join(responses_text)}
MODEL RANKINGS:
{chr(10).join(rankings_text)}
As Chairman, produce a FINAL SYNTHESIS that:
1. Incorporates the strongest elements from the best-ranked responses
2. Resolves any contradictions between responses
3. Addresses aspects that multiple models agreed on
4. Corrects any errors identified through cross-ranking
5. Provides the most complete, accurate, and helpful answer
Begin your synthesis:"""
print(f"\n{'='*60}")
print("PHASE 3: Chairman Synthesis")
print(f"{'='*60}")
print(f"Chairman: {CHAIRMAN_MODEL.split('/')[-1]}")
print()
try:
start = time.time()
response = requests.post(
API_URL,
headers={
"Authorization": f"Bearer {FIREWORKS_API_KEY}",
"Content-Type": "application/json"
},
json={
"model": CHAIRMAN_MODEL,
"messages": [
{"role": "system", "content": "You are the Chairman of an LLM Council. Synthesize multiple AI perspectives into a definitive, comprehensive response."},
{"role": "user", "content": synthesis_prompt}
],
"max_tokens": 4000,
"temperature": 1
},
timeout=180
)
response.raise_for_status()
elapsed = time.time() - start
synthesis = response.json()["choices"][0]["message"]["content"]
with open(f"{SESSION_DIR}/phase3_synthesis.txt", "w") as f:
f.write(synthesis)
print(f"Phase 3 complete ({elapsed:.2f}s). Synthesis saved to: {SESSION_DIR}/phase3_synthesis.txt")
except Exception as e:
print(f"ERROR: {e}")
synthesis = f"[ERROR: {str(e)}]"
with open(f"{SESSION_DIR}/phase3_synthesis.txt", "w") as f:
f.write(synthesis)
# Update config with chairman
config["chairman"] = CHAIRMAN_MODEL
with open(f"{SESSION_DIR}/config.json", "w") as f:
json.dump(config, f, indent=2)
PYEOF
读取所有保存的文件并显示完整的委员会审议过程:
SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"
python3 << 'PYEOF'
import os
import json
SESSION_DIR = os.environ.get("SESSION_DIR")
# Load all data
with open(f"{SESSION_DIR}/config.json") as f:
config = json.load(f)
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
phase1 = json.load(f)
with open(f"{SESSION_DIR}/phase2_rankings.json") as f:
phase2 = json.load(f)
with open(f"{SESSION_DIR}/phase3_synthesis.txt") as f:
synthesis = f.read()
model_to_label = phase2["model_to_label"]
label_to_model = phase2["label_mapping"]
# Build formatted output
output = []
output.append("=" * 70)
output.append(" LLM COUNCIL DELIBERATION")
output.append(" Powered by Fireworks AI")
output.append("=" * 70)
output.append("")
output.append(f"QUERY: {config['query']}")
output.append(f"COUNCIL: {', '.join([m.split('/')[-1] for m in config['models']])}")
output.append(f"CHAIRMAN: {config.get('chairman', 'N/A').split('/')[-1]}")
output.append("")
# Phase 1: Individual Responses
output.append("-" * 70)
output.append(" PHASE 1: INDIVIDUAL RESPONSES")
output.append("-" * 70)
output.append("")
for model_id, result in phase1.items():
model_name = model_id.split("/")[-1]
label = model_to_label.get(model_id, "?")
latency = result.get("latency_seconds", "N/A")
tokens = result.get("tokens", {})
output.append(f"[{label}] {model_name} (latency: {latency}s, tokens: {tokens.get('total', 'N/A')})")
output.append("-" * 40)
output.append(result["content"])
output.append("")
# Phase 2: Cross-Model Rankings
output.append("-" * 70)
output.append(" PHASE 2: CROSS-MODEL RANKINGS")
output.append("-" * 70)
output.append("")
output.append(f"Label mapping: {json.dumps({v: k.split('/')[-1] for k, v in model_to_label.items()}, indent=2)}")
output.append("")
for model_id, result in phase2["rankings"].items():
model_name = model_id.split("/")[-1]
output.append(f"[{model_name}'s Rankings]")
output.append(result["content"])
output.append("")
# Phase 3: Chairman Synthesis
output.append("-" * 70)
output.append(" PHASE 3: CHAIRMAN'S SYNTHESIS")
output.append("-" * 70)
output.append("")
chairman_name = config.get("chairman", "Chairman").split("/")[-1]
output.append(f"[{chairman_name} - Chairman]")
output.append("")
output.append(synthesis)
output.append("")
output.append("=" * 70)
output.append(f"Session files: {SESSION_DIR}/")
# Save formatted output
final_output = "\n".join(output)
with open(f"{SESSION_DIR}/final_output.md", "w") as f:
f.write(final_output)
print(final_output)
print(f"\nFull output saved to: {SESSION_DIR}/final_output.md")
PYEOF
/tmp/llm-council/{timestamp}/ 中创建一个唯一的会话在 https://fireworks.ai/ 创建一个 Fireworks AI 账户,并从仪表板获取您的 API 密钥
在您的 shell 配置文件中导出它:
export FIREWORKS_API_KEY="your_api_key_here"
重新启动您的终端或运行 source ~/.zshrc
当您想就某个问题获得多个开源权重 AI 的视角时,调用此技能
每周安装次数
0
仓库
GitHub 星标数
85
首次出现
1970年1月1日
安全审计
This skill implements Karpathy's LLM Council concept where multiple open-weight LLMs deliberate on a query, powered entirely by Fireworks AI:
All inference runs through Fireworks AI using open-weight models. The speed and pricing of Fireworks makes it practical to run multi-model deliberation that would be slow or expensive on other providers.
Before running any phase, verify the Fireworks API key is set:
if [ -z "$FIREWORKS_API_KEY" ]; then
echo "ERROR: FIREWORKS_API_KEY is not set."
echo "Create a Fireworks AI account at: https://fireworks.ai/"
echo "Then export it in your shell profile (~/.zshrc or ~/.bashrc):"
echo ' export FIREWORKS_API_KEY="your_api_key_here"'
exit 1
fi
echo "FIREWORKS_API_KEY is set."
Present these options to the user via AskUserQuestion (multiselect):
| Model | Fireworks ID | Provider |
|---|---|---|
| GLM 5 | accounts/fireworks/models/glm-5 | Z.ai |
| DeepSeek V3.1 | accounts/fireworks/models/deepseek-v3p1 | DeepSeek |
| DeepSeek V3.2 | accounts/fireworks/models/deepseek-v3p2 | DeepSeek |
| MiniMax M2.1 | accounts/fireworks/models/minimax-m2p1 | MiniMax |
| Kimi K2.5 | accounts/fireworks/models/kimi-k2p5 | Moonshot |
| Qwen3 235B | accounts/fireworks/models/qwen3-235b-a22b | Alibaba |
| Llama 4 Maverick | accounts/fireworks/models/llama4-maverick-instruct-basic | Meta |
Use AskUserQuestion to get:
Note: AskUserQuestion supports max 4 options per question. Since there are 7 models, split model selection across two questions, or show the most popular 4 and let the user type "Other" for the rest. A good default is to show 4 models in the first question and note the others are available via "Other". Rotate which models are shown based on variety.
Example AskUserQuestion for model selection (show 4, mention others):
question: "Which models should participate in the LLM Council? (Also available via Other: Llama 4 Maverick, Qwen3 235B, GLM 5)"
header: "Models"
multiSelect: true
options:
- label: "DeepSeek V3.2"
description: "DeepSeek's newest and most capable model"
- label: "MiniMax M2.1"
description: "MiniMax's strong open-weight model"
- label: "Kimi K2.5"
description: "Moonshot's strong open-weight model"
- label: "DeepSeek V3.1"
description: "DeepSeek's proven reasoning model"
Example AskUserQuestion for chairman:
question: "Which model should be the Chairman (synthesizes the final answer)?"
header: "Chairman"
multiSelect: false
options:
- label: "DeepSeek V3.2 (Recommended)"
description: "Newest DeepSeek, strong at comprehensive analysis"
- label: "GLM 5"
description: "Strong reasoning for synthesis"
- label: "Kimi K2.5"
description: "Strong at structured synthesis"
- label: "MiniMax M2.1"
description: "Strong open-weight model for synthesis"
Use this mapping to convert user selections to Fireworks model IDs:
MODEL_MAP = {
"GLM 5": "accounts/fireworks/models/glm-5",
"DeepSeek V3.1": "accounts/fireworks/models/deepseek-v3p1",
"DeepSeek V3.2": "accounts/fireworks/models/deepseek-v3p2",
"MiniMax M2.1": "accounts/fireworks/models/minimax-m2p1",
"Kimi K2.5": "accounts/fireworks/models/kimi-k2p5",
"Qwen3 235B": "accounts/fireworks/models/qwen3-235b-a22b",
"Llama 4 Maverick": "accounts/fireworks/models/llama4-maverick-instruct-basic",
}
After gathering input, run this script to get responses from all selected models in parallel:
QUERY="USER_QUERY_HERE"
MODELS='["accounts/fireworks/models/glm-5", "accounts/fireworks/models/deepseek-v3p1"]'
python3 << 'PYEOF'
import os
import json
import requests
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"
QUERY = os.environ.get("QUERY", "")
MODELS = json.loads(os.environ.get("MODELS", "[]"))
# Create session directory
timestamp = time.strftime("%Y%m%d-%H%M%S")
SESSION_DIR = f"/tmp/llm-council/{timestamp}"
os.makedirs(SESSION_DIR, exist_ok=True)
# Save config
config = {"query": QUERY, "models": MODELS, "timestamp": timestamp}
with open(f"{SESSION_DIR}/config.json", "w") as f:
json.dump(config, f, indent=2)
def call_model(model_id, query):
"""Call a single model via Fireworks AI"""
try:
start = time.time()
response = requests.post(
API_URL,
headers={
"Authorization": f"Bearer {FIREWORKS_API_KEY}",
"Content-Type": "application/json"
},
json={
"model": model_id,
"messages": [
{"role": "system", "content": "You are participating in an LLM council deliberation. Provide your best, most thoughtful response to the query. Be comprehensive but focused."},
{"role": "user", "content": query}
],
"max_tokens": 4000,
"temperature": 1
},
timeout=120
)
response.raise_for_status()
elapsed = time.time() - start
data = response.json()
usage = data.get("usage", {})
return {
"success": True,
"content": data["choices"][0]["message"]["content"],
"model": model_id,
"latency_seconds": round(elapsed, 2),
"tokens": {
"prompt": usage.get("prompt_tokens", 0),
"completion": usage.get("completion_tokens", 0),
"total": usage.get("total_tokens", 0)
}
}
except Exception as e:
return {
"success": False,
"content": f"[ERROR: {str(e)}]",
"model": model_id,
"latency_seconds": 0,
"tokens": {"prompt": 0, "completion": 0, "total": 0}
}
print(f"\n{'='*60}")
print("PHASE 1: Collecting Individual Responses")
print(f"{'='*60}")
print(f"Query: {QUERY[:200]}...")
print(f"Models: {', '.join([m.split('/')[-1] for m in MODELS])}")
print(f"Session: {SESSION_DIR}")
print()
# Parallel execution
results = {}
with ThreadPoolExecutor(max_workers=len(MODELS)) as executor:
futures = {executor.submit(call_model, m, QUERY): m for m in MODELS}
for future in as_completed(futures):
model = futures[future]
result = future.result()
results[model] = result
status = "OK" if result["success"] else "FAILED"
latency = f"{result['latency_seconds']}s" if result["success"] else "N/A"
print(f" [{status}] {model.split('/')[-1]} ({latency})")
# Save raw results
with open(f"{SESSION_DIR}/phase1_responses.json", "w") as f:
json.dump(results, f, indent=2)
print(f"\nPhase 1 complete. Results saved to: {SESSION_DIR}/phase1_responses.json")
print(f"SESSION_DIR={SESSION_DIR}")
PYEOF
Each model reviews and ranks the anonymized responses from Phase 1:
SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"
python3 << 'PYEOF'
import os
import json
import requests
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"
SESSION_DIR = os.environ.get("SESSION_DIR")
# Load Phase 1 results
with open(f"{SESSION_DIR}/config.json") as f:
config = json.load(f)
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
phase1_results = json.load(f)
QUERY = config["query"]
MODELS = config["models"]
# Create anonymized mapping
labels = ["A", "B", "C", "D", "E", "F", "G"][:len(MODELS)]
model_to_label = dict(zip(MODELS, labels))
label_to_model = {v: k for k, v in model_to_label.items()}
# Format anonymized responses
anonymized_responses = []
for model_id in MODELS:
label = model_to_label[model_id]
content = phase1_results[model_id]["content"]
anonymized_responses.append(f"=== Response {label} ===\n{content}")
anonymized_text = "\n\n".join(anonymized_responses)
def get_rankings(model_id, query, anonymized, own_label):
"""Get rankings from a single model"""
ranking_prompt = f"""You are evaluating responses from multiple AI models to this query:
QUERY: {query}
Here are the anonymized responses:
{anonymized}
Please rank these responses from BEST to WORST. For each ranking:
1. State the response letter (A, B, C, etc.)
2. Give a brief reason (1-2 sentences)
3. You may skip ranking your own response (labeled {own_label}) or rank it fairly
Format your response EXACTLY as:
RANKINGS:
1. [Letter] - [Brief reason]
2. [Letter] - [Brief reason]
3. [Letter] - [Brief reason]
..."""
try:
start = time.time()
response = requests.post(
API_URL,
headers={
"Authorization": f"Bearer {FIREWORKS_API_KEY}",
"Content-Type": "application/json"
},
json={
"model": model_id,
"messages": [
{"role": "system", "content": f"You are ranking AI responses objectively. Your own response is labeled '{own_label}'."},
{"role": "user", "content": ranking_prompt}
],
"max_tokens": 1000,
"temperature": 1
},
timeout=90
)
response.raise_for_status()
elapsed = time.time() - start
return {
"success": True,
"content": response.json()["choices"][0]["message"]["content"],
"model": model_id,
"latency_seconds": round(elapsed, 2)
}
except Exception as e:
return {
"success": False,
"content": f"[ERROR: {str(e)}]",
"model": model_id,
"latency_seconds": 0
}
print(f"\n{'='*60}")
print("PHASE 2: Cross-Model Ranking")
print(f"{'='*60}")
print(f"Label mapping: {json.dumps({v: k.split('/')[-1] for k, v in model_to_label.items()})}")
print()
# Collect rankings from all models in parallel
rankings = {}
with ThreadPoolExecutor(max_workers=len(MODELS)) as executor:
futures = {
executor.submit(get_rankings, mid, QUERY, anonymized_text, model_to_label[mid]): mid
for mid in MODELS
}
for future in as_completed(futures):
model = futures[future]
result = future.result()
rankings[model] = result
status = "OK" if result["success"] else "FAILED"
latency = f"{result['latency_seconds']}s" if result["success"] else "N/A"
print(f" [{status}] {model.split('/')[-1]} ({latency})")
# Save rankings
output = {
"label_mapping": label_to_model,
"model_to_label": model_to_label,
"rankings": rankings
}
with open(f"{SESSION_DIR}/phase2_rankings.json", "w") as f:
json.dump(output, f, indent=2)
print(f"\nPhase 2 complete. Rankings saved to: {SESSION_DIR}/phase2_rankings.json")
PYEOF
The Chairman model receives all responses and rankings, then produces the final synthesis:
SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"
CHAIRMAN_MODEL="accounts/fireworks/models/glm-5"
python3 << 'PYEOF'
import os
import json
import requests
import time
FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"
SESSION_DIR = os.environ.get("SESSION_DIR")
CHAIRMAN_MODEL = os.environ.get("CHAIRMAN_MODEL")
# Load all previous results
with open(f"{SESSION_DIR}/config.json") as f:
config = json.load(f)
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
phase1 = json.load(f)
with open(f"{SESSION_DIR}/phase2_rankings.json") as f:
phase2 = json.load(f)
QUERY = config["query"]
label_to_model = phase2["label_mapping"]
model_to_label = phase2["model_to_label"]
# Format responses with model names revealed
responses_text = []
for model_id, result in phase1.items():
label = model_to_label.get(model_id, "?")
model_name = model_id.split("/")[-1]
responses_text.append(f"=== {label}: {model_name} ===\n{result['content']}")
# Format rankings
rankings_text = []
for model_id, result in phase2["rankings"].items():
model_name = model_id.split("/")[-1]
rankings_text.append(f"[{model_name}'s Rankings]\n{result['content']}")
synthesis_prompt = f"""You are the Chairman of an LLM Council. Your task is to synthesize the best possible answer from multiple AI responses.
ORIGINAL QUERY:
{QUERY}
INDIVIDUAL RESPONSES:
{chr(10).join(responses_text)}
MODEL RANKINGS:
{chr(10).join(rankings_text)}
As Chairman, produce a FINAL SYNTHESIS that:
1. Incorporates the strongest elements from the best-ranked responses
2. Resolves any contradictions between responses
3. Addresses aspects that multiple models agreed on
4. Corrects any errors identified through cross-ranking
5. Provides the most complete, accurate, and helpful answer
Begin your synthesis:"""
print(f"\n{'='*60}")
print("PHASE 3: Chairman Synthesis")
print(f"{'='*60}")
print(f"Chairman: {CHAIRMAN_MODEL.split('/')[-1]}")
print()
try:
start = time.time()
response = requests.post(
API_URL,
headers={
"Authorization": f"Bearer {FIREWORKS_API_KEY}",
"Content-Type": "application/json"
},
json={
"model": CHAIRMAN_MODEL,
"messages": [
{"role": "system", "content": "You are the Chairman of an LLM Council. Synthesize multiple AI perspectives into a definitive, comprehensive response."},
{"role": "user", "content": synthesis_prompt}
],
"max_tokens": 4000,
"temperature": 1
},
timeout=180
)
response.raise_for_status()
elapsed = time.time() - start
synthesis = response.json()["choices"][0]["message"]["content"]
with open(f"{SESSION_DIR}/phase3_synthesis.txt", "w") as f:
f.write(synthesis)
print(f"Phase 3 complete ({elapsed:.2f}s). Synthesis saved to: {SESSION_DIR}/phase3_synthesis.txt")
except Exception as e:
print(f"ERROR: {e}")
synthesis = f"[ERROR: {str(e)}]"
with open(f"{SESSION_DIR}/phase3_synthesis.txt", "w") as f:
f.write(synthesis)
# Update config with chairman
config["chairman"] = CHAIRMAN_MODEL
with open(f"{SESSION_DIR}/config.json", "w") as f:
json.dump(config, f, indent=2)
PYEOF
Read all saved files and display the complete council deliberation:
SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"
python3 << 'PYEOF'
import os
import json
SESSION_DIR = os.environ.get("SESSION_DIR")
# Load all data
with open(f"{SESSION_DIR}/config.json") as f:
config = json.load(f)
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
phase1 = json.load(f)
with open(f"{SESSION_DIR}/phase2_rankings.json") as f:
phase2 = json.load(f)
with open(f"{SESSION_DIR}/phase3_synthesis.txt") as f:
synthesis = f.read()
model_to_label = phase2["model_to_label"]
label_to_model = phase2["label_mapping"]
# Build formatted output
output = []
output.append("=" * 70)
output.append(" LLM COUNCIL DELIBERATION")
output.append(" Powered by Fireworks AI")
output.append("=" * 70)
output.append("")
output.append(f"QUERY: {config['query']}")
output.append(f"COUNCIL: {', '.join([m.split('/')[-1] for m in config['models']])}")
output.append(f"CHAIRMAN: {config.get('chairman', 'N/A').split('/')[-1]}")
output.append("")
# Phase 1: Individual Responses
output.append("-" * 70)
output.append(" PHASE 1: INDIVIDUAL RESPONSES")
output.append("-" * 70)
output.append("")
for model_id, result in phase1.items():
model_name = model_id.split("/")[-1]
label = model_to_label.get(model_id, "?")
latency = result.get("latency_seconds", "N/A")
tokens = result.get("tokens", {})
output.append(f"[{label}] {model_name} (latency: {latency}s, tokens: {tokens.get('total', 'N/A')})")
output.append("-" * 40)
output.append(result["content"])
output.append("")
# Phase 2: Cross-Model Rankings
output.append("-" * 70)
output.append(" PHASE 2: CROSS-MODEL RANKINGS")
output.append("-" * 70)
output.append("")
output.append(f"Label mapping: {json.dumps({v: k.split('/')[-1] for k, v in model_to_label.items()}, indent=2)}")
output.append("")
for model_id, result in phase2["rankings"].items():
model_name = model_id.split("/")[-1]
output.append(f"[{model_name}'s Rankings]")
output.append(result["content"])
output.append("")
# Phase 3: Chairman Synthesis
output.append("-" * 70)
output.append(" PHASE 3: CHAIRMAN'S SYNTHESIS")
output.append("-" * 70)
output.append("")
chairman_name = config.get("chairman", "Chairman").split("/")[-1]
output.append(f"[{chairman_name} - Chairman]")
output.append("")
output.append(synthesis)
output.append("")
output.append("=" * 70)
output.append(f"Session files: {SESSION_DIR}/")
# Save formatted output
final_output = "\n".join(output)
with open(f"{SESSION_DIR}/final_output.md", "w") as f:
f.write(final_output)
print(final_output)
print(f"\nFull output saved to: {SESSION_DIR}/final_output.md")
PYEOF
/tmp/llm-council/{timestamp}/Create a Fireworks AI account at https://fireworks.ai/ and grab your API key from the dashboard
Export it in your shell profile:
export FIREWORKS_API_KEY="your_api_key_here"
Restart your terminal or run source ~/.zshrc
Invoke this skill when you want multiple open-weight AI perspectives on a question
Weekly Installs
0
Repository
GitHub Stars
85
First Seen
Jan 1, 1970
Security Audits
超能力技能使用指南:AI助手技能调用优先级与工作流程详解
43,500 周安装
小说创作工坊:AI辅助三阶段编辑工作流,提升长篇故事一致性
342 周安装
agent-browser 浏览器自动化工具 - 命令行网页操作与测试
411 周安装
编码规范与最佳实践指南:TypeScript/JavaScript和React代码质量原则
311 周安装
nanochat LLM训练教程:单GPU端到端大语言模型训练工具,复现GPT-2仅需48美元
469 周安装
Claude 项目内存状态监控工具 - status 命令:实时检查内存健康与容量
394 周安装
PDF表格提取工具 - 使用Camelot高精度提取复杂表格,支持合并单元格和无边框表格
385 周安装