LLM委员会技能：多模型AI审议工具，Fireworks AI驱动，开源权重模型协同决策

LLM Council by dair-ai/dair-academy-plugins

85 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/dair-ai/dair-academy-plugins --skill 'LLM Council'

AI/机器学习开源项目自然语言处理

🇨🇳中文介绍

LLM 委员会 (Fireworks AI)

此技能实现了 Karpathy 的 LLM 委员会概念，其中多个开源权重大语言模型对一个查询进行审议，完全由 Fireworks AI 驱动：

阶段 1：所有模型独立（并行）响应查询
阶段 2：模型对彼此匿名的响应进行排名
阶段 3：一个主席 LLM 综合出最终答案

所有推理都通过 Fireworks AI 使用开源权重模型运行。Fireworks 的速度和定价使得运行多模型审议变得切实可行，而在其他提供商上运行则可能缓慢或昂贵。

关键规则

始终使用 AskUserQuestion 让用户选择委员会模型（多选）和主席模型
始终将原始响应保存到文件 - 绝不总结或截断 API 输出
始终保持完全透明 - 显示所有单独的响应、所有排名以及最终的综合结果
绝不跳过排名阶段 - 这对委员会审议过程至关重要
从文件中读取以进行显示 - 确保内容未经修改地显示
阶段 3 完成后，始终向用户显示最终输出

飞行前检查

在运行任何阶段之前，请验证 Fireworks API 密钥是否已设置：

if [ -z "$FIREWORKS_API_KEY" ]; then
  echo "ERROR: FIREWORKS_API_KEY is not set."
  echo "Create a Fireworks AI account at: https://fireworks.ai/"
  echo "Then export it in your shell profile (~/.zshrc or ~/.bashrc):"
  echo '  export FIREWORKS_API_KEY="your_api_key_here"'
  exit 1
fi
echo "FIREWORKS_API_KEY is set."

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

相关 Skills

find-skills 技能搜索工具 - Vercel Labs 开源智能体技能包管理器

794,900 周安装

React 组合模式指南：Vercel 组件架构最佳实践，提升代码可维护性

107,800 周安装

AI Elements：基于shadcn/ui的AI原生应用组件库，快速构建对话界面

58,500 周安装

AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具

44,500 周安装

QUERY="USER_QUERY_HERE"
MODELS='["accounts/fireworks/models/glm-5", "accounts/fireworks/models/deepseek-v3p1"]'

python3 << 'PYEOF'
import os
import json
import requests
import time
from concurrent.futures import ThreadPoolExecutor, as_completed

FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"

QUERY = os.environ.get("QUERY", "")
MODELS = json.loads(os.environ.get("MODELS", "[]"))

# Create session directory
timestamp = time.strftime("%Y%m%d-%H%M%S")
SESSION_DIR = f"/tmp/llm-council/{timestamp}"
os.makedirs(SESSION_DIR, exist_ok=True)

# Save config
config = {"query": QUERY, "models": MODELS, "timestamp": timestamp}
with open(f"{SESSION_DIR}/config.json", "w") as f:
    json.dump(config, f, indent=2)

def call_model(model_id, query):
    """Call a single model via Fireworks AI"""
    try:
        start = time.time()
        response = requests.post(
            API_URL,
            headers={
                "Authorization": f"Bearer {FIREWORKS_API_KEY}",
                "Content-Type": "application/json"
            },
            json={
                "model": model_id,
                "messages": [
                    {"role": "system", "content": "You are participating in an LLM council deliberation. Provide your best, most thoughtful response to the query. Be comprehensive but focused."},
                    {"role": "user", "content": query}
                ],
                "max_tokens": 4000,
                "temperature": 1
            },
            timeout=120
        )
        response.raise_for_status()
        elapsed = time.time() - start
        data = response.json()
        usage = data.get("usage", {})
        return {
            "success": True,
            "content": data["choices"][0]["message"]["content"],
            "model": model_id,
            "latency_seconds": round(elapsed, 2),
            "tokens": {
                "prompt": usage.get("prompt_tokens", 0),
                "completion": usage.get("completion_tokens", 0),
                "total": usage.get("total_tokens", 0)
            }
        }
    except Exception as e:
        return {
            "success": False,
            "content": f"[ERROR: {str(e)}]",
            "model": model_id,
            "latency_seconds": 0,
            "tokens": {"prompt": 0, "completion": 0, "total": 0}
        }

print(f"\n{'='*60}")
print("PHASE 1: Collecting Individual Responses")
print(f"{'='*60}")
print(f"Query: {QUERY[:200]}...")
print(f"Models: {', '.join([m.split('/')[-1] for m in MODELS])}")
print(f"Session: {SESSION_DIR}")
print()

# Parallel execution
results = {}
with ThreadPoolExecutor(max_workers=len(MODELS)) as executor:
    futures = {executor.submit(call_model, m, QUERY): m for m in MODELS}
    for future in as_completed(futures):
        model = futures[future]
        result = future.result()
        results[model] = result
        status = "OK" if result["success"] else "FAILED"
        latency = f"{result['latency_seconds']}s" if result["success"] else "N/A"
        print(f"  [{status}] {model.split('/')[-1]} ({latency})")

# Save raw results
with open(f"{SESSION_DIR}/phase1_responses.json", "w") as f:
    json.dump(results, f, indent=2)

print(f"\nPhase 1 complete. Results saved to: {SESSION_DIR}/phase1_responses.json")
print(f"SESSION_DIR={SESSION_DIR}")
PYEOF

SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"

python3 << 'PYEOF'
import os
import json
import requests
import time
from concurrent.futures import ThreadPoolExecutor, as_completed

FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"
SESSION_DIR = os.environ.get("SESSION_DIR")

# Load Phase 1 results
with open(f"{SESSION_DIR}/config.json") as f:
    config = json.load(f)
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
    phase1_results = json.load(f)

QUERY = config["query"]
MODELS = config["models"]

# Create anonymized mapping
labels = ["A", "B", "C", "D", "E", "F", "G"][:len(MODELS)]
model_to_label = dict(zip(MODELS, labels))
label_to_model = {v: k for k, v in model_to_label.items()}

# Format anonymized responses
anonymized_responses = []
for model_id in MODELS:
    label = model_to_label[model_id]
    content = phase1_results[model_id]["content"]
    anonymized_responses.append(f"=== Response {label} ===\n{content}")

anonymized_text = "\n\n".join(anonymized_responses)

def get_rankings(model_id, query, anonymized, own_label):
    """Get rankings from a single model"""
    ranking_prompt = f"""You are evaluating responses from multiple AI models to this query:

QUERY: {query}

Here are the anonymized responses:

{anonymized}

Please rank these responses from BEST to WORST. For each ranking:
1. State the response letter (A, B, C, etc.)
2. Give a brief reason (1-2 sentences)
3. You may skip ranking your own response (labeled {own_label}) or rank it fairly

Format your response EXACTLY as:
RANKINGS:
1. [Letter] - [Brief reason]
2. [Letter] - [Brief reason]
3. [Letter] - [Brief reason]
..."""

    try:
        start = time.time()
        response = requests.post(
            API_URL,
            headers={
                "Authorization": f"Bearer {FIREWORKS_API_KEY}",
                "Content-Type": "application/json"
            },
            json={
                "model": model_id,
                "messages": [
                    {"role": "system", "content": f"You are ranking AI responses objectively. Your own response is labeled '{own_label}'."},
                    {"role": "user", "content": ranking_prompt}
                ],
                "max_tokens": 1000,
                "temperature": 1
            },
            timeout=90
        )
        response.raise_for_status()
        elapsed = time.time() - start
        return {
            "success": True,
            "content": response.json()["choices"][0]["message"]["content"],
            "model": model_id,
            "latency_seconds": round(elapsed, 2)
        }
    except Exception as e:
        return {
            "success": False,
            "content": f"[ERROR: {str(e)}]",
            "model": model_id,
            "latency_seconds": 0
        }

print(f"\n{'='*60}")
print("PHASE 2: Cross-Model Ranking")
print(f"{'='*60}")
print(f"Label mapping: {json.dumps({v: k.split('/')[-1] for k, v in model_to_label.items()})}")
print()

# Collect rankings from all models in parallel
rankings = {}
with ThreadPoolExecutor(max_workers=len(MODELS)) as executor:
    futures = {
        executor.submit(get_rankings, mid, QUERY, anonymized_text, model_to_label[mid]): mid
        for mid in MODELS
    }
    for future in as_completed(futures):
        model = futures[future]
        result = future.result()
        rankings[model] = result
        status = "OK" if result["success"] else "FAILED"
        latency = f"{result['latency_seconds']}s" if result["success"] else "N/A"
        print(f"  [{status}] {model.split('/')[-1]} ({latency})")

# Save rankings
output = {
    "label_mapping": label_to_model,
    "model_to_label": model_to_label,
    "rankings": rankings
}
with open(f"{SESSION_DIR}/phase2_rankings.json", "w") as f:
    json.dump(output, f, indent=2)

print(f"\nPhase 2 complete. Rankings saved to: {SESSION_DIR}/phase2_rankings.json")
PYEOF

SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"
CHAIRMAN_MODEL="accounts/fireworks/models/glm-5"

python3 << 'PYEOF'
import os
import json
import requests
import time

FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"
SESSION_DIR = os.environ.get("SESSION_DIR")
CHAIRMAN_MODEL = os.environ.get("CHAIRMAN_MODEL")

# Load all previous results
with open(f"{SESSION_DIR}/config.json") as f:
    config = json.load(f)
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
    phase1 = json.load(f)
with open(f"{SESSION_DIR}/phase2_rankings.json") as f:
    phase2 = json.load(f)

QUERY = config["query"]
label_to_model = phase2["label_mapping"]
model_to_label = phase2["model_to_label"]

# Format responses with model names revealed
responses_text = []
for model_id, result in phase1.items():
    label = model_to_label.get(model_id, "?")
    model_name = model_id.split("/")[-1]
    responses_text.append(f"=== {label}: {model_name} ===\n{result['content']}")

# Format rankings
rankings_text = []
for model_id, result in phase2["rankings"].items():
    model_name = model_id.split("/")[-1]
    rankings_text.append(f"[{model_name}'s Rankings]\n{result['content']}")

synthesis_prompt = f"""You are the Chairman of an LLM Council. Your task is to synthesize the best possible answer from multiple AI responses.

ORIGINAL QUERY:
{QUERY}

INDIVIDUAL RESPONSES:
{chr(10).join(responses_text)}

MODEL RANKINGS:
{chr(10).join(rankings_text)}

As Chairman, produce a FINAL SYNTHESIS that:
1. Incorporates the strongest elements from the best-ranked responses
2. Resolves any contradictions between responses
3. Addresses aspects that multiple models agreed on
4. Corrects any errors identified through cross-ranking
5. Provides the most complete, accurate, and helpful answer

Begin your synthesis:"""

print(f"\n{'='*60}")
print("PHASE 3: Chairman Synthesis")
print(f"{'='*60}")
print(f"Chairman: {CHAIRMAN_MODEL.split('/')[-1]}")
print()

try:
    start = time.time()
    response = requests.post(
        API_URL,
        headers={
            "Authorization": f"Bearer {FIREWORKS_API_KEY}",
            "Content-Type": "application/json"
        },
        json={
            "model": CHAIRMAN_MODEL,
            "messages": [
                {"role": "system", "content": "You are the Chairman of an LLM Council. Synthesize multiple AI perspectives into a definitive, comprehensive response."},
                {"role": "user", "content": synthesis_prompt}
            ],
            "max_tokens": 4000,
            "temperature": 1
        },
        timeout=180
    )
    response.raise_for_status()
    elapsed = time.time() - start
    synthesis = response.json()["choices"][0]["message"]["content"]

    with open(f"{SESSION_DIR}/phase3_synthesis.txt", "w") as f:
        f.write(synthesis)

    print(f"Phase 3 complete ({elapsed:.2f}s). Synthesis saved to: {SESSION_DIR}/phase3_synthesis.txt")

except Exception as e:
    print(f"ERROR: {e}")
    synthesis = f"[ERROR: {str(e)}]"
    with open(f"{SESSION_DIR}/phase3_synthesis.txt", "w") as f:
        f.write(synthesis)

# Update config with chairman
config["chairman"] = CHAIRMAN_MODEL
with open(f"{SESSION_DIR}/config.json", "w") as f:
    json.dump(config, f, indent=2)
PYEOF

SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"

python3 << 'PYEOF'
import os
import json

SESSION_DIR = os.environ.get("SESSION_DIR")

# Load all data
with open(f"{SESSION_DIR}/config.json") as f:
    config = json.load(f)
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
    phase1 = json.load(f)
with open(f"{SESSION_DIR}/phase2_rankings.json") as f:
    phase2 = json.load(f)
with open(f"{SESSION_DIR}/phase3_synthesis.txt") as f:
    synthesis = f.read()

model_to_label = phase2["model_to_label"]
label_to_model = phase2["label_mapping"]

# Build formatted output
output = []
output.append("=" * 70)
output.append("                  LLM COUNCIL DELIBERATION")
output.append("                  Powered by Fireworks AI")
output.append("=" * 70)
output.append("")
output.append(f"QUERY: {config['query']}")
output.append(f"COUNCIL: {', '.join([m.split('/')[-1] for m in config['models']])}")
output.append(f"CHAIRMAN: {config.get('chairman', 'N/A').split('/')[-1]}")
output.append("")

# Phase 1: Individual Responses
output.append("-" * 70)
output.append("                 PHASE 1: INDIVIDUAL RESPONSES")
output.append("-" * 70)
output.append("")

for model_id, result in phase1.items():
    model_name = model_id.split("/")[-1]
    label = model_to_label.get(model_id, "?")
    latency = result.get("latency_seconds", "N/A")
    tokens = result.get("tokens", {})
    output.append(f"[{label}] {model_name} (latency: {latency}s, tokens: {tokens.get('total', 'N/A')})")
    output.append("-" * 40)
    output.append(result["content"])
    output.append("")

# Phase 2: Cross-Model Rankings
output.append("-" * 70)
output.append("                 PHASE 2: CROSS-MODEL RANKINGS")
output.append("-" * 70)
output.append("")
output.append(f"Label mapping: {json.dumps({v: k.split('/')[-1] for k, v in model_to_label.items()}, indent=2)}")
output.append("")

for model_id, result in phase2["rankings"].items():
    model_name = model_id.split("/")[-1]
    output.append(f"[{model_name}'s Rankings]")
    output.append(result["content"])
    output.append("")

# Phase 3: Chairman Synthesis
output.append("-" * 70)
output.append("                 PHASE 3: CHAIRMAN'S SYNTHESIS")
output.append("-" * 70)
output.append("")
chairman_name = config.get("chairman", "Chairman").split("/")[-1]
output.append(f"[{chairman_name} - Chairman]")
output.append("")
output.append(synthesis)
output.append("")
output.append("=" * 70)
output.append(f"Session files: {SESSION_DIR}/")

# Save formatted output
final_output = "\n".join(output)
with open(f"{SESSION_DIR}/final_output.md", "w") as f:
    f.write(final_output)

print(final_output)
print(f"\nFull output saved to: {SESSION_DIR}/final_output.md")
PYEOF

会话目录：每次运行都会在 /tmp/llm-council/{timestamp}/ 中创建一个唯一的会话
原始数据保留：所有 API 响应都按原样保存到 JSON 文件中，以确保完全透明
成本：Fireworks 定价是按令牌计费的。模型越多、查询越长，成本越高。请查看 https://fireworks.ai/pricing 的当前定价
延迟跟踪：每个 API 调用都会跟踪延迟，以便您可以看到 Fireworks 的实际速度
令牌使用量：阶段 1 的响应包含令牌计数，以便了解成本
速率限制：如果遇到速率限制，请稍等片刻再重试
模型可用性：请查看 https://app.fireworks.ai/ 了解当前模型状态

🇺🇸English

LLM Council (Fireworks AI)

This skill implements Karpathy's LLM Council concept where multiple open-weight LLMs deliberate on a query, powered entirely by Fireworks AI:

Phase 1 : All models respond to the query independently (parallel)
Phase 2 : Models rank each other's anonymized responses
Phase 3 : A Chairman LLM synthesizes the final answer

All inference runs through Fireworks AI using open-weight models. The speed and pricing of Fireworks makes it practical to run multi-model deliberation that would be slow or expensive on other providers.

CRITICAL RULES

ALWAYS use AskUserQuestion to let the user select council models (multiselect) and the Chairman model
ALWAYS save raw responses to files - never summarize or truncate API outputs
ALWAYS show full transparency - display all individual responses, all rankings, AND the final synthesis
NEVER skip the ranking phase - it is essential to the council deliberation process
Read from files for display - ensures content is shown unmodified
ALWAYS display the final output to the user after Phase 3 completes

Pre-flight Check

Before running any phase, verify the Fireworks API key is set:

if [ -z "$FIREWORKS_API_KEY" ]; then
  echo "ERROR: FIREWORKS_API_KEY is not set."
  echo "Create a Fireworks AI account at: https://fireworks.ai/"
  echo "Then export it in your shell profile (~/.zshrc or ~/.bashrc):"
  echo '  export FIREWORKS_API_KEY="your_api_key_here"'
  exit 1
fi
echo "FIREWORKS_API_KEY is set."

Available Models

Present these options to the user via AskUserQuestion (multiselect):

Model	Fireworks ID	Provider
GLM 5	accounts/fireworks/models/glm-5	Z.ai
DeepSeek V3.1	accounts/fireworks/models/deepseek-v3p1	DeepSeek
DeepSeek V3.2	accounts/fireworks/models/deepseek-v3p2	DeepSeek
MiniMax M2.1	accounts/fireworks/models/minimax-m2p1	MiniMax
Kimi K2.5	accounts/fireworks/models/kimi-k2p5	Moonshot
Qwen3 235B	accounts/fireworks/models/qwen3-235b-a22b	Alibaba
Llama 4 Maverick	accounts/fireworks/models/llama4-maverick-instruct-basic	Meta

Workflow

Step 1: Gather User Input

Use AskUserQuestion to get:

The query/question for the council (or accept it from the conversation)
Which models to include (multiselect, recommend 3-5 models)
Which model should be the Chairman (single select)

Note: AskUserQuestion supports max 4 options per question. Since there are 7 models, split model selection across two questions, or show the most popular 4 and let the user type "Other" for the rest. A good default is to show 4 models in the first question and note the others are available via "Other". Rotate which models are shown based on variety.

Example AskUserQuestion for model selection (show 4, mention others):

question: "Which models should participate in the LLM Council? (Also available via Other: Llama 4 Maverick, Qwen3 235B, GLM 5)"
header: "Models"
multiSelect: true
options:
  - label: "DeepSeek V3.2"
    description: "DeepSeek's newest and most capable model"
  - label: "MiniMax M2.1"
    description: "MiniMax's strong open-weight model"
  - label: "Kimi K2.5"
    description: "Moonshot's strong open-weight model"
  - label: "DeepSeek V3.1"
    description: "DeepSeek's proven reasoning model"

Example AskUserQuestion for chairman:

question: "Which model should be the Chairman (synthesizes the final answer)?"
header: "Chairman"
multiSelect: false
options:
  - label: "DeepSeek V3.2 (Recommended)"
    description: "Newest DeepSeek, strong at comprehensive analysis"
  - label: "GLM 5"
    description: "Strong reasoning for synthesis"
  - label: "Kimi K2.5"
    description: "Strong at structured synthesis"
  - label: "MiniMax M2.1"
    description: "Strong open-weight model for synthesis"

Model Name to ID Mapping

Use this mapping to convert user selections to Fireworks model IDs:

MODEL_MAP = {
    "GLM 5": "accounts/fireworks/models/glm-5",
    "DeepSeek V3.1": "accounts/fireworks/models/deepseek-v3p1",
    "DeepSeek V3.2": "accounts/fireworks/models/deepseek-v3p2",
    "MiniMax M2.1": "accounts/fireworks/models/minimax-m2p1",
    "Kimi K2.5": "accounts/fireworks/models/kimi-k2p5",
    "Qwen3 235B": "accounts/fireworks/models/qwen3-235b-a22b",
    "Llama 4 Maverick": "accounts/fireworks/models/llama4-maverick-instruct-basic",
}

Step 2: Run Phase 1 - Individual Responses

After gathering input, run this script to get responses from all selected models in parallel:

QUERY="USER_QUERY_HERE"
MODELS='["accounts/fireworks/models/glm-5", "accounts/fireworks/models/deepseek-v3p1"]'

python3 << 'PYEOF'
import os
import json
import requests
import time
from concurrent.futures import ThreadPoolExecutor, as_completed

FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"

QUERY = os.environ.get("QUERY", "")
MODELS = json.loads(os.environ.get("MODELS", "[]"))

# Create session directory
timestamp = time.strftime("%Y%m%d-%H%M%S")
SESSION_DIR = f"/tmp/llm-council/{timestamp}"
os.makedirs(SESSION_DIR, exist_ok=True)

# Save config
config = {"query": QUERY, "models": MODELS, "timestamp": timestamp}
with open(f"{SESSION_DIR}/config.json", "w") as f:
    json.dump(config, f, indent=2)

def call_model(model_id, query):
    """Call a single model via Fireworks AI"""
    try:
        start = time.time()
        response = requests.post(
            API_URL,
            headers={
                "Authorization": f"Bearer {FIREWORKS_API_KEY}",
                "Content-Type": "application/json"
            },
            json={
                "model": model_id,
                "messages": [
                    {"role": "system", "content": "You are participating in an LLM council deliberation. Provide your best, most thoughtful response to the query. Be comprehensive but focused."},
                    {"role": "user", "content": query}
                ],
                "max_tokens": 4000,
                "temperature": 1
            },
            timeout=120
        )
        response.raise_for_status()
        elapsed = time.time() - start
        data = response.json()
        usage = data.get("usage", {})
        return {
            "success": True,
            "content": data["choices"][0]["message"]["content"],
            "model": model_id,
            "latency_seconds": round(elapsed, 2),
            "tokens": {
                "prompt": usage.get("prompt_tokens", 0),
                "completion": usage.get("completion_tokens", 0),
                "total": usage.get("total_tokens", 0)
            }
        }
    except Exception as e:
        return {
            "success": False,
            "content": f"[ERROR: {str(e)}]",
            "model": model_id,
            "latency_seconds": 0,
            "tokens": {"prompt": 0, "completion": 0, "total": 0}
        }

print(f"\n{'='*60}")
print("PHASE 1: Collecting Individual Responses")
print(f"{'='*60}")
print(f"Query: {QUERY[:200]}...")
print(f"Models: {', '.join([m.split('/')[-1] for m in MODELS])}")
print(f"Session: {SESSION_DIR}")
print()

# Parallel execution
results = {}
with ThreadPoolExecutor(max_workers=len(MODELS)) as executor:
    futures = {executor.submit(call_model, m, QUERY): m for m in MODELS}
    for future in as_completed(futures):
        model = futures[future]
        result = future.result()
        results[model] = result
        status = "OK" if result["success"] else "FAILED"
        latency = f"{result['latency_seconds']}s" if result["success"] else "N/A"
        print(f"  [{status}] {model.split('/')[-1]} ({latency})")

# Save raw results
with open(f"{SESSION_DIR}/phase1_responses.json", "w") as f:
    json.dump(results, f, indent=2)

print(f"\nPhase 1 complete. Results saved to: {SESSION_DIR}/phase1_responses.json")
print(f"SESSION_DIR={SESSION_DIR}")
PYEOF

Step 3: Run Phase 2 - Cross-Model Ranking

Each model reviews and ranks the anonymized responses from Phase 1:

SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"

python3 << 'PYEOF'
import os
import json
import requests
import time
from concurrent.futures import ThreadPoolExecutor, as_completed

FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"
SESSION_DIR = os.environ.get("SESSION_DIR")

# Load Phase 1 results
with open(f"{SESSION_DIR}/config.json") as f:
    config = json.load(f)
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
    phase1_results = json.load(f)

QUERY = config["query"]
MODELS = config["models"]

# Create anonymized mapping
labels = ["A", "B", "C", "D", "E", "F", "G"][:len(MODELS)]
model_to_label = dict(zip(MODELS, labels))
label_to_model = {v: k for k, v in model_to_label.items()}

# Format anonymized responses
anonymized_responses = []
for model_id in MODELS:
    label = model_to_label[model_id]
    content = phase1_results[model_id]["content"]
    anonymized_responses.append(f"=== Response {label} ===\n{content}")

anonymized_text = "\n\n".join(anonymized_responses)

def get_rankings(model_id, query, anonymized, own_label):
    """Get rankings from a single model"""
    ranking_prompt = f"""You are evaluating responses from multiple AI models to this query:

QUERY: {query}

Here are the anonymized responses:

{anonymized}

Please rank these responses from BEST to WORST. For each ranking:
1. State the response letter (A, B, C, etc.)
2. Give a brief reason (1-2 sentences)
3. You may skip ranking your own response (labeled {own_label}) or rank it fairly

Format your response EXACTLY as:
RANKINGS:
1. [Letter] - [Brief reason]
2. [Letter] - [Brief reason]
3. [Letter] - [Brief reason]
..."""

    try:
        start = time.time()
        response = requests.post(
            API_URL,
            headers={
                "Authorization": f"Bearer {FIREWORKS_API_KEY}",
                "Content-Type": "application/json"
            },
            json={
                "model": model_id,
                "messages": [
                    {"role": "system", "content": f"You are ranking AI responses objectively. Your own response is labeled '{own_label}'."},
                    {"role": "user", "content": ranking_prompt}
                ],
                "max_tokens": 1000,
                "temperature": 1
            },
            timeout=90
        )
        response.raise_for_status()
        elapsed = time.time() - start
        return {
            "success": True,
            "content": response.json()["choices"][0]["message"]["content"],
            "model": model_id,
            "latency_seconds": round(elapsed, 2)
        }
    except Exception as e:
        return {
            "success": False,
            "content": f"[ERROR: {str(e)}]",
            "model": model_id,
            "latency_seconds": 0
        }

print(f"\n{'='*60}")
print("PHASE 2: Cross-Model Ranking")
print(f"{'='*60}")
print(f"Label mapping: {json.dumps({v: k.split('/')[-1] for k, v in model_to_label.items()})}")
print()

# Collect rankings from all models in parallel
rankings = {}
with ThreadPoolExecutor(max_workers=len(MODELS)) as executor:
    futures = {
        executor.submit(get_rankings, mid, QUERY, anonymized_text, model_to_label[mid]): mid
        for mid in MODELS
    }
    for future in as_completed(futures):
        model = futures[future]
        result = future.result()
        rankings[model] = result
        status = "OK" if result["success"] else "FAILED"
        latency = f"{result['latency_seconds']}s" if result["success"] else "N/A"
        print(f"  [{status}] {model.split('/')[-1]} ({latency})")

# Save rankings
output = {
    "label_mapping": label_to_model,
    "model_to_label": model_to_label,
    "rankings": rankings
}
with open(f"{SESSION_DIR}/phase2_rankings.json", "w") as f:
    json.dump(output, f, indent=2)

print(f"\nPhase 2 complete. Rankings saved to: {SESSION_DIR}/phase2_rankings.json")
PYEOF

Step 4: Run Phase 3 - Chairman Synthesis

The Chairman model receives all responses and rankings, then produces the final synthesis:

SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"
CHAIRMAN_MODEL="accounts/fireworks/models/glm-5"

python3 << 'PYEOF'
import os
import json
import requests
import time

FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"
SESSION_DIR = os.environ.get("SESSION_DIR")
CHAIRMAN_MODEL = os.environ.get("CHAIRMAN_MODEL")

# Load all previous results
with open(f"{SESSION_DIR}/config.json") as f:
    config = json.load(f)
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
    phase1 = json.load(f)
with open(f"{SESSION_DIR}/phase2_rankings.json") as f:
    phase2 = json.load(f)

QUERY = config["query"]
label_to_model = phase2["label_mapping"]
model_to_label = phase2["model_to_label"]

# Format responses with model names revealed
responses_text = []
for model_id, result in phase1.items():
    label = model_to_label.get(model_id, "?")
    model_name = model_id.split("/")[-1]
    responses_text.append(f"=== {label}: {model_name} ===\n{result['content']}")

# Format rankings
rankings_text = []
for model_id, result in phase2["rankings"].items():
    model_name = model_id.split("/")[-1]
    rankings_text.append(f"[{model_name}'s Rankings]\n{result['content']}")

synthesis_prompt = f"""You are the Chairman of an LLM Council. Your task is to synthesize the best possible answer from multiple AI responses.

ORIGINAL QUERY:
{QUERY}

INDIVIDUAL RESPONSES:
{chr(10).join(responses_text)}

MODEL RANKINGS:
{chr(10).join(rankings_text)}

As Chairman, produce a FINAL SYNTHESIS that:
1. Incorporates the strongest elements from the best-ranked responses
2. Resolves any contradictions between responses
3. Addresses aspects that multiple models agreed on
4. Corrects any errors identified through cross-ranking
5. Provides the most complete, accurate, and helpful answer

Begin your synthesis:"""

print(f"\n{'='*60}")
print("PHASE 3: Chairman Synthesis")
print(f"{'='*60}")
print(f"Chairman: {CHAIRMAN_MODEL.split('/')[-1]}")
print()

try:
    start = time.time()
    response = requests.post(
        API_URL,
        headers={
            "Authorization": f"Bearer {FIREWORKS_API_KEY}",
            "Content-Type": "application/json"
        },
        json={
            "model": CHAIRMAN_MODEL,
            "messages": [
                {"role": "system", "content": "You are the Chairman of an LLM Council. Synthesize multiple AI perspectives into a definitive, comprehensive response."},
                {"role": "user", "content": synthesis_prompt}
            ],
            "max_tokens": 4000,
            "temperature": 1
        },
        timeout=180
    )
    response.raise_for_status()
    elapsed = time.time() - start
    synthesis = response.json()["choices"][0]["message"]["content"]

    with open(f"{SESSION_DIR}/phase3_synthesis.txt", "w") as f:
        f.write(synthesis)

    print(f"Phase 3 complete ({elapsed:.2f}s). Synthesis saved to: {SESSION_DIR}/phase3_synthesis.txt")

except Exception as e:
    print(f"ERROR: {e}")
    synthesis = f"[ERROR: {str(e)}]"
    with open(f"{SESSION_DIR}/phase3_synthesis.txt", "w") as f:
        f.write(synthesis)

# Update config with chairman
config["chairman"] = CHAIRMAN_MODEL
with open(f"{SESSION_DIR}/config.json", "w") as f:
    json.dump(config, f, indent=2)
PYEOF

Step 5: Display Full Results

Read all saved files and display the complete council deliberation:

SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"

python3 << 'PYEOF'
import os
import json

SESSION_DIR = os.environ.get("SESSION_DIR")

# Load all data
with open(f"{SESSION_DIR}/config.json") as f:
    config = json.load(f)
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
    phase1 = json.load(f)
with open(f"{SESSION_DIR}/phase2_rankings.json") as f:
    phase2 = json.load(f)
with open(f"{SESSION_DIR}/phase3_synthesis.txt") as f:
    synthesis = f.read()

model_to_label = phase2["model_to_label"]
label_to_model = phase2["label_mapping"]

# Build formatted output
output = []
output.append("=" * 70)
output.append("                  LLM COUNCIL DELIBERATION")
output.append("                  Powered by Fireworks AI")
output.append("=" * 70)
output.append("")
output.append(f"QUERY: {config['query']}")
output.append(f"COUNCIL: {', '.join([m.split('/')[-1] for m in config['models']])}")
output.append(f"CHAIRMAN: {config.get('chairman', 'N/A').split('/')[-1]}")
output.append("")

# Phase 1: Individual Responses
output.append("-" * 70)
output.append("                 PHASE 1: INDIVIDUAL RESPONSES")
output.append("-" * 70)
output.append("")

for model_id, result in phase1.items():
    model_name = model_id.split("/")[-1]
    label = model_to_label.get(model_id, "?")
    latency = result.get("latency_seconds", "N/A")
    tokens = result.get("tokens", {})
    output.append(f"[{label}] {model_name} (latency: {latency}s, tokens: {tokens.get('total', 'N/A')})")
    output.append("-" * 40)
    output.append(result["content"])
    output.append("")

# Phase 2: Cross-Model Rankings
output.append("-" * 70)
output.append("                 PHASE 2: CROSS-MODEL RANKINGS")
output.append("-" * 70)
output.append("")
output.append(f"Label mapping: {json.dumps({v: k.split('/')[-1] for k, v in model_to_label.items()}, indent=2)}")
output.append("")

for model_id, result in phase2["rankings"].items():
    model_name = model_id.split("/")[-1]
    output.append(f"[{model_name}'s Rankings]")
    output.append(result["content"])
    output.append("")

# Phase 3: Chairman Synthesis
output.append("-" * 70)
output.append("                 PHASE 3: CHAIRMAN'S SYNTHESIS")
output.append("-" * 70)
output.append("")
chairman_name = config.get("chairman", "Chairman").split("/")[-1]
output.append(f"[{chairman_name} - Chairman]")
output.append("")
output.append(synthesis)
output.append("")
output.append("=" * 70)
output.append(f"Session files: {SESSION_DIR}/")

# Save formatted output
final_output = "\n".join(output)
with open(f"{SESSION_DIR}/final_output.md", "w") as f:
    f.write(final_output)

print(final_output)
print(f"\nFull output saved to: {SESSION_DIR}/final_output.md")
PYEOF

Important Notes

Session Directory : Each run creates a unique session in /tmp/llm-council/{timestamp}/
Raw Data Preserved : All API responses are saved as-is to JSON files for full transparency
Cost : Fireworks pricing is per-token. More models and longer queries cost more. Check current pricing at https://fireworks.ai/pricing
Latency Tracking : Each API call tracks latency so you can see Fireworks' speed in action
Token Usage : Phase 1 responses include token counts for cost awareness
Rate Limits : If you hit rate limits, wait briefly and retry
Model Availability : Check https://app.fireworks.ai/ for current model status

Setup

Create a Fireworks AI account at https://fireworks.ai/ and grab your API key from the dashboard

Export it in your shell profile:

export FIREWORKS_API_KEY="your_api_key_here"

Restart your terminal or run source ~/.zshrc
Invoke this skill when you want multiple open-weight AI perspectives on a question

Weekly Installs

Repository

dair-ai/dair-ac…-plugins

GitHub Stars

First Seen

Jan 1, 1970

Security Audits

Gen Agent Trust HubPass SocketPass SnykWarn

LLM委员会技能：多模型AI审议工具，Fireworks AI驱动，开源权重模型协同决策

🇨🇳中文介绍

LLM 委员会 (Fireworks AI)

关键规则

飞行前检查

相关 Skills

可用模型

工作流程

步骤 1：收集用户输入

模型名称到 ID 的映射

步骤 2：运行阶段 1 - 单独响应

步骤 3：运行阶段 2 - 跨模型排名

步骤 4：运行阶段 3 - 主席综合

步骤 5：显示完整结果

重要说明

设置