高级提示工程师工具包：提示优化、RAG评估与智能体编排

senior-prompt-engineer by alirezarezvani/claude-skills

226 周安装量

8,300 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/alirezarezvani/claude-skills --skill senior-prompt-engineer

AI/机器学习开发自动化

🇨🇳中文介绍

高级提示工程师

提示工程模式、LLM评估框架与智能体系统设计。

快速开始

# 分析并优化提示文件
python scripts/prompt_optimizer.py prompts/my_prompt.txt --analyze

# 评估RAG检索质量
python scripts/rag_evaluator.py --contexts contexts.json --questions questions.json

# 根据定义可视化智能体工作流
python scripts/agent_orchestrator.py agent_config.yaml --visualize

工具概览

1. 提示优化器

分析提示的令牌效率、清晰度和结构。生成优化版本。

输入： 提示文本文件或字符串 输出： 包含优化建议的分析报告

用法：

# 分析提示文件
python scripts/prompt_optimizer.py prompt.txt --analyze

# 输出：
# 令牌计数：847
# 预估成本：$0.0025 (GPT-4)
# 清晰度得分：72/100
# 发现的问题：
#   - 第3行指令模糊
#   - 缺少输出格式规范
#   - 冗余上下文（第12-15行重复了第5-8行）
# 建议：
#   1. 添加明确的输出格式："以JSON格式响应，包含以下键：..."
#   2. 移除冗余上下文以节省89个令牌
#   3. 澄清"分析" -> "列出前3个问题及其严重性评级"

# 生成优化版本
python scripts/prompt_optimizer.py prompt.txt --optimize --output optimized.txt

# 为成本估算统计令牌数
python scripts/prompt_optimizer.py prompt.txt --tokens --model gpt-4

# 提取并管理少样本示例
python scripts/prompt_optimizer.py prompt.txt --extract-examples --output examples.json

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

3. 智能体编排器

解析智能体定义并可视化执行流程。验证工具配置。

输入： 智能体配置（YAML/JSON） 输出： 工作流可视化、验证报告

# 验证智能体配置
python scripts/agent_orchestrator.py agent.yaml --validate

# 输出：
# === 智能体验证报告 ===
# 智能体：research_assistant
# 模式：ReAct
#
# 工具（已注册4个）：
#   [OK] web_search - API密钥已配置
#   [OK] calculator - 无需配置
#   [WARN] file_reader - 缺少 allowed_paths
#   [OK] summarizer - 提示模板有效
#
# 流程分析：
#   最大深度：5次迭代
#   预估令牌数/运行：2,400-4,800
#   潜在无限循环：无
#
# 建议：
#   1. 为 file_reader 添加 allowed_paths 以确保安全
#   2. 考虑为简单查询添加提前退出条件

# 可视化智能体工作流（ASCII）
python scripts/agent_orchestrator.py agent.yaml --visualize

# 输出：
# ┌─────────────────────────────────────────┐
# │            research_assistant           │
# │              (ReAct Pattern)            │
# └─────────────────┬───────────────────────┘
#                   │
#          ┌────────▼────────┐
#          │   用户查询      │
#          └────────┬────────┘
#                   │
#          ┌────────▼────────┐
#          │     思考        │◄──────┐
#          └────────┬────────┘       │
#                   │                │
#          ┌────────▼────────┐       │
#          │   选择工具      │       │
#          └────────┬────────┘       │
#                   │                │
#     ┌─────────────┼─────────────┐  │
#     ▼             ▼             ▼  │
# [web_search] [calculator] [file_reader]
#     │             │             │  │
#     └─────────────┼─────────────┘  │
#                   │                │
#          ┌────────▼────────┐       │
#          │     观察        │───────┘
#          └────────┬────────┘
#                   │
#          ┌────────▼────────┐
#          │   最终答案      │
#          └─────────────────┘

# 将工作流导出为Mermaid图
python scripts/agent_orchestrator.py agent.yaml --visualize --format mermaid

提示工程工作流

提示优化工作流

当需要改进现有提示的性能或降低令牌成本时使用。

步骤 1：建立当前提示的基线

python scripts/prompt_optimizer.py current_prompt.txt --analyze --output baseline.json

步骤 2：识别问题 查看分析报告，寻找：

令牌浪费（冗余指令、冗长的示例）
模糊指令（不清晰的输出格式、模糊的动词）
缺少约束（没有长度限制、没有格式规范）

步骤 3：应用优化模式

问题	应应用的模式
输出模糊	添加明确的格式规范
过于冗长	提取为少样本示例
结果不一致	添加角色/人设框架
缺少边缘情况	添加约束边界

步骤 4：生成优化版本

python scripts/prompt_optimizer.py current_prompt.txt --optimize --output optimized.txt

步骤 5：比较结果

python scripts/prompt_optimizer.py optimized.txt --analyze --compare baseline.json
# 显示：令牌减少量、清晰度提升、已解决的问题

步骤 6：用测试用例验证 针对您的评估集运行两个提示并比较输出。

少样本示例设计工作流

当需要为上下文学习创建示例时使用。

步骤 1：清晰定义任务

任务：从客户评论中提取产品实体
输入：评论文本
输出：包含 {product_name, sentiment, features_mentioned} 的JSON

步骤 2：选择多样化的示例（建议3-5个）

示例类型	目的
简单案例	展示基本模式
边缘案例	处理模糊性
复杂案例	多个实体
负面案例	不应提取的内容

步骤 3：保持格式一致

示例 1：
输入："Love my new iPhone 15, the camera is amazing!"
输出：{"product_name": "iPhone 15", "sentiment": "positive", "features_mentioned": ["camera"]}

示例 2：
输入："The laptop was okay but battery life is terrible."
输出：{"product_name": "laptop", "sentiment": "mixed", "features_mentioned": ["battery life"]}

步骤 4：验证示例质量

python scripts/prompt_optimizer.py prompt_with_examples.txt --validate-examples
# 检查：一致性、覆盖率、格式对齐

步骤 5：用保留案例测试 确保模型能泛化到您的示例之外。

结构化输出设计工作流

当需要可靠的JSON/XML/结构化响应时使用。

步骤 1：定义模式

{
  "type": "object",
  "properties": {
    "summary": {"type": "string", "maxLength": 200},
    "sentiment": {"enum": ["positive", "negative", "neutral"]},
    "confidence": {"type": "number", "minimum": 0, "maximum": 1}
  },
  "required": ["summary", "sentiment"]
}

步骤 2：在提示中包含模式

请按照以下模式以JSON格式响应：
- summary (字符串，最多200字符)：内容的简要摘要
- sentiment (枚举)："positive"、"negative"、"neutral" 之一
- confidence (数字 0-1)：您对情感判断的置信度

步骤 3：添加格式强制要求

重要：请仅返回有效的JSON。不要使用markdown，不要解释。
请以 { 开始响应，以 } 结束。

步骤 4：验证输出

python scripts/prompt_optimizer.py structured_prompt.txt --validate-schema schema.json

文件	包含内容	当用户询问以下内容时加载
`references/prompt_engineering_patterns.md`	10种提示模式，包含输入/输出示例	"哪种模式？"、"少样本"、"思维链"、"角色提示"
`references/llm_evaluation_frameworks.md`	评估指标、评分方法、A/B测试	"如何评估？"、"衡量质量"、"比较提示"
`references/agentic_system_design.md`	智能体架构（ReAct、计划-执行、工具使用）	"构建智能体"、"工具调用"、"多智能体"

常用模式速查表

模式	使用时机	示例
零样本	简单、定义明确的任务	"将此电子邮件分类为垃圾邮件或非垃圾邮件"
少样本	复杂任务，需要一致的格式	在任务前提供3-5个示例
思维链	推理、数学、多步骤逻辑	"请逐步思考..."
角色提示	需要专业知识、特定视角	"您是一名专业的税务会计师..."
结构化输出	需要可解析的JSON/XML	包含模式 + 格式强制要求

# 提示分析
python scripts/prompt_optimizer.py prompt.txt --analyze          # 完整分析
python scripts/prompt_optimizer.py prompt.txt --tokens           # 仅令牌计数
python scripts/prompt_optimizer.py prompt.txt --optimize         # 生成优化版本

# RAG评估
python scripts/rag_evaluator.py --contexts ctx.json --questions q.json  # 评估
python scripts/rag_evaluator.py --contexts ctx.json --compare baseline  # 与基线比较

# 智能体开发
python scripts/agent_orchestrator.py agent.yaml --validate       # 验证配置
python scripts/agent_orchestrator.py agent.yaml --visualize      # 显示工作流
python scripts/agent_orchestrator.py agent.yaml --estimate-cost  # 令牌估算

🇺🇸English

Senior Prompt Engineer

Prompt engineering patterns, LLM evaluation frameworks, and agentic system design.

Quick Start
Tools Overview
- Prompt Optimizer
- RAG Evaluator
- Agent Orchestrator
Prompt Engineering Workflows
- Prompt Optimization Workflow
- Few-Shot Example Design
- Structured Output Design
Reference Documentation
Common Patterns Quick Reference

Quick Start

# Analyze and optimize a prompt file
python scripts/prompt_optimizer.py prompts/my_prompt.txt --analyze

# Evaluate RAG retrieval quality
python scripts/rag_evaluator.py --contexts contexts.json --questions questions.json

# Visualize agent workflow from definition
python scripts/agent_orchestrator.py agent_config.yaml --visualize

Tools Overview

1. Prompt Optimizer

Analyzes prompts for token efficiency, clarity, and structure. Generates optimized versions.

Input: Prompt text file or string Output: Analysis report with optimization suggestions

Usage:

# Analyze a prompt file
python scripts/prompt_optimizer.py prompt.txt --analyze

# Output:
# Token count: 847
# Estimated cost: $0.0025 (GPT-4)
# Clarity score: 72/100
# Issues found:
#   - Ambiguous instruction at line 3
#   - Missing output format specification
#   - Redundant context (lines 12-15 repeat lines 5-8)
# Suggestions:
#   1. Add explicit output format: "Respond in JSON with keys: ..."
#   2. Remove redundant context to save 89 tokens
#   3. Clarify "analyze" -> "list the top 3 issues with severity ratings"

# Generate optimized version
python scripts/prompt_optimizer.py prompt.txt --optimize --output optimized.txt

# Count tokens for cost estimation
python scripts/prompt_optimizer.py prompt.txt --tokens --model gpt-4

# Extract and manage few-shot examples
python scripts/prompt_optimizer.py prompt.txt --extract-examples --output examples.json

2. RAG Evaluator

Evaluates Retrieval-Augmented Generation quality by measuring context relevance and answer faithfulness.

Input: Retrieved contexts (JSON) and questions/answers Output: Evaluation metrics and quality report

Usage:

# Evaluate retrieval quality
python scripts/rag_evaluator.py --contexts retrieved.json --questions eval_set.json

# Output:
# === RAG Evaluation Report ===
# Questions evaluated: 50
#
# Retrieval Metrics:
#   Context Relevance: 0.78 (target: >0.80)
#   Retrieval Precision@5: 0.72
#   Coverage: 0.85
#
# Generation Metrics:
#   Answer Faithfulness: 0.91
#   Groundedness: 0.88
#
# Issues Found:
#   - 8 questions had no relevant context in top-5
#   - 3 answers contained information not in context
#
# Recommendations:
#   1. Improve chunking strategy for technical documents
#   2. Add metadata filtering for date-sensitive queries

# Evaluate with custom metrics
python scripts/rag_evaluator.py --contexts retrieved.json --questions eval_set.json \
    --metrics relevance,faithfulness,coverage

# Export detailed results
python scripts/rag_evaluator.py --contexts retrieved.json --questions eval_set.json \
    --output report.json --verbose

3. Agent Orchestrator

Parses agent definitions and visualizes execution flows. Validates tool configurations.

Input: Agent configuration (YAML/JSON) Output: Workflow visualization, validation report

Usage:

# Validate agent configuration
python scripts/agent_orchestrator.py agent.yaml --validate

# Output:
# === Agent Validation Report ===
# Agent: research_assistant
# Pattern: ReAct
#
# Tools (4 registered):
#   [OK] web_search - API key configured
#   [OK] calculator - No config needed
#   [WARN] file_reader - Missing allowed_paths
#   [OK] summarizer - Prompt template valid
#
# Flow Analysis:
#   Max depth: 5 iterations
#   Estimated tokens/run: 2,400-4,800
#   Potential infinite loop: No
#
# Recommendations:
#   1. Add allowed_paths to file_reader for security
#   2. Consider adding early exit condition for simple queries

# Visualize agent workflow (ASCII)
python scripts/agent_orchestrator.py agent.yaml --visualize

# Output:
# ┌─────────────────────────────────────────┐
# │            research_assistant           │
# │              (ReAct Pattern)            │
# └─────────────────┬───────────────────────┘
#                   │
#          ┌────────▼────────┐
#          │   User Query    │
#          └────────┬────────┘
#                   │
#          ┌────────▼────────┐
#          │     Think       │◄──────┐
#          └────────┬────────┘       │
#                   │                │
#          ┌────────▼────────┐       │
#          │   Select Tool   │       │
#          └────────┬────────┘       │
#                   │                │
#     ┌─────────────┼─────────────┐  │
#     ▼             ▼             ▼  │
# [web_search] [calculator] [file_reader]
#     │             │             │  │
#     └─────────────┼─────────────┘  │
#                   │                │
#          ┌────────▼────────┐       │
#          │    Observe      │───────┘
#          └────────┬────────┘
#                   │
#          ┌────────▼────────┐
#          │  Final Answer   │
#          └─────────────────┘

# Export workflow as Mermaid diagram
python scripts/agent_orchestrator.py agent.yaml --visualize --format mermaid

Prompt Engineering Workflows

Prompt Optimization Workflow

Use when improving an existing prompt's performance or reducing token costs.

Step 1: Baseline current prompt

python scripts/prompt_optimizer.py current_prompt.txt --analyze --output baseline.json

Step 2: Identify issues Review the analysis report for:

Token waste (redundant instructions, verbose examples)
Ambiguous instructions (unclear output format, vague verbs)
Missing constraints (no length limits, no format specification)

Step 3: Apply optimization patterns

Issue	Pattern to Apply
Ambiguous output	Add explicit format specification
Too verbose	Extract to few-shot examples
Inconsistent results	Add role/persona framing
Missing edge cases	Add constraint boundaries

Step 4: Generate optimized version

python scripts/prompt_optimizer.py current_prompt.txt --optimize --output optimized.txt

Step 5: Compare results

python scripts/prompt_optimizer.py optimized.txt --analyze --compare baseline.json
# Shows: token reduction, clarity improvement, issues resolved

Step 6: Validate with test cases Run both prompts against your evaluation set and compare outputs.

Few-Shot Example Design Workflow

Use when creating examples for in-context learning.

Step 1: Define the task clearly

Task: Extract product entities from customer reviews
Input: Review text
Output: JSON with {product_name, sentiment, features_mentioned}

Step 2: Select diverse examples (3-5 recommended)

Example Type	Purpose
Simple case	Shows basic pattern
Edge case	Handles ambiguity
Complex case	Multiple entities
Negative case	What NOT to extract

Step 3: Format consistently

Example 1:
Input: "Love my new iPhone 15, the camera is amazing!"
Output: {"product_name": "iPhone 15", "sentiment": "positive", "features_mentioned": ["camera"]}

Example 2:
Input: "The laptop was okay but battery life is terrible."
Output: {"product_name": "laptop", "sentiment": "mixed", "features_mentioned": ["battery life"]}

Step 4: Validate example quality

python scripts/prompt_optimizer.py prompt_with_examples.txt --validate-examples
# Checks: consistency, coverage, format alignment

Step 5: Test with held-out cases Ensure model generalizes beyond your examples.

Structured Output Design Workflow

Use when you need reliable JSON/XML/structured responses.

Step 1: Define schema

{
  "type": "object",
  "properties": {
    "summary": {"type": "string", "maxLength": 200},
    "sentiment": {"enum": ["positive", "negative", "neutral"]},
    "confidence": {"type": "number", "minimum": 0, "maximum": 1}
  },
  "required": ["summary", "sentiment"]
}

Step 2: Include schema in prompt

Respond with JSON matching this schema:
- summary (string, max 200 chars): Brief summary of the content
- sentiment (enum): One of "positive", "negative", "neutral"
- confidence (number 0-1): Your confidence in the sentiment

Step 3: Add format enforcement

IMPORTANT: Respond ONLY with valid JSON. No markdown, no explanation.
Start your response with { and end with }

Step 4: Validate outputs

python scripts/prompt_optimizer.py structured_prompt.txt --validate-schema schema.json

Reference Documentation

File	Contains	Load when user asks about
`references/prompt_engineering_patterns.md`	10 prompt patterns with input/output examples	"which pattern?", "few-shot", "chain-of-thought", "role prompting"
`references/llm_evaluation_frameworks.md`	Evaluation metrics, scoring methods, A/B testing	"how to evaluate?", "measure quality", "compare prompts"
`references/agentic_system_design.md`	Agent architectures (ReAct, Plan-Execute, Tool Use)	"build agent", "tool calling", "multi-agent"

Common Patterns Quick Reference

Pattern	When to Use	Example
Zero-shot	Simple, well-defined tasks	"Classify this email as spam or not spam"
Few-shot	Complex tasks, consistent format needed	Provide 3-5 examples before the task
Chain-of-Thought	Reasoning, math, multi-step logic	"Think step by step..."
Role Prompting	Expertise needed, specific perspective	"You are an expert tax accountant..."
Structured Output	Need parseable JSON/XML	Include schema + format enforcement

Common Commands

# Prompt Analysis
python scripts/prompt_optimizer.py prompt.txt --analyze          # Full analysis
python scripts/prompt_optimizer.py prompt.txt --tokens           # Token count only
python scripts/prompt_optimizer.py prompt.txt --optimize         # Generate optimized version

# RAG Evaluation
python scripts/rag_evaluator.py --contexts ctx.json --questions q.json  # Evaluate
python scripts/rag_evaluator.py --contexts ctx.json --compare baseline  # Compare to baseline

# Agent Development
python scripts/agent_orchestrator.py agent.yaml --validate       # Validate config
python scripts/agent_orchestrator.py agent.yaml --visualize      # Show workflow
python scripts/agent_orchestrator.py agent.yaml --estimate-cost  # Token estimation

Weekly Installs

223

Repository

alirezarezvani/…e-skills

GitHub Stars

2.8K

First Seen

Jan 20, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykWarn

Installed on

claude-code193

opencode170

gemini-cli166

codex162

cursor148

github-copilot135

agent-browser 浏览器自动化工具 - Vercel Labs 命令行网页操作与测试

147,400 周安装

高级提示工程师工具包：提示优化、RAG评估与智能体编排

🇨🇳中文介绍

高级提示工程师

目录

快速开始

工具概览

1. 提示优化器

相关 Skills

2. RAG评估器

3. 智能体编排器

提示工程工作流

提示优化工作流

少样本示例设计工作流

结构化输出设计工作流

参考文档

常用模式速查表

常用命令

🇺🇸English

Senior Prompt Engineer

Table of Contents

Quick Start

Tools Overview

1. Prompt Optimizer

2. RAG Evaluator

3. Agent Orchestrator

Prompt Engineering Workflows

Prompt Optimization Workflow

Few-Shot Example Design Workflow

Structured Output Design Workflow

Reference Documentation

Common Patterns Quick Reference

Common Commands

最新 Skills