AI智能体开发指南：Python与Node.js框架选择、核心原则与架构解析

agentic-development by alinaqi/claude-bootstrap

174 周安装量

547 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/alinaqi/claude-bootstrap --skill agentic-development

AI/机器学习开发自动化

🇨🇳中文介绍

智能体开发技能

加载方式：base.md + llm-patterns.md + [language].md

用于构建能够使用工具执行多步骤任务的自主人工智能智能体。

来源： Claude Agent SDK | Anthropic Claude 代码最佳实践 | Pydantic AI | Google Gemini 智能体开发 | OpenAI 构建智能体

按语言选择框架

语言/框架	默认选择	原因
Python	Pydantic AI	类型安全，Pydantic 验证，多模型支持，生产就绪
Node.js / Next.js	Claude Agent SDK	官方 Anthropic SDK，工具支持，多智能体，原生流式传输

Python: Pydantic AI (默认)

from pydantic_ai import Agent
from pydantic import BaseModel

class SearchResult(BaseModel):
    title: str
    url: str
    summary: str

agent = Agent(
    'claude-sonnet-4-20250514',
    result_type=list[SearchResult],
    system_prompt='You are a research assistant.',
)

# Type-safe result
result = await agent.run('Find articles about AI agents')
for item in result.data:
    print(f"{item.title}: {item.url}")

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

相关 Skills

FlyClaw：零登录航班聚合查询工具，Python实现多源航班信息与价格搜索

4,000,000 周安装

find-skills 技能搜索工具 - Vercel Labs 开源智能体技能包管理器

843,800 周安装

Vercel React 最佳实践指南 | 58条Next.js性能优化规则与代码重构

278,000 周安装

Vercel Web界面规范检查工具 - 自动检测代码是否符合Web设计指南

224,400 周安装

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

// Define tools
const tools: Anthropic.Tool[] = [
  {
    name: "web_search",
    description: "Search the web for information",
    input_schema: {
      type: "object",
      properties: {
        query: { type: "string", description: "Search query" },
      },
      required: ["query"],
    },
  },
];

// Agentic loop
async function runAgent(prompt: string) {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: prompt },
  ];

  while (true) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 4096,
      tools,
      messages,
    });

    // Check for tool use
    if (response.stop_reason === "tool_use") {
      const toolUse = response.content.find((b) => b.type === "tool_use");
      if (toolUse) {
        const result = await executeTool(toolUse.name, toolUse.input);
        messages.push({ role: "assistant", content: response.content });
        messages.push({
          role: "user",
          content: [{ type: "tool_result", tool_use_id: toolUse.id, content: result }],
        });
        continue;
      }
    }

    // Done - return final response
    return response.content.find((b) => b.type === "text")?.text;
  }
}

┌─────────────────────────────────────────────────┐
│                    AGENT                        │
├─────────────────────────────────────────────────┤
│  Model (Brain)      │ LLM for reasoning &       │
│                     │ decision-making           │
├─────────────────────┼───────────────────────────┤
│  Tools (Arms/Legs)  │ APIs, functions, external │
│                     │ systems for action        │
├─────────────────────┼───────────────────────────┤
│  Instructions       │ System prompts defining   │
│  (Rules)            │ behavior & boundaries     │
└─────────────────────┴───────────────────────────┘

project/
├── src/
│   ├── agents/
│   │   ├── orchestrator.ts    # Main agent coordinator
│   │   ├── specialized/       # Task-specific agents
│   │   │   ├── researcher.ts
│   │   │   ├── coder.ts
│   │   │   └── reviewer.ts
│   │   └── base.ts            # Shared agent interface
│   ├── tools/
│   │   ├── definitions/       # Tool schemas
│   │   ├── implementations/   # Tool logic
│   │   └── registry.ts        # Tool discovery
│   ├── prompts/
│   │   ├── system/            # Agent instructions
│   │   └── templates/         # Task templates
│   └── memory/
│       ├── conversation.ts    # Short-term context
│       └── persistent.ts      # Long-term storage
├── tests/
│   ├── agents/                # Agent behavior tests
│   ├── tools/                 # Tool unit tests
│   └── evals/                 # End-to-end evaluations
└── skills/                    # Agent skills (Anthropic pattern)
    ├── skill-name/
    │   ├── instructions.md
    │   ├── scripts/
    │   └── resources/

// Plan explicitly before execution
async function plan(task: Task, context: Context): Promise<Plan> {
  const prompt = `
    Task: ${task.description}
    Context: ${JSON.stringify(context)}

    Create a step-by-step plan. For each step:
    1. What action to take
    2. What tools to use
    3. How to verify success
    4. What could go wrong

    Output JSON with steps array.
  `;

  return await llmCall({ prompt, schema: PlanSchema });
}

// Execute with verification at each step
async function execute(plan: Plan): Promise<Result[]> {
  const results: Result[] = [];

  for (const step of plan.steps) {
    // Execute single step
    const result = await executeStep(step);

    // Verify before continuing
    if (!await verify(step, result)) {
      // Self-correct or escalate
      const corrected = await selfCorrect(step, result);
      if (!corrected.success) {
        return handleFailure(step, results);
      }
    }

    results.push(result);
  }

  return results;
}

// Independent verification prevents overfitting
async function verify(step: Step, result: Result): Promise<boolean> {
  // Run tests if available
  if (step.testCommand) {
    const testResult = await runCommand(step.testCommand);
    if (!testResult.success) return false;
  }

  // Use LLM to verify against criteria
  const verification = await llmCall({
    prompt: `
      Step: ${step.description}
      Expected: ${step.successCriteria}
      Actual: ${JSON.stringify(result)}

      Does the result satisfy the success criteria?
      Respond with { "passes": boolean, "reasoning": string }
    `,
    schema: VerificationSchema
  });

  return verification.passes;
}

// tools/definitions/file-operations.ts
import { z } from 'zod';

export const ReadFileTool = {
  name: 'read_file',
  description: 'Read contents of a file. Use before modifying any file.',
  parameters: z.object({
    path: z.string().describe('Absolute path to the file'),
    startLine: z.number().optional().describe('Start line (1-indexed)'),
    endLine: z.number().optional().describe('End line (1-indexed)'),
  }),
  // Risk level for guardrails (OpenAI pattern)
  riskLevel: 'low' as const,
};

export const WriteFileTool = {
  name: 'write_file',
  description: 'Write content to a file. Always read first to understand context.',
  parameters: z.object({
    path: z.string().describe('Absolute path to the file'),
    content: z.string().describe('Complete file content'),
  }),
  riskLevel: 'medium' as const,
  // Require confirmation for high-risk operations
  requiresConfirmation: true,
};

// tools/implementations/file-operations.ts
export async function readFile(
  params: z.infer<typeof ReadFileTool.parameters>
): Promise<ToolResult> {
  try {
    const content = await fs.readFile(params.path, 'utf-8');
    const lines = content.split('\n');

    const start = (params.startLine ?? 1) - 1;
    const end = params.endLine ?? lines.length;

    return {
      success: true,
      data: lines.slice(start, end).join('\n'),
      metadata: { totalLines: lines.length }
    };
  } catch (error) {
    return {
      success: false,
      error: `Failed to read file: ${error.message}`
    };
  }
}

// Expose specialized agents as callable tools
const researchAgent = createAgent({
  name: 'researcher',
  instructions: 'You research topics and return structured findings.',
  tools: [webSearchTool, documentReadTool],
});

const mainAgent = createAgent({
  tools: [
    {
      type: 'function',
      function: {
        name: 'research_topic',
        description: 'Delegate research to specialized agent',
        parameters: ResearchQuerySchema,
        handler: async (query) => researchAgent.run(query),
      },
    },
  ],
});

// memory/conversation.ts
interface ConversationMemory {
  messages: Message[];
  maxTokens: number;

  add(message: Message): void;
  getContext(): Message[];
  summarize(): Promise<string>;
}

// Maintain state across tool calls (Gemini pattern)
interface AgentState {
  thoughtSignature?: string;  // Encrypted reasoning state
  conversationId: string;     // For shared memory
  currentPlan?: Plan;
  completedSteps: Step[];
}

// guards/index.ts
interface GuardrailConfig {
  // Input validation
  inputClassifier: (input: string) => Promise<SafetyResult>;

  // Output validation
  outputValidator: (output: string) => Promise<SafetyResult>;

  // Tool risk assessment
  toolRiskLevels: Record<string, 'low' | 'medium' | 'high'>;

  // Actions requiring human approval
  humanInTheLoop: string[];
}

async function executeWithGuardrails(
  agent: Agent,
  input: string,
  config: GuardrailConfig
): Promise<Result> {
  // 1. Check input safety
  const inputCheck = await config.inputClassifier(input);
  if (!inputCheck.safe) {
    return { blocked: true, reason: inputCheck.reason };
  }

  // 2. Execute with tool monitoring
  const result = await agent.run(input, {
    beforeTool: async (tool, params) => {
      const risk = config.toolRiskLevels[tool.name];
      if (risk === 'high' || config.humanInTheLoop.includes(tool.name)) {
        return await requestHumanApproval(tool, params);
      }
      return { approved: true };
    },
  });

  // 3. Validate output
  const outputCheck = await config.outputValidator(result.output);
  if (!outputCheck.safe) {
    return { blocked: true, reason: outputCheck.reason };
  }

  return result;
}

// Agent must stay within defined scope
const agentInstructions = `
You are a customer service agent for Acme Corp.

SCOPE BOUNDARIES (non-negotiable):
- Only answer questions about Acme products and services
- Never provide legal, medical, or financial advice
- Never access or modify data outside your authorized scope
- If a request is out of scope, politely decline and explain why

If you cannot complete a task within scope, notify the user
and request explicit approval before proceeding.
`;

任务复杂度	推荐模型	备注
简单、快速	gpt-5-mini, claude-haiku	低延迟
通用目的	gpt-4.1, claude-sonnet	平衡
复杂推理	o4-mini, claude-opus	更高准确性
深度规划	gpt-5 + reasoning, ultrathink	最大能力

// Use thinking_level for reasoning depth
const response = await gemini.generate({
  model: 'gemini-3',
  thinking_level: 'high',  // For complex planning
  temperature: 1.0,        // Optimized for reasoning engine
});

// Preserve thought state across tool calls
const nextResponse = await gemini.generate({
  thoughtSignature: response.thoughtSignature,  // Required for function calling
  // ... rest of params
});

describe('agent planning', () => {
  it('creates plan before executing file modifications', async () => {
    const trace = await agent.runWithTrace('Refactor the auth module');

    // Verify planning happened first
    const firstToolCall = trace.toolCalls[0];
    expect(firstToolCall.name).toBe('read_file');

    // Verify no writes without reads
    const writeIndex = trace.toolCalls.findIndex(t => t.name === 'write_file');
    const readIndex = trace.toolCalls.findIndex(t => t.name === 'read_file');
    expect(readIndex).toBeLessThan(writeIndex);
  });
});

// Run nightly, not in regular CI
describe('Agent Accuracy (Eval)', () => {
  const testCases = loadTestCases('./evals/coding-tasks.json');

  it.each(testCases)('completes $name correctly', async (testCase) => {
    const result = await agent.run(testCase.input);

    // Verify against expected outcomes
    expect(result.filesModified).toEqual(testCase.expectedFiles);
    expect(await runTests(testCase.testCommand)).toBe(true);
  }, 120000);
});

project/
├── src/
│   ├── agents/
│   │   ├── __init__.py
│   │   ├── researcher.py       # Research agent
│   │   ├── coder.py            # Coding agent
│   │   └── orchestrator.py     # Main coordinator
│   ├── tools/
│   │   ├── __init__.py
│   │   ├── web.py              # Web search tools
│   │   ├── files.py            # File operations
│   │   └── database.py         # DB queries
│   ├── models/
│   │   ├── __init__.py
│   │   └── schemas.py          # Pydantic models
│   └── deps.py                 # Dependencies
├── tests/
│   ├── test_agents.py
│   └── test_tools.py
└── pyproject.toml

from pydantic_ai import Agent, RunContext
from pydantic import BaseModel
from httpx import AsyncClient

class SearchResult(BaseModel):
    title: str
    url: str
    snippet: str

class ResearchDeps(BaseModel):
    http_client: AsyncClient
    api_key: str

research_agent = Agent(
    'claude-sonnet-4-20250514',
    deps_type=ResearchDeps,
    result_type=list[SearchResult],
    system_prompt='You are a research assistant. Use tools to find information.',
)

@research_agent.tool
async def web_search(ctx: RunContext[ResearchDeps], query: str) -> list[dict]:
    """Search the web for information."""
    response = await ctx.deps.http_client.get(
        'https://api.search.com/search',
        params={'q': query},
        headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
    )
    return response.json()['results']

@research_agent.tool
async def read_webpage(ctx: RunContext[ResearchDeps], url: str) -> str:
    """Read and extract content from a webpage."""
    response = await ctx.deps.http_client.get(url)
    return response.text[:5000]  # Truncate for context

# Usage
async def main():
    async with AsyncClient() as client:
        deps = ResearchDeps(http_client=client, api_key='...')
        result = await research_agent.run(
            'Find recent articles about LLM agents',
            deps=deps,
        )
        for item in result.data:
            print(f"- {item.title}")

from pydantic import BaseModel, Field
from pydantic_ai import Agent

class CodeReview(BaseModel):
    summary: str = Field(description="Brief summary of the review")
    issues: list[str] = Field(description="List of issues found")
    suggestions: list[str] = Field(description="Improvement suggestions")
    approval: bool = Field(description="Whether code is approved")
    confidence: float = Field(ge=0, le=1, description="Confidence score")

review_agent = Agent(
    'claude-sonnet-4-20250514',
    result_type=CodeReview,
    system_prompt='Review code for quality, security, and best practices.',
)

# Result is validated Pydantic model
result = await review_agent.run(f"Review this code:\n```python\n{code}\n```")
if result.data.approval:
    print("Code approved!")
else:
    for issue in result.data.issues:
        print(f"Issue: {issue}")

from pydantic_ai import Agent

# Specialized agents
planner = Agent('claude-sonnet-4-20250514', system_prompt='Create detailed plans.')
executor = Agent('claude-sonnet-4-20250514', system_prompt='Execute tasks precisely.')
reviewer = Agent('claude-sonnet-4-20250514', system_prompt='Review and verify work.')

async def orchestrate(task: str):
    # 1. Plan
    plan = await planner.run(f"Create a plan for: {task}")

    # 2. Execute each step
    results = []
    for step in plan.data.steps:
        result = await executor.run(f"Execute: {step}")
        results.append(result.data)

    # 3. Review
    review = await reviewer.run(
        f"Review the results:\nTask: {task}\nResults: {results}"
    )

    return review.data

import pytest
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel

@pytest.fixture
def test_agent():
    return Agent(
        TestModel(),  # Mock model for testing
        result_type=str,
    )

async def test_agent_response(test_agent):
    result = await test_agent.run('Test prompt')
    assert result.data is not None

# Test with specific responses
async def test_with_mock_response():
    model = TestModel()
    model.seed_response('Expected output')

    agent = Agent(model)
    result = await agent.run('Any prompt')
    assert result.data == 'Expected output'

# Code Review Skill

## When to Use
Activate this skill when asked to review code, PRs, or diffs.

## Process
1. Read the changed files completely
2. Run linters: `./scripts/run-linters.sh`
3. Check against resources/checklist.md
4. Provide structured feedback

## Output Format
- Summary (1-2 sentences)
- Issues found (severity: critical/major/minor)
- Suggestions for improvement
- Approval recommendation

async function loadSkill(skillName: string): Promise<Skill> {
  const skillPath = `./skills/${skillName}`;
  const instructions = await fs.readFile(`${skillPath}/instructions.md`, 'utf-8');
  const scripts = await glob(`${skillPath}/scripts/*`);
  const resources = await glob(`${skillPath}/resources/*`);

  return {
    name: skillName,
    instructions,
    scripts: scripts.map(s => ({ name: path.basename(s), path: s })),
    resources: await Promise.all(resources.map(loadResource)),
  };
}

执行前无计划 - 直接行动的智能体更容易出错
单体智能体 - 一个拥有 50 个工具的智能体会变得混乱
无验证 - 智能体必须验证自己的工作
硬编码工具序列 - 让模型决定工具顺序
缺少护栏 - 所有智能体都需要安全边界
无状态管理 - 在工具调用间丢失上下文
仅测试成功路径 - 需要测试失败和边缘情况
忽略模型差异 - 推理模型需要不同的提示词
无成本跟踪 - 智能体工作流可能很昂贵
无监督的完全自动化 - 关键操作需要人工介入

🇺🇸English

Agentic Development Skill

Load with: base.md + llm-patterns.md + [language].md

For building autonomous AI agents that perform multi-step tasks with tools.

Sources: Claude Agent SDK | Anthropic Claude Code Best Practices | Pydantic AI | Google Gemini Agent Development | OpenAI Building Agents

Framework Selection by Language

Language/Framework	Default	Why
Python	Pydantic AI	Type-safe, Pydantic validation, multi-model, production-ready
Node.js / Next.js	Claude Agent SDK	Official Anthropic SDK, tools, multi-agent, native streaming

Python: Pydantic AI (Default)

from pydantic_ai import Agent
from pydantic import BaseModel

class SearchResult(BaseModel):
    title: str
    url: str
    summary: str

agent = Agent(
    'claude-sonnet-4-20250514',
    result_type=list[SearchResult],
    system_prompt='You are a research assistant.',
)

# Type-safe result
result = await agent.run('Find articles about AI agents')
for item in result.data:
    print(f"{item.title}: {item.url}")

Node.js / Next.js: Claude Agent SDK (Default)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

// Define tools
const tools: Anthropic.Tool[] = [
  {
    name: "web_search",
    description: "Search the web for information",
    input_schema: {
      type: "object",
      properties: {
        query: { type: "string", description: "Search query" },
      },
      required: ["query"],
    },
  },
];

// Agentic loop
async function runAgent(prompt: string) {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: prompt },
  ];

  while (true) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 4096,
      tools,
      messages,
    });

    // Check for tool use
    if (response.stop_reason === "tool_use") {
      const toolUse = response.content.find((b) => b.type === "tool_use");
      if (toolUse) {
        const result = await executeTool(toolUse.name, toolUse.input);
        messages.push({ role: "assistant", content: response.content });
        messages.push({
          role: "user",
          content: [{ type: "tool_result", tool_use_id: toolUse.id, content: result }],
        });
        continue;
      }
    }

    // Done - return final response
    return response.content.find((b) => b.type === "text")?.text;
  }
}

Core Principle

Plan first, act incrementally, verify always.

Agents that research and plan before executing consistently outperform those that jump straight to action. Break complex tasks into verifiable steps, use tools judiciously, and maintain clear state throughout execution.

Agent Architecture

Three Components (OpenAI)

┌─────────────────────────────────────────────────┐
│                    AGENT                        │
├─────────────────────────────────────────────────┤
│  Model (Brain)      │ LLM for reasoning &       │
│                     │ decision-making           │
├─────────────────────┼───────────────────────────┤
│  Tools (Arms/Legs)  │ APIs, functions, external │
│                     │ systems for action        │
├─────────────────────┼───────────────────────────┤
│  Instructions       │ System prompts defining   │
│  (Rules)            │ behavior & boundaries     │
└─────────────────────┴───────────────────────────┘

Project Structure

project/
├── src/
│   ├── agents/
│   │   ├── orchestrator.ts    # Main agent coordinator
│   │   ├── specialized/       # Task-specific agents
│   │   │   ├── researcher.ts
│   │   │   ├── coder.ts
│   │   │   └── reviewer.ts
│   │   └── base.ts            # Shared agent interface
│   ├── tools/
│   │   ├── definitions/       # Tool schemas
│   │   ├── implementations/   # Tool logic
│   │   └── registry.ts        # Tool discovery
│   ├── prompts/
│   │   ├── system/            # Agent instructions
│   │   └── templates/         # Task templates
│   └── memory/
│       ├── conversation.ts    # Short-term context
│       └── persistent.ts      # Long-term storage
├── tests/
│   ├── agents/                # Agent behavior tests
│   ├── tools/                 # Tool unit tests
│   └── evals/                 # End-to-end evaluations
└── skills/                    # Agent skills (Anthropic pattern)
    ├── skill-name/
    │   ├── instructions.md
    │   ├── scripts/
    │   └── resources/

Workflow Pattern: Explore-Plan-Execute-Verify

1. Explore Phase

// Gather context before acting
async function explore(task: Task): Promise<Context> {
  const relevantFiles = await agent.searchCodebase(task.query);
  const existingPatterns = await agent.analyzePatterns(relevantFiles);
  const dependencies = await agent.identifyDependencies(task);

  return { relevantFiles, existingPatterns, dependencies };
}

2. Plan Phase (Critical)

// Plan explicitly before execution
async function plan(task: Task, context: Context): Promise<Plan> {
  const prompt = `
    Task: ${task.description}
    Context: ${JSON.stringify(context)}

    Create a step-by-step plan. For each step:
    1. What action to take
    2. What tools to use
    3. How to verify success
    4. What could go wrong

    Output JSON with steps array.
  `;

  return await llmCall({ prompt, schema: PlanSchema });
}

3. Execute Phase

// Execute with verification at each step
async function execute(plan: Plan): Promise<Result[]> {
  const results: Result[] = [];

  for (const step of plan.steps) {
    // Execute single step
    const result = await executeStep(step);

    // Verify before continuing
    if (!await verify(step, result)) {
      // Self-correct or escalate
      const corrected = await selfCorrect(step, result);
      if (!corrected.success) {
        return handleFailure(step, results);
      }
    }

    results.push(result);
  }

  return results;
}

4. Verify Phase

// Independent verification prevents overfitting
async function verify(step: Step, result: Result): Promise<boolean> {
  // Run tests if available
  if (step.testCommand) {
    const testResult = await runCommand(step.testCommand);
    if (!testResult.success) return false;
  }

  // Use LLM to verify against criteria
  const verification = await llmCall({
    prompt: `
      Step: ${step.description}
      Expected: ${step.successCriteria}
      Actual: ${JSON.stringify(result)}

      Does the result satisfy the success criteria?
      Respond with { "passes": boolean, "reasoning": string }
    `,
    schema: VerificationSchema
  });

  return verification.passes;
}

Tool Design

Tool Definition Pattern

// tools/definitions/file-operations.ts
import { z } from 'zod';

export const ReadFileTool = {
  name: 'read_file',
  description: 'Read contents of a file. Use before modifying any file.',
  parameters: z.object({
    path: z.string().describe('Absolute path to the file'),
    startLine: z.number().optional().describe('Start line (1-indexed)'),
    endLine: z.number().optional().describe('End line (1-indexed)'),
  }),
  // Risk level for guardrails (OpenAI pattern)
  riskLevel: 'low' as const,
};

export const WriteFileTool = {
  name: 'write_file',
  description: 'Write content to a file. Always read first to understand context.',
  parameters: z.object({
    path: z.string().describe('Absolute path to the file'),
    content: z.string().describe('Complete file content'),
  }),
  riskLevel: 'medium' as const,
  // Require confirmation for high-risk operations
  requiresConfirmation: true,
};

Tool Implementation

// tools/implementations/file-operations.ts
export async function readFile(
  params: z.infer<typeof ReadFileTool.parameters>
): Promise<ToolResult> {
  try {
    const content = await fs.readFile(params.path, 'utf-8');
    const lines = content.split('\n');

    const start = (params.startLine ?? 1) - 1;
    const end = params.endLine ?? lines.length;

    return {
      success: true,
      data: lines.slice(start, end).join('\n'),
      metadata: { totalLines: lines.length }
    };
  } catch (error) {
    return {
      success: false,
      error: `Failed to read file: ${error.message}`
    };
  }
}

Prefer Built-in Tools (OpenAI)

// Use platform-provided tools when available
const agent = createAgent({
  tools: [
    // Built-in tools (handled by platform)
    { type: 'web_search' },
    { type: 'code_interpreter' },

    // Custom tools only when needed
    { type: 'function', function: customDatabaseTool },
  ],
});

Multi-Agent Patterns

Single Agent (Default)

Use one agent for most tasks. Multiple agents add complexity.

Agent-as-Tool Pattern (OpenAI)

// Expose specialized agents as callable tools
const researchAgent = createAgent({
  name: 'researcher',
  instructions: 'You research topics and return structured findings.',
  tools: [webSearchTool, documentReadTool],
});

const mainAgent = createAgent({
  tools: [
    {
      type: 'function',
      function: {
        name: 'research_topic',
        description: 'Delegate research to specialized agent',
        parameters: ResearchQuerySchema,
        handler: async (query) => researchAgent.run(query),
      },
    },
  ],
});

Handoff Pattern (OpenAI)

// One-way transfer between agents
const customerServiceAgent = createAgent({
  tools: [
    // Handoff to specialist when needed
    {
      name: 'transfer_to_billing',
      description: 'Transfer to billing specialist for payment issues',
      handler: async (context) => {
        return { handoff: 'billing_agent', context };
      },
    },
  ],
});

When to Use Multiple Agents

Separate task domains with non-overlapping tools
Different authorization levels needed
Complex workflows with clear handoff points
Parallel execution of independent subtasks

Memory & State

Conversation Memory

// memory/conversation.ts
interface ConversationMemory {
  messages: Message[];
  maxTokens: number;

  add(message: Message): void;
  getContext(): Message[];
  summarize(): Promise<string>;
}

// Maintain state across tool calls (Gemini pattern)
interface AgentState {
  thoughtSignature?: string;  // Encrypted reasoning state
  conversationId: string;     // For shared memory
  currentPlan?: Plan;
  completedSteps: Step[];
}

Persistent Memory

// memory/persistent.ts
interface PersistentMemory {
  // Store learnings across sessions
  store(key: string, value: any): Promise<void>;
  retrieve(key: string): Promise<any>;

  // Semantic search over past interactions
  search(query: string, limit: number): Promise<Memory[]>;
}

Guardrails & Safety

Multi-Layer Protection (OpenAI)

// guards/index.ts
interface GuardrailConfig {
  // Input validation
  inputClassifier: (input: string) => Promise<SafetyResult>;

  // Output validation
  outputValidator: (output: string) => Promise<SafetyResult>;

  // Tool risk assessment
  toolRiskLevels: Record<string, 'low' | 'medium' | 'high'>;

  // Actions requiring human approval
  humanInTheLoop: string[];
}

async function executeWithGuardrails(
  agent: Agent,
  input: string,
  config: GuardrailConfig
): Promise<Result> {
  // 1. Check input safety
  const inputCheck = await config.inputClassifier(input);
  if (!inputCheck.safe) {
    return { blocked: true, reason: inputCheck.reason };
  }

  // 2. Execute with tool monitoring
  const result = await agent.run(input, {
    beforeTool: async (tool, params) => {
      const risk = config.toolRiskLevels[tool.name];
      if (risk === 'high' || config.humanInTheLoop.includes(tool.name)) {
        return await requestHumanApproval(tool, params);
      }
      return { approved: true };
    },
  });

  // 3. Validate output
  const outputCheck = await config.outputValidator(result.output);
  if (!outputCheck.safe) {
    return { blocked: true, reason: outputCheck.reason };
  }

  return result;
}

Scope Enforcement (OpenAI)

// Agent must stay within defined scope
const agentInstructions = `
You are a customer service agent for Acme Corp.

SCOPE BOUNDARIES (non-negotiable):
- Only answer questions about Acme products and services
- Never provide legal, medical, or financial advice
- Never access or modify data outside your authorized scope
- If a request is out of scope, politely decline and explain why

If you cannot complete a task within scope, notify the user
and request explicit approval before proceeding.
`;

Model Selection

Match Model to Task

Task Complexity	Recommended Model	Notes
Simple, fast	gpt-5-mini, claude-haiku	Low latency
General purpose	gpt-4.1, claude-sonnet	Balance
Complex reasoning	o4-mini, claude-opus	Higher accuracy
Deep planning	gpt-5 + reasoning, ultrathink	Maximum capability

Gemini-Specific

// Use thinking_level for reasoning depth
const response = await gemini.generate({
  model: 'gemini-3',
  thinking_level: 'high',  // For complex planning
  temperature: 1.0,        // Optimized for reasoning engine
});

// Preserve thought state across tool calls
const nextResponse = await gemini.generate({
  thoughtSignature: response.thoughtSignature,  // Required for function calling
  // ... rest of params
});

Claude-Specific (Thinking Modes)

// Trigger extended thinking with keywords
const thinkingLevels = {
  'think': 'standard analysis',
  'think hard': 'deeper reasoning',
  'think harder': 'extensive analysis',
  'ultrathink': 'maximum reasoning budget',
};

const prompt = `
Think hard about this problem before proposing a solution.

Task: ${task.description}
`;

Testing Agents

Unit Tests (Tools)

describe('readFile tool', () => {
  it('reads file content correctly', async () => {
    const result = await readFile({ path: '/test/file.txt' });
    expect(result.success).toBe(true);
    expect(result.data).toContain('expected content');
  });
});

Behavior Tests (Agent Decisions)

describe('agent planning', () => {
  it('creates plan before executing file modifications', async () => {
    const trace = await agent.runWithTrace('Refactor the auth module');

    // Verify planning happened first
    const firstToolCall = trace.toolCalls[0];
    expect(firstToolCall.name).toBe('read_file');

    // Verify no writes without reads
    const writeIndex = trace.toolCalls.findIndex(t => t.name === 'write_file');
    const readIndex = trace.toolCalls.findIndex(t => t.name === 'read_file');
    expect(readIndex).toBeLessThan(writeIndex);
  });
});

Evaluation Tests

// Run nightly, not in regular CI
describe('Agent Accuracy (Eval)', () => {
  const testCases = loadTestCases('./evals/coding-tasks.json');

  it.each(testCases)('completes $name correctly', async (testCase) => {
    const result = await agent.run(testCase.input);

    // Verify against expected outcomes
    expect(result.filesModified).toEqual(testCase.expectedFiles);
    expect(await runTests(testCase.testCommand)).toBe(true);
  }, 120000);
});

Pydantic AI Patterns (Python Default)

Project Structure (Python)

project/
├── src/
│   ├── agents/
│   │   ├── __init__.py
│   │   ├── researcher.py       # Research agent
│   │   ├── coder.py            # Coding agent
│   │   └── orchestrator.py     # Main coordinator
│   ├── tools/
│   │   ├── __init__.py
│   │   ├── web.py              # Web search tools
│   │   ├── files.py            # File operations
│   │   └── database.py         # DB queries
│   ├── models/
│   │   ├── __init__.py
│   │   └── schemas.py          # Pydantic models
│   └── deps.py                 # Dependencies
├── tests/
│   ├── test_agents.py
│   └── test_tools.py
└── pyproject.toml

Agent with Tools

from pydantic_ai import Agent, RunContext
from pydantic import BaseModel
from httpx import AsyncClient

class SearchResult(BaseModel):
    title: str
    url: str
    snippet: str

class ResearchDeps(BaseModel):
    http_client: AsyncClient
    api_key: str

research_agent = Agent(
    'claude-sonnet-4-20250514',
    deps_type=ResearchDeps,
    result_type=list[SearchResult],
    system_prompt='You are a research assistant. Use tools to find information.',
)

@research_agent.tool
async def web_search(ctx: RunContext[ResearchDeps], query: str) -> list[dict]:
    """Search the web for information."""
    response = await ctx.deps.http_client.get(
        'https://api.search.com/search',
        params={'q': query},
        headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
    )
    return response.json()['results']

@research_agent.tool
async def read_webpage(ctx: RunContext[ResearchDeps], url: str) -> str:
    """Read and extract content from a webpage."""
    response = await ctx.deps.http_client.get(url)
    return response.text[:5000]  # Truncate for context

# Usage
async def main():
    async with AsyncClient() as client:
        deps = ResearchDeps(http_client=client, api_key='...')
        result = await research_agent.run(
            'Find recent articles about LLM agents',
            deps=deps,
        )
        for item in result.data:
            print(f"- {item.title}")

Structured Output with Validation

from pydantic import BaseModel, Field
from pydantic_ai import Agent

class CodeReview(BaseModel):
    summary: str = Field(description="Brief summary of the review")
    issues: list[str] = Field(description="List of issues found")
    suggestions: list[str] = Field(description="Improvement suggestions")
    approval: bool = Field(description="Whether code is approved")
    confidence: float = Field(ge=0, le=1, description="Confidence score")

review_agent = Agent(
    'claude-sonnet-4-20250514',
    result_type=CodeReview,
    system_prompt='Review code for quality, security, and best practices.',
)

# Result is validated Pydantic model
result = await review_agent.run(f"Review this code:\n```python\n{code}\n```")
if result.data.approval:
    print("Code approved!")
else:
    for issue in result.data.issues:
        print(f"Issue: {issue}")

Multi-Agent Coordination

from pydantic_ai import Agent

# Specialized agents
planner = Agent('claude-sonnet-4-20250514', system_prompt='Create detailed plans.')
executor = Agent('claude-sonnet-4-20250514', system_prompt='Execute tasks precisely.')
reviewer = Agent('claude-sonnet-4-20250514', system_prompt='Review and verify work.')

async def orchestrate(task: str):
    # 1. Plan
    plan = await planner.run(f"Create a plan for: {task}")

    # 2. Execute each step
    results = []
    for step in plan.data.steps:
        result = await executor.run(f"Execute: {step}")
        results.append(result.data)

    # 3. Review
    review = await reviewer.run(
        f"Review the results:\nTask: {task}\nResults: {results}"
    )

    return review.data

Streaming Responses

from pydantic_ai import Agent

agent = Agent('claude-sonnet-4-20250514')

async def stream_response(prompt: str):
    async with agent.run_stream(prompt) as response:
        async for chunk in response.stream():
            print(chunk, end='', flush=True)

    # Get final structured result
    result = await response.get_data()
    return result

Testing Agents

import pytest
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel

@pytest.fixture
def test_agent():
    return Agent(
        TestModel(),  # Mock model for testing
        result_type=str,
    )

async def test_agent_response(test_agent):
    result = await test_agent.run('Test prompt')
    assert result.data is not None

# Test with specific responses
async def test_with_mock_response():
    model = TestModel()
    model.seed_response('Expected output')

    agent = Agent(model)
    result = await agent.run('Any prompt')
    assert result.data == 'Expected output'

Skills Pattern (Anthropic)

Skill Structure

skills/
└── code-review/
    ├── instructions.md      # How to perform code reviews
    ├── scripts/
    │   └── run-linters.sh   # Supporting scripts
    └── resources/
        └── checklist.md     # Review checklist

instructions.md Example

# Code Review Skill

## When to Use
Activate this skill when asked to review code, PRs, or diffs.

## Process
1. Read the changed files completely
2. Run linters: `./scripts/run-linters.sh`
3. Check against resources/checklist.md
4. Provide structured feedback

## Output Format
- Summary (1-2 sentences)
- Issues found (severity: critical/major/minor)
- Suggestions for improvement
- Approval recommendation

Loading Skills Dynamically

async function loadSkill(skillName: string): Promise<Skill> {
  const skillPath = `./skills/${skillName}`;
  const instructions = await fs.readFile(`${skillPath}/instructions.md`, 'utf-8');
  const scripts = await glob(`${skillPath}/scripts/*`);
  const resources = await glob(`${skillPath}/resources/*`);

  return {
    name: skillName,
    instructions,
    scripts: scripts.map(s => ({ name: path.basename(s), path: s })),
    resources: await Promise.all(resources.map(loadResource)),
  };
}

Anti-Patterns

No planning before execution - Agents that jump to action make more errors
Monolithic agents - One agent with 50 tools becomes confused
No verification - Agents must verify their own work
Hardcoded tool sequences - Let the model decide tool order
Missing guardrails - All agents need safety boundaries
No state management - Lose context across tool calls
Testing only happy paths - Test failures and edge cases
Ignoring model differences - Reasoning models need different prompts
No cost tracking - Agentic workflows can be expensive
Full automation without oversight - Human-in-the-loop for critical actions

Quick Reference

Agent Development Checklist

Define clear agent scope and boundaries
Design tools with explicit schemas and risk levels
Implement explore-plan-execute-verify workflow
Add multi-layer guardrails
Set up conversation and persistent memory
Write behavior and evaluation tests
Configure appropriate model for task complexity
Add human-in-the-loop for high-risk operations
Monitor token usage and costs
Document skills and instructions

Thinking Triggers (Claude)

"think"        → Standard analysis
"think hard"   → Deeper reasoning
"think harder" → Extensive analysis
"ultrathink"   → Maximum reasoning

Gemini Settings

thinking_level: "high" | "low"
temperature: 1.0 (keep at 1.0 for reasoning)
thoughtSignature: <pass back for function calling>

Weekly Installs

132

Repository

alinaqi/claude-bootstrap

GitHub Stars

530

First Seen

Jan 20, 2026

Security Audits

Gen Agent Trust HubPass SocketWarn SnykWarn

Installed on

opencode106

codex101

gemini-cli99

claude-code96

cursor88

github-copilot80

AI智能体开发指南：Python与Node.js框架选择、核心原则与架构解析

🇨🇳中文介绍

智能体开发技能

按语言选择框架

Python: Pydantic AI (默认)

相关 Skills

Node.js / Next.js: Claude Agent SDK (默认)

核心原则

智能体架构

三个组件 (OpenAI)

项目结构

工作流模式：探索-计划-执行-验证

1. 探索阶段

2. 计划阶段 (关键)

3. 执行阶段

4. 验证阶段

工具设计

工具定义模式

工具实现

优先使用内置工具 (OpenAI)

多智能体模式

单智能体 (默认)

智能体即工具模式 (OpenAI)

交接模式 (OpenAI)

何时使用多智能体

内存与状态

对话内存

持久化内存

护栏与安全

多层保护 (OpenAI)

范围强制执行 (OpenAI)

模型选择

根据任务匹配模型

Gemini 特定设置

Claude 特定设置 (思考模式)

测试智能体

单元测试 (工具)

行为测试 (智能体决策)

评估测试

Pydantic AI 模式 (Python 默认)

项目结构 (Python)

带工具的智能体

带验证的结构化输出

多智能体协调

流式响应