Datadog日志管理工具：低成本搜索、处理与归档日志的完整指南

dd-logs by datadog-labs/agent-skills

266 周安装量

85 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/datadog-labs/agent-skills --skill dd-logs

开发运维监控数据处理

🇨🇳中文介绍

Datadog 日志

具备成本意识地搜索、处理和归档日志。

前提条件

Datadog Pup (dd-pup/pup) 应该已经安装：

go install github.com/datadog-labs/pup@latest

快速开始

pup auth login

搜索日志

# 基础搜索
pup logs search --query="status:error" --from="1h"

# 带过滤器
pup logs search --query="service:api status:error" --from="1h" --limit 100

# JSON 输出
pup logs search --query="@http.status_code:>=500" --from="1h" --json

搜索语法

查询	含义
`error`	全文搜索
`status:error`

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

⚠️ 排除过滤器 (成本控制)

仅索引重要内容：

{
  "name": "Drop debug logs",
  "filter": {"query": "status:debug"},
  "is_enabled": true
}

# 查找最嘈杂的日志来源
pup logs search --query="*" --from="1h" --json | jq 'group_by(.service) | map({service: .[0].service, count: length}) | sort_by(-.count)[:10]'

排除项	查询
健康检查	`@http.url:"/health" OR @http.url:"/ready"`
调试日志	`status:debug`
静态资源	`@http.url:.css OR @http.url:.js`
心跳	`@message:heartbeat`

低成本存储日志以满足合规性要求：

# 列出归档
pup logs archives list

# 归档配置 (S3 示例)
{
  "name": "compliance-archive",
  "query": "*",
  "destination": {
    "type": "s3",
    "bucket": "my-logs-archive",
    "path": "/datadog"
  },
  "rehydration_tags": ["team:platform"]
}

重新提取 (恢复)

# 重新提取已归档的日志
pup logs rehydrate create \
  --archive-id abc123 \
  --from "2024-01-01T00:00:00Z" \
  --to "2024-01-02T00:00:00Z" \
  --query "service:api status:error"

基于日志的指标

从日志创建指标 (比索引更便宜)：

# 统计每个服务的错误数
pup logs metrics create \
  --name "api.errors.count" \
  --query "service:api status:error" \
  --group-by "endpoint"

⚠️ 基数警告： 仅按有界值分组。

{
  "type": "hash-remapper",
  "name": "Hash emails",
  "sources": ["email", "@user.email"]
}

# 在你的应用中 - 发送前进行清理
import re

def sanitize_log(message: str) -> str:
    # 移除信用卡号
    message = re.sub(r'\b\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}\b', '[REDACTED]', message)
    # 移除社会安全号码
    message = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[REDACTED]', message)
    return message

问题	修复方法
日志未出现	检查代理、流水线过滤器
成本过高	添加排除过滤器
搜索缓慢	缩小时间范围，使用索引
缺少属性	检查 grok 解析器

🇺🇸English

Datadog Logs

Search, process, and archive logs with cost awareness.

Prerequisites

Datadog Pup (dd-pup/pup) should already be installed:

go install github.com/datadog-labs/pup@latest

Quick Start

pup auth login

Search Logs

# Basic search
pup logs search --query="status:error" --from="1h"

# With filters
pup logs search --query="service:api status:error" --from="1h" --limit 100

# JSON output
pup logs search --query="@http.status_code:>=500" --from="1h" --json

Search Syntax

Query	Meaning
`error`	Full-text search
`status:error`	Tag equals
`@http.status_code:500`	Attribute equals
`@http.status_code:>=400`	Numeric range
`service:api AND env:prod`	Boolean
`@message:timeout`	Wildcard

Pipelines

Process logs before indexing:

# List pipelines
pup logs pipelines list

# Create pipeline (JSON)
pup logs pipelines create --json @pipeline.json

Common Processors

{
  "name": "API Logs",
  "filter": {"query": "service:api"},
  "processors": [
    {
      "type": "grok-parser",
      "name": "Parse nginx",
      "source": "message",
      "grok": {"match_rules": "%{IPORHOST:client_ip} %{DATA:method} %{DATA:path} %{NUMBER:status}"}
    },
    {
      "type": "status-remapper",
      "name": "Set severity",
      "sources": ["level", "severity"]
    },
    {
      "type": "attribute-remapper",
      "name": "Remap user_id",
      "sources": ["user_id"],
      "target": "usr.id"
    }
  ]
}

⚠️ Exclusion Filters (Cost Control)

Index only what matters:

{
  "name": "Drop debug logs",
  "filter": {"query": "status:debug"},
  "is_enabled": true
}

High-Volume Exclusions

# Find noisiest log sources
pup logs search --query="*" --from="1h" --json | jq 'group_by(.service) | map({service: .[0].service, count: length}) | sort_by(-.count)[:10]'

Exclude	Query
Health checks	`@http.url:"/health" OR @http.url:"/ready"`
Debug logs	`status:debug`
Static assets	`@http.url:.css OR @http.url:.js`
Heartbeats	`@message:heartbeat`

Log-Based Metrics

Create metrics from logs (cheaper than indexing):

# Count errors per service
pup logs metrics create \
  --name "api.errors.count" \
  --query "service:api status:error" \
  --group-by "endpoint"

⚠️ Cardinality warning: Group by bounded values only.

Sensitive Data

Scrubbing Rules

{
  "type": "hash-remapper",
  "name": "Hash emails",
  "sources": ["email", "@user.email"]
}

Never Log

# In your app - sanitize before sending
import re

def sanitize_log(message: str) -> str:
    # Remove credit cards
    message = re.sub(r'\b\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}\b', '[REDACTED]', message)
    # Remove SSNs
    message = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[REDACTED]', message)
    return message

Troubleshooting

Problem	Fix
Logs not appearing	Check agent, pipeline filters
High costs	Add exclusion filters
Search slow	Narrow time range, use indexes
Missing attributes	Check grok parser

References/Documentation

Weekly Installs

Repository

datadog-labs/ag…t-skills

GitHub Stars

First Seen

13 days ago

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

github-copilot89

kimi-cli88

gemini-cli88

amp88

codex88

opencode88

Azure 升级评估与自动化工具 - 轻松迁移 Functions 计划、托管层级和 SKU

104,900 周安装