npx skills add https://github.com/datadog-labs/agent-skills --skill dd-logs具备成本意识地搜索、处理和归档日志。
Datadog Pup (dd-pup/pup) 应该已经安装:
go install github.com/datadog-labs/pup@latest
pup auth login
# 基础搜索
pup logs search --query="status:error" --from="1h"
# 带过滤器
pup logs search --query="service:api status:error" --from="1h" --limit 100
# JSON 输出
pup logs search --query="@http.status_code:>=500" --from="1h" --json
| 查询 | 含义 |
|---|---|
error | 全文搜索 |
status:error |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| 标签等于 |
@http.status_code:500 | 属性等于 |
@http.status_code:>=400 | 数值范围 |
service:api AND env:prod | 布尔运算 |
@message:*timeout* | 通配符 |
在索引前处理日志:
# 列出流水线
pup logs pipelines list
# 创建流水线 (JSON)
pup logs pipelines create --json @pipeline.json
{
"name": "API Logs",
"filter": {"query": "service:api"},
"processors": [
{
"type": "grok-parser",
"name": "Parse nginx",
"source": "message",
"grok": {"match_rules": "%{IPORHOST:client_ip} %{DATA:method} %{DATA:path} %{NUMBER:status}"}
},
{
"type": "status-remapper",
"name": "Set severity",
"sources": ["level", "severity"]
},
{
"type": "attribute-remapper",
"name": "Remap user_id",
"sources": ["user_id"],
"target": "usr.id"
}
]
}
仅索引重要内容:
{
"name": "Drop debug logs",
"filter": {"query": "status:debug"},
"is_enabled": true
}
# 查找最嘈杂的日志来源
pup logs search --query="*" --from="1h" --json | jq 'group_by(.service) | map({service: .[0].service, count: length}) | sort_by(-.count)[:10]'
| 排除项 | 查询 |
|---|---|
| 健康检查 | @http.url:"/health" OR @http.url:"/ready" |
| 调试日志 | status:debug |
| 静态资源 | @http.url:*.css OR @http.url:*.js |
| 心跳 | @message:*heartbeat* |
低成本存储日志以满足合规性要求:
# 列出归档
pup logs archives list
# 归档配置 (S3 示例)
{
"name": "compliance-archive",
"query": "*",
"destination": {
"type": "s3",
"bucket": "my-logs-archive",
"path": "/datadog"
},
"rehydration_tags": ["team:platform"]
}
# 重新提取已归档的日志
pup logs rehydrate create \
--archive-id abc123 \
--from "2024-01-01T00:00:00Z" \
--to "2024-01-02T00:00:00Z" \
--query "service:api status:error"
从日志创建指标 (比索引更便宜):
# 统计每个服务的错误数
pup logs metrics create \
--name "api.errors.count" \
--query "service:api status:error" \
--group-by "endpoint"
⚠️ 基数警告: 仅按有界值分组。
{
"type": "hash-remapper",
"name": "Hash emails",
"sources": ["email", "@user.email"]
}
# 在你的应用中 - 发送前进行清理
import re
def sanitize_log(message: str) -> str:
# 移除信用卡号
message = re.sub(r'\b\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}\b', '[REDACTED]', message)
# 移除社会安全号码
message = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[REDACTED]', message)
return message
| 问题 | 修复方法 |
|---|---|
| 日志未出现 | 检查代理、流水线过滤器 |
| 成本过高 | 添加排除过滤器 |
| 搜索缓慢 | 缩小时间范围,使用索引 |
| 缺少属性 | 检查 grok 解析器 |
每周安装量
90
代码仓库
GitHub 星标数
60
首次出现
13 天前
安全审计
安装于
github-copilot89
kimi-cli88
gemini-cli88
amp88
codex88
opencode88
Search, process, and archive logs with cost awareness.
Datadog Pup (dd-pup/pup) should already be installed:
go install github.com/datadog-labs/pup@latest
pup auth login
# Basic search
pup logs search --query="status:error" --from="1h"
# With filters
pup logs search --query="service:api status:error" --from="1h" --limit 100
# JSON output
pup logs search --query="@http.status_code:>=500" --from="1h" --json
| Query | Meaning |
|---|---|
error | Full-text search |
status:error | Tag equals |
@http.status_code:500 | Attribute equals |
@http.status_code:>=400 | Numeric range |
service:api AND env:prod | Boolean |
@message:*timeout* | Wildcard |
Process logs before indexing:
# List pipelines
pup logs pipelines list
# Create pipeline (JSON)
pup logs pipelines create --json @pipeline.json
{
"name": "API Logs",
"filter": {"query": "service:api"},
"processors": [
{
"type": "grok-parser",
"name": "Parse nginx",
"source": "message",
"grok": {"match_rules": "%{IPORHOST:client_ip} %{DATA:method} %{DATA:path} %{NUMBER:status}"}
},
{
"type": "status-remapper",
"name": "Set severity",
"sources": ["level", "severity"]
},
{
"type": "attribute-remapper",
"name": "Remap user_id",
"sources": ["user_id"],
"target": "usr.id"
}
]
}
Index only what matters:
{
"name": "Drop debug logs",
"filter": {"query": "status:debug"},
"is_enabled": true
}
# Find noisiest log sources
pup logs search --query="*" --from="1h" --json | jq 'group_by(.service) | map({service: .[0].service, count: length}) | sort_by(-.count)[:10]'
| Exclude | Query |
|---|---|
| Health checks | @http.url:"/health" OR @http.url:"/ready" |
| Debug logs | status:debug |
| Static assets | @http.url:*.css OR @http.url:*.js |
| Heartbeats | @message:*heartbeat* |
Store logs cheaply for compliance:
# List archives
pup logs archives list
# Archive config (S3 example)
{
"name": "compliance-archive",
"query": "*",
"destination": {
"type": "s3",
"bucket": "my-logs-archive",
"path": "/datadog"
},
"rehydration_tags": ["team:platform"]
}
# Rehydrate archived logs
pup logs rehydrate create \
--archive-id abc123 \
--from "2024-01-01T00:00:00Z" \
--to "2024-01-02T00:00:00Z" \
--query "service:api status:error"
Create metrics from logs (cheaper than indexing):
# Count errors per service
pup logs metrics create \
--name "api.errors.count" \
--query "service:api status:error" \
--group-by "endpoint"
⚠️ Cardinality warning: Group by bounded values only.
{
"type": "hash-remapper",
"name": "Hash emails",
"sources": ["email", "@user.email"]
}
# In your app - sanitize before sending
import re
def sanitize_log(message: str) -> str:
# Remove credit cards
message = re.sub(r'\b\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}\b', '[REDACTED]', message)
# Remove SSNs
message = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[REDACTED]', message)
return message
| Problem | Fix |
|---|---|
| Logs not appearing | Check agent, pipeline filters |
| High costs | Add exclusion filters |
| Search slow | Narrow time range, use indexes |
| Missing attributes | Check grok parser |
Weekly Installs
90
Repository
GitHub Stars
60
First Seen
13 days ago
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
github-copilot89
kimi-cli88
gemini-cli88
amp88
codex88
opencode88
Azure 升级评估与自动化工具 - 轻松迁移 Functions 计划、托管层级和 SKU
104,900 周安装