photo-content-recognition-curation-expert by erichowens/some_claude_skills
npx skills add https://github.com/erichowens/some_claude_skills --skill photo-content-recognition-curation-expert精通照片内容分析与智能整理。结合经典计算机视觉与现代深度学习,提供全面的照片分析。
✅ 适用于:
❌ 不适用于:
event-detection-temporal-intelligence-expertcolor-theory-palette-harmony-expertclip-aware-embeddingsWhat do you need to recognize/filter?
│
├─ Duplicate photos? ─────────────────────────────── Perceptual Hashing
│ ├─ Exact duplicates? ──────────────────────────── dHash (fastest)
│ ├─ Brightness/contrast changes? ───────────────── pHash (DCT-based)
│ ├─ Heavy crops/compression? ───────────────────── DINOHash (2025 SOTA)
│ └─ Production system? ─────────────────────────── Hybrid (pHash → DINOHash)
│
├─ People in photos? ─────────────────────────────── Face Clustering
│ ├─ Known thresholds? ──────────────────────────── Apple-style Agglomerative
│ └─ Unknown data distribution? ─────────────────── HDBSCAN
│
├─ Pets/Animals? ─────────────────────────────────── Pet Recognition
│ ├─ Detection? ─────────────────────────────────── YOLOv8
│ └─ Individual clustering? ─────────────────────── CLIP + HDBSCAN
│
├─ Best from burst? ──────────────────────────────── Burst Selection
│ └─ Score: sharpness + face quality + aesthetics
│
└─ Filter junk? ──────────────────────────────────── Content Detection
├─ Screenshots? ───────────────────────────────── Multi-signal classifier
└─ NSFW? ──────────────────────────────────────── Safety classifier
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
问题: 相机连拍、重新保存的图像和轻微编辑会产生近似重复的照片。
解决方案: 感知哈希为视觉上相似的图像生成相似的值。
方法比较:
| 方法 | 速度 | 鲁棒性 | 最佳适用场景 |
|---|---|---|---|
| dHash | 最快 | 低 | 完全重复的照片 |
| pHash | 快 | 中等 | 亮度/对比度变化 |
| DINOHash | 较慢 | 高 | 严重裁剪、压缩 |
| 混合方法 | 中等 | 非常高 | 生产系统 |
混合流水线:
汉明距离阈值:
→ 深入探讨 : references/perceptual-hashing.md
目标: 无需用户标记,按人物对照片进行分组。
Apple Photos 策略:
HDBSCAN 替代方案:
参数:
| 设置 | 凝聚聚类 | HDBSCAN |
|---|---|---|
| 第一轮阈值 | 0.4 | - |
| 第二轮阈值 | 0.6 | - |
| 最小聚类大小 | - | 3 张照片 |
| 度量标准 | 余弦 | 余弦 |
→ 深入探讨 : references/face-clustering.md
问题: 连拍模式会产生 10-50 张几乎相同的照片。
多标准评分:
| 标准 | 权重 | 测量方式 |
|---|---|---|
| 清晰度 | 30% | 拉普拉斯方差 |
| 人脸质量 | 35% | 眼睛睁开、微笑、人脸清晰度 |
| 美学 | 20% | NIMA 分数 |
| 位置 | 10% | 中间帧奖励 |
| 曝光 | 5% | 直方图裁剪检查 |
连拍检测: 彼此间隔 0.5 秒内的照片。
→ 深入探讨 : references/content-detection.md
多信号方法:
| 信号 | 置信度 | 描述 |
|---|---|---|
| UI 元素 | 0.85 | 检测到状态栏、按钮 |
| 完美矩形 | 0.75 | >5 个 UI 按钮 |
| 高文本覆盖率 | 0.70 | >25% 文本覆盖率 |
| 无相机 EXIF | 0.60 | 缺少制造商/型号/镜头信息 |
| 设备宽高比 | 0.60 | 精确的手机屏幕比例 |
| 完美清晰度 | 0.50 | >2000 拉普拉斯方差 |
决策: 置信度 >0.6 = 截图
→ 深入探讨 : references/content-detection.md
目标: 通过缓存高效索引 10K+ 照片。
提取的特征:
性能:
| 操作 | 时间 |
|---|---|
| 感知哈希 | 2 分钟 |
| CLIP 嵌入 | 3 分钟 |
| 人脸检测 | 4 分钟 |
| 调色板 | 1 分钟 |
| 美学评分 | 2 分钟 |
| 聚类 + 去重 | 1 分钟 |
| 总计 | ~13 分钟 |
| 增量更新 | < 1 分钟 |
→ 深入探讨 : references/photo-indexing.md
表现:
distance = np.linalg.norm(embedding1 - embedding2) # WRONG
错误原因: 人脸嵌入是归一化的;余弦相似度是正确的度量标准。
正确做法:
from scipy.spatial.distance import cosine
distance = cosine(embedding1, embedding2) # Correct
表现: 对所有人物聚类使用相同的距离阈值。
错误原因: 不同人物的类内方差不同。
正确做法: 使用 HDBSCAN 自动发现阈值,或使用保守 + 宽松的两轮聚类。
表现:
is_duplicate = np.allclose(img1, img2) # WRONG
错误原因: 重新保存的 JPEG、裁剪、亮度变化会产生像素差异。
正确做法: 使用感知哈希配合汉明距离。
表现: 一次处理一张照片的人脸,不进行批处理。
错误原因: GPU 利用率不足,比批处理慢 10 倍。
正确做法: 使用 GPU 加速进行批处理。
表现:
for face in all_detected_faces:
cluster(face) # No filtering
错误原因: 低置信度检测会产生噪声聚类。
正确做法: 按置信度过滤。
表现: 将噪声点分配给最近的聚类。
错误原因: 单人照片不应污染人物聚类。
正确做法: HDBSCAN/DBSCAN 自然识别噪声。保持噪声分离。
from photo_curation import PhotoCurationPipeline
pipeline = PhotoCurationPipeline()
# Index photo library
index = pipeline.index_library('/path/to/photos')
# De-duplicate
duplicates = index.find_duplicates()
print(f"Found {len(duplicates)} duplicate groups")
# Cluster faces
face_clusters = index.cluster_faces()
print(f"Found {len(face_clusters)} people")
# Select best from bursts
best_photos = pipeline.select_best_from_bursts(index)
# Filter screenshots
real_photos = pipeline.filter_screenshots(index)
# Curate for collage
collage_photos = pipeline.curate_for_collage(index, target_count=100)
torch transformers facenet-pytorch ultralytics hdbscan opencv-python scipy numpy scikit-learn pillow pytesseract
版本 : 2.0.0 最后更新 : November 2025
每周安装量
116
代码库
GitHub 星标数
76
首次出现
Jan 24, 2026
安全审计
安装于
opencode108
codex107
gemini-cli106
cursor106
github-copilot102
kimi-cli95
Expert in photo content analysis and intelligent curation. Combines classical computer vision with modern deep learning for comprehensive photo analysis.
✅ Use for:
❌ NOT for:
event-detection-temporal-intelligence-expertcolor-theory-palette-harmony-expertclip-aware-embeddingsWhat do you need to recognize/filter?
│
├─ Duplicate photos? ─────────────────────────────── Perceptual Hashing
│ ├─ Exact duplicates? ──────────────────────────── dHash (fastest)
│ ├─ Brightness/contrast changes? ───────────────── pHash (DCT-based)
│ ├─ Heavy crops/compression? ───────────────────── DINOHash (2025 SOTA)
│ └─ Production system? ─────────────────────────── Hybrid (pHash → DINOHash)
│
├─ People in photos? ─────────────────────────────── Face Clustering
│ ├─ Known thresholds? ──────────────────────────── Apple-style Agglomerative
│ └─ Unknown data distribution? ─────────────────── HDBSCAN
│
├─ Pets/Animals? ─────────────────────────────────── Pet Recognition
│ ├─ Detection? ─────────────────────────────────── YOLOv8
│ └─ Individual clustering? ─────────────────────── CLIP + HDBSCAN
│
├─ Best from burst? ──────────────────────────────── Burst Selection
│ └─ Score: sharpness + face quality + aesthetics
│
└─ Filter junk? ──────────────────────────────────── Content Detection
├─ Screenshots? ───────────────────────────────── Multi-signal classifier
└─ NSFW? ──────────────────────────────────────── Safety classifier
Problem: Camera bursts, re-saved images, and minor edits create near-duplicates.
Solution: Perceptual hashes generate similar values for visually similar images.
Method Comparison:
| Method | Speed | Robustness | Best For |
|---|---|---|---|
| dHash | Fastest | Low | Exact duplicates |
| pHash | Fast | Medium | Brightness/contrast changes |
| DINOHash | Slower | High | Heavy crops, compression |
| Hybrid | Medium | Very High | Production systems |
Hybrid Pipeline (2025 Best Practice):
Hamming Distance Thresholds:
→ Deep dive : references/perceptual-hashing.md
Goal: Group photos by person without user labeling.
Apple Photos Strategy (2021-2025):
HDBSCAN Alternative:
Parameters:
| Setting | Agglomerative | HDBSCAN |
|---|---|---|
| Pass 1 threshold | 0.4 (cosine) | - |
| Pass 2 threshold | 0.6 (cosine) | - |
| Min cluster size | - | 3 photos |
| Metric | cosine | cosine |
→ Deep dive : references/face-clustering.md
Problem: Burst mode creates 10-50 nearly identical photos.
Multi-Criteria Scoring:
| Criterion | Weight | Measurement |
|---|---|---|
| Sharpness | 30% | Laplacian variance |
| Face Quality | 35% | Eyes open, smiling, face sharpness |
| Aesthetics | 20% | NIMA score |
| Position | 10% | Middle frames bonus |
| Exposure | 5% | Histogram clipping check |
Burst Detection: Photos within 0.5 seconds of each other.
→ Deep dive : references/content-detection.md
Multi-Signal Approach:
| Signal | Confidence | Description |
|---|---|---|
| UI elements | 0.85 | Status bars, buttons detected |
| Perfect rectangles | 0.75 | >5 UI buttons (90° angles) |
| High text | 0.70 | >25% text coverage (OCR) |
| No camera EXIF | 0.60 | Missing Make/Model/Lens |
| Device aspect | 0.60 | Exact phone screen ratio |
| Perfect sharpness | 0.50 | >2000 Laplacian variance |
Decision: Confidence >0.6 = screenshot
→ Deep dive : references/content-detection.md
Goal: Index 10K+ photos efficiently with caching.
Features Extracted:
Performance (10K photos, M1 MacBook Pro):
| Operation | Time |
|---|---|
| Perceptual hashing | 2 min |
| CLIP embeddings | 3 min (GPU) |
| Face detection | 4 min |
| Color palettes | 1 min |
| Aesthetic scoring | 2 min (GPU) |
| Clustering + dedup | 1 min |
| Total (first run) | ~13 min |
| Incremental | < 1 min |
→ Deep dive : references/photo-indexing.md
What it looks like:
distance = np.linalg.norm(embedding1 - embedding2) # WRONG
Why it's wrong: Face embeddings are normalized; cosine similarity is the correct metric.
What to do instead:
from scipy.spatial.distance import cosine
distance = cosine(embedding1, embedding2) # Correct
What it looks like: Using same distance threshold for all face clusters.
Why it's wrong: Different people have varying intra-class variance (twins vs. diverse ages).
What to do instead: Use HDBSCAN for automatic threshold discovery, or two-pass clustering with conservative + relaxed passes.
What it looks like:
is_duplicate = np.allclose(img1, img2) # WRONG
Why it's wrong: Re-saved JPEGs, crops, brightness changes create pixel differences.
What to do instead: Perceptual hashing (pHash or DINOHash) with Hamming distance.
What it looks like: Processing faces one photo at a time without batching.
Why it's wrong: GPU underutilization, 10x slower than batched.
What to do instead: Batch process images (batch_size=32) with GPU acceleration.
What it looks like:
for face in all_detected_faces:
cluster(face) # No filtering
Why it's wrong: Low-confidence detections create noise clusters (hands, objects).
What to do instead: Filter by confidence (threshold 0.9 for faces).
What it looks like: Assigning noise points to nearest cluster.
Why it's wrong: Solo appearances shouldn't pollute person clusters.
What to do instead: HDBSCAN/DBSCAN naturally identifies noise (label=-1). Keep noise separate.
from photo_curation import PhotoCurationPipeline
pipeline = PhotoCurationPipeline()
# Index photo library
index = pipeline.index_library('/path/to/photos')
# De-duplicate
duplicates = index.find_duplicates()
print(f"Found {len(duplicates)} duplicate groups")
# Cluster faces
face_clusters = index.cluster_faces()
print(f"Found {len(face_clusters)} people")
# Select best from bursts
best_photos = pipeline.select_best_from_bursts(index)
# Filter screenshots
real_photos = pipeline.filter_screenshots(index)
# Curate for collage
collage_photos = pipeline.curate_for_collage(index, target_count=100)
torch transformers facenet-pytorch ultralytics hdbscan opencv-python scipy numpy scikit-learn pillow pytesseract
Version : 2.0.0 Last Updated : November 2025
Weekly Installs
116
Repository
GitHub Stars
76
First Seen
Jan 24, 2026
Security Audits
Gen Agent Trust HubWarnSocketPassSnykPass
Installed on
opencode108
codex107
gemini-cli106
cursor106
github-copilot102
kimi-cli95
AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具
50,900 周安装