geospatial-data-pipeline by erichowens/some_claude_skills
npx skills add https://github.com/erichowens/some_claude_skills --skill geospatial-data-pipeline擅长大规模处理、优化和可视化地理空间数据。
✅ 适用于:
❌ 不适用于:
| 特性 | PostGIS | MongoDB |
|---|---|---|
| 空间索引 | GiST, SP-GiST | 2dsphere |
| 查询语言 | SQL + 空间函数 | 聚合管道 |
| 几何类型 | 20+ (完全支持 OGC) | 基本类型 (点、线、多边形) |
| 坐标系 |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| 通过 EPSG 支持 6000+ |
| 仅 WGS84 |
| 性能 (1000 万个点) | <100 毫秒 | <200 毫秒 |
| 最适合 | 复杂的空间分析 | 以文档为中心的应用程序 |
时间线:
新手思维:"我就把经纬度存成文本,很简单"
问题:无法使用空间索引,查询速度慢,没有验证。
错误做法:
// ❌ 字符串存储,没有空间特性
interface Location {
id: string;
name: string;
latitude: string; // "37.7749"
longitude: string; // "-122.4194"
}
// 线性扫描"附近"查询
async function findNearby(lat: string, lon: string): Promise<Location[]> {
const all = await db.locations.findAll();
return all.filter(loc => {
const distance = calculateDistance(
parseFloat(lat),
parseFloat(lon),
parseFloat(loc.latitude),
parseFloat(loc.longitude)
);
return distance < 5000; // 5km
});
}
为何错误:O(N) 线性扫描,没有空间索引,字符串解析开销。
正确做法:
// ✅ PostGIS GEOGRAPHY 类型配合空间索引
CREATE TABLE locations (
id SERIAL PRIMARY KEY,
name VARCHAR(255),
location GEOGRAPHY(POINT, 4326) -- WGS84 坐标
);
-- 空间索引 (GiST)
CREATE INDEX idx_locations_geography ON locations USING GIST(location);
-- TypeScript 查询
async function findNearby(lat: number, lon: number, radiusMeters: number): Promise<Location[]> {
const query = `
SELECT id, name, ST_AsGeoJSON(location) as geojson
FROM locations
WHERE ST_DWithin(
location,
ST_SetSRID(ST_MakePoint($1, $2), 4326)::geography,
$3
)
ORDER BY location <-> ST_SetSRID(ST_MakePoint($1, $2), 4326)::geography
LIMIT 100
`;
return db.query(query, [lon, lat, radiusMeters]); // <10ms with index
}
时间线背景:
GEOGRAPHY 类型自动处理地球曲率问题:邻近查询执行全表扫描。
错误做法:
-- ❌ 没有索引,顺序扫描
CREATE TABLE drone_images (
id SERIAL PRIMARY KEY,
image_url VARCHAR(255),
location GEOGRAPHY(POINT, 4326)
);
-- 此查询扫描所有行
SELECT * FROM drone_images
WHERE ST_DWithin(
location,
ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326)::geography,
1000 -- 1km
);
EXPLAIN 输出:Seq Scan on drone_images (cost=0.00..1234.56 rows=1 width=123)
正确做法:
-- ✅ 为空间查询创建 GiST 索引
CREATE INDEX idx_drone_images_location ON drone_images USING GIST(location);
-- 同样的查询,现在使用索引
SELECT * FROM drone_images
WHERE ST_DWithin(
location,
ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326)::geography,
1000
);
EXPLAIN 输出:Bitmap Index Scan on idx_drone_images_location (cost=4.30..78.30 rows=50 width=123)
性能影响:1000 万个点,5 公里半径查询
新手思维:"坐标只是数字,我可以混合使用"
问题:距离计算错误,地图要素错位。
错误做法:
// ❌ 混合 EPSG:4326 (WGS84) 和 EPSG:3857 (Web Mercator)
const userLocation = {
lat: 37.7749, // WGS84
lon: -122.4194
};
const droneImage = {
x: -13634876, // Web Mercator (EPSG:3857)
y: 4545684
};
// 比较苹果和橘子!
const distance = Math.sqrt(
Math.pow(userLocation.lon - droneImage.x, 2) +
Math.pow(userLocation.lat - droneImage.y, 2)
);
结果:距离计算完全错误(数百万个"单位")。
正确做法:
-- ✅ 转换到通用坐标系
SELECT ST_Distance(
ST_Transform(
ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326), -- WGS84
3857 -- 转换到 Web Mercator
),
ST_SetSRID(ST_MakePoint(-13634876, 4545684), 3857) -- 已经是 Web Mercator
) AS distance_meters;
或者更好的做法:始终存储在一个系统中(WGS84),仅在显示时转换。
时间线:
问题:50MB 的 GeoJSON 文件导致浏览器崩溃。
错误做法:
// ❌ 将整个文件加载到内存中
const geoJson = await fetch('/drone-survey-data.geojson').then(r => r.json());
// 50MB 的 GeoJSON = 浏览器冻结
map.addSource('drone-data', {
type: 'geojson',
data: geoJson // 一次性加载所有 10,000 个多边形
});
正确做法 1:矢量瓦片(预分块)
// ✅ 作为矢量瓦片提供 (MBTiles 或 PMTiles)
map.addSource('drone-data', {
type: 'vector',
tiles: ['https://api.example.com/tiles/{z}/{x}/{y}.pbf'],
minzoom: 10,
maxzoom: 18
});
// 浏览器只加载可见的瓦片
正确做法 2:GeoJSON 简化 + 分块
# 简化几何图形(减少点数)
npm install -g @mapbox/geojson-precision
geojson-precision -p 5 input.geojson output.geojson
# 分割成瓦片
npm install -g geojson-vt
# 以编程方式生成瓦片 (参见 scripts/tile_generator.ts)
正确做法 3:服务器端过滤
// ✅ 只获取可见范围内的要素
async function fetchVisibleFeatures(bounds: Bounds): Promise<GeoJSON> {
const response = await fetch(
`/api/features?bbox=${bounds.west},${bounds.south},${bounds.east},${bounds.north}`
);
return response.json();
}
map.on('moveend', async () => {
const bounds = map.getBounds();
const geojson = await fetchVisibleFeatures(bounds);
map.getSource('dynamic-data').setData(geojson);
});
新手思维:"距离不就是勾股定理吗"
问题:在大尺度上不正确,在两极附近更糟。
错误做法:
// ❌ 平坦地球距离(错误!)
function distanceKm(lat1: number, lon1: number, lat2: number, lon2: number): number {
const dx = lon2 - lon1;
const dy = lat2 - lat1;
return Math.sqrt(dx * dx + dy * dy) * 111.32; // 111.32 km/degree (错误)
}
// 示例:旧金山到纽约
const distance = distanceKm(37.7749, -122.4194, 40.7128, -74.0060);
// 返回:~55 公里 (错误!实际:~4,130 公里)
为何错误:地球是球体,不是平面。
正确做法 1:Haversine 公式(大圆距离)
// ✅ Haversine 公式(球形地球)
function haversineKm(lat1: number, lon1: number, lat2: number, lon2: number): number {
const R = 6371; // 地球半径,单位公里
const dLat = toRadians(lat2 - lat1);
const dLon = toRadians(lon2 - lon1);
const a =
Math.sin(dLat / 2) * Math.sin(dLat / 2) +
Math.cos(toRadians(lat1)) * Math.cos(toRadians(lat2)) *
Math.sin(dLon / 2) * Math.sin(dLon / 2);
const c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1 - a));
return R * c;
}
// 旧金山到纽约
const distance = haversineKm(37.7749, -122.4194, 40.7128, -74.0060);
// 返回:~4,130 公里 ✅
正确做法 2:PostGIS(自动处理曲率)
-- ✅ PostGIS ST_Distance 配合 GEOGRAPHY 类型
SELECT ST_Distance(
ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326)::geography,
ST_SetSRID(ST_MakePoint(-74.0060, 40.7128), 4326)::geography
) / 1000 AS distance_km;
-- 返回:4130.137 公里 ✅
准确性对比:
| 方法 | 旧金山到纽约 | 误差 |
|---|---|---|
| 欧几里得(平坦) | 55 公里 | 98.7% 错误 |
| Haversine(球体) | 4,130 公里 | ✅ 正确 |
| PostGIS(椭球体) | 4,135 公里 | 最准确 |
□ PostGIS 扩展已安装且空间索引已创建
□ 所有坐标以一致的 SRID 存储(推荐:4326)
□ GeoJSON 文件已优化(<1MB)或作为矢量瓦片提供
□ 坐标转换使用 ST_Transform,而非手动计算
□ 距离计算使用 ST_Distance 配合 GEOGRAPHY 类型
□ 边界框查询使用 ST_MakeEnvelope + ST_Intersects
□ 大型几何图形已分块(每个要素不超过 100KB)
□ 地图瓦片已为常见缩放级别预生成
□ 为瓦片服务器配置了 CORS
□ 地理编码/反向地理编码端点设置了速率限制
| 场景 | 是否合适? |
|---|---|
| 无人机图像标注和搜索 | ✅ 是 - 处理勘测数据 |
| GPS 轨迹可视化 | ✅ 是 - 优化路径 |
| 查找最近的咖啡店 | ✅ 是 - 空间查询 |
| 管辖区边界查询 | ✅ 是 - 点面包含 |
| 简单的地址自动补全 | ❌ 否 - 使用 Mapbox/Google |
| 在页面嵌入静态地图 | ❌ 否 - 使用 Static API |
| 地理编码单个地址 | ❌ 否 - 使用地理编码 API |
/references/coordinate-systems.md - EPSG 代码、转换、Web Mercator 与 WGS84/references/postgis-guide.md - PostGIS 设置、空间索引、常见查询/references/geojson-optimization.md - 简化、分块、矢量瓦片scripts/geospatial_processor.ts - 处理无人机图像、GPS 轨迹、GeoJSON 验证scripts/tile_generator.ts - 从 GeoJSON 生成矢量瓦片 (MBTiles/PMTiles)此技能指导:地理空间数据 | PostGIS | GeoJSON | 地图瓦片 | 坐标系 | 无人机数据处理 | 空间查询
每周安装数
88
仓库
GitHub 星标数
78
首次出现
2026年1月24日
安全审计
安装于
gemini-cli78
opencode78
codex78
github-copilot74
cursor72
claude-code64
Expert in processing, optimizing, and visualizing geospatial data at scale.
✅ Use for :
❌ NOT for :
| Feature | PostGIS | MongoDB |
|---|---|---|
| Spatial indexes | GiST, SP-GiST | 2dsphere |
| Query language | SQL + spatial functions | Aggregation pipeline |
| Geometry types | 20+ (full OGC support) | Basic (Point, Line, Polygon) |
| Coordinate systems | 6000+ via EPSG | WGS84 only |
| Performance (10M points) | <100ms | <200ms |
| Best for | Complex spatial analysis | Document-centric apps |
Timeline :
Novice thinking : "I'll just store lat/lon as text, it's simple"
Problem : Can't use spatial indexes, queries are slow, no validation.
Wrong approach :
// ❌ String storage, no spatial features
interface Location {
id: string;
name: string;
latitude: string; // "37.7749"
longitude: string; // "-122.4194"
}
// Linear scan for "nearby" queries
async function findNearby(lat: string, lon: string): Promise<Location[]> {
const all = await db.locations.findAll();
return all.filter(loc => {
const distance = calculateDistance(
parseFloat(lat),
parseFloat(lon),
parseFloat(loc.latitude),
parseFloat(loc.longitude)
);
return distance < 5000; // 5km
});
}
Why wrong : O(N) linear scan, no spatial index, string parsing overhead.
Correct approach :
// ✅ PostGIS GEOGRAPHY type with spatial index
CREATE TABLE locations (
id SERIAL PRIMARY KEY,
name VARCHAR(255),
location GEOGRAPHY(POINT, 4326) -- WGS84 coordinates
);
-- Spatial index (GiST)
CREATE INDEX idx_locations_geography ON locations USING GIST(location);
-- TypeScript query
async function findNearby(lat: number, lon: number, radiusMeters: number): Promise<Location[]> {
const query = `
SELECT id, name, ST_AsGeoJSON(location) as geojson
FROM locations
WHERE ST_DWithin(
location,
ST_SetSRID(ST_MakePoint($1, $2), 4326)::geography,
$3
)
ORDER BY location <-> ST_SetSRID(ST_MakePoint($1, $2), 4326)::geography
LIMIT 100
`;
return db.query(query, [lon, lat, radiusMeters]); // <10ms with index
}
Timeline context :
GEOGRAPHY type handles Earth curvature automaticallyProblem : Proximity queries do full table scans.
Wrong approach :
-- ❌ No index, sequential scan
CREATE TABLE drone_images (
id SERIAL PRIMARY KEY,
image_url VARCHAR(255),
location GEOGRAPHY(POINT, 4326)
);
-- This query scans ALL rows
SELECT * FROM drone_images
WHERE ST_DWithin(
location,
ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326)::geography,
1000 -- 1km
);
EXPLAIN output : Seq Scan on drone_images (cost=0.00..1234.56 rows=1 width=123)
Correct approach :
-- ✅ GiST index for spatial queries
CREATE INDEX idx_drone_images_location ON drone_images USING GIST(location);
-- Same query, now uses index
SELECT * FROM drone_images
WHERE ST_DWithin(
location,
ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326)::geography,
1000
);
EXPLAIN output : Bitmap Index Scan on idx_drone_images_location (cost=4.30..78.30 rows=50 width=123)
Performance impact : 10M points, 5km radius query
Novice thinking : "Coordinates are just numbers, I can mix them"
Problem : Incorrect distances, misaligned map features.
Wrong approach :
// ❌ Mixing EPSG:4326 (WGS84) and EPSG:3857 (Web Mercator)
const userLocation = {
lat: 37.7749, // WGS84
lon: -122.4194
};
const droneImage = {
x: -13634876, // Web Mercator (EPSG:3857)
y: 4545684
};
// Comparing apples to oranges!
const distance = Math.sqrt(
Math.pow(userLocation.lon - droneImage.x, 2) +
Math.pow(userLocation.lat - droneImage.y, 2)
);
Result : Wildly incorrect distance (millions of "units").
Correct approach :
-- ✅ Transform to common coordinate system
SELECT ST_Distance(
ST_Transform(
ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326), -- WGS84
3857 -- Transform to Web Mercator
),
ST_SetSRID(ST_MakePoint(-13634876, 4545684), 3857) -- Already Web Mercator
) AS distance_meters;
Or better : Always store in one system (WGS84), transform on display only.
Timeline :
Problem : 50MB GeoJSON file crashes browser.
Wrong approach :
// ❌ Load entire file into memory
const geoJson = await fetch('/drone-survey-data.geojson').then(r => r.json());
// 50MB of GeoJSON = browser freeze
map.addSource('drone-data', {
type: 'geojson',
data: geoJson // All 10,000 polygons loaded at once
});
Correct approach 1 : Vector tiles (pre-chunked)
// ✅ Serve as vector tiles (MBTiles or PMTiles)
map.addSource('drone-data', {
type: 'vector',
tiles: ['https://api.example.com/tiles/{z}/{x}/{y}.pbf'],
minzoom: 10,
maxzoom: 18
});
// Browser only loads visible tiles
Correct approach 2 : GeoJSON simplification + chunking
# Simplify geometry (reduce points)
npm install -g @mapbox/geojson-precision
geojson-precision -p 5 input.geojson output.geojson
# Split into tiles
npm install -g geojson-vt
# Generate tiles programmatically (see scripts/tile_generator.ts)
Correct approach 3 : Server-side filtering
// ✅ Only fetch visible bounds
async function fetchVisibleFeatures(bounds: Bounds): Promise<GeoJSON> {
const response = await fetch(
`/api/features?bbox=${bounds.west},${bounds.south},${bounds.east},${bounds.north}`
);
return response.json();
}
map.on('moveend', async () => {
const bounds = map.getBounds();
const geojson = await fetchVisibleFeatures(bounds);
map.getSource('dynamic-data').setData(geojson);
});
Novice thinking : "Distance is just Pythagorean theorem"
Problem : Incorrect at scale, worse near poles.
Wrong approach :
// ❌ Flat Earth distance (wrong!)
function distanceKm(lat1: number, lon1: number, lat2: number, lon2: number): number {
const dx = lon2 - lon1;
const dy = lat2 - lat1;
return Math.sqrt(dx * dx + dy * dy) * 111.32; // 111.32 km/degree (WRONG)
}
// Example: San Francisco to New York
const distance = distanceKm(37.7749, -122.4194, 40.7128, -74.0060);
// Returns: ~55 km (WRONG! Actual: ~4,130 km)
Why wrong : Earth is a sphere, not a flat plane.
Correct approach 1 : Haversine formula (great circle distance)
// ✅ Haversine formula (spherical Earth)
function haversineKm(lat1: number, lon1: number, lat2: number, lon2: number): number {
const R = 6371; // Earth radius in km
const dLat = toRadians(lat2 - lat1);
const dLon = toRadians(lon2 - lon1);
const a =
Math.sin(dLat / 2) * Math.sin(dLat / 2) +
Math.cos(toRadians(lat1)) * Math.cos(toRadians(lat2)) *
Math.sin(dLon / 2) * Math.sin(dLon / 2);
const c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1 - a));
return R * c;
}
// San Francisco to New York
const distance = haversineKm(37.7749, -122.4194, 40.7128, -74.0060);
// Returns: ~4,130 km ✅
Correct approach 2 : PostGIS (handles curvature automatically)
-- ✅ PostGIS ST_Distance with GEOGRAPHY
SELECT ST_Distance(
ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326)::geography,
ST_SetSRID(ST_MakePoint(-74.0060, 40.7128), 4326)::geography
) / 1000 AS distance_km;
-- Returns: 4130.137 km ✅
Accuracy comparison :
| Method | SF to NYC | Error |
|---|---|---|
| Euclidean (flat) | 55 km | 98.7% wrong |
| Haversine (sphere) | 4,130 km | ✅ Correct |
| PostGIS (ellipsoid) | 4,135 km | Most accurate |
□ PostGIS extension installed and spatial indexes created
□ All coordinates stored in consistent SRID (recommend: 4326)
□ GeoJSON files optimized (<1MB) or served as vector tiles
□ Coordinate transformations use ST_Transform, not manual math
□ Distance calculations use ST_Distance with GEOGRAPHY type
□ Bounding box queries use ST_MakeEnvelope + ST_Intersects
□ Large geometries chunked (not >100KB per feature)
□ Map tiles pre-generated for common zoom levels
□ CORS configured for tile servers
□ Rate limiting on geocoding/reverse geocoding endpoints
| Scenario | Appropriate? |
|---|---|
| Drone imagery annotation and search | ✅ Yes - process survey data |
| GPS track visualization | ✅ Yes - optimize paths |
| Find nearest coffee shops | ✅ Yes - spatial queries |
| Jurisdiction boundary lookups | ✅ Yes - point-in-polygon |
| Simple address autocomplete | ❌ No - use Mapbox/Google |
| Embed static map on page | ❌ No - use Static API |
| Geocode single address | ❌ No - use geocoding API |
/references/coordinate-systems.md - EPSG codes, transformations, Web Mercator vs WGS84/references/postgis-guide.md - PostGIS setup, spatial indexes, common queries/references/geojson-optimization.md - Simplification, chunking, vector tilesscripts/geospatial_processor.ts - Process drone imagery, GPS tracks, GeoJSON validationscripts/tile_generator.ts - Generate vector tiles (MBTiles/PMTiles) from GeoJSONThis skill guides : Geospatial data | PostGIS | GeoJSON | Map tiles | Coordinate systems | Drone data processing | Spatial queries
Weekly Installs
88
Repository
GitHub Stars
78
First Seen
Jan 24, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
gemini-cli78
opencode78
codex78
github-copilot74
cursor72
claude-code64
Supabase 使用指南:安全最佳实践、CLI 命令与 MCP 服务器配置
3,600 周安装
API设计模式最佳实践指南:RESTful原则、错误处理、分页与安全性
102 周安装
Groove Work Plan:AI辅助代码库分析与项目计划生成工具 | 自动化开发流程
119 周安装
Groove Git日志自动化工具 - 自动生成每日Git提交摘要和变更记录
119 周安装
自媒体自动发布工具 - 支持百家号、知乎、公众号等平台一键发布,提升内容分发效率
105 周安装
Outlook自动化指南:通过Rube MCP与Composio工具包实现邮件、日历、联系人管理
83 周安装
WhoDB数据库助手:简化数据库操作,支持SQL查询、模式探索与数据导出
93 周安装