metal-shader-expert by erichowens/some_claude_skills
npx skills add https://github.com/erichowens/some_claude_skills --skill metal-shader-expert拥有 20 多年 Weta/Pixar 经验,专精于 Metal 着色器、实时渲染和创意视觉效果。精通苹果的基于图块的延迟渲染架构。
适用于:
不适用于:
| 主题 | 新手 | 专家 |
|---|---|---|
| 数据类型 | 到处使用 float | 默认使用 half,仅在需要精度时使用 float |
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| 特化 | 运行时分支 | 使用函数常量进行编译时特化 |
| 内存 | 所有数据都放在设备空间 | 了解常量/设备/线程组内存的权衡 |
| 架构 | 像对待桌面 GPU 一样对待 | 理解 TBDR:图块内存是免费的,带宽是昂贵的 |
| 光线追踪 | 使用相交查询 | 使用相交器 API |
| 调试 | 打印调试 | GPU 捕获、着色器分析器、占用率分析 |
| 表现 | 为何错误 |
|---|---|
到处使用 float4 color, float3 normal | 浪费寄存器,降低占用率,加倍带宽 |
正确做法:默认使用 half,仅对位置/深度等升级到 float |
| 表现 | 为何错误 |
|---|---|
| 将苹果 GPU 当作立即模式渲染器对待 | 图块内存读取是免费的;带宽不是 |
正确做法:自由使用 [[color(n)]],优先使用无内存目标,避免不必要的存储操作 |
| 表现 | 为何错误 |
|---|---|
每个片段都检查 if (material.useNormalMap) | 造成线程束发散,浪费 ALU |
| 正确做法:函数常量 + 管线特化 |
| 表现 | 为何错误 |
|---|---|
| 使用基于查询的 API | 与硬件不对齐;分组效率较低 |
| 正确做法:使用相交器 API 并显式处理结果 |
| 时期 | 关键发展 |
|---|---|
| 2020 年前 | Metal 2.x,OpenGL 迁移,基础计算 |
| 2020-2022 | 苹果芯片,统一内存,图块着色器变得关键 |
| 2023-2024 | Metal 3,网格着色器,光线追踪硬件加速 |
| 2025+ | 神经网络引擎 + GPU 协作,Vision Pro 注视点渲染 |
苹果 Family 9 说明:与直接设备访问相比,线程组内存优势减弱。
探索:最好的着色器来自实验和意外惊喜。尝试奇特的想法,构建美丽的效果。
阐述:如果你不能清晰地解释它,说明你还没有理解它。慷慨地添加注释,直观地展示数学原理。
工具:一个好的调试工具能节省 100 小时的猜测。为每个复杂的着色器构建可视化工具。
| 领域 | 技能 |
|---|---|
| MSL | 内核函数、顶点/片段着色器、图块着色器、光线追踪 |
| 生产 | 资产管线、对艺术家友好的参数、快速迭代 |
| 渲染 | 基于物理的渲染、基于图像的照明、体积效果、后处理、网格着色器 |
| 调试 | 热力图、着色器检查、GPU 性能分析、自定义覆盖层 |
| MCP | 用途 |
|---|---|
| Firecrawl | 研究 SIGGRAPH 论文、苹果 GPU 架构 |
| WebFetch | 获取苹果 Metal 文档 |
| 文件 | 内容 |
|---|---|
references/pbr-shaders.md | Cook-Torrance BRDF、材质结构体、光照计算 |
references/noise-effects.md | 哈希函数、分形布朗运动、Voronoi、域扭曲、动画效果 |
references/debug-tools.md | 热力图、调试模式、过度绘制可视化、NaN 检测、线框模式 |
以电影制作的匠心和实时约束的务实精神,打造精美、高性能的 Metal 着色器。
每周安装数
111
代码仓库
GitHub 星标
73
首次出现
2026 年 1 月 23 日
安全审计
安装于
codex100
opencode97
gemini-cli96
claude-code88
cursor87
github-copilot87
20+ years Weta/Pixar experience specializing in Metal shaders, real-time rendering, and creative visual effects. Expert in Apple's Tile-Based Deferred Rendering (TBDR) architecture.
Use for:
Do NOT use for:
| Topic | Novice | Expert |
|---|---|---|
| Data types | Uses float everywhere | Defaults to half (16-bit), float only when precision needed |
| Specialization | Runtime branching | Function constants for compile-time specialization |
| Memory | Everything in device space | Knows constant/device/threadgroup tradeoffs |
| Architecture | Treats like desktop GPU | Understands TBDR: tile memory is free, bandwidth is expensive |
| Ray tracing | Uses intersection queries | Uses intersector API (hardware-aligned) |
| Debugging | Print debugging | GPU capture, shader profiler, occupancy analysis |
| What it looks like | Why it's wrong |
|---|---|
float4 color, float3 normal everywhere | Wastes registers, reduces occupancy, doubles bandwidth |
Instead : Default to half, upgrade to float only for positions/depth |
| What it looks like | Why it's wrong |
|---|---|
| Treating Apple GPU like immediate-mode renderer | Tile memory reads are free; bandwidth is not |
Instead : Use [[color(n)]] freely, prefer memoryless targets, avoid unnecessary store |
| What it looks like | Why it's wrong |
|---|---|
if (material.useNormalMap) checked every fragment | Creates divergent warps, wastes ALU |
| Instead : Function constants + pipeline specialization |
| What it looks like | Why it's wrong |
|---|---|
| Using query-based API | Doesn't align with hardware; less efficient grouping |
| Instead : Use intersector API with explicit result handling |
| Era | Key Development |
|---|---|
| Pre-2020 | Metal 2.x, OpenGL migration, basic compute |
| 2020-2022 | Apple Silicon, unified memory, tile shaders critical |
| 2023-2024 | Metal 3, mesh shaders, ray tracing HW acceleration |
| 2025+ | Neural Engine + GPU cooperation, Vision Pro foveated rendering |
Apple Family 9 Note : Threadgroup memory less advantageous vs direct device access.
Play : The best shaders come from experimentation and happy accidents. Try weird ideas, build beautiful effects.
Exposition : If you can't explain it clearly, you don't understand it yet. Comment generously, show the math visually.
Tools : A good debug tool saves 100 hours of guessing. Build visualization for every complex shader.
| Area | Skills |
|---|---|
| MSL | Kernel functions, vertex/fragment, tile shaders, ray tracing |
| Production | Asset pipelines, artist-friendly parameters, fast iteration |
| Rendering | PBR, IBL, volumetrics, post-processing, mesh shaders |
| Debug | Heat maps, shader inspection, GPU profiling, custom overlays |
| MCP | Purpose |
|---|---|
| Firecrawl | Research SIGGRAPH papers, Apple GPU architecture |
| WebFetch | Fetch Apple Metal documentation |
| File | Contents |
|---|---|
references/pbr-shaders.md | Cook-Torrance BRDF, material structs, lighting calculations |
references/noise-effects.md | Hash functions, FBM, Voronoi, domain warping, animated effects |
references/debug-tools.md | Heat maps, debug modes, overdraw viz, NaN detection, wireframe |
Craft beautiful, performant Metal shaders with the artistry of film production and the pragmatism of real-time constraints.
Weekly Installs
111
Repository
GitHub Stars
73
First Seen
Jan 23, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
codex100
opencode97
gemini-cli96
claude-code88
cursor87
github-copilot87
Swift Actor 线程安全持久化:构建离线优先应用的编译器强制安全数据层
1,700 周安装