arm-cortex-expert by sickn33/antigravity-awesome-skills
npx skills add https://github.com/sickn33/antigravity-awesome-skills --skill arm-cortex-expertresources/implementation-playbook.md。目标平台
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
核心能力
高级主题
关键: ARM Cortex-M7 具有弱序内存。CPU 和硬件可以相对于其他操作重新排序寄存器读写。
缺失屏障的症状:
C/C++: 在读取前后用 __DMB()(数据内存屏障)包装寄存器访问,在写入后用 __DSB()(数据同步屏障)。创建辅助函数:mmio_read()、mmio_write()、mmio_modify()。
Rust: 在易失性读写周围使用 cortex_m::asm::dmb() 和 cortex_m::asm::dsb()。创建宏,如 safe_read_reg!()、safe_write_reg!()、safe_modify_reg!(),用于包装 HAL 寄存器访问。
为何重要: M7 为了性能会重新排序内存操作。没有屏障,寄存器写入可能在下一条指令执行前未完成,或者读取返回缓存的陈旧值。
关键: ARM Cortex-M7 设备(Teensy 4.x, STM32 F7/H7)具有数据缓存。没有缓存维护,DMA 和 CPU 可能看到不同的数据。
对齐要求(关键):
内存放置策略(从优到劣):
DTCM/SRAM(不可缓存,CPU 访问最快)
__attribute__((section(".dtcm.bss"))) __attribute__((aligned(32))) static uint8_t buffer[512];#[link_section = ".dtcm"] #[repr(C, align(32))] static mut BUFFER: [u8; 512] = [0; 512];MPU 配置的非缓存区域 - 通过 MPU 将 OCRAM/SRAM 区域配置为不可缓存
缓存维护(最后手段 - 最慢)
arm_dcache_flush_delete() 或 cortex_m::cache::clean_dcache_by_range()arm_dcache_delete() 或 cortex_m::cache::invalidate_dcache_by_range()最佳实践: 在调试版本中使用 is_valid_mmio_address(addr) 验证 MMIO 地址,检查地址是否在有效的外设范围内(例如,外设为 0x40000000-0x4FFFFFFF,ARM Cortex-M 系统外设为 0xE0000000-0xE00FFFFF)。使用 #ifdef DEBUG 保护并在无效地址处停止。
许多状态寄存器(尤其是 i.MX RT, STM32)通过写入 1 而不是 0 来清除:
uint32_t status = mmio_read(&USB1_USBSTS);
mmio_write(&USB1_USBSTS, status); // 写回位以清除它们
常见的 W1C 寄存器: USBSTS、PORTSC、CCM 状态。错误做法: status &= ~bit 在 W1C 寄存器上无效。
⚠️ 电压容限:
Teensy 4.x: FlexSPI 专用于 Flash/PSRAM • EEPROM 为模拟(限制写入频率 <10Hz) • LPSPI 最大 30MHz • 切勿在外设活动时更改 CCM 时钟
STM32 F7/H7: 每个外设的时钟域配置 • 固定的 DMA 流/通道分配 • GPIO 速度影响压摆率/功耗
nRF52: SAADC 上电后需要校准 • GPIOTE 有限(8 个通道) • 无线电共享优先级级别
SAMD: SERCOM 需要仔细的引脚复用 • GCLK 路由至关重要 • M0+ 变体上的 DMA 有限
static mut正确模式:
static READY: AtomicBool = AtomicBool::new(false);
static STATE: Mutex<RefCell<Option<T>>> = Mutex::new(RefCell::new(None));
// 访问:critical_section::with(|cs| STATE.borrow_ref_mut(cs))
错误: static mut 是未定义行为(数据竞争)。
原子排序: Relaxed(仅 CPU) • Acquire/Release(共享状态) • AcqRel(CAS) • SeqCst(很少需要)
平台特定的优先级级别:
关键原则:
配置:
NVIC_SetPriority(IRQn, priority) 或 HAL_NVIC_SetPriority()NVIC::set_priority() 或使用 PAC 特定函数目的: 保护共享数据免受 ISR 和主代码的并发访问。
C/C++:
__disable_irq(); /* 临界区 */ __enable_irq(); // 阻塞所有中断
// M3/M4/M7: 仅屏蔽较低优先级中断
uint32_t basepri = __get_BASEPRI();
__set_BASEPRI(priority_threshold << (8 - __NVIC_PRIO_BITS));
/* 临界区 */
__set_BASEPRI(basepri);
Rust: cortex_m::interrupt::free(|cs| { /* 使用 cs 令牌 */ })
最佳实践:
常见原因:
检查模式(M3/M4/M7):
HFSR(硬故障状态寄存器)以确定故障类型CFSR(可配置故障状态寄存器)以获取详细原因MMFAR / BFAR 以获取故障地址(如果有效)R0-R3, R12, LR, PC, xPSR平台限制:
调试提示: 使用硬故障处理程序在复位前捕获栈帧并打印/记录寄存器。
| 特性 | M0/M0+ | M3 | M4/M4F | M7/M7F |
|---|---|---|---|---|
| 最大时钟 | ~50 MHz | ~100 MHz | ~180 MHz | ~600 MHz |
| ISA | 仅 Thumb-1 | Thumb-2 | Thumb-2 + DSP | Thumb-2 + DSP |
| MPU | M0+ 可选 | 可选 | 可选 | 可选 |
| FPU | 无 | 无 | M4F: 单精度 | M7F: 单精度 + 双精度 |
| 缓存 | 无 | 无 | 无 | I-缓存 + D-缓存 |
| TCM | 无 | 无 | 无 | ITCM + DTCM |
| DWT | 无 | 有 | 有 | 有 |
| 故障处理 | 有限(仅 HardFault) | 完整 | 完整 | 完整 |
惰性压栈(M4F/M7F 默认): 仅当 ISR 使用 FPU 时才保存 FPU 上下文(S0-S15, FPSCR)。减少了非 FPU ISR 的延迟,但导致时序可变。
为确定性延迟而禁用: 在硬实时系统或当 ISR 始终使用 FPU 时,配置 FPU->FPCCR(清除 LSPEN 位)。
MPU 保护页(最佳): 在栈下方配置无访问权限的 MPU 区域。在 M3/M4/M7 上触发 MemManage 故障。在 M0/M0+ 上有限。
哨兵值(可移植): 栈底部的魔数(例如 0xDEADBEEF),定期检查。
看门狗: 通过超时间接检测,提供恢复能力。最佳: MPU 保护页,否则哨兵 + 看门狗。
模式: 创建基于事务读写的非阻塞 SPI 驱动:
sensorReadRegister(0x0F) 用于读取 WHO_AM_I平台特定 API:
SPI.beginTransaction(SPISettings(speed, order, mode)) → SPI.transfer(data) → SPI.endTransaction()HAL_SPI_Transmit() / HAL_SPI_Receive() 或 LL 驱动nrfx_spi_xfer() 或 nrf_drv_spi_transfer()SERCOM_SPI_MODE_MASTER 配置 SERCOM 为 SPI 主模式每周安装量
158
代码仓库
GitHub 星标数
27.1K
首次出现
2026 年 1 月 28 日
安全审计
安装于
opencode150
gemini-cli141
codex140
github-copilot134
cursor125
claude-code120
resources/implementation-playbook.md.Target Platforms
Core Competencies
Advanced Topics
CRITICAL: ARM Cortex-M7 has weakly-ordered memory. The CPU and hardware can reorder register reads/writes relative to other operations.
Symptoms of Missing Barriers:
C/C++: Wrap register access with __DMB() (data memory barrier) before/after reads, __DSB() (data synchronization barrier) after writes. Create helper functions: mmio_read(), mmio_write(), mmio_modify().
Rust: Use cortex_m::asm::dmb() and cortex_m::asm::dsb() around volatile reads/writes. Create macros like safe_read_reg!(), safe_write_reg!(), safe_modify_reg!() that wrap HAL register access.
Why This Matters: M7 reorders memory operations for performance. Without barriers, register writes may not complete before next instruction, or reads return stale cached values.
CRITICAL: ARM Cortex-M7 devices (Teensy 4.x, STM32 F7/H7) have data caches. DMA and CPU can see different data without cache maintenance.
Alignment Requirements (CRITICAL):
Memory Placement Strategies (Best to Worst):
DTCM/SRAM (Non-cacheable, fastest CPU access)
__attribute__((section(".dtcm.bss"))) __attribute__((aligned(32))) static uint8_t buffer[512];#[link_section = ".dtcm"] #[repr(C, align(32))] static mut BUFFER: [u8; 512] = [0; 512];MPU-configured Non-cacheable regions - Configure OCRAM/SRAM regions as non-cacheable via MPU
Cache Maintenance (Last resort - slowest)
arm_dcache_flush_delete() or cortex_m::cache::clean_dcache_by_range()arm_dcache_delete() or cortex_m::cache::invalidate_dcache_by_range()Best practice: Validate MMIO addresses in debug builds using is_valid_mmio_address(addr) checking addr is within valid peripheral ranges (e.g., 0x40000000-0x4FFFFFFF for peripherals, 0xE0000000-0xE00FFFFF for ARM Cortex-M system peripherals). Use #ifdef DEBUG guards and halt on invalid addresses.
Many status registers (especially i.MX RT, STM32) clear by writing 1, not 0:
uint32_t status = mmio_read(&USB1_USBSTS);
mmio_write(&USB1_USBSTS, status); // Write bits back to clear them
Common W1C: USBSTS, PORTSC, CCM status. Wrong: status &= ~bit does nothing on W1C registers.
⚠️ Voltage Tolerances:
Teensy 4.x: FlexSPI dedicated to Flash/PSRAM only • EEPROM emulated (limit writes <10Hz) • LPSPI max 30MHz • Never change CCM clocks while peripherals active
STM32 F7/H7: Clock domain config per peripheral • Fixed DMA stream/channel assignments • GPIO speed affects slew rate/power
nRF52: SAADC needs calibration after power-on • GPIOTE limited (8 channels) • Radio shares priority levels
SAMD: SERCOM needs careful pin muxing • GCLK routing critical • Limited DMA on M0+ variants
static mutCORRECT Patterns:
static READY: AtomicBool = AtomicBool::new(false);
static STATE: Mutex<RefCell<Option<T>>> = Mutex::new(RefCell::new(None));
// Access: critical_section::with(|cs| STATE.borrow_ref_mut(cs))
WRONG: static mut is undefined behavior (data races).
Atomic Ordering: Relaxed (CPU-only) • Acquire/Release (shared state) • AcqRel (CAS) • SeqCst (rarely needed)
Platform-Specific Priority Levels:
Key Principles:
Configuration:
NVIC_SetPriority(IRQn, priority) or HAL_NVIC_SetPriority()NVIC::set_priority() or use PAC-specific functionsPurpose: Protect shared data from concurrent access by ISRs and main code.
C/C++:
__disable_irq(); /* critical section */ __enable_irq(); // Blocks all
// M3/M4/M7: Mask only lower-priority interrupts
uint32_t basepri = __get_BASEPRI();
__set_BASEPRI(priority_threshold << (8 - __NVIC_PRIO_BITS));
/* critical section */
__set_BASEPRI(basepri);
Rust: cortex_m::interrupt::free(|cs| { /* use cs token */ })
Best Practices:
Common Causes:
Inspection Pattern (M3/M4/M7):
HFSR (HardFault Status Register) for fault typeCFSR (Configurable Fault Status Register) for detailed causeMMFAR / BFAR for faulting address (if valid)R0-R3, R12, LR, PC, xPSRPlatform Limitations:
Debug Tip: Use hardfault handler to capture stack frame and print/log registers before reset.
| Feature | M0/M0+ | M3 | M4/M4F | M7/M7F |
|---|---|---|---|---|
| Max Clock | ~50 MHz | ~100 MHz | ~180 MHz | ~600 MHz |
| ISA | Thumb-1 only | Thumb-2 | Thumb-2 + DSP | Thumb-2 + DSP |
| MPU | M0+ optional | Optional | Optional | Optional |
| FPU | No | No | M4F: single precision | M7F: single + double |
| Cache | No | No | No | I-cache + D-cache |
Lazy Stacking (Default on M4F/M7F): FPU context (S0-S15, FPSCR) saved only if ISR uses FPU. Reduces latency for non-FPU ISRs but creates variable timing.
Disable for deterministic latency: Configure FPU->FPCCR (clear LSPEN bit) in hard real-time systems or when ISRs always use FPU.
MPU Guard Pages (Best): Configure no-access MPU region below stack. Triggers MemManage fault on M3/M4/M7. Limited on M0/M0+.
Canary Values (Portable): Magic value (e.g., 0xDEADBEEF) at stack bottom, check periodically.
Watchdog: Indirect detection via timeout, provides recovery. Best: MPU guard pages, else canary + watchdog.
Pattern: Create non-blocking SPI drivers with transaction-based read/write:
sensorReadRegister(0x0F) for WHO_AM_IPlatform-specific APIs:
SPI.beginTransaction(SPISettings(speed, order, mode)) → SPI.transfer(data) → SPI.endTransaction()HAL_SPI_Transmit() / HAL_SPI_Receive() or LL driversnrfx_spi_xfer() or nrf_drv_spi_transfer()SERCOM_SPI_MODE_MASTERWeekly Installs
158
Repository
GitHub Stars
27.1K
First Seen
Jan 28, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode150
gemini-cli141
codex140
github-copilot134
cursor125
claude-code120
React 组合模式指南:Vercel 组件架构最佳实践,提升代码可维护性
113,700 周安装
| TCM | No | No | No | ITCM + DTCM |
| DWT | No | Yes | Yes | Yes |
| Fault Handling | Limited (HardFault only) | Full | Full | Full |