Hard Faults are the embedded developer’s equivalent of a kernel panic—sudden, catastrophic, and often cryptic. On ARM Cortex-M microcontrollers, they signal that something went very wrong: an invalid memory access, a divide-by-zero, or a corrupted stack.
In this post, we’ll explore:
- What causes Hard Faults
- How to extract useful information from the fault handler
- How to decode the stack frame
- Tools and techniques to prevent them
💥 What Triggers a Hard Fault?
A Hard Fault is a type of exception that occurs when the processor encounters a condition it cannot handle. Common causes include:
- Dereferencing a null or invalid pointer
- Executing code from a non-executable region
- Stack overflows or corruption
- Misaligned memory access
- Unhandled exceptions (e.g., BusFault, MemManage)
🧰 Anatomy of a Hard Fault Handler
When a Hard Fault occurs, the processor pushes a stack frame onto the current stack (MSP or PSP), which includes:
-
R0–R3
,R12
: General-purpose registers -
LR
: Link register -
PC
: Program counter at the time of the fault -
xPSR
: Program status register
You can write a custom handler to extract this data:
void HardFault_Handler(void) {
__asm volatile (
"TST lr, #4 \n"
"ITE EQ \n"
"MRSEQ r0, MSP \n"
"MRSNE r0, PSP \n"
"B hard_fault_handler_c \n"
);
}
void hard_fault_handler_c(uint32_t *stacked_regs) {
uint32_t r0 = stacked_regs[0];
uint32_t r1 = stacked_regs[1];
uint32_t r2 = stacked_regs[2];
uint32_t r3 = stacked_regs[3];
uint32_t r12 = stacked_regs[4];
uint32_t lr = stacked_regs[5];
uint32_t pc = stacked_regs[6];
uint32_t psr = stacked_regs[7];
printf("Hard Fault!\n");
printf("PC = 0x%08lX\n", pc);
printf("LR = 0x%08lX\n", lr);
printf("xPSR = 0x%08lX\n", psr);
// Add more logging or store to non-volatile memory
}
🔍 Decoding the Fault Address
Once you have the Program Counter (PC), you can map it back to the source code using your ELF file and addr2line
:
arm-none-eabi-addr2line -e firmware.elf 0x08001234
This will tell you the exact file and line number where the fault occurred. If the PC is in a library or system call, you may need to inspect the call stack to trace back to your code.
🧪 Diagnosing Common Fault Scenarios
🧷 Null Pointer Dereference
If PC points to 0x00000000
or a low memory address, you likely dereferenced a null pointer.
🌀 Stack Overflow
If the stack pointer (SP) is outside the valid RAM range, or if the fault occurs deep in a recursive call, suspect a stack overflow. Use a memory map to verify.
🧨 Invalid Memory Access
If the fault address is in a peripheral or flash region, check for misconfigured pointers or DMA transfers.
🛡️ Preventing Hard Faults
✅ Enable All Fault Handlers
By default, some Cortex-M faults (like MemManage or BusFault) escalate to a Hard Fault. Enable them explicitly:
This gives you more granular fault information.
🧵 Use Stack Canaries
Insert known values at the end of the stack and check them periodically to detect overflows.
🧰 Static Analysis
Use tools like Cppcheck, Coverity, or Clang Static Analyzer to catch pointer misuse and undefined behavior before runtime.
🧪 Unit Tests with Fault Injection
Simulate invalid memory access or corrupted pointers in test environments to validate fault handling logic.
🧭 Final Thoughts
Hard Faults can feel like black holes in your firmware—silent until they crash everything. But with the right tools and a methodical approach, you can extract meaningful insights from even the most cryptic crashes.
Pro tip: Always log the fault context to non-volatile memory or a debug port. It’s your black box recorder when things go wrong in the field.