Embedded systems must be able to detect, recover from and report errors. This is a critical feature during debugging and also for quality control after product manufacturing has commenced. Advanced error handling is especially important for embedded systems that are often manufactured in large unit volumes and must run mission-critical applications non-stop for extended periods of time.
This white paper describes Machine Check Architecture, a built in capability that enables Intel processors to detect, report and attempt to recover from system errors observed by the CPU.
Many times, machine check exceptions are the only available clue during system failures, but diagnosing the cause of machine check exceptions can be a challenging and time consuming process. When an Intel® architecture CPU detects critical machine check exceptions, and the errors are not correctable, the processor will reset the system to prevent the error situation from getting worse. Machine check exception (MCE) registers capture some of the error data observed by the CPU at the point of failure, and this information can help to diagnose the root cause of the error.
This white paper provides recommendations and a checklist for debugging machine check exceptions on embedded Intel architecture platforms, using the Intel® Core™ Duo and Intel® Core™2 Duo processors as examples.