EDAC/RAS features¶
Copyright (c) 2024-2025 HiSilicon Limited.
- Author:
- Shiju Jose <shiju.jose@huawei.com> 
- License:
- The GNU Free Documentation License, Version 1.2 without Invariant Sections, Front-Cover Texts nor Back-Cover Texts. (dual licensed under the GPL v2) 
- Written for: 6.15 
Introduction¶
EDAC/RAS components plugging and high-level design:
- Scrub control 
- Error Check Scrub (ECS) control 
- ACPI RAS2 features 
- Post Package Repair (PPR) control 
- Memory Sparing Repair control 
High level design is illustrated in the following diagram:
     +-----------------------------------------------+
     |   Userspace - Rasdaemon                       |
     | +-------------+                               |
     | | RAS CXL mem |     +---------------+         |
     | |error handler|---->|               |         |
     | +-------------+     | RAS dynamic   |         |
     | +-------------+     | scrub, memory |         |
     | | RAS memory  |---->| repair control|         |
     | |error handler|     +----|----------+         |
     | +-------------+          |                    |
     +--------------------------|--------------------+
                                |
                                |
+-------------------------------|------------------------------+
|     Kernel EDAC extension for | controlling RAS Features     |
|+------------------------------|----------------------------+ |
|| EDAC Core          Sysfs EDAC| Bus                        | |
||   +--------------------------|---------------------------+| |
||   |/sys/bus/edac/devices/<dev>/scrubX/ |   | EDAC device || |
||   |/sys/bus/edac/devices/<dev>/ecsX/   |<->| EDAC MC     || |
||   |/sys/bus/edac/devices/<dev>/repairX |   | EDAC sysfs  || |
||   +---------------------------|--------------------------+| |
||                           EDAC|Bus                        | |
||                               |                           | |
||   +----------+ Get feature    |      Get feature          | |
||   |          | desc +---------|------+ desc +----------+  | |
||   |EDAC scrub|<-----| EDAC device    |      |          |  | |
||   +----------+      | driver- RAS    |----->| EDAC mem |  | |
||   +----------+      | feature control|      | repair   |  | |
||   |          |<-----|                |      +----------+  | |
||   |EDAC ECS  |      +---------|------+                    | |
||   +----------+    Register RAS|features                   | |
||         ______________________|_____________              | |
|+---------|---------------|------------------|--------------+ |
|  +-------|----+  +-------|-------+     +----|----------+     |
|  |            |  | CXL mem driver|     | Client driver |     |
|  | ACPI RAS2  |  | scrub, ECS,   |     | memory repair |     |
|  | driver     |  | sparing, PPR  |     | features      |     |
|  +-----|------+  +-------|-------+     +------|--------+     |
|        |                 |                    |              |
+--------|-----------------|--------------------|--------------+
         |                 |                    |
+--------|-----------------|--------------------|--------------+
|    +---|-----------------|--------------------|-------+      |
|    |                                                  |      |
|    |            Platform HW and Firmware              |      |
|    +--------------------------------------------------+      |
+--------------------------------------------------------------+
- EDAC Features components - Create feature-specific descriptors. For example: scrub, ECS, memory repair in the above diagram. 
- EDAC device driver for controlling RAS Features - Get feature’s attribute descriptors from EDAC RAS feature component and registers device’s RAS features with EDAC bus and expose the features control attributes via sysfs. For example, /sys/bus/edac/devices/<dev-name>/<feature>X/ 
- RAS dynamic feature controller - Userspace sample modules in rasdaemon for dynamic scrub/repair control to issue scrubbing/repair when excess number of corrected memory errors are reported in a short span of time. 
RAS features¶
- Memory Scrub 
Memory scrub features are documented in Scrub Control.
- Memory Repair 
Memory repair features are documented in EDAC Memory Repair Control.