HBM Read (Standard)
Overview
The hbm_read test establishes the baseline sequential read bandwidth of the GPU's memory subsystem.
Execution Mechanics
This kernel uses volatile reads decomposed into 32-bit chunks.
- It utilizes a standard 4x unroll per thread.
- Architecture Fix: Attempting to execute single 128-bit vector loads on unaligned buffers can cause segmentation faults on specific AMD RDNA hardware. This kernel decomposes the fetch into smaller, safer chunks to ensure cross-platform stability.
Target Subsystems
- Primary Target: Sequential VRAM Read Bandwidth.
Failure Symptoms
Critical Failures
- Low Throughput: Just like the write test, low throughput signifies memory controller degradation or ECC intervention.