NVIDIA is urging users to activate System Level Error-Correcting Code (ECC) mitigation to combat Rowhammer attacks targeting graphics processors equipped with GDDR6 memory.
The company has reinforced this recommendation as new research revealed a successful Rowhammer attack on the NVIDIA A6000 GPU (graphical processing unit). Rowhammer is a type of hardware fault, instigated through software procedures, resulting from the proximity of memory cells, thereby impacting GPU memory operations.
A Rowhammer exploit operates by repeatedly accessing a designated memory row, which can cause nearby data bits to toggle between their binary states—effectively corrupting in-memory information. The consequences of such an attack can range from data corruption to denial-of-service incidents, and even privilege escalation.
NVIDIA’s security notice references academic research from the University of Toronto, which demonstrated this vulnerability. Researchers developed the GPUHammer methodology, capable of inducing bit flips on GPU memory banks. Despite advancements in GDDR6 technology that make Rowhammer attacks more challenging due to increased latency and rapid refresh rates, attackers have shown that exploiting the vulnerability remains feasible. More information about these attacks can be found in the study published here.
NVIDIA recommends enabling System-Level ECC for several GPU families, particularly in workstation and data center products required for extensive datasets and AI computations. A complete list of affected products, including various models from the Ampere, Ada, Hopper, Blackwell, Turing, and Volta series, can be found in NVIDIA’s advisory here.
Newer GPU models like the Blackwell RTX 50 Series and Hopper Data Center offer built-in on-die ECC protection, which does not necessitate user intervention. To verify whether System Level ECC is activated, users can employ methods involving the Baseboard Management Controller and utilize hardware interface software, such as the Redfish API or the nvidia-smi command-line tool.
The Rowhammer threat prolongs as a significant security concern, especially in multi-tenant cloud environments. Although exploiting the vulnerability requires intricate execution and specific circumstances, the risks associated, including potential data corruption, are notable and demand user vigilance.