GPUHammer: Hardware attacks on NVIDIA GPUs lead to compromised AI models

Redazione RHC : 14 July 2025 19:12

NVIDIA has reported a new vulnerability in its graphics processors, called GPUHammer. This attack, based on the well-known RowHammer technique, allows attackers to corrupt other users’ data by exploiting the peculiarities of video card RAM. For the first time, the possibility of implementing a RowHammer attack on a GPU, rather than on traditional processors, has been demonstrated. For example, specialists used the NVIDIA A6000 graphics card with GDDR6 memory, managing to modify individual bits in the video memory. This can lead to the destruction of data integrity without direct access.

Of particular concern is the fact that even a single bit flip can compromise the accuracy of AI: a model trained on ImageNet that had previously demonstrated 80% accuracy was attacked to less than 1%. This impact transforms GPUHammer from a technical anomaly into a powerful tool for disrupting AI infrastructure, including replacing internal model parameters and poisoning training data.

Unlike CPUs, GPUs often lack built-in security mechanisms such as instruction-level access control or parity checking. This makes them more vulnerable to low-level attacks, especially in shared computing environments such as cloud platforms or virtual desktops. In such systems, a potential attacker can interfere with adjacent tasks without having direct access to them, creating tenant-level risks.

Previous research, including the SpecHammer methodology, combined vulnerabilities RowHammer and Spectre to launch speculative execution attacks. GPUHammer continues this trend, demonstrating that the attack is possible even in the presence of protection mechanisms such as Target Row Refresh (TRR), previously considered a reliable precaution.

The consequences of such attacks are particularly dangerous for sectors with high security and transparency requirements, such as healthcare, finance, and autonomous systems. Introducing uncontrolled biases into AI can violate regulations such as ISO/IEC 27001 or European AI legislation, especially when decisions are made based on corrupted models. To reduce the risks, NVIDIA recommends enabling Error Correction Code with the “nvidia-smi -e 1” command. You can check its status with “nvidia-smi -q | grep ECC”. In some cases, it may be acceptable to enable ECC only for training nodes or critical workloads. It’s also worth monitoring system logs for memory error fixes to detect any attacks early.

It’s worth noting that enabling ECC reduces machine learning performance on the A6000 GPU by about 10% and reduces available memory by 6.25%. However, newer GPU models such as the H100 and RTX 5090 are not affected by this vulnerability, as they use on-chip error correction.

Of further concern is a related recent development, called CrowHammer, presented by a team from NTT Social Informatics Laboratories and CentraleSupélec. In this case, the attack succeeded in recovering the private key of the post-quantum Falcon signature algorithm selected for standardization by NIST. The researchers demonstrated that even a single targeted bit flip can lead to key extraction in the presence of several hundred million signatures, with more distortions and less data.

Overall, this highlights the need to reconsider approaches to securing AI models and the infrastructure they operate on. Simple data-level protection is no longer sufficient: we must account for vulnerabilities emerging at the hardware level, all the way down to the video memory architecture.

Redazione
The editorial team of Red Hot Cyber consists of a group of individuals and anonymous sources who actively collaborate to provide early information and news on cybersecurity and computing in general.

Lista degli articoli