A recently identified set of security vulnerabilities in NVIDIA’s Triton Inference Server could potentially be exploited by attackers to gain unauthorized control over susceptible servers. The Triton platform is widely used for running artificial intelligence models at scale on both Windows and Linux operating systems.
Wiz researchers Ronen Shustin and Nir Ohfeld reported that when these vulnerabilities are exploited in conjunction, they could enable a remote, unauthenticated attacker to achieve remote code execution (RCE) on these servers. The vulnerabilities include CVE-2025-23319, CVE-2025-23320, and CVE-2025-23334, with CVSS scores of 8.1, 7.5, and 5.9 respectively.
The risks associated with these vulnerabilities are significant. Successful exploitation could lead to information disclosure, data tampering, and even a denial-of-service scenario. Specifically, CVE-2025-23319 could lead to an out-of-bounds write, while CVE-2025-23320 might allow attackers to exceed shared memory limits through oversized requests. The implications for organizations employing Triton for AI/ML operations are severe: a breach could result in the theft of sensitive AI models and valuable data.
Although NVIDIA has released security patches addressing these vulnerabilities in version 25.07, users are urged to apply these updates promptly to minimize risks. It’s worth noting that, to date, there is no evidence indicating that these vulnerabilities have been exploited in the wild. However, administrators of Triton Inference Servers should be vigilant and ensure they remain updated to protect against potential threats.