Researchers find widespread remote code execution risk in AI inference engines from unsafe ZMQ and pickle use

Research, Risk, Vendors, Vulnerabilities

November 15, 2025

Cybersecurity researchers reported critical remote code execution vulnerabilities affecting several AI inference engines, including components from Meta, NVIDIA, Microsoft and open-source projects such as vLLM and SGLang. Oligo Security researcher Avi Lumelsky said the flaws all trace to an overlooked unsafe use of ZeroMQ and Python\’s pickle deserialization.

At the center of the issue is a pattern researchers call ShadowMQ, in which insecure deserialization logic has been copied across projects. The original vulnerability was reported in Meta\’s Llama framework and involved using ZeroMQ\’s recv_pyobj() to accept pickled objects over a network-exposed socket, allowing an attacker who can send crafted data to trigger arbitrary code execution; maintainers also addressed related issues in the pyzmq library.

Oligo said the same unsafe pattern – pickle deserialization over unauthenticated ZMQ TCP sockets – recurred in multiple inference frameworks. The issues have been assigned identifiers including CVE-2025-30165 for vLLM (CVSS 8.0), CVE-2025-23254 for NVIDIA TensorRT-LLM (CVSS 8.8, fixed in version 0.18.2) and a commit addressing CVE-2025-60455 for Modular Max Server; vLLM maintainers have switched to the V1 engine by default and SGLang has implemented what has been described as incomplete fixes, while Microsoft\’s Sarathi-Serve remained unpatched in the disclosure.

Researchers traced how the vulnerability pattern propagated, saying it often arose from direct code reuse or copy-pasting: vulnerable files indicate adaptations between projects, and one project borrowed logic from another, effectively repeating the same unsafe practice across codebases, the report said.

Because inference engines are often deployed as cluster nodes, a successful compromise of a single node could allow attackers to execute arbitrary code, escalate privileges, steal models or deploy persistent malicious payloads such as cryptocurrency miners, the researchers warned.

Separately, AI security platform Knostic reported techniques to compromise Cursor\’s built-in browser via JavaScript injection and by registering a rogue local Model Context Protocol server. The report said a malicious MCP server can replace browser pages to harvest credentials.

The Knostic report also demonstrated how a malicious extension or injected JavaScript can run inside the Node.js interpreter used by the IDE, inheriting full file-system privileges and the ability to modify or persist code and extensions, which the company said could turn the IDE into a malware distribution and exfiltration platform. The researchers recommended disabling Auto-Run features, vetting extensions, installing MCP servers only from trusted sources, using least-privilege API keys and auditing MCP server code for critical integrations.

Ahold Delhaize Cursor Knostic Oligo Security pickle remote code execution SGLang TensorRT-LLM vLLM ZeroMQ

Latest NEWS

New ENCFORGE ransomware targets AI files after Langflow breach

July 21, 2026
Hugging Face says AI agent breached its production infrastructure

July 20, 2026
Malicious Ruby gems linked to SleeperGem supply chain attack

July 20, 2026
Mozilla ships Firefox 152.0.6 security fix for CVE-2026-15718

July 20, 2026
CrowdStrike to buy XM Cyber IP in expanded deal with Schwarz Digits

July 19, 2026

Researchers find widespread remote code execution risk in AI inference engines from unsafe ZMQ and pickle use

Latest NEWS

New ENCFORGE ransomware targets AI files after Langflow breach

Hugging Face says AI agent breached its production infrastructure

Malicious Ruby gems linked to SleeperGem supply chain attack

Mozilla ships Firefox 152.0.6 security fix for CVE-2026-15718

CrowdStrike to buy XM Cyber IP in expanded deal with Schwarz Digits