SECURITY WIN

vLLM Security Assessment

Identified 1 critical vulnerability involving unsafe torch.load() calls without weights_only verification in the tensorizer module, potentially allowing arbitrary code execution through malicious checkpoint files. Fix was remediated and merged into PR #32045.
January 20264 min read
Faizan
vLLM Remote Code Execution Deserialization Model Security

Executive Summary

This security assessment was conducted using Kolega.dev's automated security remediation platform, which combines traditional security scanning (SAST, SCA, secrets detection) with proprietary AI-powered deep code analysis. Our two-tier detection approach identified vulnerabilities that standard tools miss, including complex logic flaws and cross-service injection vectors.

Our analysis of the vLLM repository identified 1 vulnerabilities through Kolega.dev Deep Code Scan (Tier 2) that warrant attention.

Vulnerability Overview

ID

Title

PR/Ticket

V1

Unsafe torch.load() without weights_only Verification

PR #32045

Responsible Disclosure Timeline

Kolega.dev follows responsible disclosure practices. We coordinated privately through vLLM's official security reporting channel.

January 7 2026

Initial report sent to vLLM through Github Security

January 10 2026

Response from vLLM confirming 1 of the reported items were remediated and merged into PR 32045


Vulnerabilities Detail

V1: Unsafe torch.load() without weights_only Verification

CWE: CWE-502 (Deserialization of Untrusted Data)
Location: vllm/model_executor/model_loader/tensorizer.py:763-765; vllm/model_executor/models/adapters.py:93

Description
While some torch.load() calls are configured with weights_only=True, the tensorizer.py module conditionally invokes torch.load() without this safeguard when handling non-safetensors files. This inconsistency introduces a potential arbitrary code execution risk if a malicious checkpoint file is loaded.

Evidence
torch.load() without weights_only=True can execute arbitrary Python code embedded in the checkpoint file. This is a well-known PyTorch security issue.

Impact
Remote Code Execution (RCE). An attacker distributing malicious .pt or .bin checkpoint files could achieve arbitrary code execution when the model is loaded.

Remediation

  1. Always use weights_only=True for torch.load() calls.

  2. If full checkpoint loading is required, verify checkpoint source and digital signatures.

  3. Implement a checkpoint validation layer that inspects checkpoint structure before loading.

  4. Consider switching to safetensors format exclusively.

Simple 3 click setup.

Deploy Kolega.dev.

Find and fix your technical debt.