PyTorch Lightning | HiddenLayer

Products Impacted

Lightning AI’s pytorch-lightning.

CVSS Score: 7.8

AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE Categorization

CWE-502: Deserialization of Untrusted Data.

Details

The cause of this vulnerability is in the convert_zero_checkpoint_to_fp32_state_dict function from lightning/pytorch/utilities/deepspeed.py:

def convert_zero_checkpoint_to_fp32_state_dict(
    checkpoint_dir: _PATH, output_file: _PATH, tag: str | None = None
) -> dict[str, Any]:
    """Convert ZeRO 2 or 3 checkpoint into a single fp32 consolidated ``state_dict`` file that can be loaded with
    ``torch.load(file)`` + ``load_state_dict()`` and used for training without DeepSpeed. It gets copied into the top
    level checkpoint dir, so the user can easily do the conversion at any point in the future. Once extracted, the
    weights don't require DeepSpeed and can be used in any application. Additionally the script has been modified to
    ensure we keep the lightning state inside the state dict for being able to run
    ``LightningModule.load_from_checkpoint('...')```.

    Args:
        checkpoint_dir: path to the desired checkpoint folder.
            (one that contains the tag-folder, like ``global_step14``)
        output_file: path to the pytorch fp32 state_dict output file (e.g. path/pytorch_model.bin)
        tag: checkpoint tag used as a unique identifier for checkpoint. If not provided will attempt
            to load tag in the file named ``latest`` in the checkpoint folder, e.g., ``global_step14``

    Examples::

        # Lightning deepspeed has saved a directory instead of a file
        convert_zero_checkpoint_to_fp32_state_dict(
            "lightning_logs/version_0/checkpoints/epoch=0-step=0.ckpt/",
            "lightning_model.pt"
        )

    """
...
    zero_stage = optim_state["optimizer_state_dict"]["zero_stage"]
    model_file = get_model_state_file(checkpoint_dir, zero_stage)
    client_state = torch.load(model_file, map_location=CPU_DEVICE)
...

The function is used to convert checkpoints into a single consolidated file. Unlike the other functions in this report, this vulnerability takes in a directory and requires an additional file named latest which contains the name of a directory containing a pytorch file with the naming convention *_optim_states.pt. This pytorch file returns a state which specifies the model state file, also located in the directory. This file is either named mp_rank_00_model_states.pt or zero_pp_rank_0_mp_rank_00_model_states.pt and is loaded in this exploit.

from lightning.pytorch.utilities.deepspeed import convert_zero_checkpoint_to_fp32_state_dict

checkpoint = "./checkpoint"
convert_zero_checkpoint_to_fp32_state_dict(checkpoint, "out.pt")

The pytorch file contains a data.pkl file which is unpickled during the loading process. Pickle is an inherently unsafe format which when loaded can cause arbitrary code to be executed, if the user tries to load a compromised checkpoint code can run on their system.

Project URL

https://lightning.ai/docs/pytorch/stable/

https://github.com/Lightning-AI/pytorch-lightning

Researcher: Kasimir Schulz, Director, Security Research, HiddenLayer

Related SAI Security Advisory

CVE-2026-3071

February 26, 2026

Flair Vulnerability Report

Flair

An arbitrary code execution vulnerability exists in the LanguageModel class due to unsafe deserialization in the load_language_model method. Specifically, the method invokes torch.load() with the weights_only parameter set to False, which causes PyTorch to rely on Python’s pickle module for object deserialization.

February 2026

CVE-2025-62354

November 26, 2025

Allowlist Bypass in Run Terminal Tool Allows Arbitrary Code Execution During Autorun Mode

Cursor

When in autorun mode, Cursor checks commands sent to run in the terminal to see if a command has been specifically allowed. The function that checks the command has a bypass to its logic allowing an attacker to craft a command that will execute non-allowed commands.

November 2025

The Most Comprehensive AI Security Platform

Case Study

Insights

Reports and Guides

Research

Innovation Hub

Webinars

Podcasts

Security Advisory

Glossary

Advisory & Resale Partners

Technology alliance

AWS

Databricks