HiddenLayer is the leading provider of Security for AI. Its security platform helps enterprises safeguard the machine learning models behind their most important products. HiddenLayer is the only company to offer turnkey security for AI that does not add unnecessary complexity to models and does not require access to raw data and algorithms. Founded by a team with deep roots in security and ML, HiddenLayer aims to protect enterprise’s AI from inference, bypass, extraction attacks, and model theft. The company is backed by a group of strategic investors, including M12, Microsoft’s Venture Fund, Moore Strategic Ventures, Booz Allen Ventures, IBM Ventures, and Capital One Ventures.
HiddenLayer’s Synaptic Adversarial Intelligence (SAI) team consists of multidisciplinary cybersecurity experts and data scientists dedicated to raising awareness about threats to machine learning and artificial intelligence systems. Our mission is to educate data scientists, MLDevOps teams, and cybersecurity practitioners on evaluating ML/AI vulnerabilities and risks, promoting more security-conscious implementations and deployments.
During our research, we identify numerous vulnerabilities within ML/AI projects. While our research blogs cover those that we consider to be most impactful, some affect only specific projects or use cases. We’ve therefore created this dedicated space to share all of our findings, enabling users within our community to keep updated on new vulnerabilities, including security issues that have not been assigned a CVE.
July 2024
SAI-ADV-2024-001: Model Deserialization Leads to Code Execution
Tensorflow Probability
An attacker can create a maliciously crafted HDF5 file by injecting a pickle object containing arbitrary code into the DistributionLambda layer of the model under the make_distribution_fn key, and share it with a victim. If the victim is using Tensorflow Probability v0.7 or later and loads the malicious model, the object will be deserialized and arbitrary code will execute on their system.
June 2024
CVE-2024-37061: Remote Code Execution on Local System via MLproject YAML File
MLflow
An attacker can package an MLflow Project where the main entrypoint command set in the MLproject file contains malicious code (or an operating system appropriate command), and share it with a victim. When the victim runs the project, the command will be executed on their system.
CVE-2024-37060: Pickle Load on Recipe Run Leading to Code Execution
MLflow
An attacker can create an MLProject Recipe containing a malicious pickle file and a Python file that calls BaseCard.load on it and share it with a victim. When the victim runs mlflow run against the Recipe directory, the pickle file will be deserialized on their system, running any arbitrary code it contains.
CVE-2024-37059: Cloudpickle Load on PyTorch Model Load Leading to Code Execution
MLflow
An attacker can inject a malicious pickle object into a Pytorch model file and log it to the MLflow tracking server via the API using the model.pytorch.log_model function. When a victim user calls the mlflow.pytorch.load_model function on the model, the pickle object is deserialized on their system, running any arbitrary code it contains.
CVE-2024-37058: Cloudpickle Load on Langchain Model Load Leading to Code Execution
MLflow
An attacker can inject a malicious pickle object during the process of creating a Langhchain model and log the model to the MLflow tracking server via the API using the model.langchain.log_model function. When a victim user calls the mlflow.langchain.load_model function on the model, the pickle object is deserialized on their system, running any arbitrary code it contains.
CVE-2024-37057: Cloudpickle Load on TensorFlow Keras Model Leading to Code Execution
MLflow
An attacker can inject a malicious pickle object into a Tensorflow model file and log it to the MLflow tracking server via the API using the model.tensorflow.log_model function. When a victim user calls the mlflow.tensorflow.load_model function on the model, the pickle object is deserialized on their system, running any arbitrary code it contains.
CVE-2024-37056: Cloudpickle Load on LightGBM SciKit Learn Model Leading to Code Execution
MLflow
An attacker can inject a malicious pickle object into a LightGBM scikit-learn model file and log it to the MLflow tracking server via the API using the model.lightgbm.log_model function. When a victim user calls the mlflow.lightgbm.load_model function on the model, the pickle object is deserialized on their system, running any arbitrary code it contains.
CVE-2024-37055: Pickle Load on Pmdarima Model Load Leading to Code Execution
MLflow
An attacker can inject a malicious pickle object into a pmdarima model file and log it to the MLflow tracking server via the API using the model.pmdarima.log_model function. When a victim user calls the mlflow.pmdarima.load_model function on the model, the pickle object is deserialized on their system, running any arbitrary code it contains.
CVE-2024-37054: Cloudpickle Load on PyFunc Model Load Leading to Code Execution
MLflow
An attacker can inject a malicious pickle object into a model file and log it to the MLflow tracking server via the API using the model.pyfunc.log_model function. When a victim user calls the mlflow.pyfunc.load_model function on the model, the pickle object is deserialized on their system, running any arbitrary code it contains.
CVE-2024-37053: Pickle Load on Sklearn Model Load Leading to Code Execution
MLflow
An attacker can inject a malicious pickle object into a scikit-learn model file and log it to the MLflow tracking server via the API. When a victim user calls the mlflow.sklearn.load_model function on the model, the pickle file is deserialized on their system, running any arbitrary code it contains.
CVE-2024-37052: Cloudpickle Load on Sklearn Model Load Leading to Code Execution
MLflow
An attacker can inject a malicious pickle object into a scikit-learn model file and log it to the MLflow tracking server via the API. When a victim user calls the mlflow.sklearn.load_model function on the model, the pickle file is deserialized on their system, running any arbitrary code it contains.
April 2024
CVE-2024-34073: Command Injection in CaptureDependency Function
AWS Sagemaker
A command injection vulnerability exists inside the capture_dependencies function of the AWS Sagemakers util file. If a user used the util function when creating their code, an attacker can leverage the vulnerability to run arbitrary commands on a system running the code by injecting a system command into the string passed to the function.
CVE-2024-34072: Numpy defaults to allowing Pickle to be run when content type is NPY or NPZ
AWS Sagemaker
An attacker can inject a malicious pickle object into a numpy file and share it with a victim user. When the victim uses the NumpyDeserializer.deserialize function of the base_deserializers python file to load it, the allow_pickle optional argument can be set to ‘false’ and passed to np.load, leading to the safe loading of the file. However, by default the optional parameter was set to true, so if this is not specifically changed by the victim, this will result in the loading and execution of the malicious pickle object.
CVE-2024-27322: R-bitrary Code Execution Through Deserialization Vulnerability
R
An attacker could leverage the R Data Serialization format to insert arbitrary code into an RDS formatted file, or an R package as an RDX or RDB component, which will be executed when referenced or called with ReadRDS. This is because of the lazy evaluation process used in the unserialize function of the R programming language.
February 2024
CVE-2024-27319: Out of bounds read due to lack of string termination in assert
ONNX
An attacker can create a malicious onnx model which fails an assert statement in a way that an error string equal to or greater than 2048 characters is printed out and share it with a victim. When the victim tries to load the onnx model a string is created which leaks program memory.
CVE-2024-27318: Path sanitization bypass leading to arbitrary read
ONNX
An attacker can create a malicious onnx model containing paths to externally located tensors and share it with a victim. When the victim tries to load the externally located tensors a directory traversal attack can occur leading to an arbitrary read on a victim’s system leading to information disclosure.
CVE-2024-24595: Credentials Stored in Plaintext in MongoDB Instance
ClearML
An attacker could retrieve ClearML user information and credentials using a tool such as mongosh if they have access to the server. This is because the open-source version of the ClearML Server MongoDB instance lacks access control and stores user information and credentials in plaintext.
CVE-2024-24594: Web Server Renders User HTML Leading to XSS
ClearML
An attacker can provide a URL rather than uploading an image to the Debug Samples tab of an Experiment. If the URL has the extension .html, the web server retrieves the HTML page, which is assumed to contain trusted data. The HTML is marked as safe and rendered on the page, resulting in arbitrary JavaScript running in any user’s browser when they view the samples tab.
CVE-2024-24591: Path Traversal on File Download Leading to Arbitrary Write
ClearML
An attacker can upload or modify a dataset containing a link pointing to an arbitrary file and a target file path. When a user interacts with this dataset, such as when using the Dataset.squash method, the file is written to the target path on the user’s system.
CVE-2024-24590: Pickle Load on Artifact Get Leading to Code Execution
ClearML
An attacker can create a pickle file containing arbitrary code and upload it as an artifact to a Project via the API. When a victim user calls the get method within the Artifact class to download and load a file into memory, the pickle file is deserialized on their system, running any arbitrary code it contains.