Cloudpickle and Pickle Load on Sklearn Model Load Leading to Code Execution
CVE Number
CVE-2024-37052
CVE-2024-37053
Summary
A deserialization vulnerability exists in the sklearn/__init__.py file, within the function _load_model_from_local_file. An attacker can inject a malicious pickle object into a model file on upload which will then be deserialized when the model is loaded, executing the malicious code on the victim machine.
Products Impacted
This vulnerability was introduced in version 1.1.0 of MLflow.
CVSS Score: 8.8
AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
CWE Categorization
CWE-502: Deserialization of Untrusted Data.
Details
The vulnerability exists within the sklearn/__init__.py file, within the function _load_model_from_local_file. This is called when the mlflow.sklearn.load_model function is called.
An attacker can exploit this by injecting a pickle object that will execute arbitrary code when deserialized into a model. The attacker can then call the sklearn.log_model() function to serialize this model and log it to the tracking server. By default, cloudpickle.load is used on deserialization when the model is loaded. The serialization format can be set to ‘pickle’ when the model is logged in order to force the use of pickle.load() when the model is loaded. In the below example, the pickle object has been injected into the init method of the ElasticNet class.
When the model is loaded by the victim (example code snippet below), the arbitrary code is executed on their machine: