HiddenLayer, a Gartner recognized Cool Vendor for AI Security, is the leading provider of Security for AI. Its security platform helps enterprises safeguard the machine learning models behind their most important products. HiddenLayer is the only company to offer turnkey security for AI that does not add unnecessary complexity to models and does not require access to raw data and algorithms. Founded by a team with deep roots in security and ML, HiddenLayer aims to protect enterprise’s AI from inference, bypass, extraction attacks, and model theft. The company is backed by a group of strategic investors, including M12, Microsoft’s Venture Fund, Moore Strategic Ventures, Booz Allen Ventures, IBM Ventures, and Capital One Ventures.
February 23, 2024
Path sanitization bypass leading to arbitrary read
CVE Number
CVE-2024-27318
Summary
A path traversal vulnerability exists inside of load_external_data_for_tensor function of the external_data_helper python file. This vulnerability requires the user to have downloaded and loaded a malicious model, leading to an arbitrary file read. The vulnerability exists because the _sanitize_path doesn’t properly sanitize the path.
Products Impacted
This vulnerability is present in ONNX v1.4.0 up to and including v1.15.0.
CVSS Score: 5.5
AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N
CWE Categorization
CWE-22: Improper Limitation of a Pathname to a Restricted Directory (‘Path Traversal’)
Details
The vulnerability exists within the onnx/external_data_helper.py file, in the load_external_data_for_tensor function. This is triggered when the onnx.external_data_helper._get_all_tensors function is called on a loaded model.
def load_external_data_for_tensor(tensor: TensorProto, base_dir: str) -> None:
"""
Loads data from an external file for tensor.
Ideally TensorProto should not hold any raw data but if it does it will be ignored.
Arguments:
tensor: a TensorProto object.
base_dir: directory that contains the external data.
"""
info = ExternalDataInfo(tensor)
file_location = _sanitize_path(info.location)
external_data_file_path = os.path.join(base_dir, file_location)
with open(external_data_file_path, "rb") as data_file:
if info.offset:
data_file.seek(info.offset)
if info.length:
tensor.raw_data = data_file.read(info.length)
else:
tensor.raw_data = data_file.read()
An attacker can exploit this vulnerability by creating an ONNX model with external tensors which contain malicious paths meant to traverse out of the designated directory. However, as can be seen in the above code snippet, there is an attempt to sanitize the path information provided by the user. This is a result of CVE-2022-25882, the predecessor of this vulnerability, which resulted in the developers implementing a sanitization function to prevent path traversals in the external tensor loader.
def _sanitize_path(path: str) -> str:
"""Remove path components which would allow traversing up a directory tree from a base path.
Note: This method is currently very basic and should be expanded.
"""
return path.lstrip("/.")
The original patch fixed a large number of path traversals by removing the “/” and “.” characters from the start of a path in order to remove absolute and relative paths being used by an attacker. However, nested path traversal attacks and absolute paths on Windows were not prevented. An attacker could exploit a nested path traversal attack by first going into a directory and then using relative paths to escape it, a very probable attack given that an attacker could provide the model with a directory containing external tensors, thus knowing the path of the directory. This style of attack is not stopped by the above due to the sanitization only stripping the bad characters at the start of a path.
When the user loads a malicious model with an external tensor pointing at external_data/../../secret their system would then load the data from that file into the model:
import onnx
model = onnx.load("model.onnx")
tensors = onnx.external_data_helper._get_all_tensors(model)
for tensor in tensors:
print(tensor)
Once run we can see that the super secret password was read.
Out of bounds read due to lack of string termination in assert
CVE Number
CVE-2024-27319
Summary
When assert is called the message is copied into a buffer and then printed. The copying will fill the whole buffer and fail to add a string terminator at the end of the copied buffer allowing an attacker to read some bytes from memory.
Products Impacted
This vulnerability is present in ONNX v1.1.0 up to and including v1.15.0.
CVSS Score: 3.3
AV:L/AC:L/PR:N/UI:R/S:U/C:L/I:N/A:N
CWE Categorization
CWE-125: Out-of-bounds Read
Details
The vulnerability exists within the onnx/common/assertions.cc file, in the barf function. This is triggered when any assert fails and the resulting string is 2048 characters or longer.
std::string barf(const char* fmt, ...) {
char msg[2048];
va_list args;
va_start(args, fmt);
// Although vsnprintf might have vulnerability issue while using format string with overflowed length,
// it should be safe here to use fixed length for buffer "msg". No further checking is needed.
vsnprintf(msg, 2048, fmt, args);
va_end(args);
return std::string(msg);
}
In the barf function, which is called by ONNX_ASSERT and ONNX_ASSERTM a buffer with 2048 bytes is allocated for the generated string. However, in the vsnprintf call 2048 bytes are copied to the buffer before turning the buffer into a string. This means that any strings 2048 bytes or longer will not have a null terminator. When the string is created the program will continue reading past the end of the buffer and will copy arbitrary program memory until a string terminator is reached.