In our previous blog post on the Shadow Brokers leak (https://www.countercept.com/our-thinking/analyzing-the-doublepulsar-kernel-dll-injection-technique/), we looked at the advanced kernel mode DOUBLEPULSAR payload and its stealthy kernel-to-user-land DLL injection technique. However, this is very much a stealthy stager to load a more fully featured implant. The PEDDLECHEAP implant used with the DANDERSPRITZ GUI tool is such an implant. This runs in user land and DOUBLEPULSAR is one of the primary methods for loading it. This implant is highly modular in nature and is much more closely aligned in thinking to Metasploit’s meterpreter.
Many of the modules are implemented as standalone DLLs that are sent across on demand to the target and loaded in order to execute their functionality. By default, DANDERSPRITZ runs a fairly extensive set of modules automatically when interacting with the PEDDLECHEAP implant, which results in many of these modules being loaded on first connect.
Additionally, there are various different command and control (C2) options for DANDERSPRITZ that are more conventional in nature, as opposed to the clever use of unimplemented features of the SMB stack used by the SMB version of the DOUBLEPULSAR implant. A screenshot of some of these C2 options is shown below. For this testing we used option 19, which is a standard bind shell:
When first connecting/receiving a connection from a PEDDLECHEAP implant, there is an option to choose whether to load libraries from file or from memory. Shellcode related to this can be found in the egg files in the following location:
Load from File Library
A simple first glance at the strings in the file-based load library shellcode gives an immediate clue to its operation. The presence of functions such as GetTempFileNameW(), WriteFile() and LoadLibraryW() suggests it may be writing the DLL contents to a temporary file and then using a conventional LoadLibrary() call to load the DLL from disk.
This assumption is backed up when monitoring an injected process using the file load option as can be seen from the procmon output below:
Here we can clearly see an image has been loaded from a temporary file. It is interesting that this is one option that can be used as it is arguably much less stealthy than the clever kernel-to-user-land in-memory DLL load technique used by the DOUBLEPULSAR implant to initially stage the PEDDLECHEAP implant.
However, this is only one DLL load and we know that a large amount of DLLs are used for the different modules that are implemented. We also see many new threads being created in the process and a quick look at their stack traces show that they all either come from an unknown region of memory that corresponds to the initial DLL load from the kernel by DOUBLEPULSAR or from within the address space of the temporary file that was loaded using LoadLibrary().
A quick analysis of the address space and the process environment block (PEB) using WinDBG shows that there is more occurring here than meets the eye:
Here we can see that the linked list of modules in the PEB shows that, as well as all the standard DLLs and the temporary file that were loaded from disk, there are also a range of modules loaded that do not have associated names. A quick comparison of the memory regions of these unknown modules to the conventionally loaded DLLs also shows a discrepancy:
Here we can see that the conventionally loaded DLL that was written as a temporary file to disk is of type MEM_IMAGE. However, one of the unknown modules has PAGE_EXECUTE_READWRITE permissions and is of type MEM_MAPPED. This brings us to the default option to load library in memory.
Load from Memory Library
If we select the “Load from Memory Library” option then we do not see the DLL written to disk in a temporary file and then loaded via LoadLibrary(). However, we do still see very similar behavior with a great deal of unknown modules listed in the PEB. If we take a closer look at the shellcode for the “Load from Memory Library” option then the reason for this will become clear. First though, a quick look at the strings contained gives some initial clues:
While we can still see references to LoadLibraryA(), we also see other references to lower-level API calls. In particular, calls such as NtCreateSection(), NtMapViewOfSection() and RtlImageDirectoryEntryToData() are of particular interest given the behavior we saw in the target process address space. For example, let us first look at the call to NtCreateSection() within the shellcode:
Here we can see NtCreateSection() being dynamically resolved via GetProcAddress() and then called in order to create a section with PAGE_EXECUTE_READWRITE permissions and with a NULL value passed for the FileHandle parameter. The documentation tells us that the section will then be backed by the paging file instead of an actual file on disk https://msdn.microsoft.com/en-us/library/windows/hardware/ff566428(v=vs.85).aspx
Soon after this, we then see a call to NtMapViewOfSection() in order to map the previously created section into memory space of the current process. This would result in the MEM_MAPPED regions with PAGE_EXECUTE_READWRITE permissions that we saw in the address space in WinDBG earlier.
However, the more key question here is: How does it then load an arbitrary DLL into this mapped space purely in-memory whilst still ensuring that it can run functionally without using the file-backed LoadLibrary() API? The most important part of answering this question is to examine the later calls to RtlImageDirectoryEntryToData() https://msdn.microsoft.com/en-us/library/windows/desktop/ms680148(v=vs.85).aspx :
This function is used to parse the PE structure of an image and obtain a directory entry. Interestingly, the MappedAsImage parameter is FALSE, which is also indicative of how this is being used against a memory-mapped data region instead of a MEM_IMAGE type region. In this first instance of it being called, we see the base relocation table being requested. This table is extremely important because when a DLL is compiled it will have some hard-coded fixed offsets based on an assumption of its base address at runtime. If it is loaded into a different base address, these will be wrong.
The base relocation table allows the windows loader to fix-up these addresses during loading. Seeing this low-level call by the shellcode to get this information is the first step to manually loading the DLL in-memory without the help of the higher-level LoadLibrary() API.
The next call we see to this function then requests the import table. The output of this is used to resolve any dependent imports in the DLL, which are then eventually passed to LoadLibraryA() in order to conduct a conventional file-backed DLL load. In this case, PEDDLECHEAP only seems to do this as a result of having dependence on “known-good” windows libraries. This is backed up by procmon as we can see further windows DLLs be loaded from disk as PEDDLECHEAP functionality is used. Assuming there were dependencies that needed to be loaded, a final call to RtlImageDirectoryEntryToData() is eventually reached:
This final call is used to load the IAT directory. This is critical in the case of import dependencies as the IAT table needs to be populated in order to ensure calls to imported functions point to the correct place in memory.
Essentially, PEDDLECHEAP seems to implement a minimal in-memory DLL loader by bypassing the LoadLibrary() API for the loading of “malicious” DLLs. It does this by making use of low-level API calls to do some of the heavy lifting but only for the critical minimum set of functionality it needs to implement. In this case that is the application of relocation information, loading of dependencies and the population of their IAT tables.
With this functionality alone, it should be possible to write PEDDLECHEAP modules in a high-level language, compile it to binary form and link against standard Windows DLLs while still ensuring that the DLL can be loaded purely in-memory without it being written to disk.
The PEDDLECHEAP implant and its modules are much less stealthy than the DOUBLEPULSAR implant. However, given the extremely lightweight nature of the kernel level DOUBLEPULSAR implant, compared with the rich featured PEDDLECHEAP implant, that is to be expected.
Whilst the memory-based loading option in DANDERSPRITZ avoids touching disk with any of the malicious DLLs, there is still a lot of suspicious activity in-memory including the following:
1) Suspicious threads running from regions of memory not corresponding to known loaded modules.
2) RWX code regions including full PE headers.
3) Loaded modules without a file-mapped backing.
4) Suspicious network connections for the C2 channel.
This is very similar in nature to many modern in-memory focused attack frameworks like Metasploit, Empire and Cobalt Strike and is really no stealthier than those techniques. Standard memory forensics techniques using tools such as Volatility or Rekall should be able to spot this, as well as any good EDR software with dedicated memory analysis capabilities.
However, one particularly interesting feature is the GANGSTERTHIEF module that shows the authors were very aware of these detection techniques themselves. In fact, GANGSTERTHIEF can actually be used to detect PEDDLECHEAP itself:
Above we can see a snippet of the output from the injected option that finds modules without a file matching on disk. In this case, it is showing all of its own modules that it has loaded within calc.exe, because this is the very technique it uses for its own injection.
Additionally, the GUI process view can be used to find evidence of injection too. By right-clicking to request further process info on the infected calc.exe process, it is then possible to browse all the module information, that shows both the temporary file loaded from disk, as well as all the unknown modules in the PEB that do not have a file-based backing.
Hopefully, this article should have given a good overview of the technical approaches used by the PEDDLECHEAP implant in order to exist in memory and dynamically load and execute modules. In its current form, these generic techniques are relatively well known and should be detectable using modern endpoint detection and memory analysis approaches. If you’d like to know more about them check out our whitepaper here (mwr.to/memoryanalysis)
However, the authors were clearly aware of how to detect their own techniques already and so the question remains as to what more stealthy methods may they have come up with since this implant was first written.