How to decompile any Python binary

Posted on 14 February 2018 by In Ming Loh

At Countercept we often encounter binary payloads that are generated from compiled Python. These are usually generated with tools such as py2exe or PyInstaller to create a Windows executable. A notable example was the Triton malware recently discovered by FireEye[1], which used this exact technique.

 

Due to the variety of payloads seen we frequently relied on multiple decompilation scripts or manual human intervention to obtain the source code. To speed this process up we decided to create a single analysis script that could decompile both py2exe and PyInstaller files and provide us the output.

 

In this article we'll walk through how to generate Python binaries and how our script can be used to decompile them.

 

Python to Executable

To start off we're going to show you how payloads can be compiled in py2exe and PyInstaller.

 

To create a payload using py2exe:
  1. Install the py2exe package from http://www.py2exe.org/
  2. For the payload (in this case, we will name it hello.py), use a script like the one in Figure 1. The option “bundle_files” with the value of 1 will bundle everything including Python interpreter into one exe.
  3. Once the script is ready, we will issue the command “python setup.py py2exe”. This will create the executable, just like in Figure 2.
from distutils.core import setup
import py2exe, sys, os

sys.argv.append('py2exe')

setup(
    options = {'py2exe': {'bundle_files': 1}},
    #windows = [{'script': "hello.py"}],
  console = [{'script': "hello.py"}],
    zipfile = None,
)

Figure 1

C:\Users\test\Desktop\test>python setup.py py2exe
running py2exe
*** searching for required modules ***
*** parsing results ***
*** finding dlls needed ***
*** create binaries ***
*** byte compile python files ***
*** copy extensions ***
*** copy dlls ***
copying C:\Python27\lib\site-packages\py2exe\run.exe -> C:\Users\test\Desktop\test\dist\hello.exe
Adding python27.dll as resource to C:\Users\test\Desktop\test\dist\hello.exe

Figure 2 

 

To create a payload using PyInstaller:
  1. Install PyInstaller using pip (pip install pyinstaller).
  2. After that, we will issue the command “pyinstaller --onefile hello.py” (a reminder that ‘hello.py’ is our payload). This will bundle everything into one executable.
C:\Users\test\Desktop\test>pyinstaller --onefile hello.py
108 INFO: PyInstaller: 3.3.1
108 INFO: Python: 2.7.14
108 INFO: Platform: Windows-10-10.0.16299
………………………………
5967 INFO: checking EXE
5967 INFO: Building EXE because out00-EXE.toc is non existent
5982 INFO: Building EXE from out00-EXE.toc
5982 INFO: Appending archive to EXE C:\Users\test\Desktop\test\dist\hello.exe
6325 INFO: Building EXE from out00-EXE.toc completed successfully.

Figure 3

 

One Script to Rule Them All

A number of useful Python decompilation scripts already exist, including unpy2exe.py, pyinstxtractor.py and uncompyle6; however, each supports different options and file types. To speed up analysis the Countercept team created a single script (Github) that can be used as a one stop shop for decompilation, calling other scripts as needed.

 

The script operates as follows:

 

  1. Once a binary is specified as input, it will automatically determine if this is packed using py2exe or PyInstaller.
  2. After it has completed the check it will proceed using either unpy2exe.py or pyinstxtractor.py for unpacking.
  3. However, if the script detected any encrypted bytecode, it will ask whether it should proceed with the decryption process (Figure 7).
  4. Once everything is unpacked, it will proceed with decompiling all the extracted Python byte code by using uncompyle6.
  5. Occasionally, the main Python file, which contains the main logic for the program, can’t be decompiled. Usually, this is because it’s missing the magic bytes for the Python version number within the Python bytecodes. The “prepend” option in this script can be used to overcome this.

 

The available options are shown below:

test@test:python python_exe_unpack.py
[*] On Python 2.7
usage: python_exe_unpack.py [-h] [-i INPUT] [-o OUTPUT] [-p PREPEND]

This program will detect, unpack and decompile binary that is packed in either
py2exe or pyinstaller. (Use only one option either -i or -p)

optional arguments:
  -h, --help  show this help message and exit
  -i INPUT    exe that is packed using py2exe or pyinstaller (Use -o to
              specify the output directory)
  -o OUTPUT   folder to store your unpacked and decompiled code. (Otherwise
              will default to current working directory and inside the folder
              "unpacked")
  -p PREPEND  Option that prepend pyc without magic bytes. (Usually for
              pyinstaller main python file)

Figure 4

 

A Python binary can be decompiled by passing it to the script using the ‘i’ argument as below – Figure 5 shows a p2exe example and Figure 6 shows a PyInstaller example:

test@test:python python_exe_unpack.py -i sample/malware_1.exe
[*] On Python 2.7
[*] This exe is packed using py2exe
[*] Unpacking the binary now

Figure 5

test@test:python python_exe_unpack.py -i sample/malware_2.exe
[*] On Python 2.7
[*] Processing sample/malware_42d5f609c0143ec808b45b247f2cbf8decce5bee0572a30c2437ecb6bf8b37b4
[*] Pyinstaller version: 2.0
[*] This exe is packed using pyinstaller
[*] Unpacking the binary now
[*] Python version: 26
[*] Length of package: 7346701 bytes
[*] Found 66 files in CArchive
[*] Beginning extraction...please standby
[!] Warning: The script is running in a different python version than the one used to build the executable
    Run this script in Python26 to prevent extraction errors(if any) during unmarshalling
[*] Found 423 files in PYZ archive
[*] Successfully extracted pyinstaller exe.

Figure 6

 

PyInstaller has an option that can encrypt the Python bytecode bundle together with the exe (usually, other modules are required by the main Python file). As we can see from Figure 7, once encrypted Python bytecode is detected, it will ask whether or not to decrypt it with the key that the script retrieved from the exe itself.

test@test:python python_exe_unpack.py -i sample/malware_3.exe
[*] On Python 2.7
[*] Processing sample/hello-pyinstaller-encrypted.exe
[*] Pyinstaller version: 2.1+
[*] This exe is packed using pyinstaller
[*] Unpacking the binary now
[*] Python version: 27
[*] Length of package: 3210322 bytes
[*] Found 20 files in CArchive
[*] Beginning extraction...please standby
[*] Found 196 files in PYZ archive
[!] Error: Failed to decompress heapq, probably encrypted. Extracting as is.
[!] Error: Failed to decompress encodings.cp932, probably encrypted. Extracting as is.
[!] Error: Failed to decompress encodings.johab, probably encrypted. Extracting as is.
[!] Error: Failed to decompress functools, probably encrypted. Extracting as is.
[!] Error: Failed to decompress random, probably encrypted. Extracting as is.
..........................................
[!] Error: Failed to decompress encodings.cp950, probably encrypted. Extracting as is.
[*] Successfully extracted pyinstaller exe.
[*] Encrypted pyc file is found. Decrypt it? [y/n]y
decompiled 194 files: 0 okay, 2 failed
[+] Binary unpacked successfully

Figure 7

 

Challenges with Python bytecode

Currently with unpy2exe or pyinstxtractor the Python bytecode file we get might not be complete and in turn it can’t be recognized by uncompyle6 to get the plain Python source code. This is caused by a missing Python bytecode version number. Therefore we included a prepend option; this will include a Python bytecode version number into it and help to ease the process of decompiling. As we can see from Figure 8, when we try to use uncompyle6 to decompile the .pyc file it returns an error. However, once we use the prepend option (Figure 9) we can see that the Python source code has been decompiled successfully.

test@test: uncompyle6 unpacked/malware_3.exe/archive.py 
Traceback (most recent call last):
  ……………………….
ImportError: File name: 'unpacked/malware_3.exe/__pycache__/archive.cpython-35.pyc' doesn't exist

Figure 8

test@test:python python_exe_unpack.py -p unpacked/malware_3.exe/archive
[*] On Python 2.7
[+] Magic bytes is already appeneded.

# Successfully decompiled file
[+] Successfully decompiled.

Figure 9 

 

Real World Example

To demonstrate how the script can be used in the real world, we will test it against a Triton malware sample.

 

  1. Download the sample by using the hash found [1]. SHA1: dc81f383624955e0c0441734f9f1dabfe03f373c
  2. Run the script with the input as the sample we mentioned above.
  3. As we can see from our text editor, the Python code is retrieved and decompiled successfully.
test@test: python python_exe_unpack.py -i sample/triton_sample
[*] On Python 2.7
[*] This exe is packed using py2exe
[*] Unpacking the binary now

# Successfully decompiled file

Figure 10

# uncompyle6 version 2.11.5
# Python bytecode 2.7 (62211)
# Decompiled from: Python 2.7.12 (default, Nov 20 2017, 18:23:56) 
# [GCC 5.4.0 20160609]
# Embedded file name: script_test.py
# Compiled at: 2017-12-24 08:05:33
import TsHi
import sh
import struct
import time
import sys

def PresetStatusField(TsApi, value):
    if len(value) != 4:
        return -1
    script_code = '\x80\x00@<\x00\x00b\x80@\x00\x80<@ \x03|\x1c\x00\x82@\x04\x00b\x80`\x00\x80<@ \x03|\x0c\x00\x82@\x18\x00B8\x1c\x00\x00H\x80\x00\x80<\x00\x01\x84`@ \x02|\x18\x00\x80@\x04\x00B8\xc4\xff\xffK' + value[2:4] + '\x80<' + value[0:2] + '\x84`\x00\x00\x82\x90\xff\xff`8\x02\x00\x00D'
    AppendResult = TsApi.SafeAppendProgramMod(script_code)
    if not AppendResult:
        return -1
    cp_info = TsApi.GetCpStatus()
    status = cp_info[40:44]
    if status != value:
        return 0
    return 1
………………………………………..

Figure 11 

 

And problem solved! We’ve hopefully demonstrated how you can now decompile any Python binary – this will ideally improve efficiencies in your own hunt teams. This project is currently on Github and we welcome contributions. 

 

 

 

[1] https://www.fireeye.com/blog/threat-research/2017/12/attackers-deploy-new-ics-attack-framework-triton.html