In the last part of this blog article series I took an in-depth look at the packer of a QBot sample and unpacked it. This blog post is mostly about cracking the string encryption of the mentioned sample. I am also using the Triton DBA Framework[0] for assisting my analysis, because I really wanted to learn writing tools with it.
I am reversing the function that is responsible for decrypting most of the strings in the unpacked binary. First I identify the relevant function and obtain a trace with a debugger. Afterwards I parse the trace and feed the information into Triton in order to assist my analysis. Finally I update the IDA disassembly with a little python script to add the information I just obtained.
1 – Identifying the string decryption function
Again, I am jumping right into the unpacked sample. If you are interested in how the packer works, I recommend you to read my first part of this blog article[1].
Looking at the disassembly in IDA there are many library calls that give us a hint of what this sample is doing, but the strings are still encrypted. I identified an interesting pattern where each time the function sub_406481
gets called, an immediate value is moved into the EAX
register. The routine is also used pretty often and the return value is always a decrypted string.
So I decided to look deeper into how this decryption routine works.
2 – Obtaining a trace
By simply setting a breakpoint at the start and at the end of the function I’ve set the boundaries of what I am interested in. Next we just start to trace all instructions and write the result into a file. I used x64Dbg’s tracing functionality for this.
The mentioned debugger has an own trace file format[2]. Each file consists of a magic header, a json header and also a binary trace blocks. These binary trace blocks inform us of memory accesses, register changes and pretty much everything that happened during the execution.
Parsing this file format is the next task. So for educational purposes we can write our own parser or …
We can also use already existing methods.
An execution trace viewer[3] for x64Dbg traces developed by Teemu Laurila can be adjusted to meet our needs. The tool itself is very useful, but I only need the parsing algorithm. We just extract the part where the trace file is parsed and use it for our own cases. The adjusted and extracted code can be found on my github page[4].
3 – Using Triton for analysis assistance
Triton[5] is a Dynamic Binary Analysis Framework developed by Jonathan Swan and offers a wide variety of tools that can be used to analyse binary behaviour. In this case I mainly use it for assistance in understanding the trace. It also comes in pretty handy that we have all register changes as well as memory accesses that were done during the decryption.
The routine itself allocates heap memory 2 times and both of them are filled during decryption. The second one holds the decrypted string. Also 2 memory addresses in the .text areas are accessed and used for XOR operations.
First we identify which memory areas we are actually interested in:
Bytes from the .text area are used for XORing as well as filling the first heap chunk. The thread stack is also used in the trace, but not as interesting as the other three memory areas.
The Triton script I wrote gives me a verbose output each time a memory area I am interested in is accessed as well as the offset to the start of the belonging memory area. This makes static analysis way easier, because I can use this additional information when reading the disassembly in IDA.
I also uploaded the python script for decryption to my github page[6]. Feel free to take a look at it if you are interested in how it works. I did not upload the Triton script code, because it was very basic. If you want to try out writing tools with Triton, I recommend you checking out its official github page which I’ve already mentioned in this article.
5 – Writing an IDA python script for string decryption
Now that we have the algorithm to decrypt the binary’s strings, we also want to embed this into IDA. I wrote a little script to loop over all instructions and search for string decryption calls. If such a call is found, we comment the decrypted string next to the call.
Thats basically it for decrypting most of the strings. Some are still missing, mostly the ones that determine the immediate value needed for decryption dynamically.
I hope that next time I look into the sample I don’t get stuck again at the second function already ;-). If there are any questions, feel free to ask me on twitter.