Vendors as well as developers try to protect their product from reverse engineers for multiple reasons. On the one hand they want to protect their intellectual property, on the other hand they might just want to fend off blackhats from finding vulnerabilities in their software. In some cases, they will use one of many commercial solutions out there which implement some neat tricks to defend software from being reverse engineered.
One of those commercial protectors is called Armadillo and it implemented a technology which is known as nanomites. Today I want to show you how this technology works and how it could be implemented for binaries targeting Linux.
Debugging on Linux
Under the hood, debugging on Linux is implemented with the ptrace
system call. ptrace
is able to control and observe the execution of another process as well as modify its memory and registers. Everything we need to debug software.
System calls can be defined as a service request from the kernel to the OS, serving as a link between the operating system and the kernel.
If you actually take a look at some standard functions we usually use to, for example print strings to standard output, you can see that most of them use these system calls to communicate with the kernel:
Here is the manual of ptrace
:
Only one debugger at a time
It’s important to mention that only one debugger can be attached to a process at a time. This is true for Windows as well as Linux. On Linux, we can verify that by compiling a simple C program and try to attach it with 2 GDB
instances.
Introducing Nanomites
There is a real cool writeup[1] of someone who solved a nanomite ctf challenge targeting the MIPS
architecture with symbolic execution and he pretty much summed up perfectly what nanomites is, it is more a recipe of how to protect software efficiently. Here is the recipe:
- Two processes have to exist, a father takes care of the son
- The father has to attach to the son with debug APIs (on Linux :
ptrace
) - Both processes have to communicate with each other during execution
Let’s check how this is implemented in a CTF challenge:
First a fork()
is called, resulting in the “son” subprocess being spawned. The “father” takes the right branch, waiting for the child process to change its state via waitpid
. The left branch is taken by the child.
The child process checks wether an external debugger is attached via ptrace
and exits if true.
19011 ptrace(PTRACE_TRACEME, 0, NULL, NULL) = -1 EPERM (Operation not permitted)
19011 write(1, "So you want to trace me?!\n", 26) = 26
Otherwise it takes the right branch and prepares a signal handler for the SIGFPE
signal. Next the handled signal is triggered on purpose by a division of zero. This triggers the waitpid
I mentioned before, the father process continues its execution now. Here is the strace
output of the father process:
wait4(19029, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGFPE}], 0, NULL) = 19029
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=19029, si_uid=1000, si_status=SIGFPE, si_utime=0, si_stime=0} ---
ptrace(PTRACE_CONT, 19029, NULL, SIGFPE) = 0
wait4(19029, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSEGV}], 0, NULL) = 19029
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=19029, si_uid=1000, si_status=SIGSEGV, si_utime=0, si_stime=0} ---
ptrace(PTRACE_GETREGS, 19029, NULL, 0x7ffed41a8ec0) = 0
ptrace(PTRACE_SETREGS, 19029, NULL, 0x7ffed41a8ec0) = 0
ptrace(PTRACE_CONT, 19029, NULL, SIG_0) = 0
wait4(19029, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP}], 0, NULL) = 19029
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=19029, si_uid=1000, si_status=SIGTRAP, si_utime=0, si_stime=0} ---
ptrace(PTRACE_GETREGS, 19029, NULL, 0x7ffed41a8ec0) = 0
ptrace(PTRACE_SETREGS, 19029, NULL, 0x7ffed41a8ec0) = 0
ptrace(PTRACE_CONT, 19029, NULL, SIG_0) = 0
wait4(19029, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP}], 0, NULL) = 19029
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=19029, si_uid=1000, si_status=SIGTRAP, si_utime=0, si_stime=0} ---
ptrace(PTRACE_GETREGS, 19029, NULL, 0x7ffed41a8ec0) = 0
ptrace(PTRACE_SETREGS, 19029, NULL, 0x7ffed41a8ec0) = 0
ptrace(PTRACE_CONT, 19029, NULL, SIG_0) = 0
wait4(19029, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSEGV}], 0, NULL) = 19029
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=19029, si_uid=1000, si_status=SIGSEGV, si_utime=0, si_stime=0} ---
ptrace(PTRACE_GETREGS, 19029, NULL, 0x7ffed41a8ec0) = 0
ptrace(PTRACE_SETREGS, 19029, NULL, 0x7ffed41a8ec0) = 0
ptrace(PTRACE_CONT, 19029, NULL, SIG_0) = 0
wait4(19029, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP}], 0, NULL) = 19029
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=19029, si_uid=1000, si_status=SIGTRAP, si_utime=0, si_stime=0} ---
ptrace(PTRACE_GETREGS, 19029, NULL, 0x7ffed41a8ec0) = 0
ptrace(PTRACE_SETREGS, 19029, NULL, 0x7ffed41a8ec0) = 0
ptrace(PTRACE_CONT, 19029, NULL, SIG_0) = 0
wait4(19029, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 19029
The father process continues to communicate with the child process and modifies its registers by calling ptrace
with PTRACE_GETREGS
and PTRACE_SETREGS
flags. This hinders reverse engineering since we really need to understand how the communication between these two processes looks like.
Final words
The screenshots are from a CTF challenge you can find here[2]. I solved already one of those challenges which introduce nanomites. I will not share the solution here, because this would destroy all the fun :-). If you need any tips on solving them, you can write me on twitter.