A Peek into a Reverse Engineering Challenge

With this post, I will be stepping through a “crackme,” or a program developed for the sole purpose of bypassing software security mechanisms. A number of tools will be used for this walkthrough, including a debugger and a hex editor. The specific tool I will be using is Ollydbg for my debugger. This tool is for the Windows platform and is free to download.

The crackme that I will be walking through is called FAKE.exe. If you’re following along, download FAKE.exe along with Ollydbg.

First, I will execute FAKE.exe to see what the program looks like and what it does:
1
Not very interesting, but clicking on the “Register” menu item reveals a registration dialog:
2
I typed “123456789” into the box and clicked “Enter Serial” and was presented with the following:
3Let’s open up Ollydbg, and use it to see if we can make “123456789” a correct serial number!

In Ollydbg, go to File->Open and select FAKE.exe. The file will load and you will see a window like the following:

4

In order to find where the program checks the serial number, we will search for the referenced strings used by the program. Essentially, the phrase “That serial is incorrect” should be close to the program check to determine if the serial number is correct or not. To search for the referenced strings in Ollydbg, right-click the main section and go through the “Search for->All reference strings” contextual menu.

A new window will appear with the referenced strings. We are searching for the “That serial is incorrect” string.

5We can see that this string is easily found in this small executable program. Double-click that string to get close to the checking portion of the program. When double-clicking this string, we can also see that the “That serial is correct!!!!!” string is very close. How can we manipulate the program flow to see this message?

6

If we follow the small arrows next to the low-level instructions, we can see that the arrow originates at a JNE instruction, as highlighted in green below:

7

The JNE instructions stands for “Jump if not equal.” It is used in conjuction with the CMP instruction right before it. CMP compares two values and sets the zero flag register if they are equal. JNE looks at the zero flag and if it is not set, then the jump is taken. Otherwise the jump is ignored and goes to the next instruction.

Since the JNE instruction goes to the “That serial is incorrect” string (or near it rather), we do not want this jump to happen. So double click the three CMP instructions to set break points (the address of the instruction will turn red) and execute the program by clicking the “play” icon in the toolbar.

10

The program will run. Click the “Register” menu item and type in “123456789” into the serial number box and click “Enter Serial.” The program will pause execution at that CMP instruction. Pay attention to the CPU registers on the right hand side, in particular the Z flag. This is the zero flag register.

8Click on the Debug->Step Over menu item to advance one CPU instruction. You will notice that the arrow next to the JNE instruction will be red. This means the jump will be taken. Double-click the Z flag on the right to change that 0 to a 1. The arrow will turn black.

9Click the “play” icon again to resume execution. The program will pause at the next CMP instruction. Step over the CMP instruction to the JNE instruction and change the Z flag to 1 just like before. Repeat for the third break point set. When you click the play icon for the third time, you will see this in the registration dialog:

11

You win!

Keep in mind that this is an incredibly simple registration dialog. Actual programs are much more difficult. Enjoy this peek into what reverse code engineering is like! If you’re interested in learning more about Ollydbg, you can watch the following video:

Anti-Reverse Code Engineering

Anti-reverse code engineering is a technique that software developers use in order to prevent people from reverse code engineering (called “reversers”) their software. Software developers want to prevent reversers from reverse code engineering their software because their software may have proprietary algorithms they want to protect, or cryptology mechanisms that need to be a “black box” that prevents anyone from looking inside.

Software developers use a tool called a debugger that allows them to step through their software and track down bugs that exist in their programs and also inspect program behavior at run time. Debuggers also allow software developers to change variables and other values that exist in memory during the execution of their programs. During software development, developers compile their code in “debug” mode, that is, the compiled programs have extra data included in the executable for the debugger to use. Before a software developer ships their programs, they compile their code in a “release” mode that does not include all the useful debugging metadata that assist with software debugging.

Reversers also use debuggers when reverse code engineering software. Reversers are using these debuggers on “released” software rather than “debug” software. When reversers are using these debuggers, they do not get the very helpful data for the debugger, so figuring out how a program is behaving is a little more difficult. It is still entirely possible for a reverser to find these proprietary algorithms using a debugger on “released” software. The central processing unit, or CPU, of a computer executes low-level instructions of a program. It is impossible to execute encrypted low-level instructions because the CPU will not understand them if they are encrypted. Thus, if software developers use encrypted programs, the decryption portion cannot also be encrypted, which implies that the low-level instructions for the decryption functionality can be read and understood by reversers. Thus reversers will be able to decrypt the encrypted program, making the efforts of the software developer fruitless.

To combat this, software developers have developed techniques to detect if a program is running in a debugger. There are various technical ways of accomplishing this, but if a program can detect if a debugger is attached to the program, then the program can quit and refuse to execute in order to prevent the reverser from using a debugger. Software developer Tyler Shields gave a talk on this very subject, and note at the very beginning of his talk that he says that these techniques are no silver bullet and will only stop a beginning or intermediate reverse code engineer.

Another way of preventing reverse code engineering is to obfuscate or encrypt the executable. How does this work? Essentially the software developer will compile their program as normal, then encrypt it. They will then write another program that takes the original encrypted program and decrypts it and executes it at run time. This technique makes analyzing the program very difficult when it is not being executed and if this technique is used with debugger detection, the reverser will have a very difficult time accomplishing their goal.

In the world of computing, anything people try to hide in software or hardware can be seen by anyone determined enough to work through the layers of security put in place. Nothing is completely secure, but software developers can make it very difficult.

Debuggers, Not just a Developer’s Tool

Traditionally, software developers debugged their programs using “print” statements. Essentially, developers will sprinkle these print statement throughout their code to print values of variables at various stages of the program’s execution. This will enable them to verify that the values of the variables are reasonable and are as expected. This is a very useful technique in debugging software, but it is very time confusing. Not only will a developer have to place all these print statements in their code, once they are ready to ship their programs, they must remove the debugging code. The solution to this problem is another piece of software called a debugger.

A debugger is a tool that a software developer uses in order to find and troubleshoot bugs that exist in the software they create. Many debuggers exist on the market: some open source and can be used with no financial contribution, and others are proprietary and cost money. These tools attach to running programs and can inspect the low-level details such as the value of the registers in the central processing unit (CPU), the values in random access memory (RAM), and the current low-level instruction being executed. Not only can a programmer inspect these values, they can also modify them at run time. This makes a debugger an incredibly valuable tool for debugging software. If a developer’s program crashes, the debugger gives a report of what caused the crash to help the developer track down the error.

The GNU Project Debugger, commonly known as GDB, is a very powerful debugger used by millions of developers around the world. It is free and open source. There are many tutorials on how to use it all over the Internet, including some videos:

While debuggers are very useful for software developers, they are essential for reverse code engineers. Reverse code engineers are able to attach a debugger to software and pause and manipulate execution of the program at will. This is incredibly powerful for a reverse code engineer because they are able to change the program flow. For example, take a piece of software that has a registration dialog. The dialog expects a user to type in their username and serial number (which they must pay for) and the software runs an algorithm on the username. If the output of that algorithm matches what is typed in the serial number box, then the user must have paid for the software and it becomes registered.

A reverse code engineer can find where this algorithm is ran in the debugger. When the software checks to see if the output of the algorithm and the serial number match, the reverse code engineer can pause execution there, and manipulate the values of these variables. Then the reverse code engineer can ensure that these values match, or just always make the program say “yes, they match” forcing registration of the program. Once the reverse code engineer has found where this check is, he or she can write a small program that will patch this software in order to “crack” it. All thanks to a useful tool called a debugger.

Reverse Code Engineering for Fun and Profit

Reverse code engineering is the art of studying a binary software program, binary file format, or binary networking protocol to determine how it works at a technical level. This involves inspecting the very low-level technical details of the program and converting that low-level code into high-level code. There are numerous reasons a person would want to do this, such as the original source code to a program is lost, documentation for a programming library is incorrect with the actual behavior of the library, or breaking technical “locks” put in place.

This craft is usually done on the Windows platform (and Mac OS X) since these are the computing platforms that deal with proprietary software most of the time. Linux has very little reverse engineering applications because the majority of the applications for a Linux computer are open source (as is Linux itself!). If a person has access to the original source code, then there is no need to inspect the low-level code to determine what the program does.

Many people reverse engineer software programs purely for fun. There are programs developed for the sole purpose of reverse engineering. In the video below, you can see YouTube user “XyliboxFrance” add additional functionality to a program using reverse engineering. These sort of programs are called “InjectMes,” programs developed just for reverse engineers to add additional features into programs:

There is a faction of people that look down upon reverse code engineering. Many of them want reverse engineering to be made illegal. These people feel that the only purpose of reverse code engineering is to steal corporate trade secrets or break software protection mechanisms. The only reason a person will reverse code engineer, in their eyes, is to steal.

I will not go into the debate of software piracy, but I will say that there are people out there that do exactly that: steal software. Many companies offer trial versions of their software with an option of buying a serial number or license key in order to unlock all the restrictions the trial software imposed. There are reverse engineers, called “crackers,” that willfully break the software protection mechanisms for trial software so they can steal it and distribute their “crack” to others.

In the reverse code engineering community, programmers have created programs called “crackmes,” which are programs that typically have “name” and “serial number” fields (much like a dialog for a registration box for trial software). The goal is to reverse code engineer the program, and discover the algorithm to generate a valid name and serial, then “register” the software using the cracked credentials. There is often a secondary goal of creating a key generator, affectionately known as a “keygen,” which is another program that accepts a name for the input, and gives a valid serial number for the output. There is a website called crackmes.de that offers many crackmes and other reverse code engineering programs for a person to practice and hone their reverse code engineering skills.

One of These is Not Like the Other

NYU-Poly hosts an annual Capture the Flag (CTF) competition called CSAW (Cyber Security Awareness Week). They have a qualification competition and then a final competition for the top few teams from the qualification round. Our team, the Whitehatters Computer Security Club at the University of South Florida (WCSC for short), competed in the qualification CTF.

There were a number of categories in this particular CTF, including trivia, reconnaissance, web, reversing, exploitation, forensics, and networking. Several people on my team looked at the forensics challenges, but one of them eluded all of us. The title of the challenge was “One of These is Not Like the Other” and consisted of a simple PNG image. The original image shown below:

CSAW 2012 Qualification - Forensics 200

At first glance, we all assumed some sort of steganography, since that is the practice of hiding messages inside of images or audio files. After a bit of steganography analysis on the image, I concluded the actual picture was irrelevant and was intended to be a red herring.

A common tool used in CTF challenges is called strings. Running strings on this picture generates output that contains many interesting fields of the format:

key{FIRSTNAME LASTNAME}

(The full output can be viewed here: https://gist.githubusercontent.com/billymeter/c09747733d5810953e49/raw/58681a0fa59e51048e12e9d2234439053f29a1d2/gistfile1.txt)

To defeat the challenge, a key, or flag, must be submitted to the scoreboard. There are plenty of keys in this file, but which one is the correct one? Running the flowing command:

brad@bt[~/Desktop]
[15:22]: strings version1.png | grep key{ | wc
     500    1001   10490

shows that there are 500 keys in this file! Which one is the correct one?

I search for some sort of pattern with the names themselves, but never found one. Thinking back the the challenge category, forensics, prompted me to actually research the technical file format for a PNG file.

Reviewing at the PNG technical specification, found here: http://www.libpng.org/pub/png/spec/iso/index-object.html, shows that PNG files are composed of data structures called chunks. Opening up the image file into a hex editor, we can see that all of these keys embedded in the file are tEXt chunks as shown below:

hex

Rather than part of the image, or the names in the text chunks, being “not like the others,” perhaps something with one of these tEXt chunks is not like the other. If the chunks in this PNG file are not formatted properly, then surely there is a tool to help us find the bad chunks.

Luckily, there is. The particular tool that I used is called pngcheck. It can be found here: http://www.libpng.org/pub/png/apps/pngcheck.html. This tool will scan and make sure that a PNG image is formatted properly. Running pngcheck on the image:

brad@bt[~/Desktop]
[16:10]: pngcheck version1.png
version1.pngĀ  CRC error in chunk tEXt (computed 5005ed3c, expected 26594131)
ERROR: version1.png

shows that there is an error with a tEXt chunk! But which one is it? We will check the help documentation for pngcheck:

brad@bt[~/Desktop]
[16:11]: pngcheck
PNGcheck, version 2.3.0 of 7 July 2007,
   by Alexander Lehmann, Andreas Dilger and Greg Roelofs.

Test PNG, JNG or MNG image files for corruption, and print size/type info.

Usage:  pngcheck [-7cfpqtv] file.{png|jng|mng} [file2.{png|jng|mng} [...]]
   or:  ... | pngcheck [-7cfpqstvx]
   or:  pngcheck [-7cfpqstvx] file-containing-PNGs...

Options:
   -7  print contents of tEXt chunks, escape chars >=128 (for 7-bit terminals)
   -c  colorize output (for ANSI terminals)
   -f  force continuation even after major errors
   -p  print contents of PLTE, tRNS, hIST, sPLT and PPLT (can be used with -q)
   -q  test quietly (output only errors)
   -s  search for PNGs within another file
   -t  print contents of tEXt chunks (can be used with -q)
   -v  test verbosely (print most chunk data)
   -x  search for PNGs within another file and extract them when found

Note:  MNG support is more informational than conformance-oriented.

If we run pngcheck with the -7 flag, then we should see where the bad tEXt chunk is.

brad@bt[~/Desktop]
[16:11]: pngcheck -7 version1.png
File: version1.png (1443898 bytes)
XML:com.adobe.xmp:
    (no translated keyword, 393 bytes of UTF-8 text)
comment:
    key{rodney danielle}
comment:
    key{matthieu blayne}
<<<<<<<<<<<<<<<<<<<<<<<<<<<<< SNIP >>>>>>>>>>>>>>>>>>>>>>>>>>>>
comment:
    key{nguyen willie}
comment:
    key{takeuchi gregory}
version1.png  CRC error in chunk tEXt (computed 5005ed3c, expected 26594131)
ERROR: version1.png

So we see that key{takeuchi gregory} is the tEXt chunk that was bad. Submitting “takeuchi gregory” to the scoreboard yielded in solving the challenge.