Reversing a custom VM obfuscator
by Peter Stolz
This is my first writeup of a CTF challenge and it was a Medium Reversing challange called protector of the KITCTF.
Challenge description:
To ensure the best protection of our proprietary software used in all of our slot machines, we developed the ultimate protector. It works way better than all those boring protectors like VMProtect or Themida. Right???
If we open the binary in ghidra there are no functions it can reconstruct. Therefore we have some kind of binary obfuscation.
Using strace, it looks like it only uses the syscalls read and write and exit.
Looking at the sections of the binary with e.g. ghidra: We see there is:
asdf
RWasdf2
RWX
Running checksec, we can see that there is no ASLR and the sections are mapped to 0x500000 and 0x600000.
Disassembling the entry point, we see that the Program starts by decrypting the first 32 bytes of code at 0x600000 before it jumps there.
00401010 4d 31 2c 24 XOR qword ptr [R12]=>LAB_00600000,R13
00401014 49 83 c4 08 ADD R12,0x8
00401018 4d 31 2c 24 XOR qword ptr [R12]=>LAB_00600008,R13
0040101c 49 83 c4 08 ADD R12,0x8
00401020 4d 31 2c 24 XOR qword ptr [R12]=>DAT_00600010,R13 = 07B34B0C4A63E7D9h
00401024 49 83 c4 08 ADD R12,0x8
00401028 4d 31 2c 24 XOR qword ptr [R12]=>DAT_00600018,R13 = CB7F87ABA5115890h
0040102c 49 83 ec 18 SUB R12,0x18
00401030 41 ff e4 JMP R12=>LAB_00600000
First call destination after the entry decrypt:
0x600000: xor rsi,0xffffffff9cce4d23
0x600007: nop
0x600008: nop
0x600009: nop
0x60000a: nop
0x60000b: nop
0x60000c: nop
0x60000d: nop
0x60000e: nop
0x60000f: nop
0x600010: movabs r15,0x401033
0x60001a: jmp r15
Before this call it looks like nonsense:
0x600000: fcomp st(1)
0x600002: cmps BYTE PTR ds:[rsi],BYTE PTR es:[rdi]
0x600003: jns 0x600006
0x600005: test DWORD PTR [rdi],ebp
0x600007: xchg edi,eax
0x600008: add al,cl
0x60000a: ror dl,0xdc
0x60000d: (bad) [rbx]
0x60000f: xchg edi,eax
0x600010: (bad)
0x600012: movsxd ecx,DWORD PTR [rdx+0xc]
0x600015: rex.WXB mov r11b,0x7
Each instruction block is 32 bytes long. Above is the disassembly of the first code block. It looks like one instruction padded with NOPs followed by a jump back to the main code section.
Let’s look at the jump destination 0x401033 from above:
00401033 4d 89 ee MOV R14,R13
00401036 4d 33 34 24 XOR R14,qword ptr [R12]
0040103a 49 83 c4 08 ADD R12,0x8
0040103e 4d 33 34 24 XOR R14,qword ptr [R12]
00401042 49 83 c4 08 ADD R12,0x8
00401046 4d 33 34 24 XOR R14,qword ptr [R12]
0040104a 49 83 c4 08 ADD R12,0x8
0040104e 4d 33 34 24 XOR R14,qword ptr [R12]
00401052 49 83 ec 18 SUB R12,0x18
00401056 4d 31 2c 24 XOR qword ptr [R12],R13
0040105a 49 83 c4 08 ADD R12,0x8
0040105e 4d 31 2c 24 XOR qword ptr [R12],R13
00401062 49 83 c4 08 ADD R12,0x8
00401066 4d 31 2c 24 XOR qword ptr [R12],R13
0040106a 49 83 c4 08 ADD R12,0x8
0040106e 4d 31 2c 24 XOR qword ptr [R12],R13
00401072 49 83 c4 08 ADD R12,0x8
00401076 4d 89 f5 MOV R13,R14
00401079 eb 95 JMP entry::decryptPage
The first few instructions iterate over the executed code block and compute some value based on the old decryption key.
Afterwards (0x401052) the program reencrypts the old block. Hence we can’t dump the whole code as there is only one decrypted block at a time.
The last instruction before the jump sets the encryption key to the result of the first XORs.
As this is all the code in the .text
section we can infer the meaning of the registers:
r12 = The offset into the decrypted page (Program Counter)
r13 = Streaming key
r14 = Next Streaming key
With pwntools we can reconstruct the encryption and decrypt the code section:
from pwn import *
from binascii import hexlify as hx
context.binary = e = ELF('./protector')
stream_key = 0x7b34b4c5a505890
instructions = []
# super annoying if they use loops in thei0xcb73d9360644f601r code
for pc in range(0x600000, 0x88387f, 32):
code = e.read(pc, 32)
instruction = b""
new_key = stream_key
for i in range(0, 32, 8):
instruction += p64(u64(code[i:i+8]) ^ stream_key)
new_key ^= u64(code[i:i+8])
# print(hx(instruction))
instructions.append(instruction)
stream_key = new_key
print(disasm(b"".join(instructions)))
With a bunch of greps we removed the paddings and the jumps to the main section, so only the real instructions remain. We then put it back into a binary with pwntools’s elf library.
We then threw angr against that binary mapping the asdf
data section into the address space as well.
After a few minutes the flag popped out of this script:
import angr, claripy, monkeyhex
from pwn import *
e = ELF("protector")
filename = "protector_decrypted"
proj = angr.Project(filename)
init_state = proj.factory.entry_state(
#add_options = {angr.options.ZERO_FILL_UNCONSTRAINED_REGISTERS}
)
init_state.memory.store(0x500000, e.read(0x500000, 0x40a), 0x40a)
simgr = proj.factory.simgr(init_state)
def instate(string):
return lambda s: string.encode() in s.posix.dumps(1)
info("Starting angr: (Exploit takes around 2:30 minutes)")
simgr.explore(find=instate('yes'), avoid = instate('no'))
if simgr.found:
info("YOU ROCK" + "!" * 10)
state = simgr.found[0]
info(state.posix.dumps(0).decode())
In conclusion I learnt to use pwntools a bit better and it was a super fun challenge.
tags: rev - linux - ghidra - pwntools - angr