Shellcode, Linux System Calls, and Format String Vulnerabilities

What is Shellcode?

Shellcode is machine level code that can be executed on a computer at runtime. Shellcode gets its name from being a payload that can generate a shell, but we can use shellcode read, create, and delete files. In yesterday’s example, we had a win function, but in the real world we don’t usually have a win function.

Running Shellcode in C

We are going to take some premade shellcode, and run it in a C program.

Writing Shellcode by Hand

We haven’t looked at writing assembly at all so far in this course, but many times you can’t use premade shellcode because of input restrictions (certain characters can’t be used, certain calls can’t be made by a program, or there is a size limit).

To develop our own assembly programs, we will use nasm and ld to turn our compiled assembly programs into executables. For example if we had assembly in hello.asm we could compile and execute it using

nasm elf -f hello.asm #creates an object file hello.o 
ld -m elf_i386 -s -o hello hello.o #links the object file to create the executable
./hello

Here are some template files: Makefile and template.asm

section .text: 
	global _start ;;tells the linker where start is

_start:  
	xor ebx, ebx
	push ebx
	push 0x0a21 ;;push '!'
	push 0x646c726f 
	push 0x57202c6f ;;push 'o, W' 
	push 0x6c6c6548 ;; push 'Hell' in little-endian
	mov ebx, 1 ;; the file descript, stdout in this case
	mov ecx, esp ;; address to read from
	mov edx, 15  ;; len of message
	mov eax, 4 ;; sys_call number 
	int 0x80 ;; call to kernel
	
	xor ebx, ebx
 
	mov eax, 1 ;; sys_call number (exit) 
	int 0x80 ;; call to kernel

What are those int 0x80 calls??

An int in assembly is an interrupt and the 0x80 refers to the linux kernel. Other system calls we can perform are open, read, write, and exec. One thing to notice is that arguments are passed differently to system calls than normal functions in x86. Normally these arguments are passed via the stack, but in system calls they are passed via the general purpose registers.

Turning Assembly into Shellcode

If we have a compiled executable, we can convert it’s output to shellcode with

objdump -D hello |grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'

If there are any \x00’s in your output, rewriting your assembly is necessary to remove them. Strings are null-terminated in C, so your string will no longer be read after the null-byte. One strategy is to not use full registers for full operations. Let’s do that with the Hello World from above.

section .text: 
	global _start ;;tells the linker where start is

_start: 
	sub esp, 0x10 ;; make room on the stack for "Hello, World!"
	mov dword [esp], 0x6c6c6548 ;; push 'Hell' in little-endian
	mov dword [esp + 4], 0x57202c6f ;;push 'o, W' 
	mov dword [esp + 8], 0x646c726f ;;push 'orld'
	mov dword [esp + 12], 0x0a21 ;;push '!\n'
	mov bl, 1 ;; the file descriptor, stdout in this case
	mov ecx, esp ;; address to read from
	mov dl, 14  ;; len of message
	mov al, 4 ;; sys_call number 
	int 0x80 ;; call to kernel
	
	mov al, 1 ;; sys_call number (exit) 
	int 0x80 ;; call to kernel

Now our shellcode looks like: "\x83\xec\x10\xc7\x04\x24\x48\x65\x6c\xc7\x44\x24\x04\x6f\x2c\x57\xc7\x44\x24\x08\x6f\x72\x64\x66\xc7\x44\x24\x0c\x21\xb3\x01\x89\xe1\xb2\x0e\xb0\x04\xcd\x80\xb0\x01\xcd\x80"

Using pwntools for Shellcode

The pwntools library has a shellcraft module that is really useful. To get shellcode that runs /bin/bash we can run

from pwn import * 
shellcode = asm(pwnlib.shellcraft.i386.linux.sh())

Exploiting a Buffer Overflow with Shellcode

Injecting shellcode and getting it run by the binary is done by:

Injecting the shellcode into a buffer.
Overwriting the return address to a value on the stack, either at the start of the buffer, or somewhere within the buffer.

This is fairly easy if we have a large space to write to, and we have the address of our buffer.

Let’s try this challenge: main nc 52.15.140.126 5003

What happens if we aren’t given the buffer address?

Chances are, the binary will still be compiled with ASLR, but we might be able to leak the address of the binary with another attack, like a format string vulnerability.

Format String Attack

Here is the safe use of format strings

Here is an unsafe use of format strings This compiles and works … as long as the user doesn’t pass any format specifiers.

The reason for this vulnerability can be seen through the cdecl calling conventions. printf accepts an arbitrary number of arguments based on the format specifier (the first argument of printf). If the user passes more format specifiers than there are arguments, printf starts walking up the stack and printing arguments from the stack.

Let’s see how we can exploit this here.

echooo echooo.c

ELEG 467/667

Pentesting & CTF's