Program hijacking

Code injection

Paul Krzyzanowski

October 23, 2024

Program Hijacking

Program hijacking refers to various techniques that can be used to take control of a program and have it do something other than what it was intended to do. One class of techniques uses code injection, in which an adversary manages to add code to the program and change the program’s execution flow to run that code.

The best-known set of attacks is based on buffer overflow. Buffer overflow is the condition where a programmer allocates a chunk of memory (for example, an array of characters) but neglects to check the size of that buffer when moving data into it. Data will spill over into adjacent memory and overwrite whatever is in that memory.

Languages such as C, C++, and assembler are susceptible to buffer overflows since the language does not have a means of testing array bounds. Hence, the compiler cannot generate code to validate that data only goes into the allocated buffer. For example, when you copy a string using strcpy(char *dest, char *src), you pass the function only source and destination pointers. The strcpy function has no idea how big either of the buffers are.

Stack-based overflows

When a process runs, the operating system’s program loader allocates a region for the executable code and static data (called the text and data segments), a region for the stack, and a region for the heap (used for dynamic memory allocation, such as by malloc).

Just before a program calls a function, it pushes the function’s parameters onto the stack. When the call is made, the return address gets pushed on the stack. On entry to the function that was called, the function pushes the current frame pointer (a register in the CPU) on the stack, which forms a linked list to the previous frame pointer and provides an easy way to revert the stack to where it was before making the function call. The frame pointer register is then set to the current top of the stack.

The function then adjusts the stack pointer to make room to store local variables, which live on the stack. This region for the function’s local data is called the stack frame. Ensuring that the stack pointer is always pointing to the top of the stack enables the function to get interrupts or call other functions without overwriting anything useful on the stack. The compiler generates code to reference parameters and local variables as offsets from the current frame pointer register.

Before a function returns, the compiler generates code to:

Adjust the stack back to point to where it was before the stack expanded to make room for local variables. This is done by copying the frame pointer to the stack pointer.
Restore the previous frame pointer by popping it off the stack (so that local variables for the previous function could be referenced properly).
Return from the function. Once the previous frame pointer has been popped off the stack, the stack pointer points to a location on the stack that holds the return address.

Simple stack overflows

Local variables are allocated on the stack and the stack grows downward in memory. Hence, the top of the stack is in lower memory than the start, or bottom, of the stack. If a buffer (e.g., char buf[128]) is defined as a local variable, it will reside on the stack and its address will be the start of the buffer (lower memory). As the buffer gets filled up, its contents will be written to higher and higher memory addresses. If the buffer overflows, data will be written further down the stack (in higher memory), overwriting the contents of any other variables that were allocated for that function and eventually overwriting the saved frame pointer and the saved return address.

When this happens and the function tries to return, the return address that is read from the stack will contain garbage data, usually a memory address that is not mapped into the program’s memory. As such, the program will crash when the function returns and tries to execute code at that invalid address. This is an availability attack. If we can exploit the fact that a program does not check the bounds of a buffer and overflows the buffer, we can cause a program to crash.

Subverting control flow through a stack overflow

Buffer overflow can be used in a more malicious manner. The buffer itself can be filled with bytes of valid machine code. If the attacker knows the exact size of the buffer, she can write just the right number of bytes to write a new return address into the very same region of memory on the stack that held the return address to the parent function. This new return address points to the start of the buffer that contains the injected code. When the function returns, it will “return” to the new code in the bufferand execute the code at that location.

Off-by-one stack overflows

As we saw, buffer overflow occurs because of programming bugs: the programmer neglected to make sure that the data written to a buffer does not overflow. This often occurs because the programmer used old, unsafe functions that do not allow the programmer to specify limits. Common functions include:

- strcpy(char *dest, char *src)

- strcat(char *dest, char *src)

- sprintf(char *format, ...)

Each of these functions has a safe counterpart that accepts a count parameter so that the function will never copy more than count number of bytes:

- strncpy(char *dest, char *src, int count)

- strncat(char *dest, char *src, int count)

- snprintf(char *format, int count,  ...)

You’d think this would put an end to buffer overflow problems. However, programmers may miscount or they may choose to write their own functions that do not check array bounds correctly. A common error is an off-by-one error. For example, a programmer may declare a buffer as:

char buf[128];

and then copy data into it with:

for (i=0; i <= 128; i++)
    buf[i] = stuff[i];

The programmer inadvertently used a <= comparison instead of <.

With off-by-one bounds checking, there is no way that malicious input can overwrite the return address on the stack: the copy operation would stop before that time.

However, if the buffer is the first variable that is allocated on the stack, an off-by-one error can overwrite one byte of the saved frame pointer.

The potential for damage depends greatly on the value of that saved frame pointer and how the compiler generates code for managing the stack. In the worst case, it could be set up to a value that is 255 bytes lower in memory. If the frame pointer is modified, the function will still return normally. However, upon returning, the compiler pops the frame pointer from the stack to restore the saved value of the calling function’s frame pointer, which was corrupted by the buffer overflow. Now the program has a modified frame pointer.

Recall that references to a function’s variables and parameters are expressed as offsets from the current frame pointer. Any references to local variables may now be references to data in the buffer. Moreover, should that function return, it will update its stack pointer to this buffer area and return to an address that the attacker defined.

Heap overflows

Not all data is allocated on the stack: only local variables. Global and static variables are placed in a region of memory right above the executable program. Dynamically allocated memory (e.g., via new or malloc) comes from an area called the heap. In either case, since this memory is not the stack, it does not contain return addresses, so there is no ability for a buffer overflow attack to overwrite return addresses.

We aren’t totally safe, however. A buffer overflow will cause data to spill over into higher memory addresses above the buffer that may contain other variables. If the attacker knows the order in which variables are allocated, they could be overwritten. While these overwrites will not change a return address, they can change things such as filenames, lookup tables, or linked lists. Some programs make extensive use of function pointers, which may be stored in global variables or in dynamically-allocated structures such as linked lists on a heap. If a buffer overflow can overwrite a function pointer then it can change the execution of the program: when that function is called, control will be transferred to a location of the attacker’s choosing.

If we aren’t sure of the exact address at which execution will start, we can fill a buffer with a bunch (e.g., thousands) of NOP (no operation) instructions as part of the injected code. If the processor “returns” anywhere in that region of memory, it will happily execute these NOP instructions until it eventually reaches the injected code. This is called a NOP slide, or a landing zone.

Format string attacks with printf

The family of printf functions are commonly used in C and C++ to create formatted output. They accept a format string that defines what will be printed, with % characters representing formatting directives for parameters. For example,

printf("value = %05d\n", v);

Will print a string such as

value = 01234

if the value of v is 1234.

Reading arbitrary memory

Occasionally, programs will use a format string that could be modified. For instance, the format string may be a local variable that is a pointer to a string. This local variable may be overwritten by a buffer overflow attack to point to a different string. It is also common, although improper, for a programmer to use printf(s) to print a fixed string s. If s is a string that is generated by the attacker, it may contain unexpected formatting directives.

Note that printf takes a variable number of arguments and matches each % directive in the format string with a parameter. If there are not enough parameters passed to printf, the function does not know that: it assumes they are on the stack and will happily read whatever value is on the stack where it thinks the parameter should be. This gives an attacker the ability to read arbitrarily deep into the stack. For example, with a format string such as:

printf("%08x\n%08x\n%08x\n%08x\n");

printf will expect four parameters, all of which are missing. It will instead read the next four values that are on the top of the stack and print each of those integers as an 8-character-long hexadecimal value prefixed with leading zeros ("%08x\n").

Writing arbitrary memory

The printf function also contains a somewhat obscure formatting directive: %n. Unlike other % directives that expect to read a parameter and format it, %n instead writes to the address corresponding to that parameter. It writes the number of characters that it has output thus far. For example,

printf(“paul%n says hi”, &printbytes);

will store the number 4 (strlen("paul")) into the variable printbytes. An attacker who can change the format specifier may be able to write to arbitrary memory. Each % directive to print a variable will cause printf to look for the next variable in the next slot in the stack. Hence, format directives such as %x, %lx, %llx will cause printf to skip over the length of an int, long, or long long and get the next variable from the following location on the stack. Thus, just like reading the stack, we can skip through any number of bytes on the stack until we get to the address where we want to modify a value. At that point, we insert a %n directive in the format string, which will modify that address on the stack with the number of bytes that were output.

We can precisely control the value written by specifying how many bytes are output as part of the format string. For example, a format of %.55000x tells printf to output a value that will fill 55,000 characters. Using formats like that for output values, we can change the count that will be written with %n. Remember, we don’t care what printf actually prints; we just want to force the byte count to be a value we care about, such as the address of a function we want to call.

If you’d like to read more about printf vulnerabilities, there’s a great writeup from Syracuse University.

Defense against hijacking attacks

Better programming and the dangers of making assumptions

Hijacking attacks are the result of sloppy programming: a lack of bounds checking that results in overflows. They can be eliminated if the programmer never uses unsafe functions (e.g., use strncpy instead of strcpy) and is careful about off-by-one errors.

A programer can use a technique called fuzzing to locate buffer overflow problems. Fuzzing is generally an automated technique that attempts to provide long, unexpected data as input to programs. Whenever a string can be provided by the user, the user will enter extremely long strings with well-defined patterns (e.g., “\[\]$$…”). If the app crashes because a buffer overflow destroyed a return address on the stack, the programmer can then load the core dump into a debugger, identify where the program crashed and search for a substring of the entered pattern (“\[\]$”) to identify which buffer was affected.

Buffer overflows can be avoided by using languages with stronger type checking and array bounds checking. Languages such as Java, C#, and Python check array bounds. C and C++ do not. However, it is sometimes difficult to avoid using C or C++: it is the basis of most operating system code, compilers, libraries, high-performance services, and embedded systems.

Tight specification of requirements, coding to those requirements, and constructing tests based on those requirements can help avoid buffer overflow bugs. If input lengths are specified as requirements, they are more likely to be coded and checked. Documentation should be explicit, such as "user names longer than 32 bytes must be rejected.”

Programmers often make implicit assumptions they never check for because it seems inconceivable that a user would ever do something dramatically incorrect. For instance, allocating a 1024-byte buffer for a name and not checking the input length may feel acceptable because no name would never be longer than that. But without safeguards, nothing stops an attacker from entering a string string that is thousands of characters long and possibly wreaking havoc on the program. Input requirements have real-world consequences of their own. For instance, consider these falsehoods programmers believe about names.

Even data that may not be entered by a user might cause problems. In 2022, some older Honda and Acura cars produced between 2006 and 2014 began to report the year as 2002 because the programmers decided to store the date as a 32-bit signed integer in the form yymmddHHMM and the year 2022 caused an integer overflow, making the time number negative. In this case, the bug was an annoyance, but it is easy to imagine cases where this could have serious side-effects, such as payments or authentication credentials not being accepted. Complementing the falsehoods about names, there’s also this list of falsehoods programmers believe about time.

Data Execution Prevention (DEP)

Buffer overflows affect data areas: either the stack, heap, or static data areas. There is usually no reason that those code regions should contain executable code. Hence, it makes sense for the operating system to set the processor’s memory management unit (MMU) to turn off execute permission for memory pages in those regions. This is known as data execution prevention, or DEP.

This was impossible with early Intel or AMD processors: their MMU did not support enabling or disabling execute permissions. All memory could contain executable code. That changed in 2004, when Intel and AMD finally added an NX (no-execute) bit to their MMU’s page tables. On Intel architectures, this was called the Execute Disable Bit (XD). Operating system support followed. Windows, Linux, and macOS all currently support DEP.

DEP cannot always be used. Some environments, such as some LISP interpreters do need execution enabled in their stack and some environments need executable code in their heap section (to support dynamic loading, patching, or just-in-time compilation). DEP also does not guard against data modification attacks, such as heap-based overflows or some printf attacks.

DEP attacks

Attackers came up with some clever solutions to defeat DEP. The first of these is called return-to-libc*. Buffer overflows still allow us to corrupt the stack. We just cannot execute code on the stack. However, there is already a lot of code sitting in the program and the libraries it uses. Instead of adding code into the buffer, the attacker merely overflows a buffer to create a new return address and parameter list on the stack. When the function returns, it switches control to the new return address. This return address will be an address in the standard C library (libc), which contains functions such as printf, system, and wrappers for system calls. All that an attacker often needs to do is to push parameters that point to a string in the buffer that contains a command to execute and then “return” to the libc system function, whose function is to execute a parameter as a shell command.

A more sophisticated variant of return-to-libc is Return Oriented Programming (ROP).

Return-oriented programming generalizes return-to-libc and realizes that execution can branch to any arbitrary point in any function in any loaded library.

When jumping to some point in the middle of a function, the function will execute a series of instructions and eventually return. An attacker will overflow the stack with data that now tells this function where to “return”. The return can jump to yet another arbitrary point in another library. When that returns, it can – once again – be directed to an address chosen by the intruder that has been placed further down the stack, along with frame pointers, local variables, and parameters.

There are lots and lots of return instructions across all the functions within all the libraries normally used by programs. Each of these tail ends of a function is called a gadget. It has been demonstrated that using carefully chosen gadgets allows an attacker to push a string of return addresses that will enable the execution of arbitrary algorithms. To make life easier for the attacker, tools have been created that search through libraries and identify useful gadgets. A ROP compiler then allows the attacker to program operations using these gadgets.

Address Space Layout Randomization

Stack overflow attacks require knowing and injecting an address that will be used as a target when a function returns. ROP also requires knowing the addresses of all the entry points of gadgets. Address Space Layout Randomization (ASLR) is a technique that was developed to have the operating system’s program loader pick random starting points for the executable program, static data, heap, stack, and shared libraries. Since code and data reside in different locations each time the program runs, the attacker is not able to program buffer overflows with useful known addresses. For ASLR to work, the program and all libraries must be compiled to use position independent code (PIC), which uses relative offsets instead of absolute memory addresses.

Stack canaries

A stack canary is a compiler technique to ensure that a function will not be allowed to return if a buffer overflow took place that may have clobbered the return address.

At the start of a function, the compiler adds code to generate a random integer (the canary) and push it onto the stack before allocating space for the function’s local variables (the entire region of the stack used by a local function is called a frame).

The canary sits between the return address and these variables. If there is a buffer overflow in a local variable that tries to change the return address, that overflow will have to clobber the value of the canary.

The compiler generates code to have the function check that the canary has a valid value before returning (a copy of the original value is saved in a register or protected memory location). If the value of the canary is not the original value, then a buffer overflow occurred and the return value has likely been altered. In this case, the compiler will force the program to exit rather than risk returning.

However, you may still have a buffer overflow that does not change the value of the canary or the return address. Consider a function that has a local array (buffer) and various other local varibles. They are all allocated on the stack within the same stack frame. If the array is in lower memory than some of the other variables, then a buffer overflow can overwrite the contents of those variables and the function continues executing.

Depending on the code, that can alter the way the function works. For instance, suppose the function allocates space for an integer followed by an array. An overflow in the array can change the value of the integer in higher memory. The canary won’t detect this. Even if the overflow happened to clobber the return value as well, the canary check only takes place when the function is about to return. Meanwhile, it’s possible that the overflow that caused other variables to change changed the behavior of the function.

Stack canaries cannot fix this problem in general. However, the compiler (which creates the code to generate them and check them) can take steps to ensure that a buffer overflow cannot overwrite non-array variables, such as integers and floats. By allocating arrays on the stack first (in higher memory) and then scalar variables, the compiler can make sure that a buffer overflow in an array will not change the value of scalar variables. One array overflowing to another is still a risk, however, but it is most often the scalar variables that contain values that define the control flow of a function. This approach was developed by IBM (Pro-Police) and is then incorporated into the gcc compiler.

Intel Control-Flow Enforcement Technology (CET)

In 2020, Intel introduced a set of features called Control-Flow Enforcement Technology (CET). This adds two new protection mechanisms to the processor.

Overflowing the buffer of a local variable can modify the return address of a function. This is because local variables share the same stack as return addresses.

With CET, Intel introduces a secondary stack, called a shadow stack, that stores only return addresses. An extra protection attribute in the processor’s memory management unit (MMU) protects it from tampering by disallowing the use of regular store instructions to modify its contents. Writes to this stack are restricted only to control flow transfer instructions: calls, system calls, and interrupts will push their return address on the normal stack as before but also push the return address into this shadow stack. Buffer overflows on the main stack cannot touch the shadow stack and thus cannot change the control flow.

When the processor executes a return instruction, instead of just popping the return address from the stack and jumping there, the processor compares the return address on the stack with the return address on the shadow stack. If both addresses match, that indicates that the return address on the stack was not modified. Both addresses can be popped off the stack and execution can continue normally. If the addresses differ between the stacks, that indicates that a buffer overflow modified the return address on the main stack. The processor generates a fault (interrupt #21, Control Flow Protection Fault), notifying the operating system of this violation and enabling it to kill the process.

The second feature Intel introduced is called indirect branch tracking. This was designed to restrict a program’s ability to use jump tables.

A jump table is a table of memory locations to which a program branch. It is commonly used for operations such as implementing switch statements and lookup tables. Jump tables are lists of indirect branches. When implemented in C, they are a list of function pointers.

An attack related to Return Oriented Programming is Jump Oriented Programming, or JOP. JOP uses sequences of code that end in indirect jumps or calls. Like ROP, these sequences are used to cause unintended program behavior. Also like ROP, they cause a program to jump some chosen point within a function.

Indirect branch tracking adds a new ENDBRANCH instruction to allow a programmer to specify valid targets for indirect jumps or calls in a program. An indirect branch must jump to an ENDBRANCH instruction. If a hijacked program forces a branch to some other location, it will not hit an ENDBRANCH instruction and be treated as a violation.