TL;DR
C shellcode is machine code written in a way that can be injected and executed by another process. It’s often used in exploits to gain control of a system. This guide explains how it works, how to create simple shellcode, and basic techniques for testing it.
What is Shellcode?
Shellcode is a small piece of code, typically written in assembly language (often derived from C), designed to be injected into a vulnerable process. When executed, this code performs malicious actions – like spawning a shell, creating a backdoor, or modifying system files. The ‘shell’ part refers to the common goal of opening a command shell on the target machine.
Why use C (or rather, why compile from C)?
While you write shellcode in assembly language ultimately, compiling from C offers several advantages:
- Portability: C code can be more easily adapted to different architectures.
- Readability: C is easier to read and understand than raw assembly (though the final shellcode isn’t readable!).
- Development Speed: Writing complex logic in C is faster than directly in assembly.
The key is that you compile your C code into machine code, then extract that machine code as a byte array to be used as shellcode.
Creating Simple Shellcode (Example: Executing /bin/sh)
- Write the C Code: This example spawns a shell.
- Compile the Code: Use GCC to compile the code into an executable, and then extract the machine code. We’ll use a few flags for this:
-fno-stack-protector: Disables stack protection mechanisms that can interfere with shellcode extraction.-z execstack: Allows execution of code on the stack (necessary for injection). Warning: This is a security risk and should only be used in controlled environments.-m32or-m64: Specifies the architecture (32-bit or 64-bit) – match this to your target system!
gcc -fno-stack-protector -z execstack -m32 shell.c -o shell objdump -d shell | grep "<text>" - Extract the Shellcode: The
objdumpcommand will output assembly code. You need to find the machine code instructions within the <text> section. Copy these hexadecimal bytes – this is your shellcode.Example (32-bit):
b8 5f 00 00 00 mov eax, 0x5f b9 76 00 00 00 mov ecx, 0x76 ... (more bytes) ...You’ll need to copy all the hexadecimal byte pairs from the output.
- Represent as a Byte Array: In your exploit code (e.g., Python script), represent the shellcode as a byte array.
shellcode = b"xb8x5fx00x00x00xb9x76x00x00x00..."
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
char *command = "/bin/sh";
system(command);
return 0;
}
Testing Shellcode
- Simple Test Program: Create a C program to execute the shellcode directly.
#include <stdio.h> #include <stdlib.h> #include <string.h> int main() { unsigned char shellcode[] = { /* Your shellcode bytes here */ }; int (*func)() = (int (*)())shellcode; func(); return 0; } - Compile and Run: Compile the test program and run it. If your shellcode is correct, you should see the expected behavior (e.g., a shell prompt).
Remember to compile with
-fno-stack-protector -z execstack. - Debugging: Use a debugger (like GDB) to step through the execution of your test program and verify that the shellcode is being executed correctly.
Important Considerations
- Null Bytes: Shellcode often contains null bytes (
x00). These can terminate strings prematurely in C, making injection difficult. Techniques like using environment variables or other methods to bypass this limitation are common. - Address Space Layout Randomization (ASLR): ASLR randomizes the base address of libraries and other memory regions, making it harder to predict where shellcode needs to be placed.
- Data Execution Prevention (DEP) / NX Bit: DEP/NX prevents code execution from data sections of memory. Bypassing this often involves techniques like Return-Oriented Programming (ROP).