Understanding C Shellcode

G5 Cyber Security

2 months ago

TL;DR

C shellcode is machine code written in a way that can be injected and executed by another process. It’s often used in exploits to gain control of a system. This guide explains how it works, how to create simple shellcode, and basic techniques for testing it.

What is Shellcode?

Shellcode is a small piece of code, typically written in assembly language (often derived from C), designed to be injected into a vulnerable process. When executed, this code performs malicious actions – like spawning a shell, creating a backdoor, or modifying system files. The ‘shell’ part refers to the common goal of opening a command shell on the target machine.

Why use C (or rather, why compile from C)?

While you write shellcode in assembly language ultimately, compiling from C offers several advantages:

Portability: C code can be more easily adapted to different architectures.
Readability: C is easier to read and understand than raw assembly (though the final shellcode isn’t readable!).
Development Speed: Writing complex logic in C is faster than directly in assembly.

The key is that you compile your C code into machine code, then extract that machine code as a byte array to be used as shellcode.

Creating Simple Shellcode (Example: Executing /bin/sh)

Write the C Code: This example spawns a shell.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
  char *command = "/bin/sh";
  system(command);
  return 0;
}

Compile the Code: Use GCC to compile the code into an executable, and then extract the machine code. We’ll use a few flags for this:
- -fno-stack-protector: Disables stack protection mechanisms that can interfere with shellcode extraction.
- -z execstack: Allows execution of code on the stack (necessary for injection). Warning: This is a security risk and should only be used in controlled environments.
- -m32 or -m64: Specifies the architecture (32-bit or 64-bit) – match this to your target system!
```
gcc -fno-stack-protector -z execstack -m32 shell.c -o shell
objdump -d shell | grep "<text>"
```
Extract the Shellcode: The objdump command will output assembly code. You need to find the machine code instructions within the <text> section. Copy these hexadecimal bytes – this is your shellcode.
Example (32-bit):
```
b8 5f 00 00 00                mov eax, 0x5f
b9 76 00 00 00                mov ecx, 0x76
... (more bytes) ...
```
You’ll need to copy all the hexadecimal byte pairs from the output.
Represent as a Byte Array: In your exploit code (e.g., Python script), represent the shellcode as a byte array.
```
shellcode = b"xb8x5fx00x00x00xb9x76x00x00x00..."
```

Testing Shellcode

Simple Test Program: Create a C program to execute the shellcode directly.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
  unsigned char shellcode[] = { /* Your shellcode bytes here */ };
  int (*func)() = (int (*)())shellcode;
  func();
  return 0;
}

Compile and Run: Compile the test program and run it. If your shellcode is correct, you should see the expected behavior (e.g., a shell prompt).
Remember to compile with -fno-stack-protector -z execstack.
Debugging: Use a debugger (like GDB) to step through the execution of your test program and verify that the shellcode is being executed correctly.

Important Considerations

Null Bytes: Shellcode often contains null bytes (x00). These can terminate strings prematurely in C, making injection difficult. Techniques like using environment variables or other methods to bypass this limitation are common.
Address Space Layout Randomization (ASLR): ASLR randomizes the base address of libraries and other memory regions, making it harder to predict where shellcode needs to be placed.
Data Execution Prevention (DEP) / NX Bit: DEP/NX prevents code execution from data sections of memory. Bypassing this often involves techniques like Return-Oriented Programming (ROP).