Get a Pentest and security assessment of your IT network.

Cyber Security

Polyglot Files: Beyond GIFAR

TL;DR

Yes, several other polyglot files exist beyond the well-known GIFAR example. These files are designed to be valid in multiple file formats simultaneously, often used for malware evasion or testing parsing libraries. This guide details some prominent examples and how they work.

What are Polyglot Files?

Polyglot files exploit the similarities between different file format specifications. They achieve this by carefully crafting a file that adheres to the rules of multiple formats, allowing it to be interpreted correctly by various programs. This can involve interleaving valid code or data from each format.

Known Polyglot Files

  1. GIFAR (GIF + Archive): The classic example. It’s a valid GIF image and also a ZIP archive.
  2. PNGAR (PNG + Archive): Similar to GIFAR, this file is both a PNG image and a ZIP archive.
  3. JPEGAR (JPEG + Archive): A JPEG image that’s also a ZIP archive.
  4. PDFAR (PDF + Archive): A PDF document containing a valid ZIP archive.
  5. HTMLAR (HTML + Archive): An HTML file with an embedded ZIP archive.
  6. SWFAR (SWF + Archive): A Flash SWF file that’s also a ZIP archive.
  7. ZIP-PNG: A ZIP archive containing a valid PNG image as its only entry. This is different from PNGAR, which *is* both formats simultaneously.
  8. ELF-PNG (Executable and Linkable Format + PNG): An executable file that also contains a valid PNG image.

How do they work?

The core principle is to find overlapping or compatible sections in the different formats. For example:

  • ZIP header at the end of an image: Image files often have headers followed by pixel data. A ZIP archive header can be appended after the image data, making it a valid ZIP file when read as such.
  • Valid code within comments: Some formats allow for comments that are ignored during normal execution but are still parsed. Code from another format can be embedded in these comments.

Detecting Polyglot Files

  1. File Signature Analysis: Check the file header (first few bytes) to identify potential formats. Tools like file on Linux/macOS are useful.
    file your_file.gifar
  2. Entropy Analysis: Polyglot files often have higher entropy due to the combination of different data types.
  3. Format-Specific Parsers: Attempt to parse the file with multiple parsers (e.g., a GIF parser and a ZIP parser). If both succeed, it’s likely a polyglot.
    identify your_file.pngar # ImageMagick
  4. Hex Editor Inspection: Manually examine the file in a hex editor to look for overlapping headers and data structures.

Practical Example: GIFAR

GIFAR works because the ZIP format allows for files to be appended after the central directory. The GIF header is valid, followed by image data, then the ZIP archive structure.

Why are Polyglot Files Used?

  • Malware Evasion: To bypass signature-based detection systems.
  • Testing Parsing Libraries: To identify vulnerabilities in software that handles multiple file formats.
  • Obfuscation: To hide malicious code within seemingly harmless files.

Resources

Related posts
Cyber Security

Zip Codes & PII: Are They Personal Data?

Cyber Security

Zero-Day Vulnerabilities: User Defence Guide

Cyber Security

Zero Knowledge Voting with Trusted Server

Cyber Security

ZeroNet: 51% Attack Risks & Mitigation