Get a Pentest and security assessment of your IT network.

Cyber Security

MD5 Sum in File: Is it Possible?

TL;DR

Yes, a file can contain its MD5 sum inside it. However, this doesn’t guarantee the file hasn’t been tampered with unless you have a separate, trusted source for the original MD5 sum to compare against.

How to Do It

  1. Calculate the MD5 Sum: Use a command-line tool or programming language function to generate the MD5 hash of your file.
    • Linux/macOS:
      md5sum filename

      This will output something like a1b2c3d4e5f678901234567890abcdef filename. The long hexadecimal string is the MD5 sum.

    • Windows (PowerShell):
      Get-FileHash filename -Algorithm MD5 | Format-List

      Look for the Hash property in the output.

    • Python:
      import hashlib
      with open('filename', 'rb') as f:
        md5_hash = hashlib.md5(f.read()).hexdigest()
      print(md5_hash)
  2. Append the MD5 Sum to the File: Add the calculated MD5 sum to the end of your file. You can do this using a text editor or command-line tools.
    • Command Line (Linux/macOS):
      echo "$(md5sum filename | awk '{print $1}')" >> filename

      This appends the MD5 sum to the file.

    • Text Editor: Open the file in a text editor and manually add a newline character followed by the MD5 sum at the very end of the file. Make sure there are no extra spaces or characters before or after the hash.
  3. Verify the Integrity (Important): To check if the file has been altered, recalculate the MD5 sum and compare it to the one stored within the file.
    • Linux/macOS:
      md5sum filename | awk '{print $1}'

      Compare this output with the MD5 sum you appended.

    • Windows (PowerShell):
      Get-FileHash filename -Algorithm MD5 | Select-Object Hash

      Compare this hash to the one in your file.

Important Considerations

  • Security Risk: Storing the MD5 sum *inside* the file is not a secure way to verify integrity. An attacker could modify both the file and its embedded MD5 sum, making it appear valid.
  • Trusted Source Required: You need a separate, trusted source for the original MD5 sum (e.g., from a website, another file, or a secure database) to reliably detect tampering.
  • File Format Compatibility: This method works best with text-based files. For binary files, appending the MD5 sum might corrupt the file if it’s not handled correctly. Consider storing the hash in a separate metadata section of the file format (if available).
  • Alternative Hash Algorithms: MD5 is considered cryptographically broken and prone to collisions. Use stronger algorithms like SHA-256 or SHA-3 for better security.
    • Linux/macOS (SHA-256):
      sha256sum filename
    • Windows (PowerShell – SHA-256):
      Get-FileHash filename -Algorithm SHA256 | Format-List
Related posts
Cyber Security

Zip Codes & PII: Are They Personal Data?

Cyber Security

Zero-Day Vulnerabilities: User Defence Guide

Cyber Security

Zero Knowledge Voting with Trusted Server

Cyber Security

ZeroNet: 51% Attack Risks & Mitigation