Blog | G5 Cyber Security

Message Hashing & Data Appending

TL;DR

No, you generally cannot append data to a message without changing its hash. A hash is like a fingerprint – even a tiny change in the message alters the fingerprint completely. However, there are techniques (like using Message Authentication Codes – MACs) that allow you to verify data integrity *after* appending information.

Understanding Hashing

Hashing algorithms take an input (your message) and produce a fixed-size output (the hash). Common hashing algorithms include SHA-256, MD5 (though MD5 is now considered insecure for many purposes), and others. The key properties are:

Because of these properties, hashes are used for verifying data integrity.

Why Appending Data Changes the Hash

The hashing algorithm processes every bit of the input message. If you add even a single character or byte to the message, the entire calculation changes, resulting in a different hash value.

Example (Python)

import hashlib

message = "This is my original message."
original_hash = hashlib.sha256(message.encode()).hexdigest()
print(f"Original Hash: {original_hash}")

messaged_with_append = message + " I've added some extra data."
new_hash = hashlib.sha256(messaged_with_append.encode()).hexdigest()
print(f"Hash after appending: {new_hash}")

You’ll see that original_hash and new_hash are completely different.

How to Append Data *and* Verify Integrity

If you need to append data while still ensuring integrity, use a Message Authentication Code (MAC). A MAC uses a secret key along with the message to generate a tag. This tag can be used to verify that the message hasn’t been tampered with.

Steps for Using a MAC

  1. Choose a MAC algorithm: HMAC is a common and secure choice.
  2. Share a secret key: This key must be known only to the sender and receiver.
  3. Calculate the MAC: The sender calculates the MAC of the original message using the secret key.
  4. Append the MAC to the message: Send both the message *and* the MAC tag.
  5. Verify on receipt: The receiver recalculates the MAC using the received message and the shared secret key. If the calculated MAC matches the received MAC, the message is authentic and hasn’t been altered.

Example (Python – HMAC)

import hmac
hash_algorithm = hashlib.sha256
secret_key = b'MySecretKey'
message = "This is my original message."

# Calculate the MAC
mac = hmac.new(secret_key, message.encode(), hash_algorithm).hexdigest()
print(f"MAC: {mac}")

# Append the MAC to the message
message_with_mac = message + "|"+ mac  # Use a delimiter like '|' to separate message and MAC

# Verification (Receiver side)
received_message_and_mac = message_with_mac
try:
    message, received_mac = received_message_and_mac.split("|")
    calculated_mac = hmac.new(secret_key, message.encode(), hash_algorithm).hexdigest()
    if calculated_mac == received_mac:
        print("Message is authentic!")
    else:
        print("Message has been tampered with!")
except ValueError:
    print("Invalid message format.")

Important: Always use a strong, randomly generated secret key. Never hardcode keys directly into your code for production systems.

Digital Signatures

For even stronger security and non-repudiation (proving who sent the message), consider using digital signatures instead of MACs. Digital signatures use asymmetric cryptography (public/private key pairs).

Exit mobile version