GCM Output Formats: A Practical Guide

G5 Cyber Security

2 months ago

TL;DR

There isn’t *one* standard format for storing GCM (Galois/Counter Mode) output, but several common approaches exist. This guide covers the most practical options and how to implement them, focusing on ensuring data integrity and usability.

Understanding GCM Output

GCM produces two main outputs: ciphertext and an authentication tag (sometimes called a MAC – Message Authentication Code). The tag verifies that the ciphertext hasn’t been tampered with. Storing both correctly is vital for cyber security.

Solution Guide

Choose a Storage Method:
- Database: Ideal for structured data, allowing easy querying and management.
- Filesystem: Suitable for larger binary blobs or when database access isn’t practical. Consider encryption at rest if storing sensitive data on disk.
- Key-Value Stores: Good for simple key-based retrieval of ciphertext/tag pairs.
Data Format Options:
- Concatenation: The simplest approach – just join the ciphertext and tag together.
```
ciphertext + tag
```
  This is easy to implement but requires careful handling of lengths.
- JSON: A human-readable format, good for debugging and interoperability.
```
{
 "ciphertext": "...",
 "tag": "..."
}
```
  Adds overhead due to the JSON structure.
- Protocol Buffers (protobuf): A more efficient binary format than JSON, especially for large datasets. Requires defining a schema.
- ASN.1/DER: Common in cryptographic applications, providing a structured and standardised way to represent data. More complex to implement directly.
Storing the IV (Initialisation Vector):
The IV is crucial for decryption. *Always* store it alongside the ciphertext and tag. It’s generally stored in plain text, but ensure its uniqueness.
- Database: Add a separate column for the IV.
- Filesystem/Key-Value Stores: Store as part of the metadata associated with the ciphertext file or key. Consider using a filename convention that includes an IV identifier.

Example Implementation (Python) – JSON:

This shows how to store GCM output in JSON format.

import json
from cryptography.fernet import Fernet

def encrypt_and_store(data, key):
    f = Fernet(key)
    ciphertext, tag = f.encrypt_and_get_tag(data.encode())
    iv = f.iv # Get the IV used for encryption
    storage_data = {
        "ciphertext": ciphertext.decode(),
        "tag": tag.decode(),
        "iv": iv.decode()
    }
    with open("encrypted_data.json", "w") as outfile:
        json.dump(storage_data, outfile)

def load_and_decrypt(filename, key):
    with open(filename, "r") as infile:
        storage_data = json.load(infile)
    f = Fernet(key)
    ciphertext = storage_data["ciphertext"].encode()
    tag = storage_data["tag"].encode()
    iv = storage_data["iv"].encode()
    f.iv = iv # Set the IV before decrypting
    decrypted_data = f.decrypt(ciphertext, tag)
    return decrypted_data.decode()

Length Considerations:
GCM tags have a fixed length (typically 16 bytes). Ciphertext length varies depending on the input data size.
- Database: Use appropriate data types to accommodate variable ciphertext lengths (e.g., `BLOB` or `TEXT`).
- Concatenation: Store the ciphertext length separately if you need it for other purposes.
Security Best Practices:
- Key Management: Protect your encryption key rigorously! Use a Hardware Security Module (HSM) or secure key management system.
- Authentication: Verify the authenticity of any data retrieved before decryption.
- Regular Audits: Review your storage and retrieval processes regularly for vulnerabilities.