Blog | G5 Cyber Security

Code Verification: Proof of Source

TL;DR

It’s very difficult to cryptographically prove a running interpreted program is exactly the same as published source code. While perfect proof isn’t usually possible, you can get strong evidence using techniques like hashing and digital signatures combined with careful build processes and dependency management.

How it Works: The Challenges

Interpreted languages (Python, JavaScript, Ruby etc.) are different from compiled ones. Compiled code creates a fixed executable file directly from the source. Interpreted code needs an interpreter to run it, and there’s often more flexibility in how that code is executed.

These factors make a simple hash comparison unreliable.

Steps to Verify Code Integrity

  1. Hashing the Source Code: Create a cryptographic hash of your original source code.
    sha256sum my_script.py

    This gives you a unique ‘fingerprint’ of the file.

  2. Dependency Management: Use a dependency manager (e.g., pip for Python, npm for JavaScript) to lock down specific versions of all libraries your code uses.
    • Python: Create a requirements.txt file with pinned versions:
      requests==2.28.1
    • JavaScript: Use package-lock.json or yarn.lock to record exact dependencies.
  3. Build Process (if applicable): If you use a build step (e.g., bundling JavaScript), hash the *output* of that process, not just the source.
    sha256sum dist/bundle.js
  4. Digital Signatures: Sign your source code and dependency lock file with a digital signature using a private key.
    • This proves who created the files, and that they haven’t been tampered with. Tools like GPG can be used for this.
    • Example (GPG signing):
      gpg --sign my_script.py
  5. Runtime Hashing: At runtime, recalculate the hash of the source code and compare it to the original.

    This is tricky in interpreted languages because you need a way to access the source code at runtime. It’s often not practical.

    import hashlib
    with open('my_script.py', 'rb') as f:
      source_code = f.read()
    hash_value = hashlib.sha256(source_code).hexdigest()
    print(hash_value)

    Compare this hash to the original hash you calculated in step 1.

  6. Code Integrity Checks: Implement checks within your application to verify dependencies and code hashes.
    • If a check fails, refuse to run or display a clear warning.

Important Considerations

Exit mobile version