Blog | G5 Cyber Security

PDF Signatures & DSIG Core

TL;DR

Yes, you can describe PDF certificate-based signatures using W3C’s Digital Signature (DSIG) Core, but it requires understanding how PDFs store signature information and mapping that to the DSIG model. It’s not a direct one-to-one translation, as PDFs have their own complexities. You’ll likely need a library or tool to extract the relevant data.

Understanding PDF Signatures

PDF signatures aren’t like simple digital signatures on documents. They are complex structures containing:

These components are embedded within the PDF file itself.

Mapping to DSIG Core

W3C’s DSIG Core provides a standard way to represent digital signatures. Here’s how you can map PDF signature data:

  1. Identify the Signed Data: Determine what part of the PDF was actually signed. This is usually specified in the signature dictionary.
  2. Extract the Digest: The content stream contains a hash (digest) of the signed data. You need to extract this value. Libraries like PyPDF2 or pdfminer.six can help with this.
    from PyPDF2 import PdfReader
    reader = PdfReader("your_pdf.pdf")
    signature_field = reader.get_fields()["/Sig1"] # Replace /Sig1 with the actual signature field name
    digest = signature_field.get('/Contents')[0].decode('utf-8')
    print(digest)
    
  3. Extract Certificate Information: Retrieve the certificate chain from the PDF. This will give you the signer’s public key, which is essential for verification.
    from PyPDF2 import PdfReader
    reader = PdfReader("your_pdf.pdf")
    signature_field = reader.get_fields()["/Sig1"] # Replace /Sig1 with the actual signature field name
    certificates = signature_field.get('/Cert')
    print(certificates)
    
  4. Determine the Digest Algorithm: Find out which hashing algorithm was used (e.g., SHA256). This is also in the signature dictionary.
    from PyPDF2 import PdfReader
    reader = PdfReader("your_pdf.pdf")
    signature_field = reader.get_fields()["/Sig1"] # Replace /Sig1 with the actual signature field name
    digest_algorithm = signature_field.get('/Filter')
    print(digest_algorithm)
    
  5. Create a DSIG Core Representation: Use a DSIG library (e.g., xmlsec in Python) to create a DSIG representation of the signature.

    This involves creating a Canonical XML form of the signed data, calculating the digest using the identified algorithm, and then signing it with the signer’s public key.

Tools & Libraries

Important Considerations

Exit mobile version