Blog | G5 Cyber Security

Webpage Download Tracking

TL;DR

Yes, a webpage can track if you download its source code or save it as a web archive (like using your browser’s ‘Save Page As…’). They do this by checking for specific events and behaviours. However, there are ways to reduce tracking.

How Webpages Track Downloads

  1. JavaScript Events: Many websites use JavaScript to detect when you try to save the page or its source code.
  • Network Requests: When you save a page, your browser makes requests for all its resources (HTML, CSS, images, etc.). The server logs these requests, which can be correlated to identify download attempts.
  • Web Archive Services: Saving as a web archive often involves sending the page content to a third-party service (like Archive.org). This service will have a record of the saved page and its origin.
  • Content Integrity Checks: Some websites embed hidden code or unique identifiers within their HTML that are checked when the page is loaded. If this code is missing in a downloaded version, it can indicate a download attempt.
  • How to Reduce Tracking

    1. Disable JavaScript (Use with Caution): Disabling JavaScript will prevent many tracking methods but may break website functionality.
  • Browser Extensions: Use privacy-focused browser extensions like uBlock Origin or Privacy Badger to block trackers and scripts. These can often prevent download tracking code from running.
  • Save as Text Only: Instead of ‘Save Page As…’, try viewing the page source (usually right-click -> ‘View Page Source’) and then copying and pasting it into a plain text editor. This removes all JavaScript and other potentially tracking elements.
    • Right-click on the webpage, select ‘View Page Source’.
    • Select all the content in the source code window (Ctrl+A or Cmd+A).
    • Copy the content (Ctrl+C or Cmd+C).
    • Paste it into a plain text editor like Notepad (Windows) or TextEdit (Mac).
  • Use Command-Line Tools: Use tools like wget or curl to download the page content. These tools give you more control over what is downloaded and can be configured to avoid running JavaScript.
    wget -q --no-check-certificate  -O filename.html
  • Incognito/Private Browsing: While not a perfect solution, using incognito mode or a private browsing window can limit the amount of tracking data associated with your session.
  • VPN and Tor: Using a VPN (Virtual Private Network) or the Tor network can mask your IP address and make it harder to track your downloads.
  • Important Considerations

    Exit mobile version