Get a Pentest and security assessment of your IT network.

Cyber Security

Bypassing htmlspecialchars() XSS

TL;DR

htmlspecialchars() isn’t a perfect defence against Cross-Site Scripting (XSS). This guide shows how to bypass it using different character encodings and techniques, focusing on scenarios where event handlers are blocked. We’ll cover HTML entity encoding, Unicode escapes, and context-dependent exploitation.

Understanding the Problem

htmlspecialchars() converts potentially dangerous characters (like <, >, ", &) into their HTML entities. However, it doesn’t protect against all XSS vectors, especially when combined with incorrect character encoding or specific browser behaviours.

Bypassing htmlspecialchars()

  1. HTML Entity Encoding
    • Sometimes, double encoding can bypass filters. For example, if the filter encodes & to &, you might try encoding it again: &&.
    • However, this is less common with modern implementations of htmlspecialchars().
  2. Unicode Escapes
    • Browsers often interpret Unicode characters differently. You can try using Unicode escapes for characters that are normally filtered.
      <script>alert(1);
    • This works because htmlspecialchars() might not encode all Unicode characters, or the browser may interpret them as HTML tags.
  3. Context-Dependent Exploitation
    • The most effective bypasses depend on where the user input is placed in the HTML.
      1. Attribute Context: If the input goes into an HTML attribute, you can try breaking out of the attribute using quotes and other characters.
      2. <img src="x" onerror="alert(1)">
      3. Comment Context: If the input is placed within an HTML comment, you can try closing the comment tag.
        --> <script>alert(1);</script>
      4. Script Tag Context: If the input goes into a script tag, you might be able to inject JavaScript directly. This is less common if htmlspecialchars() is applied correctly within the script tag itself.
        <script>alert(1);
  4. Using Different Character Sets
    • If the page uses a character set other than UTF-8, you might be able to inject characters that are interpreted differently by the browser.
      <img src="javascript:alert(1)">
  5. Exploiting Browser Quirks
    • Some older browsers have quirks that allow XSS even with htmlspecialchars(). This is less common now, but it's worth considering if you're targeting a specific browser version.

Example Scenario

Let’s say the vulnerable code looks like this (PHP example):

<?php
$input = $_GET['param'];
$safe_input = htmlspecialchars($input);
echo "<p>>" . $safe_input . "</p>>";
?>

If you input <script>alert(1)</script>, htmlspecialchars() will convert it to &lt;script&gt;alert(1)&lt;/script&gt;. However, if the context allows (e.g., within a script tag or by exploiting attribute context), you might still be able to inject JavaScript.

Prevention

  • Content Security Policy (CSP): Implement a strong CSP to restrict the sources from which scripts can be loaded.
  • Input Validation: Validate user input on the server-side and only allow expected characters.
  • Output Encoding: Use output encoding consistently, not just htmlspecialchars(). Consider using a templating engine that automatically escapes variables based on context.
  • Regular Security Audits: Regularly audit your code for XSS vulnerabilities.
Related posts
Cyber Security

Zip Codes & PII: Are They Personal Data?

Cyber Security

Zero-Day Vulnerabilities: User Defence Guide

Cyber Security

Zero Knowledge Voting with Trusted Server

Cyber Security

ZeroNet: 51% Attack Risks & Mitigation