Regex & Input Risks: A Security Guide

TL;DR

Yes! Letting users provide both the regular expression and the input to be matched against it is extremely dangerous. It can lead to Denial of Service (DoS) attacks, and potentially allow attackers to execute arbitrary code on your server. This guide explains why and how to fix it.

Why it’s risky

Regular expressions (regex) are powerful tools for pattern matching. However, badly written regex can take a very long time to process – especially against certain inputs. This is called ‘ReDoS’ (Regular Expression Denial of Service). When an attacker controls both the regex and the input, they can craft a combination that causes your server to hang or crash.

How attackers exploit this

Imagine you have a function like this (example in Python):

def match_string(regex, input_string):
  import re
  try:
    re.search(regex, input_string)
    return True
  except Exception as e:
    return False

An attacker could supply a regex like ^(a+)+$ and an input string of just "b". This seems harmless, but the regex engine will try many different combinations to match, leading to exponential processing time.

How to protect yourself

Avoid letting users supply regex directly whenever possible. The best solution is to avoid this entirely. Use pre-defined patterns or a limited set of options that you control.
If you *must* allow user input, sanitise and validate it rigorously:

Limit complexity: Restrict the length of the regex string. A long regex is more likely to be problematic.
Disallow backtracking features: Features like backreferences (1), possessive quantifiers (++, *+) and nested quantifiers are common causes of ReDoS. Blacklist these from user-supplied regex.
Character class restrictions: Limit the characters allowed in the regex to a safe set. Avoid allowing metacharacters like [, ], ^, $ without careful escaping and validation.
Timeouts: Set a maximum execution time for the regex matching operation. This will prevent long-running attacks from taking down your server.

Use a safe regex engine or library: Some regex engines are more resistant to ReDoS than others. Consider using libraries specifically designed with security in mind.

Example Timeout Implementation (Python)

Here’s how you can add a timeout to the Python example:

import re
import signal

def match_string(regex, input_string, timeout=1):
  def handler(signum, frame):
    raise TimeoutError("Regex execution timed out")

  signal.signal(signal.SIGALRM, handler)
  signal.alarm(timeout) # Set the alarm for 'timeout' seconds

  try:
    re.search(regex, input_string)
    return True
  except TimeoutError as e:
    print("Regex timed out!")
    return False
  except Exception as e:
    return False
  finally:
    signal.alarm(0) # Disable the alarm

This code sets a 1-second timeout for the regex execution. If it takes longer than that, a TimeoutError is raised.

Testing

ReDoS testing tools: Use online ReDoS testers (search for ‘redos tester’) to check if your allowed regex patterns are vulnerable with various inputs.
Fuzzing: Generate random regex and input combinations to try and find problematic cases.

TL;DR

Why it’s risky

How attackers exploit this

How to protect yourself

Example Timeout Implementation (Python)

Testing

Something Fresh

Zip Codes & PII: Are They Personal Data?

ZeroNet: 51% Attack Risks & Mitigation

Zero Knowledge Voting with Trusted Server

What People Reading

Feedback and data-driven updates to Googles disclosure policy

Certificate Security in the Wild West

Security Insider Interview Series: John McArthur, Senior Product Manager, IP Intelligence; and Rupert Young, Senior Director Software Engineering, Data Compilation and Identity, Neustar

Mozilla Says Google's New Ad Tech - FLoC - Doesn't Protect User Privacy

This Week's TV: Felicia Day Plays a Hacker with a Dungeons and Dragons Tattoo

Categories

Partners

Just add here your partners image or promo text

Regex & Input Risks: A Security Guide

TL;DR

Why it’s risky

How attackers exploit this

How to protect yourself

Example Timeout Implementation (Python)

Testing

Related posts

Something Fresh

What People Reading

Categories

Partners

Just add here your partners image or promo text