TL;DR
Using the HTTP referrer header can help block simple spam bots by identifying requests that don’t come from legitimate web pages. It’s not foolproof, but it adds a useful layer of protection.
How to Block Spam Bots with HTTP Referrer Checks
- Understand the HTTP Referrer Header: The HTTP referrer (or referer) header tells a server where the user came from before requesting a page. For example, if someone clicks a link on
example.comto visit your site, the referrer header will usually containexample.com.- Bots often don’t send a referrer header or send an invalid one.
- Users can disable referrer sending in their browser settings.
- Server-Side Implementation: You need to implement the check on your web server. The method varies depending on your server software.
- Apache (.htaccess): Add rules to your
.htaccessfile.RewriteEngine On RewriteCond %{HTTP_REFERER} ^$ RewriteRule .* - [F,L] # Block requests with no referrer RewriteCond %{HTTP_REFERER} !^https?://(www.)?yourdomain.com/(.*) RewriteRule .* - [F,L] # Block requests from other domains (replace yourdomain.com) - Nginx: Configure the check in your Nginx configuration file.
server { ... valid_referers yourdomain.com; if ($invalid_referer) { return 403; # Block invalid referrers } ... } - PHP: Use the
$_SERVER['HTTP_REFERER']variable.
- Apache (.htaccess): Add rules to your
- Whitelisting: Allow legitimate referrers.
- Your own domain is essential.
- If you use a CDN, whitelist its domains.
- Consider whitelisting search engine crawlers (e.g., Googlebot) if needed, but be cautious as this can also allow abuse.
- Testing: Thoroughly test your implementation.
- Use a tool like
curlto simulate requests with and without referrers:curl -v https://yourdomain.com # With referrer (usually from the shell) curl -H "Referer: http://badsite.com" https://yourdomain.com # With a bad referrer curl -H "Referer: " https://yourdomain.com # With an empty referrer - Check your server logs for blocked requests.
- Use a tool like
- Limitations and Considerations:
- Browser Settings: Users can disable referrer sending, so legitimate users might be blocked.
- Proxy Servers: Proxies may strip or modify the referrer header.
- Bots Can Spoof: Sophisticated bots can spoof the referrer header.
- False Positives: Be careful not to block too many legitimate requests.
This method is best used as one layer in a multi-layered approach to spam bot prevention, alongside other techniques like CAPTCHAs and rate limiting.