I’m addicted to investigating server log files. I’ve stumbled upon very unusual requests: POSTs to homepage. Surely enough there’s no form on the homepage that might have been used.
The requests look like this:
5.75.71.6 - - [17/Sep/2017:12:27:28 +0000] "POST / HTTP/1.1" 200 5953 "-" "-" "5.75.71.6" "some-funny-hostname" sn="www.example.com" rt=0.241 ua="unix:/var/run/php-fpm/php-fpm-example.com.sock" us="200" ut="0.237" ul="20755" cs=-
The IP addresses for these requests 100% times originated from Iran or China.
OK, let’s investigate what data in fact they are posting to us. I’ve adjusted main script of the website in question to log relevant data to separate log file. At the top of index.php
, added:
if ($_SERVER['REQUEST_METHOD'] === 'POST' && $_SERVER['REQUEST_URI'] == '/') {
$req_dump = print_r($_REQUEST, TRUE);
$fp = fopen(dirname(__FILE__) . '/request.log', 'a');
fwrite($fp, print_r($_SERVER, true));
fwrite($fp, $req_dump);
fclose($fp);
}
Investigating the request.log
file created I found that in fact they are not posting any data. However, the HTTP Host
header (as obvious already from the nginx log) is always some funny and spammy website name.
Now it became obvious to me that those are:
- Spam bots
- They submit advertised website’s name in the
Host
header in order to appear in your Google Analytics reports - This will happen only with the website that is set as default on your server (nginx:
default_server
directive, or the first one listed) - They are submitting requests as POST in order to bypass any caching. Typically POST requests are not cached as per configuration. Thus they create unnecessary load on server besides spamming your Google Analytics!
So even if you don’t have Google Analytics in the first place, those spam bots will put extra strain to your server. Which is quite concerning. Let’s put those bots to peace with simple configuration of nginx.
The ultimate fix here would be simply to make nginx drop requests to hostnames (websites) that you know are not yours.
One implementation of that fix would involve creating a “catch-all” default server in nginx like this:
server {
listen 80 default_server;
listen 443 ssl default_server;
ssl_certificate dummy.crt; ...
server_name _; # some invalid name that won't match anything
return 444;
}
Any time someone is trying to access a website with a name that is nowhere defined in your nginx configuration, this server block will be used. And nginx will silently drop those requests. The special HTTP status code 444 does exactly that (unique to nginx).
While this would definately work, I’m not a fan of breaking things while fixing them. What would break you might ask?
The fix above would break SSL for browsers which do not support SNI. Because they talk directly to the site that we have defined to be default_server
when initiating connection, they would simply get dropped connections.
So what we need here is different. We need to drop requests when host name is irrelevant to our main site’s configuration after SSL connection is already established.
Easy proper fix:
server {
listen 80 default_server;
listen 443 ssl default_server;
ssl_certificate example.com.crt; ...
server_name example.com
if ($http_host !~ "example\.com") {
return 444;
}
As easy as that:
- SNI still works
- Irrelevant requests are simply discarded by nginx
- Your server is happy and has less load
An improvement of the config would be:
- Use nginx map directive to match hostnames without regex
- Fail2ban configuration to pick up on those discarded requests and block IP addresses
- .. Which is already implemented in the Citrus server stack. Interested? We can install and configure your super fast web server!