Site icon GetPageSpeed

NGINX honeypot – the easiest and fastest way to block bots!

NGINX Honeypot

NGINX Honeypot

The Internet is not a safe place these days. Hosting a public website means exposing it to multiple attacks from evil bots, which, at best will cause extra CPU and I/O load to your server.

If your web server is NGINX, you may be rightfully tempted to make use of some 3rd party WAF modules to counter the bad guys. One such module is nginx-module-security, other is NAXSI.

But what if I told you that there’s a trick that would allow your NGINX to easily filter out 99% of the bots out there, without third-party modules? Read on to find out how.

Know your average enemy (bot)

; tldr #1 – Evil bots try to upload

Suppose that you have a WordPress blog, and sure enough, the bad guys are trying to check if they are able to find a weak spot. They do this by trying different upload endpoints of various plugins. As an example, one of the bots was trying to access:

Those plugins most likely do not even exist on your website!

So what we can obviously do, is ban any IP that attempts to access a resource that doesn’t exist on our site. Honeypot resources are either:

We will now add an NGINX honeypot that will work in a simple and effective way: when a malicious bot requests a known, yet non-existent upload location, NGINX will immediately ban their IP.

Let’s dive into implementation details for this kind of ban.

Pre-requisites

RHEL 7 based system, e.g. CentOS 7, and EPEL repository:

yum -y install epel-release

Setup FirewallD

We are going to create two FirewallD IP sets.
The two IP sets’ names are honeypot4 and honeypot6, for IPv4 and IPv6 addresses, respectively.

Any IP addresses placed on either of the two sets should be banned in the server firewall. To achieve this, we configure these IP sets to be in FirewallD’s drop zone.

firewall-cmd --permanent --new-ipset=honeypot4 --type=hash:ip --option=maxelem=1000000 --option=family=inet --option=hashsize=4096
firewall-cmd --permanent --new-ipset=honeypot6 --type=hash:ip --option=maxelem=1000000 --option=family=inet6 --option=hashsize=4096
firewall-cmd --permanent --zone=drop --add-source=ipset:honeypot4 
firewall-cmd --permanent --zone=drop --add-source=ipset:honeypot6 
firewall-cmd --reload

Next, we’ll teach our NGINX when and how to add IP addresses to those IP sets.

Setup NGINX honeypot locations

Our “trap” locations in NGINX will forward requests to a special FastCGI daemon powered by fcgiwrap.

Each of those honeypot locations will include exactly the same directives to ensure passing requests to block-ip.cgi CGI script, that we’ll create later.

To make things clean, we will include those directives from a separate file:

/etc/nginx/includes/honeypot.conf

    fastcgi_intercept_errors off;
    include fastcgi_params;
    fastcgi_param SCRIPT_FILENAME /usr/local/libexec/block-ip.cgi;
    fastcgi_pass unix:/run/fcgiwrap/fcgiwrap-nginx.sock;
    keepalive_timeout 0;

Add locations for some known plugin endpoints that do not really exist for your website:

/etc/nginx/sites-available/example.conf

location = /wp-content/plugins/ungallery/source_vuln.php {
    include includes/honeypot.conf;
}

location = /wp-content/plugins/barclaycart/uploadify/uploadify.php {
    include includes/honeypot.conf;
}

location = /wp-content/plugins/barclaycart/uploadify/settings_auto.php {
    include includes/honeypot.conf;
}

location = /wp-content/plugins/hd-webplayer/playlist.php {
   include includes/honeypot.conf;
}

location = /wp-content/plugins/cherry-plugin/admin/import-export/upload.php {
    include includes/honeypot.conf;
}

location = /wp-content/plugins/viral-optins/api/uploader/file-uploader.php {
    include includes/honeypot.conf;
}

Using exact location matching via = ensures that matching is fast and has priority over your existing \.php location.

Another approach would be regex matching. It may look somewhat cleaner as you can simply list all the plugins you don’t have in a single spot:

location ~ ^/wp-content/plugins/(ungallery|barclaycart|viral-optins|hd-webplayer)/ {
    include includes/honeypot.conf;
}

location ~* ^/phpmyadmin/ {
    include global/honeypot.conf;
}

But hey, is there any good plugin out there that requires direct access to its PHP files under wp-content? None, of course.
So our honeypot locations can be greatly simplified to block anyone trying to access PHP files in wp-content.
Furthermore, let’s drop some locations that satisfy the rules “existing locations that are not meant to be accessed by genuine users”.
Our final configuration snippet becomes:

location ~ ^/wp-content/.*\.php$ {
    include includes/honeypot.conf;
}

location = /wp-config.php { 
    include includes/honeypot.conf;
}

location = /wp-admin/install.php {
    include includes/honeypot.conf;
}

location ~* ^/phpmyadmin/ {
    include global/honeypot.conf;
}

TIP: talking about PhpMyAdmin, don’t be someone who installs it, eh?

The regex matching is prone to configuration errors because the position of these directives relative to other locations is important. Be sure to place your honeytrap regexes before your existing \.php location.

Install and configure fcgiwrap

The fcgiwrap is needed to empower our NGINX with the ability to launch the bash script to ban IPs in the firewall. Technically, you could use whatever existing scripting engine currently used for your website (PHP with WordPress). But for efficiency reasons, let’s avoid launching the PHP interpreter for banning.

yum -y install fcgiwrap

The fcgiwrap conveniently ships with a service file that allows us to launch multiple instances using different users. E.g. fcgiwrap@nginx.socket unit will launch fcgiwrap service with nginx user, and listen on a UNIX socket.

To find the UNIX socket’s path easily, you can run:

systemctl status fcgiwrap@nginx.socket

This would reveal: /run/fcgiwrap/fcgiwrap-nginx.sock.

Ensure that the service is enabled and running:

systemctl start fcgiwrap@nginx.socket
systemctl enable fcgiwrap@nginx.socket

Create the CGI script

Let’s create a CGI script at /usr/local/libexec/block-ip.cgi.
The script will call actual bash script for banning an IP, using sudo:

#!/bin/bash

echo "Status: 410 Gone"
echo "Content-type: text/plain"
echo "Connection: close"
echo

echo "Bye bye, $REMOTE_ADDR!"
sudo /usr/local/sbin/block-ip.sh

exit 0

Any time NGINX will call this script, this will launch sudo /usr/local/sbin/block-ip.sh and pass along REMOTE_ADDR.

Make the CGI script executable:

chmod 0755 /usr/libexec/block-ip.cgi

Note on Keep-Alive

NGINX, like any other web server, supports keepalive connections.
Simply blocking an IP in the firewall is not sufficient, because it affects only future connections.
If bots are smart enough to use Keep-Alive (which is easy to implement), they can still make malicious requests over the initially established connection.

That’s where keepalive_timeout 0; in our NGINX honeypot include comes in handy. It instructs NGINX to close the connection with the malicious client, after blocking CGI script completes.

We also explicitly instruct the malicious client to close the connection via Connection: close header.

Create the bash script to block an IP

Let’s create the bash script that will be launched by NGINX via sudo, that will block an IP address.

#!/bin/bash

if [[ -z ${REMOTE_ADDR} ]]; then
    if [[ -z "$1" ]]; then
        echo "REMOTE_ADDR not set!"
        exit 1
    else
        REMOTE_ADDR=$1
    fi
fi

# Put space separate list of trusted IP addresses, not to lock yourself out if you like to test the honeypot! : )
TRUSTED_IPS=(1.2.3.4 2.3.4.5 127.0.0.1)
if printf '%s\n' ${TRUSTED_IPS[@]} | grep -q -P "^$REMOTE_ADDR\$"; then
    echo "Trusted IP"
    exit 0
fi

if [[ "$REMOTE_ADDR" != "${1#*[0-9].[0-9]}" ]]; then
  /sbin/ipset add honeypot4 ${REMOTE_ADDR}
  /sbin/conntrack -D -s ${REMOTE_ADDR}
elif [[ "$REMOTE_ADDR" != "${1#*:[0-9a-fA-F]}" ]]; then
  /sbin/ipset add honeypot6 ${REMOTE_ADDR}
  /sbin/conntrack -D -s ${REMOTE_ADDR}
else
  echo "Unrecognized IP format '$1'"
fi

We could also use firewall-cmd to add to IP sets, but of course, we want to avoid the heavy lifting of the Python interpreter. On a 1 CPU VPS, firewall-cmd --ipset=honeypot4 --add-entry=... runs 0m0.494s while pure binary ipset add honeypot4 ... took only 0m0.002s to run!

That said, if you don’t care about the millisecond differences between different ways of blocking, using fds program will yield the cleanest approach.

sudo yum -y install https://extras.getpagespeed.com/release-latest.rpm
sudo yum -y install fds

With fds, our script becomes:

#!/bin/bash

if [[ -z ${REMOTE_ADDR} ]]; then
    if [[ -z "$1" ]]; then
        echo "REMOTE_ADDR not set!"
        exit 1
    else
        REMOTE_ADDR=$1
    fi
fi

# Put space separate list of trusted IP addresses, not to lock yourself out if you like to test the honeypot! : )
TRUSTED_IPS=(1.2.3.4 2.3.4.5 127.0.0.1)
if printf '%s\n' ${TRUSTED_IPS[@]} | grep -q -P "^$REMOTE_ADDR\$"; then
    echo "Trusted IP"
    exit 0
fi

fds block ${REMOTE_ADDR} --ipset honeypot

sudo!!

What else is missing? Surely enough NGINX runs under non-privileged user and can’t sudo /usr/local/sbin/block-ip.sh, yet. We want to allow nginx to gain privileges to run the script as the superuser. Moreover, for security reasons, we will allow only the REMOTE_ADDR environment variable to be passed while launching the script.

So create sudoers configuration by running sudo visudo -f /etc/sudoers.d/nginx-block-ip.
This command will open up the system editor (likely vim), in edit mode. Simply paste in:

Defaults!/usr/local/sbin/block-ip.sh env_keep=REMOTE_ADDR
nginx ALL=(ALL) NOPASSWD: /usr/local/sbin/block-ip.sh

Then close the editor normally, e.g. by typing :wq for vim, or Ctrl+X for nano.

That’s it! Restart NGINX and you’re good to go, NGINX will start banning bots for you. But there’s, even more, you can do… 🙂

Host header vulnerability protection

The honeypot approach is great when you want to protect your server from the hostname injection vulnerability.

The Host header vulnerability happens when a server or a website trusts the Host header in incoming internet requests without checking if it’s safe. This header is supposed to tell the server which website or page the visitor wants to see. But if bad guys change the Host header to something malicious, they can trick the server into sending them to a fake or dangerous place, steal information, or cause other security problems. It’s like sending a letter with the wrong address on purpose and then intercepting it to cause mischief.

; tldr #2 – Evil bots don’t even know your domain name!

Most of “current wave” bots will only know your IP, because they are scanning public IPv4 ranges and iterating one IP after another as their victim.
This is also when host header vulnerability takes place – they just put a random or intentionally incorrect Host header in their requests.

Those bots will share these common characteristics:

So you can greatly reduce the load from those bots by blocking any client that does not provide valid hostnames. Obviously, valid hostnames are simply all domains that you host on your server, and any other domain would be an invalid hostname.

In /etc/nginx/nginx.conf, setup a map listing all your website domain names:

map $http_host $default_host_match {
    getpagespeed.com 1;
    www.getpagespeed.com 1;
    default 0;
}

In a server location of your websites, add honeypot for bad hostname:

error_page 410 = @honeypot;
if ($default_host_match = 0) {
    return 410;
}
location @honeypot {
    include includes/honeypot.conf;
}

With the configuration above, NGINX will check if the requested domain is in the list specified in nginx.conf. If there is no match, then 410 status code is returned, which is handled by the @honeypot named location, which, of course, launches our bash script for banning.

Caveats

In our important honeypot location ^/wp-content/.*\.php$ which denies access to all PHP files under /wp-content, there is a slight chance that you have a bad WordPress plugin that uses just this location to execute its PHP files. Such plugins should be reported and dealt with. But sure enough you don’t want to block valid users from your website.

To act out of extra precaution you may want to temporarily return 411; in this location and monitor your traffic with a script:

import os
import re



# grep 411 logs/access.log | grep wp-content > analyze.log

# Open a file for reading
with open('analyze.log', 'r') as f:
    # Read the entire contents of the file
    log = f.read()
    # Print the content
    # extract requested URL
    uris = re.findall(r'"(?:GET|POST) (?P<uri>\S*)', log)
    for uri in uris:
        uri = uri.split('?')[0]
        file_path = "httpdocs" + uri
        print(file_path)
        if os.path.exists(file_path):
            print(file_path + " exists and requested!")

If the script returns no result, it means there are no actual PHP files exist in your WordPress plugins which are being accessed. If there are, those plugins should be removed or replaced.
As a last resort, you should whitelist those files in the honeypot location. For example if you must have /wp-content/plugins/wp-invoice/lib/gateways/js/wpi_gateways.js.php executed, then instead of ^/wp-content/.*\.php$ use ^/wp-content/(?:(?!plugins/wp-invoice/lib/gateways/js/wpi_gateways\.js\.php).)*\.php$.

How this compares to anything

Surely enough, you should not use this approach alone. There is never “enough security”, and you should use Fail2ban, Malware Detect, and ModSecurity.

However, we can see how the honeypot approach can complement the mentioned tools, and soften their disadvantages.

For example:

So other tools may be either slow or “too late”. They are surely enough useful. But now we can do better with the additional security layer thanks to the NGINX honeypot approach.

Exit mobile version