Cloudflare is the major global CDN and DNS service. We have blogged about it in the past in our Cloudbleed and Varnish post.
Sure enough, building your own CDN powered by Varnish may not be a trivial task and, provided that Cloudbleed was one of the rare incidents with Cloudflare, you might want to use their services.
But what are the challenges while applying Cloudflare CDN to your NGINX powered website?
NGINX and Cloudflare
When you put Cloudflare in front of your website, all the end visitors will connect to your website indirectly, via Cloudflare, and your server will only see Cloudflare IP addresses instead of the real remote visitors’ IP addresses.
To ensure that NGINX “sees” real visitor IP addresses, Cloudflare passes that data in a special HTTP header, X-Forwarded-For
. So you can teach your NGINX to use that header’s value as client IP addresses:
real_ip_header X-Forwarded-For;
However, the challenge here is ensuring that this header cannot be spoofed and trusting this header’s value only when sent by requests from Cloudflare networks.
If anyone knows your server’s IP address, they can connect to it directly and provide arbitrary value for
X-Forwarded-For
thus disguising themselves as someone else.
Cloudflare publishes its connecting IP addresses for IPv4 and IPv6 here.
To set up your NGINX with Cloudflare you will have to take those provided IP sets and include them to your NGINX configuration using realip
module’s set_real_ip_from
directive:
set_real_ip_from 103.21.244.0/22;
set_real_ip_from 103.22.200.0/22;
set_real_ip_from 103.31.4.0/22;
set_real_ip_from 104.16.0.0/12;
... long list of networks follows
By doing this, we tell NGINX that if a request comes from any of those networks that belong to Cloudflare, it should rewrite real IP address to the one that is sent to it in X-Forwarded-For
header.
Now, you can see where it can get quite boring. Whenever Cloudflare updates / expands its network of operations (literally), you have to manually update your NGINX configuration and list additional networks inside it.
But no more 🙂
Instead of manually putting those IP addresses and checking them manually for updates, here’s the automated solution using our always up-to-date packages for CentOS 7.
Install Cloudflare IPs list for NGINX
yum install https://extras.getpagespeed.com/release-latest.rpm
yum install nginx-cloudflare-ips-v4 nginx-cloudflare-ips-v6
Configure NGINX to trust Cloudflare IP addresses
Simply put the code below to e.g. /etc/nginx/conf.d/cloudflare.conf
(or directly in http { }
configuration):
include /etc/nginx/cloudflare/realip-from-ipv4.conf;
include /etc/nginx/cloudflare/realip-from-ipv6.conf;
real_ip_header X-Forwarded-For;
real_ip_recursive on;
And that’s about it. Should Cloudflare update their IP addresses again, simply running yum update nginx-cloudflare-ips-*
will ensure the latest IP list and magically reload NGINX configuration for you. Abou it, but not. Why do we have to manually run yum update
? 😀
Auto-update Cloudflare IP lists
Following our auto-upgrade approach for Amplify package, we can do the same for selected packages from our repository as well.
Assuming that you already have set up yum-cron
for automatic updates of Amplify, here are the remaining for pieces for auto-updating Cloudflare IP lists as well.
In the new file /etc/yum.auto.repos.d/getpagespeed.repo
you will specify the noarch
subrepository and includepkgs
will ensure that only specific packages will be auto-updated:
[getpagespeed-extras-noarch]
name=GetPageSpeed packages for Enterprise Linux - noarch
baseurl=https://extras.getpagespeed.com/redhat/7/noarch/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-GETPAGESPEED
includepkgs=nginx-cloudflare-ips-*
About the nginx-cloudflare-ips-* packages
How do we ensure that they are always up-to-date? All thanks to our awesome repository that is smart enough to automatically build upstream stuff.
Specifically for these packages, our system checks the .txt
resources published by Cloudflare, does the sanity checks that they do contain network lists and not some other data, templates that data into NGINX configuration and automatically packages new releases if the lists are really different than 24 hrs before.