yum
upgrades for production use, this is the repository for you.
Active subscription is required.
Ever seen this snippet below for Varnish virtual hosts and wondered how you’re going to manage a dozen of websites with the same dozen of if
statements in your VCL file?
if (! req.http.Host) {
error 404 "Need a host header";
}
set req.http.Host = regsub(req.http.Host, "^www\.", "");
set req.http.Host = regsub(req.http.Host, ":80$", "");
if (req.http.Host == "something.com") {
include "/etc/varnish/site-something.com.vcl";
} elsif (req.http.Host == "somethingelse.com") {
include "/etc/varnish/site-somethingelse.com.vcl";
}
While Varnish is so fine and great, it really lacks some documentation and tutorials on setting up virtual hosts the right way.
Varnish Virtual Hosts
Why do we need virtual hosts in Varnish so much? It’s a caching server. It doesn’t care for the domain name that is present in a request. It simply passes a request along to the backend server, or, if it’s present in Varnish cache, serves it directly without talking to Nginx or Apache.
But we need virtual hosts in Varnish. Because different sites use different technologies, different login pages, and so most importantly, they use different cookie names. Cookies are the primary reason the need for Varnish virtual hosts exists. So that we can filter against different cookies.
In general, we need Varnish to distinguish between the sites to adjust its caching policy towards specific website.
There is no built-in way and likely would never be. However, having the understanding of how the VCL works, you can manage to define your virtual hosts very similar to the way you love to do it in Nginx: through sites-available and sites-enabled directories. So let’s go.
How Varnish VCL works
Before we proceed to implementing Varnish virtual hosts, let’s review the most important thing about VCL – how include files work.
When you land with your new Varnish installation, you start coding from default.vcl
. However, you have to realize one thing. There is another file with very base default VCL rules which Varnish has internally, let’s call it builtin.vcl. After executing routines in our default.vcl
, Varnish will append routines from builtin.vcl
making those run after the ones in our VCL file.
The two files may have the same routines, i.e. vcl_recv
in both files, and these routines would both run on every request. In this order:
- first,
default.vcl
- last,
builtin.vcl
So the same routine, defined in last included file, will stack up and be called last.
If we include another file, say my.vcl
and define vcl_recv
in there, Varnish will run it in this order:
vcl_recv
fromdefault.vcl
vcl_recv
frommy.vcl
vcl_recv
frombuiltin.vcl
How is this multiple files inclusion any useful?
To make things flexible, Varnish would not call routines from included file, if you put return(...)
statement in procedure of the current file.
It means that we can prevent Varnish default behavior (found in builtin VCL) by running specific logic on the same routine, and we can extend things further using include files.
So if vcl_recv
had return(...)
in default.vcl
, then Varnish would only run:
vcl_recv
fromdefault.vcl
Varnish Virtual Hosts strategy
So here’s the strategy we should start with when we code our VCL for multiple hosts. Let’s review on that same routine vcl_recv
, which is most important, since it commonly have rules for filtering cookies or setting backend hints.
We assume you’re using CentOS/RHEL based paths, you can adjust accordingly for Debian derived systems.
First, create a directory holding your virtual hosts:
mkdir /etc/varnish/sites-enabled
Suppose we have a site a.example.com, it’s a WordPress blog with comments disabled. We want to have it ignore all the cookies except for the /wp-admin. Let’s create virtual host file.
nano /etc/varnish/sites-enabled/a.example.com.vcl
And paste in:
sub vcl_recv {
if (req.http.host == "a.example.com") {
# ignore all cookies on a WP site without comments (except for admin areas)
if (req.url !~ "^/wp-(login|admin)") {
unset req.http.cookie;
}
}
}
Now, another website of ours, b.example.com is so much different. It’s a Trac ticketing website and it runs using standalone Python app on a different port!
nano /etc/varnish/sites-enabled/b.example.com.vcl
And paste in:
backend trac {
.host = "127.0.0.1";
.port = "3050";
}
sub vcl_recv {
if (req.http.host == "b.example.com") {
set req.backend_hint = trac;
}
}
Another website of ours, has WordPress with Woocommerce plugin. We don’t want to cache Woocommerce pages there. So we run:
nano /etc/varnish/sites-enabled/c.example.com.vcl
And paste in:
sub vcl_recv {
if (req.http.host == "c.example.com") {
if (req.url ~ "/(cart|my-account|checkout|addons|/?add-to-cart=)") {
return (pass);
}
}
}
For every website, we use Google Analytics tracking. So let’s create handling for all the hosts in the file /etc/varnish/catch-all.vcl
with the following:
sub vcl_recv {
set req.http.Cookie = regsuball(req.http.Cookie, "_ga=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "_gat=[^;]+(; )?", "");
}
Next, we want to put everything together.
Update default.vcl
in the following way:
vcl 4.0;
...
sub vcl_recv {
....
# Normalize the header, remove the www and port
set req.http.host = regsub(req.http.host, "^www\.", "");
set req.http.host = regsub(req.http.host, ":[0-9]+", "");
}
...
# at the very bottom:
include "all-vhosts.vcl";
include "catch-all.vcl";
Create all-vhosts.vcl
file. It should contain:
include "sites-enabled/a.example.com.vcl";
include "sites-enabled/b.example.com.vcl";
include "sites-enabled/c.example.com.vcl";
Now we can reload Varnish by running service varnish reload
. Varnish will handle different websites in specific way. Our main VCL file will not be abused by dozens of if
statements and we can always disable special handling by commenting an include from all-vhosts.vcl
file and reloading again.
The basic rules of placing VCL logic this way are the following:
vcl_recv()
indefault.vcl
should contain things like normalising headers. It is crucial that this procedure does not callreturn(...)
statementvcl_recv()
in virtual host files likesites-enabled/a.example.com.vcl
should contain filtering that is specific to this domain and may optionally callreturn(...)
to halt further processing or filtering. It may also contain backend hints or rules to skip cache for specific URLsvcl_recv()
incatch-all.vcl
should contain just very common filtering, i.e. Google Analytics cookies or anything that is common for all the sites
You can start with the following sample configuration. Feel free to fork or send pull requests.
Photofolio
When creating different .conf files – eg.
nano /etc/varnish/sites-enabled/example3.com.conf
you do not create .vcl files but .conf files.Then at all-hosts.vcl, you refer to .vcl files – I guess you were meant to create the .conf files as .vcl files ?
Danila Vershinin
Yes, you’re right. Thanks for noticing.
I’ve updated the post with the fixes and also added link to sample configuration repository at GitHub.
TheWriter
Thanks for this! A lot of conflicting stuff when Googling and this seems the most helpful. I have a few questions, if I may.
Update default.vcl in the following way:
Am I to replace everything in my default .vcl with what you’ve provided? What about my
backend default {
stuff?I use W3TC (WordPress) to control the purging of Varnish and it requires some additional configuration for the default.vcl — is this compatible? If you’re willing to look it’s in «/wp-content/plugins/w3-total-cache/ini/varnish-sample-config.vcl»
I’ve added everything you’ve provided along with ^^ these two and so far I think it’s working… the site hasn’t crashed at least Lol.
So I’m using ServerPilot.io for my stack and they way they recommend, because I use SSL, it becomes nginx >> varnish >> apache >>php-fpm
But in their notes, perhaps, doing multiple vhost may be built in? https://serverpilot.io/community/articles/how-to-install-and-configure-varnish.html
Sorry for ALL these questions just looking into some insight. I’ve not been happy with my Redis configuration, my fastcgi config, etc… when my site hits pounded I see my server load and memory usage still climb to levels that I feel, with either or both of those configured, it shouldn’t reach.
I think Varnish is the answer.
Thank you!
Danila Vershinin
You would overwrite everything in your
default.vcl
but of course, keep the lines relevant to your configuration (backend definitions).I do not think ServerPilot is worth it. I have a post about them here.
For invalidation of cache: you would better integrate the purge logic from DreamHost VCL collection here. Just
include
it from yourdefault.vcl
file. It has superior logic that is more friendly to CPU, compared to the one in W3TC sample.MisUszatek
For some reason default.vcl show error on reload. Is it possibel that Lets encrypt certificate can cause it? Any hint on how to set this script for Googel Cloud / Debian / SSL / multiple domains on one instance via Virtual Host? Thank you
Danila Vershinin
What do you get when running
service varnish status
?syscall0
Hello Danila, do you think your approach still valid? On new varnish versions was introduced support to multiple vcl using label: https://varnish-cache.org/docs/5.1/users-guide/vcl-separate.html
What do you think? Thanks
Danila Vershinin
Yes, it’s still valid. Moreover, I find it is cleaner than their suggested VCL.
The
vcl.load
allows you to load and label VCL files. I believe this was there for a while.Then instead of
include
ing domain-specific logic in the main VCL, the labeled VCLs are included invcl_recv
, viareturn (vcl(domain_label));
(this is new).It is maybe cool because you can reload the configuration for just a single piece of Varnish logic (e.g. website), as opposed to a complete VCL reload.
But it looks like you’ll have to write some custom startup scripts to make sure that all the files are loaded when you start Varnish.
And
if
‘s are not going anywhere: with many domains, you’ll end up with a hugevcl_recv
in the main file.