fbpx

Server Setup

NGINX: try_files is evil too

by , , revisited on


We have by far the largest RPM repository with NGINX module packages and VMODs for Varnish. If you want to install NGINX, Varnish, and lots of useful performance/security software with smooth yum upgrades for production use, this is the repository for you.
Active subscription is required.

The benefits of try_files

NGINX has many useful directives that allow you to set up websites in a clean and consistent way.
The try_files is one of those handy directives. It allows you to set up a website for the use of SEO-friendly URLs.

Most websites follow the front-controller rewrite pattern.
The requests for pretty SEO URLs are routed through a bootstrap file of your PHP framework, e.g. /index.php.

Let’s see the typical config for this:

index index.html index.php;
location / {
    # This is cool because no PHP is touched for static content.
    # include the "?$args" part so non-default permalinks doesn't break when using query string
    try_files $uri $uri/ /index.php$is_args$args;
}
location ~ \.php$ {
    include fastcgi_params;
    fastcgi_pass unix:/var/run/php-fpm/example.com.sock;
}

The comment makes it very clear: the major win of the try_files is serving static files without touching PHP-FPM.
In other words, only NGINX is involved in serving any static files, which is cool indeed.
Can it get cooler though?

The usefulness of the try_files directive builds entirely on the assumption that you don’t know where all your static files are located.
So simply dropping this configuration in a new NGINX setup makes most of the websites just work.

And every static file which exists on the file system is served directly by NGINX as a great performance benefit.

Performance penalty of try_files

Assuming too much is never a good thing. The try_files comes with a performance penalty of file existence checks.
Having such checks may seem like a negligible thing, but as your traffic grows, you will want to reduce the disk operations to improve the latency of the response.

try_files evaluates its arguments from left to right, while doing file existence checks against them, except the very last one.
The last argument specifies a URI or a named location that will satisfy a request if none of the filenames in preceding arguments exist.

With the above configuration, for a URI of /some-pretty-foo, NGINX will actually run stat system calls against these files/directories:

  • /some-pretty-foo
  • /some-pretty-foo/

2 filesystem locations checked for existence on every single request to that URI.
All in vain, because the request is internally redirected to /index.php for being unconditionally served by the PHP-FPM.
The performance impact is greater, the more arguments you pass to try_files.

You can see that the index directive has a performance penalty too.

If the directory /some-pretty-foo/ actually exists, additional file checks influenced by the index directive will take place:

  • /some-pretty-foo/index.html
  • /some-pretty-foo/index.php

So the worst which may happen is 4 stat calls.

Each additional entry to the file list in the index directive, may cause up to (length of the list) additional file checks.

Changing to try_files $uri /index.php?$args; is usually a safe thing to do and this alone saves 1 to 3 stat calls in our example.

Of course, the performance impact of try_files will be most obvious with slower disks.
But even with SSD disks, there is an additional delay that will incur from the use of the try_files directive.

You can improve the try_files performance by caching the information about the non-existence of files and directories, using open_file_cache.
This may be a good solution overall, although it doesn’t really solve the initial issue of unnecessarily checking file existence.

Living without try_files. Going faster and more secure

Knowing how try_files is evil, we don’t need to fight the fire with fire by making an assumption.
The performance-friendly configuration without try_files will build upon a simple fact:

As long as you use a well-structured framework/CMS, you do know where all static files are located.

And indeed, that is really the case for the majority of frameworks. Few notable examples:

  • WordPress has a defined location for static files, /wp-content/.
  • Magento stores static files in a handful of directories, but there are not many, e.g. /media/, /static/

Sure enough, there are commonly several static files stored in the root of the websites, of which /robots.txt and /sitemap.xml are the primary representatives.

Setting those up for serving by NGINX is a no-brainer.
Simply add a location where your static files are stored, so NGINX will continue to shine by serving them directly.
At the same time, SEO-friendly URLs are going to be delivered through PHP-FPM.

Let’s see how our config can be rewritten without try_files:

index index.html index.php;
location / {
    fastcgi_param SCRIPT_FILENAME $document_root/index.php;
    include fastcgi_params;
    # override SCRIPT_NAME which was set in fastcgi_params
    fastcgi_param SCRIPT_NAME /index.php;
    fastcgi_pass unix:/var/run/php-fpm/example.com.sock;
}
location /wp-content/ { }

So far, we have made most of the requests go through PHP-FPM. They are routed through /index.php.
But we also added the empty block for /wp-content/ which will ensure that most of the static files are served by NGINX directly.
There is no fastcgi_pass directive in this location, so there is no PHP-FPM routing there.

But let’s not forget that the wp-content does not hold static files alone.
There are also plugins PHP files, e.g. wp-content/plugins/foo/foo.php.

As a rule that applies to all frameworks, you would choose to completely deny the execution of interpreted files that live alongside your static files directory, for added security.
In some exceptional cases, you would whitelist some PHP scripts there to be executed, but this will be quite uncommon.

So adding security to the performance:

location ~ ^/wp-content/.*\.php$ {
    deny all;
}

Tip: read here for a more in-depth secure WordPress NGINX configuration.

Finally, we need to ensure that any static file types which live outside wp-content, are served by NGINX directly as well.
You can combine this with optimizing browser cacheability by supplying Far Future Expire headers:

location ~* ^.+\.(xml|txt|css|js|7z|avi|bz2|flac|flv|gz|mka|mkv|mov|mp3|mp4|mpeg|mpg|ogg|ogm|opus|rar|tar|tgz|tbz|txz|wav|webm|xz|zip|bmp|csv|doc|docx|gif|jpeg|jpg|less|odt|pdf|png|ppt|pptx|rtf|svgz|swf|webp|woff|woff2|xls|xlsx)$ {
    expires max;
}

The list of file types does not have to be so lengthy, as you only want to list the file types which are located outside your main static directory.

So our final config may look like this:

index index.html index.php;

location / {
    fastcgi_param SCRIPT_FILENAME $document_root/index.php;
    include fastcgi_params;
    # override SCRIPT_NAME which was set in fastcgi_params
    fastcgi_param SCRIPT_NAME /index.php;
    fastcgi_pass unix:/var/run/php-fpm/example.com.sock;
}

location ~ ^/wp-content/.*\.php$ {
    deny all;
}

location ~ \.php$ {
    include fastcgi_params;
    fastcgi_pass unix:/var/run/php-fpm/example.com.sock;
}

location /wp-content/ { 
    expires max;
}

location ~* ^.+\.(gif|jpeg|jpg|png|xml|txt|css|js)$ {
    expires max;
}

Listing every static file type is important to have NGINX serve the files directly.
Instead of taking a guess on which file extensions to put in the last location block, you can resort to a simple Python script to find all the file extensions minus the ones used by PHP.

Save the contents of the script to e.g. ~/.local/bin/generate-filetypes-location and make it executable

#!/usr/bin/env python2

import json
import collections
import itertools
import os

root = os.getcwd()
files = itertools.chain.from_iterable((
    files for _,_,files in os.walk(root)
    ))
counter = collections.Counter(
    (os.path.splitext(file_)[1] for file_ in files)
)
# print json.dumps(counter, indent=2)

out = []
for ft in counter:
    if '-' in ft or '_' in ft or '(' in ft:
        continue
    ft = ft.lower().lstrip('.')
    if ft and ft not in ['php', 'phtml']:
        out.append(ft)


out_u = list(set(out))

print('location ~* \.(' + '|'.join(out_u) + ') {')

cd to the webroot directory of your website (defined by root directive in NGINX), and run the script ~/.local/bin/generate-filetypes-location.

This emits the opening clause of the static files location block, for copy-pasting to NGINX configuration.

Finally, for a fail-safe, you may want to add another location that will ensure any requests with a dot are served by NGINX directly.
Those are typically not something you want to route through PHP-FPM:

location ~ \. { }

By listing the known static files directory, as well as adding a location for all the possible static files, we have brought the chance of serving static files through PHP to ~ none.
So there, we have eliminated the try_files from the config and the performance impact it brings is gone.

It may be not the approach for the faint-hearted, as you might have to revisit the configuration any time you add another file type.

But keeping the try_files around with its drawbacks should not be used as an excuse for saving your time.

By having try_files, you sacrifice performance for a little convenience of not fully configuring your website.

  1. Bohan Yang

    excellent article. try_files is NOT evil but just don’t ABUSE it.

    Reply
  2. A. Couriel

    Thanks for the article. I have a question — will this also work for WordPress “pretty permalinks”?

    Reply
    • Danila Vershinin

      Yes, of course. “Pretty permalinks” is merely how WordPress refers to its SEO URLs implementation. Similarly to other frameworks, pretty permalinks route /some/pretty/url “through /index.php.

      Reply
  3. Alex

    nice article, interesting ideas …
    but I think that location ^/wp-content/.*.php$ will not match any php file from wp-content, but the former location ~ .php$ will take precedence …

    Reply
    • Danila Vershinin

      You are right. Thank you for pointing this out. It also did not have a ~ specifier, and indeed the topmost regex location will win.
      So a more selective regex should be at the top.
      Updated with corrections.

      Reply
  4. Andreas Kirbach

    Are you sure that nginx does actually stat /some-pretty-foo/index.html and /some-pretty-foo/index.php after stat on /some-pretty-foo/?
    IMHO that woudn’t make sense, if the directory does not exist files below that directory can’t exist.

    As far as I can see from nginx debug log it only does check file /some-pretty-foo and directory /some-pretty-foo/.

    Reply
    • Danila Vershinin

      Thanks for the notice. I had an incorrect statement and updated the article. Indeed, those won’t be checked unless the directory in question exists.

      Reply
  5. Pearson

    Very useful!!! no bugs!!! I’ve apply on my nginx right now!

    Reply
  6. Pearson

    Hi, when I changed Nginx config by what you offered in the article, the pages of back-end eg. …/wp-admin/… php can’t be accessed, it showd File not found. Would you help me on that? The front-end page is correctly to show. Thank you!

    Reply
    • Danila Vershinin

      Do you have the location ~ \.php$ { ... } exactly as per the article text?

      Reply
  7. Joshua

    What does fastcgi_param SCRIPT_NAME /index.php; in the location / block actually do? It seems to work fine with or without it.

    Reply
    • Danila Vershinin

      It sets environment variable SCRIPT_NAME to point to the actual file that processes PHP requests. It does not define which one will process, but rather tells PHP scripts what is entry entrypoint. It is not essential for all CMS, but for some like Mautic for instance, it is an important addition. Such CMS rely heavily on SCRIPT_NAME to route requests properly.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.