Grav: Advanced configuration for nginx

There are some small changes you can make to make your website faster, and a little more secure.

Security enhancements

The original configuration will keep you safe, but it is always good to see what else can be done.

Giving less information to attackers

By default, the nginx.conf will return "403 Forbidden" errors for system files that are not supposed to be accessed directly. However, there's a good alternative, like so:

server {
    ...

    ## Begin - Index
    # for subfolders, simply adjust:
    # `location /subfolder {`
    # and the rewrite to use `/subfolder/index.php`
    location / {
        try_files $uri $uri/ @index;
    }

    location @index {
        try_files = /index.php?_url=$uri&$query_string;
    }
    ## End - Index

    ## Begin - Security
    # set error handler for these to the @index location
    error_page 418 = @index;
    # deny all direct access for these folders
    location ~* /(\.git|cache|bin|logs|backup|tests)/.*$ { return 418; }
    # deny running scripts inside core system folders
    location ~* /(system|vendor)/.*\.(txt|xml|md|html|yaml|yml|php|pl|py|cgi|twig|sh|bat)$ { return 418; }
    # deny running scripts inside user folder
    location ~* /user/.*\.(txt|md|yaml|yml|php|pl|py|cgi|twig|sh|bat)$ { return 418; }
    # deny access to specific files in the root folder
    location ~ /(LICENSE\.txt|composer\.lock|composer\.json|nginx\.conf|web\.config|htaccess\.txt|\.htaccess) { return 418; }
    ## End - Security
    ...
}

What happens here is the following:

Try and see if the file exists on disk, and if not, give the request to the @index location.
The new @index location will reroute requests to the /index.php as usual.
Instead of returning a "403 Forbidden" error, we now return a 418.
Because we set the error_page 418, any 418 will be handled by the @index location.
Grav's /index.php will pick up the route it is given, determine if there's a matching route, and if not, simply return a 404 by itself.

Normally, we route all non-existing files to Grav. However, returning any status code from nginx itself, will give a different kind of error than if it had been routed through Grav. That gives the attacker the information that those files are special and actually exist. More than that, that you explicitly don't want them to try and read those. They will try harder.

This is better, because if you reroute it to Grav, Grav will handle it like any other non-existing file.

No direct access to other .php-files

In the example nginx.conf, all requests to files ending in .php are sent to the PHP-handler. This is not necessary, as Grav only uses the /index.php to route requests. Every other location is handled internally.

Some vulnerabilities in CMS's are targeted specifically at plug-ins, themes or other third-party libraries. There is no reason to keep direct access to them open (unless you run Grav combined with other pieces of software in the same webroot!).

So, alternatively, only route /index.php to the PHP-handler, and block the rest! Like so:

server {
    ...
    ## Begin - PHP
    location = /index.php {
        # Choose either a socket or TCP/IP address
        fastcgi_pass unix:/var/run/php7.0-fpm.sock;
        # fastcgi_pass 127.0.0.1:9000;

        fastcgi_split_path_info ^(.+\.php)(/.+)$;
        fastcgi_index index.php;
        include fastcgi_params;
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    }
    ## End - PHP

    ## Begin - Security
    ...
    # deny access to other .php-scripts
    location ~ \.php$ { return 418; }
    ...
    ## End - Security
    ...
}

With this, any .php-file that is not /index.php, will be rerouted to and be handled by /index.php.

Performance

nginx is a very capable webserver, but it is also very capable of doing advanced caching.

Do not check for the existence of directories

Many, many examples use this line:

try_files $uri $uri/ @index;

Few people actually realise that it does, which is:

Check for the existence of the file.
If it does not exist, try and see if there's a directory with that name.
Otherwise, fall back to @index (/index.php).

But, in reality, you only really use $uri/ (step 2) if you really are linking to a directory that will have its own index.php or index.html. Considering you're using Grav, you will only go to /, and that already goes to /index.php in step 3.

If you're not going to use it, don't keep it around, and you will take out an extra filesystem stat():

try_files $uri @index;

Caching filesystem metadata of files

The open file cache caches metadata about files. If they exist or not, what file permissions they have, if they are readable or not.. This can help a little on local filesystems, especially that are not on SSD. It shines more in environments with network storage (like NFS).

Keep in mind that this goes outside of the server{}-block, directly into the http{}-context:

open_file_cache                 max=10000 inactive=5m;
open_file_cache_valid           1m;
open_file_cache_min_uses        1;
open_file_cache_errors          on;

server {
    ...
}

In this example, we told nginx to:

Keep a maximum of 10k entries.
Delete metadata of a file from the cache if they it is not used for 5 minutes.
Refresh the metadata it has every minute.
Put the metadata of a file in the cache immediately upon accessing it the first time.
Cache errors like "Permission denied", "Not found", as well.

Precompressing resources

In the example nginx.conf, we enable GZip-compression. While excellent, it also means that the output is compressed on a per-request basis. This in turn means that with every request, it adds extra CPU-cycles and latency (you have to wait or the compression to be done).

There is an alternative in nginx, which is gzip_static. gzip_static will make nginx look for the same file, but with a .gz extension. So if I have main.css, it will try and see if there is already a main.css.gz present, and send that instead.

To enable this, use:

server {
    ...

    ## Begin - Index
    ...

    location / {
        try_files $uri @index;

        location /assets {
            gzip_static on;
        }
    }

    ...
    ## End - Index
    ...
}

By using a nested location /assets, you will not incur extra filesystem stat()s for the rest of Grav. If enabled, the /assets location will contain pipelined/minified assets.

nginx does not automatically compress the files for you. You will have to do this yourself.

To compress the files (on UNIX-based systems), you can do the following:

cd assets
for asset in *.css *.js; do gzip -kN9 "$asset"; done

Note that these are automatically deleted when you clear your (asset) cache, and you will have to redo it after new resource files are created. There is an outstanding Pull Request to allow automatic precompression of assets upon creation of these files.

Enable FastCGI caching.

DO NOT USE THIS KIND OF CACHING IF YOU HAVE DYNAMIC PAGE CONTENT

The following example is safe to use with authentication, and the Admin interface. It is also safe for dynamic page content, but your dynamic content will not be dynamic anymore (as it is aggressively statically cached).

nginx has caching for FastCGI. In the example below, we will leverage this. Note that if anything on the site sets a cookie or similar dynamic content headers, the cache is invalid and will not be used at all.

First, we are going to need to make a map{} in the main http{}-context (so outside/before the server{}-block) to convert our optional session cookie into a unique identifier for the cache (so users with different sessions do not share the same cached resource, all of them will have a unique copy, but cached for their own session):

# This is to have caching enabled when site sessions are turned on.
# It makes FastCGI caching safe to use with authenticated content.
map $http_cookie $sessionkey {
    default '';
    ~grav-site-(?<hash>[0-9a-f]+)=(?<sessionid>[^\;]+) $hash$sessionid;
}

server {
    ...
}

If you renamed your site session name (setting session.name) in your Grav config, update its name in the example above! The regular expression will combine the unique identifier in the cookie name with the session id in the cookie into a new variable $sessionkey, which we can later use in the fastcgi_cache_key setting. Next, define a cache zone in the same context right under it:

fastcgi_cache_path      /path/to/cache/on/disk          levels=1:2
                        keys_zone=fastcgi:10m           max_size=200m
                        inactive=60m                    use_temp_path=off;

You should change the path of where the cache is stored on disk, but what it does is:

Set the path. You can use this to cache on disks that might be faster. (Tip: Use /dev/shm/something on Linux systems to use a RAM-disk instead!).
Define the directory hierarchy structure of 1 character, then 2 characters. (i.e. /path/to/cache/on/disk/c/29/...)
Give the cache zone a name ("fastcgi" here) and an initial size (10 MB).
Allow a maximum of 200 MB to be cached.
Delete a resource from the cache if it hasn't been used for 60 minutes.
Do not base the cache path off of fastcgi_temp_path.

Next, you can use this cache zone in your configuration:

server {
    ...
    ## Begin - PHP
    location = /index.php {
        ## Begin - FastCGI caching
        fastcgi_cache           fastcgi;
        fastcgi_cache_key       "$scheme$request_method$host$request_uri$sessionkey";
        fastcgi_cache_valid     200 30m;
        fastcgi_cache_valid     404 5m;
        fastcgi_cache_valid     any 1m;
        fastcgi_ignore_headers  "Cache-Control"
                                "Expires"
                                "Set-Cookie";

        fastcgi_cache_use_stale error
                                timeout
                                updating
                                http_429
                                http_500
                                http_503;

        fastcgi_cache_background_update on;
        ## End - FastCGI caching

        # Choose either a socket or TCP/IP address
        fastcgi_pass unix:/var/run/php7.0-fpm.sock;
        # fastcgi_pass 127.0.0.1:9000;

        fastcgi_split_path_info ^(.+\.php)(/.+)$;
        fastcgi_index index.php;
        include fastcgi_params;
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    }
    ## End - PHP
}

Now, what is happening here, is the following:

The cache zone named "fastcgi" will be used for this (we defined that earlier).
The cache key to lookup a cached resource is set to something like "httpsGETwww.example.com/".
Cache times:
- 200 "OK"-responses: 30 minutes.
- 404 "Not Found"-responses: 5 minutes.
- Any other status code: 1 minute.
Ignore the Cache-Control, Expires and Set-Cookie headers (since we're in control here).
Keep serving our cache when the following happens:
- There is an error.
- There is a timeout.
- nginx is busy updating the cache.
- There is a 429 Too Many Requests on the backend (backend is too busy).
- There is a 500 Internal Server Error on the backend (backend has a misconfiguration).
- There is a 503 Gateway Timeout to the backend (backend has other problems).
And last but not least: Upgrade the cache in the background, don't make our clients wait.

Please note, once again, that this will not (and should not) work if you're using the Admin panel, or use any other sessions on your site. It is not recommended to use this caching with any authenticated content, for security reasons.

If you wish to test if your cache is working, you can add a simple header (remove on production):

add_header X-Cache "$upstream_cache_status - $scheme$request_method$host$request_uri$sessionkey";

You can view the header with cURL (or any other tool):

$ curl -I https://www.example.com/
...
X-Cache: MISS - httpsHEADexample.com/en - 7b2fdab20bb2fbi85l61eounburtlxavo
...
$ curl -I https://www.example.com/
...
X-Cache: HIT - httpsHEADexample.com/en - 7b2fdab20bb2fbi85l61eounburtlxavo
...
$

Again, you might want to disable the header after confirming it works.

If you feel the need to purge your cache entirely:

Stop nginx.
Remove the directory.
Start ngnix.

The cache is automatically updated and pruned in the background, so you shouldn't need to do so.

More fine-grained control over (not) caching

If you use the admin interface, you might want to disable caching globally once the admin cookie has been set. Note that an admin cookie is set when you go to the /admin URL, and that anyone can go there by default. You don't need to login to disable caching.

You can use the following example as a basis. Put the map{}s outside your server{} and the fastcgi_*-directives with the rest of the caching directives:

# This is used by fastcgi_cache_bypass and fastcgi_no_cache.
# If you don't want certain URI's cached, add them here with a value of 1.
map $request_uri $no_cache1 {
        default                 0;
        ~^/(../|)admin          1;
}

# This is used by fastcgi_cache_bypass and fastcgi_no_cache.
# To disable caching based on cookie names, add them here with a value of 1.
map $http_cookie $no_cache2 {
    default 0;
    ~grav-site-([0-9a-f]+)-admin=([^\;]+) 1;
}

server {
    ...
    location = /index.php {
        ...
        fastcgi_cache_bypass $no_cache1 $no_cache2;
        fastcgi_no_cache     $no_cache1 $no_cache2;
        ...
    }
}