How to keep track of file downloads on a web server using PHP

Background

Keeping track of file downloads is quite tricky. On the one hand, you have tools such as Google Analytics that gives you everything and the kitchen sink. But have you ever tried to find out just how many times a file has been download using Google’s tool? Please leave a comment below if you have figured this out.

Other common analytics programs such as Webalizer provided with many control panels will give you a slightly more sane view of popular files, but if downloads do not feature in the top lists you’re back to having no data at your disposal.

This article explains how to keep track of file downloads, and at the same time, user’s IP addresses who are downloading the files. Most of the code was found on Stack Overflow, here, but modified to make the application slightly more generic.

Methodology

The methodology used in this article used server rewrite rules to intercept specific locations and then redirect them to a PHP script. Both Apache and NGINX web servers have rewrite functionality, and this is where the fun begins. On NGINX the rewrite rule looks something like this:

location /downloads/ {
  rewrite /downloads/(.*).(rar|zip|pdf)$ /tracker/download.php?file=$1.$2;
}

Let’s break down this rewrite rule.

For starters, you might already have a couple of rewrite rules, so it’s important to add this one as an additional rule. The location specifies that the rewrite rule must only kick in when /downloads/ is accessed. One can derive from this that all our downloadable files will be stored there.

Next a code block { and } is presented which what exactly must happen when this rule is hit.

At this point we specify that any hit on this location must rewrite ANYTHING full stop RAR or ZIP or PDF $ TO THE END OF LINE. The characters (.*).(rar|zip|pdf)$ is a regular expression that captures (by way of brackets) two parameters. Two sets of brackets means two sets up parameters.

From here we send it to the destination which is /tracker/download.php

An URL parameter is appended ?file and this then concatenates the two parameters obtain through the regular expression.

Here is an example, the user tries to download:

https://mysite.com/downloads/a-very-cool-blog.pdf

This will be rewritten to:

/tracker/download.php?file=a-very-cool-blog.pdf

From here the PHP script takes over and processes the file.

The gust of the PHP script is:

  • Set a base directory
  • Determine the filename
  • Determine the filename relative to the base
  • Get the user’s IP address
  • Insert the filename and IP address into a database
  • Set various content headers
  • Send the file back to the browser

In summary, every hit to the file server is intercepted, checked for the downloads directly. If a file that matches the regular expression is found, it’s sent to a script, inserted into a database, and sent back to the browser.

Quite the mouthful, and here is the script if you want to see more.

Please leave us comments if you have questions or comments about this script.

Share this article

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to Top