A Closer Look at Piwik: A Google Analytics Alternative

Lukas White
Share

A recent SitePoint article by Jacco Blankenspoor introduced a number of alternatives to the ubiquitous Google Analytics. Among them was Piwik — a self-hosted, open-source package that Jacco described as a “serious contender”. In this article, I’m going to look at Piwik in more detail, to assess that claim.

Why use Piwik?

It’s probably worth starting out by comparing Piwik to the undisputed market-leader, Google Analytics.

Google has been the source of a number of privacy concerns recently, leading to a source of discomfort as to what they do with your data. With Piwik, there are no such concerns, as the data is owned by you. You can also be assured that the data won’t be shared with third-parties, particularly advertisers, unless you choose to do so.

Piwik is extremely customisable; the dashboard — Piwik’s default view — is made up of a number of “widgets”, which can be swapped in or out, or rearranged to your preference.

Piwik's Dashboard

You can modify Piwik’s appearance as much as you wish — from uploading a logo to brand it, to tweaking the colour scheme — right through to building your own theme from scratch.

Because Piwik is self-hosted, you can also specify the domain it sits on.

Piwik has most of the features you’d expect from this kind of service including real-time analytics, visitor maps, goals, campaign tracking, referrer information, and a JavaScript tracking API.

For those of you in the European Union, another key advantage is that it conforms to the controversial EU Cookie Law, so if GA is the only reason you’re currently displaying a cookie notice, switching to Piwik means you probably won’t have to (disclaimer: this is an extremely grey area, so you should probably do the necessary research and consult a lawyer locally).

Piwik is extensible; there’s even a Marketplace dedicated to third-party plugins (accessible from the settings page), or you could write your own. And because it’s open-source, you could even modify the core.

Installation

Piwik is a hosted web application, so you’ll need to upload it to a server somewhere. It doesn’t matter where you put it, though it might make sense to put it on a subdomain, e.g. analytics.example.com.

Download Piwik and upload it to your server. Alternatively, use the command line:

wget http://builds.piwik.org/latest.zip
unzip latest.zip
cp -R piwik /var/www/analytics.example.com

A number of directories need to be writeable by the web server:

/tmp
/tmp/assets/
/tmp/cache/
/tmp/logs/
/tmp/tcpdf/
/tmp/templates_c/

Next, you’ll need to create a MySQL database for it. The database user used by Piwik needs permissions to read, write, and modify the schema.

Once your virtual host is set up, navigate to it in your browser to begin the setup wizard.

The wizard will first list the dependencies and check whether they’re all met. It’ll ask you for the database connection details, and then ask you to create a “super user” account. Next, it’ll ask you to add your first website. At this point you’ll need to give the site a name, provide its URL, specify the city (for calculating the time zone) and whether it’s an e-commerce site. You can add more websites later.

Next, it will generate and display the JavaScript tracking code for you; if you’re using a plugin you may not need this as-is, but otherwise you can simply paste it into your website’s page templates. Once you’ve finished the installation, it’ll ask you to sign-in with the super-user account you just created.

Next up, we’ll start looking at some of the functional areas of Piwik.

Goals

Goals are used to measure the effectiveness of a website with respect to business objectives. Some examples of goals you might track from your visitors:

  • Registering on the site
  • Making a purchase
  • Subscribing to a newsletter
  • Making an inquiry using the contact form
  • Downloading a brochure or other document

To add a goal, go to “Goals” then choose “Add a new goal”. Start by giving the goal a name, for example “Downloaded brochure”. Next, you can set the goal to be triggered based on the user visiting a page, downloading a file, or clicking a link. It’s common to trigger goals when people land on confirmation pages, for example alongside the thank-you message that appears when you’ve completed a contact form.

Next, decide whether this is a “one-time” goal, or whether it makes sense for the goal to be triggered multiple times. This really depends on your business objectives and the goal itself; subscribing to a newsletter is probably a one-time goal, whereas making a purchase would be multiple.

Consider using expressions instead of exact URLs; otherwise it may match http://example.com/newsletter/thanks but not https://example.com/newsletter/thanks or http://www.example.com/newsletter/thanks (notice the “https” and “www”). Whereas, (.*)/newsletter/thanks would match them all.

You may wish to add the Goals widget to the dashboard. Browse to the dashboard, click the “Widgets & Dashboard” button, hover over “Goals” then click “Goals Overview”. The widget should appear on your dashboard, and you can now drag it to your preferred position on the page.

Adding a Widget to Piwik's Dashboard

Tracking Campaigns

A vital part of any marketing campaign is measuring its effectiveness. When your campaign is designed to direct people to your site, you can use a tracking code to record it.

To do this with Piwik, first incorporate a parameter into the URL to the site in question, for example when linking to the site in your promotional emails or a in a Tweet. You can call the parameter whatever you want, and there’s no set format for the value — as long as it will make sense to you later. Here are a few examples:

http://www.example.com?campaign=email_march_2014
http://www.example.com?source=twitter
http://www.example.com?fb_post_id=123456789

You can also use hashtags for your tracking codes, e.g.

http://www.example.com#campaign=email_march_2014

The advantage of this approach is that you won’t be penalised by search engines for having duplicate URLs.

To view your campaign stats, in Piwik, click Referrers in the main navigation, then Campaigns. You can also add a widget to your dashboard: Widgets & Dashboard -> Referrers -> Campaigns.

Geolocation

By default, Piwik tries to guess where visitors are based on their language settings. This is a flawed approach, however, that’s pretty likely to distort your geo-based analytics data.

Piwiki tracks the geographic location of your site's visitors

To improve the quality of this information, download the Geolite City file from Maxmind, then copy it to Piwik’s misc directory.

To do that on the command line:

wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz
gunzip GeoLiteCity.dat.gz
cp GeoLiteCity.dat /var/www/analytics.example.com/misc/

Then, go to Settings -> Geolocation and select “GeoIP (Php)”. On that page you can also see that approach’s conclusions about where you are, compared to the default method based on your language settings. Although not fool-proof, you’ll probably find that the GeoIP approach is considerably more accurate.

For even better geolocation, you can purchase premium geo databases, and you can even set up Piwik to update the database automatically.

Websites and Advanced Website Settings

If you select Settings (top-right) then Websites (on the left hand side, under “Manage”), you’ll get a list of the websites you’ve configured, and you can add more from there.

If you click Edit next to a website, you also get access to some additional settings, several of which are also available as global settings. Here are a few of them.

Excluded IPs

You can exclude certain IP addresses from being tracked. This is useful when you or another analytics user — for example, your client — have a static IP address or range, and you wish to exclude their visits from reports.

Excluded Parameters

Some URL parameters, such as session IDs, are irrelevant to tracking. Many, such as phpsessid and sessionid, are stripped out automatically, but if the system that powers a particular site uses something else, you can set that here.

You can also track searches on the site itself. Simply specify the search parameter (e.g. q, query or search) and Piwik will pick it up. (To find out what parameter your site uses, either inspect the URL after a search, or check the name attribute of the search box.)

Users

You can add as many user accounts as you wish, and control their access to websites on an individual basis. Go to Settings -> Users to add a new user, edit existing users, or manage their permissions.

Note: in case you’re wondering, the token_auth field is used for API calls.

JavaScript Applications

If your application uses Ajax page transitions, or if you’re using a front-end framework such as Angular JS, Ember, or Backbone, you’ll probably need the JavaScript API. As well as being able to log page views programmatically, you can also pass custom variables.

For example, to track a page view on a JavaScript event:

<a href="About" onclick="javascript:_paq.push(['trackPageView', 'About']);">About</a>

Or using jQuery:

$('nav a').click(function(){
  _paq.push(['trackPageView', $(this).data('title')]);
});

To pass a custom variable:

_paq.push(['setCustomVariable','1','VisitorType','Member']);

You can also use JavaScript to track a goal:

_paq.push(['trackGoal', 1]);

You can also track a goal along with a specific revenue. For example, on an order confirmation page you might do this, injecting the relevant information into the page using PHP:

_paq.push(['trackGoal', 1, <?php echo $order->getTotal(); ?>]);

For analytics.js users, there’s a fork which supports Piwik on Github.

Integrations

There are a number of available integrations for Piwik, for example Drupal, WordPress, Magento, Silex, Laravel, as well as client and tracking API’s for PHP.

Mobile

Piwik offers mobile apps for iPhone and Android.

Further Information

There are a number of videos talking you through several aspects of Piwik on Youtube.

Disadvantages of Piwik

Being self-hosted introduces its own disadvantages. It takes time and effort and technical knowledge to set up Piwik. It uses server resources, and it’s up to you to manage any downtime.

Inevitably there’s also a learning curve, so switching clients over could cause some friction. Of course, there’s nothing to stop you running it alongside your current solution while you or your users familiarize themselves with it.

Summary

Google Analytics has some serious competition from Piwik, the self-hosted web analytics solution. For me, the benefits of having control over my or my clients’ own data — coupled with some very real concerns about Google’s privacy policies — far outweigh the drawbacks of self-hosting.

However, this is a very subjective reason, so I’ll leave you to decide for yourself. But I’d love to hear what you think in the comments.