April 23, 2018

Privacy with Google Analytics | Field Reports

Google Analytics has a lot of configuration options, which is why webmasters love it. For the purposes of user privacy, however, there are just two configuration options to pay attention to, the “IP Anonymization” option and the “Display Features” option.

IP Anonymization says to Google Analytics, “please don’t remember the exact IP address of my users.” According to Google, enabling this mode masks the least significant bits of the user’s IP address before the IP address is used or saved. Since many users can be identified by their IP address, this prevents anyone from discovering the search history for a given IP address. But remember, Google is still sent the IP address, and we have to trust that Google will obscure the IP address as advertised and not save it in some log somewhere. Even with the masked IP address, it may still be possible to identify a user, particularly if a library serves a small number of geographically dispersed users.

“Display Features” says to Google that you don’t care about user privacy, and it’s OK to track your users all to hell so that you can get access to “demographic” information. To understand what’s happening, it’s important to understand the difference between “first-party” and “third-party” cookies and how they implicate privacy differently.

Out of the box, Google Analytics uses “first-party” cookies to track users. So if you deploy Google Analytics on your “library.example.edu” server, the tracking cookie will be attached to the library.example.edu hostname. Google Analytics will have considerable difficulty connecting user number 1234 on the library.example.edu domain with user number 5678 on the “sci-hub.info” domain, because the user ids are chosen randomly for each hostname. But if you turn on Display Features, Google will connect the two user ids via a third-party tracking cookie from its Doubleclick advertising service. This enables both you and Google to know more about your users. Anyone with access to Google’s data will be able to connect the catalog searches saved for user number 1234 to that user’s searches on any website that uses Google advertising or any site that has Display Features turned on.

IP Anonymization and Display Features can be configured in Google Analytics in three ways, depending on how it’s being configured. The instructions here apply to the “Universal Analytics” script. You can tell a site uses Universal Analytics because the pages execute a Javascript named “analytics.js.” An older “classic” version of Google Analytics uses a script named “ga.js”; its configuration is similar to that of Universal. More complex websites may use Google Tag Manager to deploy and configure Google Analytics.

Google Analytics is usually deployed on a web page by inserting a script element that looks like this:

<script>
(function(i,s,o,g,r,a,m)

{i[‘GoogleAnalyticsObject’]=r; i[r]=i[r]||function(){(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)})(window,document,’script’,’https://www.google-analytics.com/analytics.js’,’ga’);

ga(‘create’, ‘UA-XXXXX-Y’, ‘auto’);
ga(‘send’, ‘pageview’);
</script>

IP Anonymization and Display Features are turned on with extra lines in the script:

ga(‘create’, ‘UA-XXXXX-Y’, ‘auto’);
ga(‘require’, ‘displayfeatures’); // starts tracking users across sites
ga(‘set’, ‘anonymizeIp’, true); // makes it harder to identify the user from logs
ga(‘send’, ‘pageview’);

The Google Analytics Admin allows you to turn on cross-site user tracking, though the privacy impact of what you’re doing is not made clear. In the “Data Collection” item of the Tracking info pane, look at the toggle switches for “Remarketing” and “Advertising Reporting Features.” If these are switched to “ON,” then you’ve enabled cross-site tracking, and your users can expect no privacy.

Turning on IP anonymization is not quite as easy as turning on cross-site tracking. You have to add it explicitly in your script or turn it on in Google tag manager (where you won’t find it unless you know to look for it!).

To check if cross-site tracking has been turned on in your institution’s Google Analytics, use the procedures I described in my article on how to check if your library is leaking catalog searches to Amazon (ow.ly/pfas30iIEvP). First, clear the cookies for your website, then load your site and look at the “Sources” tab in Chrome developer tools. If there’s a resource from “stats.g.doubleclick.net,” then your website is asking Google to track your users across sites. If your institution is a library, you should not be telling Google to do this.

Bottom line: if you use Google Analytics, always remember that Google is fundamentally an advertising company and it will seldom guide you toward protecting your users’ privacy.


Eric Hellman is the founder of Openly Informatics and ebook nonprofit Unglue.it. This column was excerpted and reprinted with permission from his blog, go-to-hellman.blogspot.com

This article was published in Library Journal. Subscribe today and save up to 35% off the regular subscription rate.

Share
Maker Workshop
In this two-week online course, you’ll create a maker program that aligns with your budget and community needs, with personal coaching from maker experts—from libraries and beyond—May 23 & June 6, 2018.
Doubling Your Circ on a Dime
How you manage your circulation matters—to keep patrons coming back for more and to demonstrate to stakeholders just how well-used the library is in your community. Don't miss this online course led by experts who have boosted their circulation numbers in creative and sometimes unexpected ways, without denting their budgets—April 25 & May 9.

Comments

  1. Rosa Parker says:

    Basically, this article is so good. But I am using Google Analytics from 2016. It has lots of privacy issues. If you want full details of Google Analytics you can contact the Google Customer Care. They provide you a lot of details of Google related any kind of problem. The website of the Google Customer Care is: https://googlesupport.co/

Comment Policy:
  1. Be respectful, and do not attack the author, people mentioned in the article, or other commenters. Take on the idea, not the messenger.
  2. Don't use obscene, profane, or vulgar language.
  3. Stay on point. Comments that stray from the topic at hand may be deleted.
  4. Comments may be republished in print, online, or other forms of media, per our Terms of Use.

We are not able to monitor every comment that comes through (though some comments with links to multiple URLs are held for spam-check moderation by the system). If you see something objectionable, please let us know. Once a comment has been flagged, a staff member will investigate.

We accept clean XHTML in comments, but don't overdo it and please limit the number of links submitted in your comment. For more info, see the full Terms of Use.

Speak Your Mind

*