Web technology is ever-evolving, and with it is the number of ways websites and advertisers can track web users and reduce their online privacy. Browser fingerprinting is one such method which can be used to uniquely identify visitors to a website by exploiting the ability to configure of operating systems and modern web browser.

Compared with cookies, which are frequently cleared by users or blocked by browsers, or IP addresses, which can change or correspond to multiple users, browser fingerprints are very difficult to block, change, hide, or modify. A user who naïvely alters parts of a browser fingerprint will only make the fingerprint more unique, therefore making themselves easier to track.

How common is browser fingerprinting? Is browser fingerprinting an effective method of tracking users? And what steps should privacy-concious users take to avoid being tracked under browser fingerprinting systems?

In order to determine how widespread browser fingerprinting is, JavaScript files from the Alexa top 1,000 sites were scanned for 23 different browser fingerprinting indicators. To measure effectiveness of browser fingerprinting, the browser information of approximately 310,000 web users was collected and analyzed. Finally, these results were used to find methods in which users can blend in so they are less easily tracked using browser fingerprinting systems.

Measuring Prevalence

Sites Using 14+ Browser Fingerprinting Indicators
paypal.com apple.com go.com cnn.com
shutterstock.com telegraph.co.uk wsj.com ikea.com
nba.com dell.com reuters.com macys.com
accuweather.com latimes.com toysrus.com cbssports.com
gamespot.com kickstarter.com staples.com howstuffworks.com
pcmag.com people.com zazzle.com jcpenney.com

In order to determine how widespread browser fingerprinting is, JavaScript files from the top 1,000 sites’ front pages were scanned for 23 different browser fingerprinting indicators (see the sample browser fingerprint for examples, along with navigator and screen).

Most sites on the list did not contain JavaScript files with any of the 23 fingerprinting identifiers. However, a handful of sites used many more of these indicators than what would be reasonably required to tailor the site to users’ browsers. There are legitimate uses for these indicators, or browsers would not provide them. Sometimes, being able to determine a user’s operating system, screen size, or plugins could help a site figure out what kind of content should be provided. However, using a large number of these indicators on one page is suspicious.

Furthermore, it is difficult to determine from JavaScript alone how the indicators are used on a site’s servers. Google Analytics, for example, uses several of these indicators, but they are used to provide browser statistics rather than to make it easier to track individual users. However personally identifying in any way is worrying, especially when many sites use far more of these browser fingerprinting identifiers than Google Analytics does.

Measuring Effectiveness

A browser fingerprinting system attempts to uniquely identify a user by combining as much browser and operating system configuration data as it can. In order to test whether this is a viable method of tracking users, a browser fingerprinting system was created in order to collect browser data from a large number of web users. Unlike browser fingerprinting that may be used on a commercial site, this browser fingerprinting system simply stored browser data rather analyzing it on the fly. This allowed browser statistics to be compared after data collection in order to simulate the effectiveness of a browser fingerprint based on those statistics.

The browser fingerprinting system has a client-side where most of the data were collected and a server-side where additional tracking data were collected and the fingerprinting data were stored. The client-side tracked 19 different browser variables using JavaScript (see “Sample Browser Fingerprint” below). Most of these variables were part of the window.navigator object, which stores information about a user’s browser such as user agent, language, whether cookies are enabled, and the browser build version. The window.navigator object also includes a list of plugins and corresponding mime-types used in the browser. Because a user’s plugins and mime-types tend not to change frequently and can vary substantially between users, they were included as a component in the fingerprint.

A browser’s window.screen object provides the dimensions and color profile of the user’s computer. Similarly to plugins and mime-types, screen dimensions differ between users enough that they are a useful tracking tool, but tend not to differ for a given user. For this reason window.screen data seemed to be worthy of inclusion in the system.

Finally, new Date().getTimezoneOffset() was used to determine the user’s time zone, which also seemed to have some identification value.

These variables were collected via JavaScript on page load and send back to the fingerprinting server via an AJAX POST request. However, before sending the fingerprint to the server, the client-side script stored two random 128-bit numbers as a cookie and in HTML5 local storage. Local storage can be used to store information similarly to using cookies, but is not as limited in size because it remains on the user’s computer instead of being passed around in HTTP requests. Because it is less likely that a user will clear both local storage and their cookies, this increases the system’s ability to keep track of users. These tracking IDs and the user’s IP address are not part of the fingerprinting system itself. Although combining them as part of the browser fingerprint would likely increase the fingerprint’s accuracy, they are not, as this would make it impossible to evaluate the browser fingerprinting system against a set of users believed to be unique.

Avoiding Browser Fingerprinting

According to the data above collected from web users, 90.5% of visitors could be tracked uniquely using browser fingerprinting alone. This is, of course, is probably much lower than for cookies. But unlike with cookies, it is not as easy to block or change individual components of a browser fingerprinting system. In fact, the more a user tries to change the values of a fingerprint component, they more unique they become in the browser fingerprinting system. Missing values were very uncommon within the dataset, and would instantly make a user more trackable rather than less.

Rather than trying to hide by changing their browser information, users could instead try to blend in. By using as normal a browser configuration as possible, a user might be able to avoid being uniquely tracked. Most users use some version of Mac OS X or Windows (80%), use a WebKit-based browser (74%), and don’t install any plugins (56%). The most common screen resolution is 1920 × 1024, although this only accounts for 6% of all web users in the dataset. However, even though a large number of people run OS X, this group can be further segmented by operating system version number. Similarly, even users don’t install plugins, their default plugin set is unique to the browser they use as well as including version numbers. Finally, even when using the most common screen dimensions, there are still so many other configurations that this does not help them remain anonymous.

This means that even if a user spoofs their browser information by making it appear as common as possible, there is still a chance they can be tracked because very few people are the average web user. A further issue with spoofing is that any mistakes will result in a browser fingerprint that is entirely unique (e.g., a 1920 × 1024 screen on an iPhone). Unless users are very aware of which browser configurations are most common, they risk making themselves more unique with spoofing rather than less.

One potential solution would be to use an iPhone to browse the web. Even though individually, iPhone fingerprint values are not very common, the most common web user is an iPhone user. However, using an iPhone to browse the web (or pretending to) is not very practical as users will be forced to use mobile-formatted sites. Additionally, using an iPhone means users have less control over other privacy options, such as cookie management and using VPNs and proxies. As cookies and IP addresses are still the dominant way of tracking users, using an iPhone to try to stay private online is probably a mistake.

Conclusion

After examining a wide range of both client-side and server-side browser fingerprinting techniques, it is clear that individual techniques vary in effectiveness. However using all variables together to form a unique identifier, it is possible to track 90.5% of users. The fact that browser fingerprinting can perform so well on what appears to be non-personally identifiable information is cause for concern.

Additionally, an increasing number of popular websites have begun using browser fingerprinting techniques to track visitors. While there are steps users can take to limit the amount of browser and operating system information they reveal to sites, the best way to prevent browser fingerprinting would be to increase browser privacy controls to allow users to control what browser information sites can access.

Sample Browser Fingerprint

Client-Side
appCodeName Mozilla
appName Netscape
appVersion 5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.131 Safari/537.36
buildID 20140327113732
doNotTrack yes
language en-US
mimeTypes (JavaScript MIME type object)
onLine true
platform MacIntel
plugins (JavaScript plugin object)
product Gecko
productSub 20030107
userAgent Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.131 Safari/537.36