We commonly hear that ‘Big Brother’ is watching you in the context of digital and analog surveillance such as Facebook advertising, street cameras, E-Z pass highway tracking or content sniffing by internet service providers. But it’s not only Big Brother; there are a lot of smaller less obvious “Little Brothers” as well, that wittingly or unwittingly funnel data, including personal identifiable information (PII) to massive databases. Unfortunately, libraries (and related organizations) are a part of this surveillance environment. In the following we’ll break down two example library organization websites. We’ll be focusing on two American Library Association (ALA) websites: ALA’s Office of Intellectual Freedom’s Choose Privacy Week website (ChoosePrivacyWeek.org) and ALA’s umbrella site (ala.org).
Before we dive too deeply, let’s review some basics about the data streams generated by a visit to a website. When you visit a website, your browser software – Chrome, Firefox, Safari, et cetera – sends a request containing your IP address, the address of the web page you want, and a whole bunch of other information. If the website supports “SSL”, most of that information is encrypted. If it does not support “SSL,” it is not encrypted and network providers are free to see everything sent or received.
Without SSL, bad actors who share the networks can insert code or other content into the web page you receive. The easiest way to see if a site has a valid SSL certificate is to look at the protocol identifier of a URL. If it’s ‘HTTPS’, that traffic is encrypted; if it’s ‘HTTP,’ DO NOT SEND any personally identifiable information (PII), as there is no guarantee that traffic is being protected. If you’re curious about the quality of a sites encryption, you can check its “Qualys report”, offered by SSL Labs., which checks the website’s configuration, and assigns a letter grade. ALA.org gets a B; ChoosePrivacyWeek gets a A. The good news is that even ALA.org’s B is an acceptable grade. The bad news is that the B grade is for “https://www.ala.org/”, whose response is reproduced here in its entirety:
You don’t have to check the SSL Labs to see the difference. You can recognize ChoosePrivacyWeek.org as a “secure” connection by looking for the lock badge in your browser; click on that badge for more info. Here’s what this looks like in Chrome:
Don’t assume that your privacy is protected just because a site has a lock badge, because the web is designed to spew data about you in many ways. Remember that “whole bunch of other information” we glossed over above? Included in that “other information” are “cookies” which allow web servers to keep track of your browsing session. It’s almost impossible to use the web these days without sending these cookies. But many websites include third party services that track your session as well. These are more insidious, because they give you an identifier that joins your activity across multiple websites. The combination of data from thousands of websites often gives away your identity, which then can be used in ways you have no control over.
Privacy Badger is a browser extension created by the Electronic Frontier Foundation (EFF) which monitors the embedded code in websites that may be tracking your web traffic. You can see a side-by-side comparison of ALA.org on the left and ChoosePrivacyWeek.org on the right:
The 2 potential trackers identified by Privacy Badger on ChoosePrivacyWeek.org are third party services: fonts from Google and an embedded video player from Vimeo. These are possibly tracking users but are not optimized to do so. The 4 trackers on ALA.org merit a closer look. They’re all from Google; the ones of concern are placed by Google Analytics. One of us has written about how Google analytics can be configured to respect user privacy, if you trust Google’s assurances. To its credit ALA.org has turned on the “anonymizeIP setting”, which in theory obscures user’s identity. But it also has “demographics” turned on, which causes an advertising (cross-domain) cookie to be set for users of ALA.org, and Google’s advertising arm is free to use ALA.org user data to target advertising (which is how Google makes money). PrivacyBadger allows you to disable any or all these trackers and potential trackers (though doing so can break some websites).
Apart giving data to third parties, any organization must have internal policies and protocols for handling the reams of data generated by website users. It’s easy to forget that server logs may grow to contain hundreds of gigabytes or more of data that can be traced back to individual users. We asked ALA about their log retention policies with privacy in mind. ALA was kind enough to respond:
“We always support privacy, so internal meetings are occurring to determine how to make sure that we comply with all applicable laws while always protecting member/customer data from exposure. Currently, ALA is taking a serious look at collection and retention considering the General Data Protection Regulation (GDPR) EU 2016/679, a European Union law on data protection and privacy for all individuals within the EU. It applies to all sites/businesses that collect personal data regardless of location.”
The ChoosePrivacyWeek.org website has a privacy statement that’s more emphatic:
We will collect no personal information about you when you visit our website unless you choose to provide that information to us.
The lack of tracking on the site is aligned with this statement, but we’d still like to see a statement about log retention. ChoosePrivacyWeek.org is hosted on a Dreamhost WordPress server, and usage log files at Dreamhost were recently sought by the Department of Justice in the Disruptj20.org case.
Organizations express their priorities and values in their actions. ALA’s stance toward implementing HTTPS will be familiar to many librarians; limited IT resources get deployed according competing priorities. In the case of ALA, a sorely needed website redesign was deemed more important to the organization than providing incremental security and privacy to website users by implementing HTTPS. Similarly, the demographic information provided by Google’s advertising tracker was valued more than member privacy (assuming ALA is aware of the trade-off). The ChoosePrivacyWeek.org website has a different set of values and objectives, and thus has made some different choices.1 2
In implementing their websites and services, libraries make many choices that impact on user privacy. We want librarians, library administrators, library technology staff and library vendors to be aware of the choices they are making, and aware of the values they are expressing on behalf of an organization or of a library. We hope that they will CHOOSE PRIVACY.
T.J. Lamanna is the chair of the New Jersey Library Association Intellectual Freedom Committee and the emerging technologies librarian at the Cherry Hill Public Library. His time is spent discussing both practical and theoretical ways of protecting librarians and their patrons in a world of social engineering, hacking and malicious states. Whether it’s email, browsing history or texts, he educates the public on what they can do to keep their communications private.