The third principle of the American Library Association Code of Ethics is that “We protect each library user’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.” That sentence leaves no room for misunderstanding.
On the whole the history of libraries is one of service and safeguards against censorship, inequity, and privacy violations. Despite librarians’ long-held professional ethics, the reality is that the advent of the digital data brought privacy concerns from day one—for the few who were paying attention. The one-on-one inviolable relationship between library worker and library customer fundamentally changed with digital data. What used to be a sacrosanct, trust-based bond between two parties turned into a cloudy series of agreements and contracts that, frankly, were not fully understood by the libraries that entered into them or the customers who clicked the “I agree to the terms of service” button. Suddenly, third parties like integrated library systems, scholarly databases, and eBook vendors had access to library patron databases and all they held—personally identifiable information like social security numbers, birth dates, and driver’s license numbers, along with the standard name, address, and other contact information.
Libraries then entered a race to catch up to the technology with our policies and practices, but the tech moved faster than we did. Inconsistently, but steadily, libraries started removing non-essential data from our patron databases and reevaluating contracts with digital providers. Concurrently, companies that libraries did and do business with used cookies to track the activities of library users within their products, then used that information to market services to those people. Companies made arrangements with fourth parties to connect library customer activity in a product with their personal, non-library use. Companies sold aggregate data about library customer activities. And all the while libraries and librarians have been unaware or unwilling to confront the violations of one of our core values. The sad truth is that most libraries have no idea what vendors are collecting on our customers or simply do not care enough to prioritize customer privacy over concepts like ease of access, provision of digital content, and user demand.
People have developed an intensely complicated relationship with technology and privacy. On one hand, technology and digital data have made it easier to provide personalized online experiences. On the other hand, people are often surprised to discover how much of their privacy they have traded for those personalized experiences. How do we, as libraries, find that balance between customer service and privacy?
Enter the world of big data. Companies read the library marketplace and saw a space for data analytics in library services with a noble goal in mind—to create enough trend data to lead to data-based decision-making. With the tools at our disposal we can now, with a few clicks and search terms, bring up a map that will, in essence, show me that someone at a particular address checked out a particular book in the last week. I don’t want that level of information to be available to me, to a third party vendor, or to anyone else. We should all be disturbed by the level of specificity and personally identifiable information used by the big data companies in the library marketplace.
I wither as I see more and more libraries increasingly using data collection (that would have been unheard of in past decades) for tracking customer usage, analyzing trends in use, creating fancy looking reports for their parent agencies, and storing and sharing data in ways that are increasingly hackable and shareable.
This is not a problem that is solely ours. Every organization or individual that collects data about the activities and profiles of people is facing this same conundrum. This seems like a natural place for libraries to take the lead in big data and user privacy. To draw a line in the sand and say no further. To date, we have not done that collectively or individually (for the most part). One of my greatest professional regrets is prioritizing what my customers and stakeholders say they want over my own understanding of the stringent privacy and confidentiality practices that I should be honoring as a librarian. I, like most of my peers, give the community what they want at the expense of their ability to control their personal data in an informed and conscious way.
Just because technology makes something possible doesn’t mean it’s something we should do. In most cases, decision makers and customers alike are unaware of the potential privacy issues with their data until that data is exploited by others. We are one big library data breach away from this issue becoming nationwide kitchen table conversation.
So how do we balance the potential of big data and privacy and confidentiality? A scorch the earth policy seems the most logical in my mind—for libraries to cease keeping any non-essential data and refusing to do business with any company that does otherwise. But I am also a realist and know that the thousands of libraries in the United States alone are not going to be able to come to agreement on that level of stringency.
I do think, however, there are things that all libraries can do to better uphold our values. Make data policies and practices transparent. Any collected data should have a clear purpose. Ensure data quality and deletion when no longer necessary. Renegotiate contracts to ensure greater transparency, authentication, back-up, replication protections, and security protocols. Provide clear information on how customer data is gong to be used. Provide a mechanism for an individual to review their personal data on our systems.
And above all, keep repeating that mantra—the third principle of the American Library Association Code of Ethics—and remember that privacy protection is part and parcel of what we signed up to do as librarians.