Article begins

The Inescapable Category of “User”

Increasingly—and in unevenly distributed ways—user data informs the borders we’re permitted to cross, the care we’re eligible to receive, and our access to housinginsurance, and employmentCorporations and governments alike now routinely create, purchase, and exchange user data—and not just in the Global  North. News headlines over the past year highlight growing public concern with the everyday encroachments of user-data-as-surveillance: the unasked for feature of every-few-seconds screenshots by Microsoft’s “Recall”; the dubious data collection practices of the US automotive industry; the stark feasibility of data from menstruation self-tracking apps leading to criminal charges against users; the US executive order blocking the sale of user data to “countries of concern;” and the repurposing of user data as chatbot training fodder, to name just a few. At the same time, user data has become an everyday, ubiquitous phenomenon. Navigating your way to this very page, you likely encountered the option to select how much digital information you consented to share with this website. In short, whether or not we see ourselves as such, we are, at any given moment, users who generate data.

Journalist Taylor Majewski recently proposed retiring the term “user” on the basis that it’s “unspecific enough to refer to just about everyone.” But the fact that “users” have become nearly interchangeable with “people” might not indicate irrelevance so much as a realignment of personhood. In other words, “user”—a term that has already undergone multiple shifts since it first came into use in the context of computers, and did not at first indicate individuals—is both descriptive and constitutive of identity and social relations. Amidst ever-expanding—and increasingly mandated—digitally-mediated modes of participation, “users” are no longer necessarily customers per se; they’re patients, students, welfare recipients, employees, citizens, residents, and applicants, and far more besides. Even nonhuman animals, fetuses, or the deceased can be users. User data, in turn, is potentially any information generated via our digital interactions or digitally recorded about us; it might include anything from text conversations to biometrics to credit history—along with any subsequent algorithmic inferring, interpreting, and predicting (however accurate or inaccurate) of our behaviors and attitudes.

PC Magazine concisely defines user data as “any data a user creates or owns,” but this flaunts the crux of user data’s ethical quagmire: to what extent do any users own their data? It’s overwhelmingly companies, not end-users, who profit from the transaction of user data. Third-party user data brokerage constitutes an entire industry. Instances of users explicitly brokering their own data exist (I specify “explicitly” because many users do so implicitly through use of apps that offer “rewards” in exchange for data), but any given individual user’s data is of insufficient value to make self-brokering viable—a portrait of Deleuzian “dividuality.” Further undercutting the idea that users “own” their own data is the fact that users are, by design, largely unaware of what data is being collected or circulated—what anthropologist Melissa Gregg describes as a “gift that is not given.” But even the companies seeking to require users to either provide their data or else pay for access aren’t reinforcing the idea that users “own” data so much as they are converting privacy from a right into a service.

Crossing the Industry-Academia Divide

In the introduction to the Journal of the Royal Anthropological Institutes’ special issue “Towards an  Anthropology of Data,” Rachel Douglas-Jones, Antonia Walford, and Nick Seaver (2021) point out that many anthropologists are suddenly finding themselves to be anthropologists of data precisely because of data’s growing entanglements with nearly every aspect of existence. But while the authors acknowledge and cite the contributions of industry-employed anthropologists, they do not, in this overview of the anthropology of data, center the outsized role that anthropology already plays in the sphere of user data. For any anthropology PhD on the job market in the past five or more years, it would be hard to miss the fact that out of all the jobs seeking workers with an advanced sociocultural anthropology degree, the most prominent category outside of academia is user experience (UX) research. UX research is a form of data work that, among other things, entails collecting, analyzing, and interpreting an array of user data with the aim of improving digital products and services from the vantage point of users. In other words, anthropologists happen to already be in the thick of one of the most pressing aspects of data in the twenty-first century.

Notably, anthropologists’ leadership in what’s now known as UX research isn’t new or obscure: Lucy  Suchman and other anthropologists helped pioneer it in the late 1970s. In which case, why isn’t the work of UX anthropologists a self-evident cornerstone of the anthropology of data, and why does exiting academia to go into UX research still feel like a kind of unspoken professional exodus?

My attunement to these interlocked questions is rooted in my research background: as an academic ethnographer, I did participant observation research at an AI startup, at which my principal mode of participation was as a UX researcher. In other words, I found myself in the unusual situation of occupying both “realms”—industry and academic anthropology—simultaneously. I believe that tackling the elitism underlying the industry-academia divide is a worthy goal, and also an urgent one. Right now, we’re squandering an obvious pathway of collaboration by which anthropologists can do something about the dramatically shifting global norms of privacy in which we and many of our informants now find ourselves. What I’m proposing is a collaboration between industry and academic anthropologists to build a praxis of user data.

Cultivating a praxis of user data means centering users in all of their complexity; disambiguating user data and the extent of its reach; and making this topic into a more public, global conversation that insists on more perspectives and accounts of data colonialism. Precisely because user data is so vast and far-reaching in scope, the agendas that belong to an anthropological praxis of user data far exceed what I can list. I want to spark this direction of thinking by illuminating one potential direction: leveraging the roles of UX anthropologists as user advocates.

User Data Can Serve an Ethical Purpose, But It’s Still far from Straightforward

In calling for anthropologists to pursue a praxis of user data, I should clarify: I don’t believe that all instances of user data collection, analysis, or application are abjectly terrible and unethical. In some situations, it can even be ethically necessary. In my ethnographic research at a startup that uses AI chatbots to provide interactive mental health care, I learned that for the psychologists working there, user data is on par with conventional clinical data. For them, collecting user data—namely, storing the conversations that users have with bots—is also key to ensuring the chatbots aren’t inadvertently causing any harm to users.

At my fieldsite, one of my first tasks was to comb through batches of de-identified conversation excerpts (“transcripts”) in order to assess the chatbot’s interactions. Among other things, it was my responsibility to flag any instances where the chatbot failed to recognize a statement from a user indicating a particularly vulnerable state of mind (User: “I hate myself and don’t deserve to live”). One of the psychologists’ recurring fears was the possibility that the chatbot not only might miss a user’s statement along these lines, but that it might “validate” such a statement (Chatbot: “It sounds like you have the right attitude, [USERNAME]!”). Being able to periodically monitor a random sampling of interactions was, to my colleagues, a sort of digitally mediated case supervision.

But even though I understood this oversight was necessary, I struggled with feeling like my AI error-flagging presence was invasive. Anthropologist Beth Semel’s designation of “sanctioned  eavesdropper” is resonant. To be clear, my fieldsites didn’t sell any user data, and the data I worked with was anonymized: each user was identifiable only by a long sequence of digits, and their username was also scrubbed from any of the dialogue. Yet this was often still very personal data. At any given moment, users of digital care services must weigh their concerns about the surveillance of this data (including the potential for a database hack, inadvertent leak, or the possibility of some future change in company policy or ownership) against the pressing need to talk and receive emotional support. Unsurprisingly, I felt overwhelmingly responsible for and accountable to every single sequentially-numbered user whose data I brushed up against.

User Data as “Shadow Ethnography”

Rather than shaking off my unease, I allowed it to guide me. I came to think of user data as a kind of shadow ethnography. Like ethnographic data, user data effectively consists of empirical data about people arising from time spent observing their behaviors and interactions (even if the “observer” is not necessarily human). The murkiness of the relationship between users and their data arguably mirrors how, even with the most iron-clad IRB and self-reflective of good intentions, the question of to what degree an informant can ever fully consent to take part in our work isn’t settled. Again, even de-identified data can be—as the 2006 AOL search data scandal unequivocally demonstrated—unexpectedly intimate, insofar as it tells a story. Of course, the corollary is that the stories it tells us—whether as prediction or interpretation—can be inaccurate or outright untrue. In