Article begins
The Inescapable Category of “User”
Increasingly—and in unevenly distributed ways—user data informs the borders we’re permitted to cross, the care we’re eligible to receive, and our access to housing, insurance, and employment. Corporations and governments alike now routinely create, purchase, and exchange user data—and not just in the Global North. News headlines over the past year highlight growing public concern with the everyday encroachments of user-data-as-surveillance: the unasked for feature of every-few-seconds screenshots by Microsoft’s “Recall”; the dubious data collection practices of the US automotive industry; the stark feasibility of data from menstruation self-tracking apps leading to criminal charges against users; the US executive order blocking the sale of user data to “countries of concern;” and the repurposing of user data as chatbot training fodder, to name just a few. At the same time, user data has become an everyday, ubiquitous phenomenon. Navigating your way to this very page, you likely encountered the option to select how much digital information you consented to share with this website. In short, whether or not we see ourselves as such, we are, at any given moment, users who generate data.
Journalist Taylor Majewski recently proposed retiring the term “user” on the basis that it’s “unspecific enough to refer to just about everyone.” But the fact that “users” have become nearly interchangeable with “people” might not indicate irrelevance so much as a realignment of personhood. In other words, “user”—a term that has already undergone multiple shifts since it first came into use in the context of computers, and did not at first indicate individuals—is both descriptive and constitutive of identity and social relations. Amidst ever-expanding—and increasingly mandated—digitally-mediated modes of participation, “users” are no longer necessarily customers per se; they’re patients, students, welfare recipients, employees, citizens, residents, and applicants, and far more besides. Even nonhuman animals, fetuses, or the deceased can be users. User data, in turn, is potentially any information generated via our digital interactions or digitally recorded about us; it might include anything from text conversations to biometrics to credit history—along with any subsequent algorithmic inferring, interpreting, and predicting (however accurate or inaccurate) of our behaviors and attitudes.
PC Magazine concisely defines user data as “any data a user creates or owns,” but this flaunts the crux of user data’s ethical quagmire: to what extent do any users own their data? It’s overwhelmingly companies, not end-users, who profit from the transaction of user data. Third-party user data brokerage constitutes an entire industry. Instances of users explicitly brokering their own data exist (I specify “explicitly” because many users do so implicitly through use of apps that offer “rewards” in exchange for data), but any given individual user’s data is of insufficient value to make self-brokering viable—a portrait of Deleuzian “dividuality.” Further undercutting the idea that users “own” their own data is the fact that users are, by design, largely unaware of what data is being collected or circulated—what anthropologist Melissa Gregg describes as a “gift that is not given.” But even the companies seeking to require users to either provide their data or else pay for access aren’t reinforcing the idea that users “own” data so much as they are converting privacy from a right into a service.
Crossing the Industry-Academia Divide
In the introduction to the Journal of the Royal Anthropological Institutes’ special issue “Towards an Anthropology of Data,” Rachel Douglas-Jones, Antonia Walford, and Nick Seaver (2021) point out that many anthropologists are suddenly finding themselves to be anthropologists of data precisely because of data’s growing entanglements with nearly every aspect of existence. But while the authors acknowledge and cite the contributions of industry-employed anthropologists, they do not, in this overview of the anthropology of data, center the outsized role that anthropology already plays in the sphere of user data. For any anthropology PhD on the job market in the past five or more years, it would be hard to miss the fact that out of all the jobs seeking workers with an advanced sociocultural anthropology degree, the most prominent category outside of academia is user experience (UX) research. UX research is a form of data work that, among other things, entails collecting, analyzing, and interpreting an array of user data with the aim of improving digital products and services from the vantage point of users. In other words, anthropologists happen to already be in the thick of one of the most pressing aspects of data in the twenty-first century.
Notably, anthropologists’ leadership in what’s now known as UX research isn’t new or obscure: Lucy Suchman and other anthropologists helped pioneer it in the late 1970s. In which case, why isn’t the work of UX anthropologists a self-evident cornerstone of the anthropology of data, and why does exiting academia to go into UX research still feel like a kind of unspoken professional exodus?
My attunement to these interlocked questions is rooted in my research background: as an academic ethnographer, I did participant observation research at an AI startup, at which my principal mode of participation was as a UX researcher. In other words, I found myself in the unusual situation of occupying both “realms”—industry and academic anthropology—simultaneously. I believe that tackling the elitism underlying the industry-academia divide is a worthy goal, and also an urgent one. Right now, we’re squandering an obvious pathway of collaboration by which anthropologists can do something about the dramatically shifting global norms of privacy in which we and many of our informants now find ourselves. What I’m proposing is a collaboration between industry and academic anthropologists to build a praxis of user data.
Cultivating a praxis of user data means centering users in all of their complexity; disambiguating user data and the extent of its reach; and making this topic into a more public, global conversation that insists on more perspectives and accounts of data colonialism. Precisely because user data is so vast and far-reaching in scope, the agendas that belong to an anthropological praxis of user data far exceed what I can list. I want to spark this direction of thinking by illuminating one potential direction: leveraging the roles of UX anthropologists as user advocates.
User Data Can Serve an Ethical Purpose, But It’s Still far from Straightforward
In calling for anthropologists to pursue a praxis of user data, I should clarify: I don’t believe that all instances of user data collection, analysis, or application are abjectly terrible and unethical. In some situations, it can even be ethically necessary. In my ethnographic research at a startup that uses AI chatbots to provide interactive mental health care, I learned that for the psychologists working there, user data is on par with conventional clinical data. For them, collecting user data—namely, storing the conversations that users have with bots—is also key to ensuring the chatbots aren’t inadvertently causing any harm to users.
At my fieldsite, one of my first tasks was to comb through batches of de-identified conversation excerpts (“transcripts”) in order to assess the chatbot’s interactions. Among other things, it was my responsibility to flag any instances where the chatbot failed to recognize a statement from a user indicating a particularly vulnerable state of mind (User: “I hate myself and don’t deserve to live”). One of the psychologists’ recurring fears was the possibility that the chatbot not only might miss a user’s statement along these lines, but that it might “validate” such a statement (Chatbot: “It sounds like you have the right attitude, [USERNAME]!”). Being able to periodically monitor a random sampling of interactions was, to my colleagues, a sort of digitally mediated case supervision.
But even though I understood this oversight was necessary, I struggled with feeling like my AI error-flagging presence was invasive. Anthropologist Beth Semel’s designation of “sanctioned eavesdropper” is resonant. To be clear, my fieldsites didn’t sell any user data, and the data I worked with was anonymized: each user was identifiable only by a long sequence of digits, and their username was also scrubbed from any of the dialogue. Yet this was often still very personal data. At any given moment, users of digital care services must weigh their concerns about the surveillance of this data (including the potential for a database hack, inadvertent leak, or the possibility of some future change in company policy or ownership) against the pressing need to talk and receive emotional support. Unsurprisingly, I felt overwhelmingly responsible for and accountable to every single sequentially-numbered user whose data I brushed up against.
User Data as “Shadow Ethnography”
Rather than shaking off my unease, I allowed it to guide me. I came to think of user data as a kind of shadow ethnography. Like ethnographic data, user data effectively consists of empirical data about people arising from time spent observing their behaviors and interactions (even if the “observer” is not necessarily human). The murkiness of the relationship between users and their data arguably mirrors how, even with the most iron-clad IRB and self-reflective of good intentions, the question of to what degree an informant can ever fully consent to take part in our work isn’t settled. Again, even de-identified data can be—as the 2006 AOL search data scandal unequivocally demonstrated—unexpectedly intimate, insofar as it tells a story. Of course, the corollary is that the stories it tells us—whether as prediction or interpretation—can be inaccurate or outright untrue. In “Google Maps Hack,” artist Simon Weckert pulled around a wagon filled with smartphones to create real-time traffic jams on Google Maps, showing just how fragile such interpretation is: it can be entirely false, while still being “real” in terms of its effects on our lives.
Because user data tells stories—and recognizing the ethnographic significance of sociologist Deborah Lupton’s concept of“data selves,” or the digital representations of us that are constituted through the aggregation of data about us, which both reflect and shape our identities and behaviors—I want to complicate the understanding of user data by considering how it can be both an instrument of surveillance capitalism and ethnographic material. (Note that with this framing, I’m not whitewashing the extractive role that ethnographic data has served). Surveillance capitalism, per Shoshana Zuboff, “aims to impose a new collective order based on total certainty.” But when taken up by anthropologists as ethnographic material, user data produces meaningful uncertainty.
User data became an ethnographic source in my research that illuminated much of my understanding of AI-human relationality. But I questioned whether or not to draw on this data at all. Ultimately I decided that, provided details were changed, I had an obligation, as an ethnographer of AI in mental health care, to not erase the presence of the data or the power of someone-ness behind it. To be an anthropologist is to do this constant work of “wrestling-with” our ethnographic data to protect, understand, and accurately represent our informants.
I believe that most UX anthropologists likewise understand their role as one of being advocates for users—even if NDAs limit their ability to openly discuss this. In my own capacity of doing UX work, I felt it imperative to advocate for users by helping my startup colleagues recognize how, ultimately, the interpretability of transcript data remains non-definitive. For instance, after examining multiple conversations in which users expressed suicidal ideation, I recommended pulling back from presuming that suicidal ideation encased in a string of expletives (“Fuck you robot. I wanna die. Shit shit shit”) was not “really” suicidal ideation. It’s not that anyone was dismissing these users in terms of care—either way, the chatbot was programmed to deploy a referral to a crisis hotline. But I wanted my colleagues to reconsider how users “shit talking” to the chatbot might be one of the unique, otherwise-unacceptable ways in which they leverage having a nonhuman communication partner when seeking emotional support. I can’t render this observation in terms of its productivity—dollars saved, accounts won, or so forth. But as an ethnographer and data worker, it felt significant to center understanding users without that centering being contingent on already having a way to apply it.
I’m not suggesting that ethnography is “big data’s humanizing foil: an anthropological superpower with the power to halt surveillance capitalism in its tracks”—though in fairness, I am positioning ethnography as a complement. My point is that when user data is treated as ethnographic material, its end is allowed to be something other than prediction, evaluated by its accuracy or inaccuracy. If users are people, anthropologists can and do open up a world of nuance and uncertainty around use.
User Data Is a Frontier
Just as decolonization is not a metaphor, neither are frontiers—and user data is indeed a frontier. Within it, usership is increasingly both a mode and condition of social, economic, political, and legal participation/exclusion, to the point that it’s no longer possible to work towards any category of human rights or wellbeing without taking datafication into account. Anthropology’s disciplinary presence in UX research—and the fact that this presence is growing even as academic anthropology departments struggle to maintain funding and independence—might seem like a trivial detail in the story of what anthropology is, but I think it’s an indicator of which our discipline should take note. As corporations seek anthropologists to be their data workers, how can we ensure that anthropologists, rather than corporations, ultimately determine our role? Inviting and better integrating industry anthropologists into academic-centered spaces is one way to both increase transparency in user data practices while also fostering alignment on concrete goals for their reform.
Our discipline as we know it today arguably arose from the determination of a collective of anthropologists to remake it into something not principally in the service of colonial governments, demonstrating that anthropology has long been defined by what we do with what we know. With that in mind: what can we do with what we know about the still-in-flux norms of user data?