I like to think of David Weinberger’s book titles over the past two decades as a sort of tour of the Internet’s metastasizing complexity. First, in 2003, Small Pieces Loosely Joined. Then, in 2007, Everything Is Miscellaneous. In 2012, Too Big to Know. And last year, Everyday Chaos. They move, roughly, from connection to organization to information to, well, chaos, which sounds like the path I remember the Internet taking over that stretch of time.
I think of his work, especially Too Big to Know, whenever I hear someone talk about “Facebook” or “Twitter” or “YouTube” as if they were each a single unitary thing. Or whenever people assume that their News Feed must somehow be indicative (or at least suggestive) of what a billion other people’s News Feeds look like. The experience of social platforms is profoundly fractured, and the only people with any sort of god-like insight into the beast are the ones who work at the companies themselves, with access to the uncountable tendrils of personalization that touch every user interaction. That makes it very hard, in practice, to make defendable statements that begin “Facebook does x to its users” or “YouTube leads users to do y.” From the outside, they’re just too big to know.
Into that problem walks The Markup, the well-funded journalism startup that specializes in exposing those tendrils. “Our approach is scientific: We build datasets from scratch, bulletproof our reporting, and show our work,” its manifesto reads. And now it’s using “The Markup Method” to try to gets its arms around the tech giants.
On Friday, it announced a new initiative called the Citizen Browser Project — “an initiative designed to measure how disinformation travels across social media platforms over time.”
At the center of The Citizen Browser Project is a custom web browser designed by The Markup to audit the algorithms that social media platforms use to determine what information they serve their users, what news and narratives are amplified or suppressed, and which online communities those users are encouraged to join. Initially, the browser will be implemented to glean data from Facebook and YouTube.
A real nerd knows: Why make a browser plugin when you can make a browser?
A nationally representative panel of 1,200 people will be paid to install the custom web browser on their desktops, which allows them to share real-time data directly from their Facebook and YouTube accounts with The Markup. Data collected from this panel will form statistically valid samples of the American population across age, race, gender, geography, and political affiliation, which will lead to important insights about how Facebook’s and YouTube’s algorithms operate. To protect the panel’s privacy, The Markup will remove personally identifiable information collected by the panel and discard it, only using the remaining redacted data in its analyses.
“Social media platforms are the broadcasting networks of the 21st century,” said The Markup’s editor-in-chief, Julia Angwin. “They dictate what news the public consumes with black-box algorithms designed to maximize profits at the expense of truth and transparency. The Citizen Browser Project is a powerful accountability check on that system that can puncture the filter bubble and point the public toward a more free and democratic discourse.”
To put in terms that seem appropriate to the moment: The Citizen Browser Project is like a top-of-the-line poll that measures public opinion on a particular issue. It’s necessarily an imperfect instrument, with margins of error will growing as each demographic or psychographic subset is sliced more thinly. But it’s still a lot more useful than an endless string of anecdotes, the way so much discussion about social media goes — which I think, in this metaphor, would be a reporter counting yard signs to measure voters’ excitement for a candidate.
(As Weinberger put it in Too Big to Know: “The massive increase in the amount of information available makes it easier than ever for us to go wrong. We have so many facts at such ready disposal that they lose their ability to nail conclusions down, because there are always other facts supporting other interpretations.”)
Citizen Browser is our most ambitious project ever — and that is really saying something for us at @themarkup.
Here’s my newsletter on how we plan to audit the algorithms of disinformation:https://t.co/XV5gOIhXKq
— Julia Angwin (@JuliaAngwin) October 17, 2020
This is the sort of thing that The Markup is designed for and best at. There are not many other news organizations that have both the technical skill to do this sort of analysis and the funding necessary to convince a nationally representative sample of Americans to install some Frankenbrowser they’ve never heard of.
(The money’s important. Previous attempts to do this sort of work, like NYU’s Ad Observatory, have relied on volunteers installing a browser plugin. But those volunteers tend to be, well, the kind of people who sign up for online citizen data-collection projects — educated coastal liberals — skewing the results. To build a truly representative sample, you’ve got to pay people.)
What The Markup doesn’t have — being a small and new nonprofit news organization working a very particular niche — is a huge organic audience to show all of its unique work. And thus: “The Markup has teamed up with The New York Times to analyze the data and report on the project’s findings together.”
Here’s what Angwin said about the project in a Zoom talk Friday:
The problem we face right now is that there’s no oversight of the algorithmic gatekeepers. The thing you could say about the old white men who ran the news business of the past was you did know what their decisions were. Their decisions were displayed on the front page in the newspaper — that was what they felt was the most important. Or they were the top of the news hour on the six o’clock news. So we did know what the outcome of their decisions was.
But with algorithmic gatekeepers, every one of us sees a different News Feed. And so there’s not really a way to say: What are they choosing to amplify? And what are they choosing not to amplify? And this, I think, is a fundamental issue for our democracy. Because if we can’t see what they’re saying, we can’t hold them accountable for those things.
So I think in this context, journalism’s role is changing, right? We still have to do the work, the bread and butter work of witness, and we always will. But we have to spend more time adjusting to this new reality, which is: We have to spend time doing what I would call forensics. Verifying and authenticating witness accounts and digital data trails. And also auditing: I think we need to hold these algorithmic gatekeepers accountable for what narratives they choose to amplify. Because right now, they can tell us anything about what their decisions were, and we don’t have a good way to hold them accountable.
“So we are hoping that when we get this tool going in the next few weeks that we will be able to answer the types of questions that really haven’t been answerable so far,” Angwin said. “For instance, what are conservative women seeing at the top of their feed? What kind of groups are being recommended to black men? We’re going to build real-time dashboards. And we’re going to collect the ad targeting information.”
If this sounds like the most wonderful thing in the world to you, The Markup is currently hiring a data reporter to work on the Citizen Browser Project.
Otherwise…sit tight until there are some results to share. As Weinberger put it in Too Big to Know: “Knowledge is becoming a property of the network, rather than of individuals who know things, of objects that contain knowledge, and of the traditional institutions that facilitate knowledge…We will argue about whether our new knowledge will bring us closer to the truth, as I think it overall does. But one thing seems clear: Networked knowledge brings us closer to the truth about knowledge.”
Auditing the Algorithms.
A quick thread on — @themarkup's Citizen Browser Project.
— Kunal Mishra. (@kunaaalllll) October 17, 2020
When it comes to measuring the sources, impact and appeal of disinformation narratives, self-reporting falls short. Auditing how users actually behave online and where info comes from is vital. I hope what @themarkup is doing here can be replicated https://t.co/xKMB7tK3ZL pic.twitter.com/TCV2LHnQ72
— Christy Quirk (@CEQuirk) October 17, 2020
A custom web browser designed to audit the algorithms social media platforms use to determine what information they serve their users, what news and narratives are amplified or suppressed, and which online communities those users are encouraged to join. https://t.co/cXHeLZD8a7
— Lisa Tozzi (@lisatozzi) October 17, 2020
What political stories are being pushed to Black voters in swing states?
What groups are recommended to young voters in Texas?
Exciting new tool from @themarkup that audits the algorithms that decide what we see on our social media feeds. https://t.co/uNyiiL4vWA
— Civic Signals 🏞📲 (@civic_signals) October 17, 2020
some cool shit right here https://t.co/XflSSf3j89
— Craig Silverman (@CraigSilverman) October 16, 2020