Is this video “missing context,” “transformed,” or “edited”? This effort wants to standardize how we categorize visual misinformation

MediaReview wants to turn the mishmash vocabulary around manipulated photos and video into something structured.

By Joshua Benton @jbenton Jan. 16, 2020, 2:55 p.m.

One thing you learn when you start fact-checking is that, while some things are clearly true or false, there’s a lot that falls in between. What seems like a binary is often more of a messy spectrum: “factually accurate but deeply misleading,” “got the numbers wrong but the broader point is correct,” “there are some studies that support that, but the preponderance of data shows the opposite,” and so on. That’s how you end up with Pinocchio scales and delineating the merely “Mostly False” from the “Pants on Fire!”

That bespoke fuzziness makes it very hard to turn the output of a fact-check into hard, encodable data. And that, in turn, makes it hard to apply fact-checks to digital platforms at scale.

Now take that problem and — instead of textual statements and politicians’ quotes — apply it to images. If a photo has been ’shopped, was it changed just a little or a lot? Did the editing harmlessly change the white balance or fundamentally alter the reality the photo is supposed to represent? Is a tight crop excluding important context or appropriately directing a viewer’s focus to something?

Then apply all of that to videos. Where’s the line between a deepfake and a cheapfake? Your head starts to hurt.

The unsung heroes of the Internet are the people who develop the standards by which information gets encoded into structured data, and said heroes are now turning their attention to this particular problem, visual misinformation. Last week, schema.org’s Dan Brickley shared a proposed standard for fact-checking visuals called MediaReview.

MediaReview is meant to build on the existing ClaimReview, which is designed primarily for text. ClaimReview is a common format through which any fact-checker can lay out their findings in a way that machines (and thus search engines, social networks, and other platforms) can read. You may not come across the benefits of ClaimReview much in the wild (yet), but it’s what allows fact-checks from different sources to exist in a common database.

ClaimReview was built primarily by the Duke Reporters’ Lab, schema.org, and Google. The current draft of MediaReview is by the Reporters’ Lab’s Bill Adair and Joel Luther and builds on work previously done by The Washington Post.

MediaReview, which is currently in draft form, would define the structure of a visual fact-check, with fields for the identification of the object in question as well as a taxonomy of potential ratings. (The current proposal, for instance, allows images or videos to be “Authentic,” “MissingContext,” “Cropped,” “Transformed,” “Edited,” or “ImageMacro,” each with its own definition.)

We’re proposing a simple approach for fact-checkers to 1) identify whether a video or image has been manipulated or taken out of context and 2) assess any factual claims in that video or image.

Fact-checkers will use two levels of schema to indicate their findings about videos and images. One level, which could be called MediaReview, will indicate if the video or image has been manipulated, taken out of context, etc. The other level is the existing ClaimReview, which uses individual claims.

We envision the new approach to be “nested.” For example, every video would have an entry indicating if it was manipulated, taken out of context, etc. Within a video could be one or more claims that are assessed individually using ClaimReview markup.

Structuring the data like this allows fact-checkers to account for cases in which a piece of media can be manipulated and also contain one or more traditional factual claims.

This approach also creates a valuable new standard for fact-checking unaltered videos and images from political campaigns and other sources.

Adair and Luther give two examples of visual fact-checks and how they’d be encoded:

The video of Nancy Pelosi appearing to slur her words during a speech. Fact-checkers found the video was selectively edited and that it made an implicit claim that she was drunk. Here’s how it would be classified in the new and existing schema:

MediaReview – Transformed – The video was selectively edited to give a misleading impression of the speech.

Claim – “Video indicates Nancy Pelosi was drunk while giving a speech” – Rating – False.

Donald Trump’s campaign ad that aired during the World Series. The 30-second ad was not manipulated and contained several claims that could be fact-checked:

MediaReview – Authentic – The video is an authorized ad from a campaign that was not altered.

Claim – “President Trump is cutting illegal immigration in half” – False.

Claim – “President Trump has created 6 million new jobs.” – Exaggeration

They also outlined how 50 other fact-checks might be encoded here and two more ripped-from-the-headlines here. (Rep. Paul Gosar photoshopping together Barack Obama and Hassan Rouhani? “Transformed.” That doctored video of Joe Biden being racist? “MissingContext.”)

The Biden video illustrates the challenge in getting different publishers to encode information in a common way, Adair and Luther write:

We identified only three fact-checks tagged with ClaimReview, from Truth or Fiction?, FactCheck.org, and PolitiFact.

Snopes and CNN wrote fact-checks but didn’t use ClaimReview, while other outlets, including Vox, The Associated Press, The New York Times, The Washington Post (Opinion) covered the story as news or opinion.

And those that used ClaimReview used an inconsistent mishmash of terms to describe the video: “out of context,” “decontextualized,” “deceptively edited,” “distorted,” and “false.” The mishmash illustrates the need for consistency and training that we’ve proposed with MediaReview.

MediaReview is still, as I mentioned, in draft form, so there’s still time for you to suggest changes or improvements — right in the Google Doc, where you can already see some ideas in the comments here. A structured data is never really finished, exactly, but the sooner a standard gains consensus, the sooner it can be implemented by everyone.

Joshua Benton is the senior writer and former director of Nieman Lab. You can reach him via email (joshua_benton@harvard.edu) or Twitter DM (@jbenton).

POSTED Jan. 16, 2020, 2:55 p.m.

Show tags

TWITTER FACEBOOK EMAIL