It’s people! Meet Soylent, the crowdsourced copy editor

By Justin Ellis @JustinNXT Nov. 3, 2010, 10 a.m.

The phrase “on-demand human computation” has a sinister tinge to it, if only because the idea of sucking the brain power out of a group of people is generally frowned upon. And yet, if you call it “crowdsourcing” everything sounds so much friendlier!

But calling Soylent “crowdsourced copy-editing” isn’t quite fair, since the system performs the type of jobs that are somewhere in the gray area between man and machine. More than a spell check, not quite the nightside copy editor versed in AP style, Soylent really is on-demand computation. It’s what all word processors need, the “Can you take a look at this?” button with a small workforce of people at your disposal.

Soylent is an add-in for Microsoft Word that uses Mechanical Turk as a distributed copy-editing system to perform tasks like proofreading and text-shortening, as well as a type of specialized edits its developers call “The Human Macro.” Currently in closed beta, Soylent was created by compsci students at MIT, Berkeley, and University of Michigan.

For those unfamiliar, Mechanical Turk is an Amazon service that makes it easier for small tasks (and the money to pay for them) to be distributed among a group of humans called Turkers. While savvy writers could already use MTurk to edit their work, the team at Soylent believes their system can produce better and more efficient results than would a writer working alone.

“The idea of Soylent is, what if we could embed human knowledge in the word processor?” MIT’s Michael Bernstein, the lead researcher on Soylent, told me.

That sounds technical, but as Bernstein explains, we all call on friends for help when writing. Research paper, essay, email, story, or blog post — most people rely on a second pair of eyeballs for help at least some of the time. And one thing Mechanical Turk has to offer is a lot of eyeballs.

Soylent’s three current features are called Shortn, Crowdproof, and the Human Macro:

— Shortn: Ever write 1,700 words and blow right past your 1,200 word count? Shortn lets writers submit passages of text to MTurk for trimming. They can determine how much they want to cut with a handy slider tool.

— Crowdproof: A superpowered, sophisticated spell, grammar and style check that provides suggestions as well as explanations why your choices are wrong.

— The Human Macro: For more complicated changes — something like “change all verbs to past tense” — the Human Macro is, as Bernstein says, programming-as-craigslist-ad. The writer describes the changes she wants (capitalization of proper names, altering verb tense, annotating references with Creative Commons photos) in a request form, which humans then act on.

Bernstein argues that Soylent’s cold, detached eye is just what some writing needs. “It’s really hard to kill your own babies in your writing,” Bernstein said. “To be honest, another motivation for me is that it’s very time consuming to go and snip words and cut things from paragraphs an hour before deadline.”

But to writers already nervous about those babies being disappeared on the copy desk, handing over their copy to the faceless masses might not sound like a solution. In their research, Bernstein and his colleagues identified “lazy” and “overeager” individual Turkers, with the lazy ones doing the minimal amount of work and the overeager making wholesale changes. Bernstein said the distributed editing process behind Soylent eliminates this problem because no one Turker is working with whole passages of a document; the work is split among many.

Some in news circles are already experimenting with Mechanical Turk; ProPublica used it to identify companies getting stimulus dollars for the Recovery Tracker project. (Here at the Lab, we use it for the long transcripts we sometimes run of video or audio interviews.) MTurk could be used for any number of tasks that call for on-demand labor. But what makes Soylent different from using MTurk directly is a programming pattern Bernstein and his colleagues created called Find-Fix-Verify, which disseminates tasks across a large group of workers. The only thing required of writers is an Amazon account to pay Turkers; Soylent sets the payment rates.

Instead of one Turker reading over an entire page or paragraph, Soylent asks a group of workers to find areas that need fixing and make corrections. Those fixes are then filtered by other Turkers for inaccuracies, which produces a set of recommendations or an edited graph to a writer. Depending on the job and the document, it usually took Soylent around 40 minutes to complete a task.

To news traditionalists, Soylent may sound like the latest turn toward outsourcing in journalism that has sent copy editing jobs to places in India. It could also be akin to the automated journalism being tested by some companies or the Huffington Post’s real-time headline testing. And some day it may be. But Soylent is far from ready for the mainstream, thanks to the processing time and payment methods. Bernstein says they’re working towards having real-time edits and managing payment through Soylent, as well adapting the program to work on photo editing. Instead of outsourcing, think of Soylent as microsourcing.

And about that name: It comes from exactly what you’re thinking. Bernstein said they were looking for something familiar but also true to the idea of what they created. Soylent is made of people. It is indeed, people.

“The original name was Homunculus,” Bernstein said. “It didn’t have the same ring to it.”

POSTED Nov. 3, 2010, 10 a.m.

Show tags

TWITTER FACEBOOK EMAIL