Nieman Foundation at Harvard
HOME
          
LATEST STORY
Journalists fight digital decay
ABOUT                    SUBSCRIBE
Oct. 9, 2024, 11:51 a.m.
Aggregation & Discovery
Mobile & Apps

How The New York Times incorporates editorial judgment in algorithms to curate its home page

The Times’ algorithmic recommendations team on responding to reader feedback, newsroom concerns, and technical hurdles.

Whether on the web or the app, the home page of The New York Times is a crucial gateway, setting the stage for readers’ experiences and guiding them to the most important news of the day. The Times publishes over 250 stories daily, far more than the 50 to 60 stories that can be featured on the home page at a given time. Traditionally, editors have manually selected and programmed which stories appear, when and where, multiple times daily. This manual process presents challenges:

  • How can we provide readers a relevant, useful, and fresh experience each time they visit the home page?
  • How can we make our editorial curation process more efficient and scalable?
  • How do we maximize the reach of each story and expose more stories to our readers?

To address these challenges, the Times has been actively developing and testing editorially driven algorithms to assist in curating home page content. These algorithms are editorially driven in that a human editor’s judgment or input is incorporated into every aspect of the algorithm — including deciding where on the home page the stories are placed, informing the rankings, and potentially influencing and overriding algorithmic outputs when necessary. From the get-go, we’ve designed algorithmic programming to elevate human curation, not to replace it.

Which parts of the home page are algorithmically programmed?

The Times began using algorithms for content recommendations in 2011 but only recently started applying them to home page modules. For years, we only had one algorithmically-powered module, “Smarter Living,” on the home page, and later, “Popular in The Times.” Both were positioned relatively low on the page.

Three years ago, the formation of a cross-functional team — including newsroom editors, product managers, data scientists, data analysts, and engineers — brought the momentum needed to advance our responsible use of algorithms. Today, nearly half of the home page is programmed with assistance from algorithms that help promote news, features, and sub-brand content, such as The Athletic and Wirecutter. Some of these modules, such as the features module located at the top right of the home page on the web version, are in highly visible locations. During major news moments, editors can also deploy algorithmic modules to display additional coverage to complement a main module of stories near the top of the page. (The topmost news package of Figure 1 is an example of this in action.)

How is editorial judgment incorporated into algorithmic programming?

Algorithmic programming comprises three steps: (1) Pooling: Creating a pool of eligible stories for the specific module; (2) Ranking: Sorting stories by a ranking mechanism; and (3) Finishing: Applying editorial guardrails and business rules to ensure the final output of stories meets our standards. Editorial judgment is incorporated into all of these steps, in different ways.

To make an algorithmic recommendation, we first need a pool of articles eligible to appear in a given home page module. A pool can be either manually curated by editors or automatically generated via a query based on rules set by the newsroom.

A pool typically includes more stories than the number of slots available in the module, so we need a mechanism to rank them to determine which ones to show first and in what order. While there are various ways to rank stories, the algorithm we frequently use on the home page is a contextual bandit, a reinforcement learning method (see our previous blog post for more information). In its simplest form, a bandit recommends the same set of engaging articles to all users; the “contextual” version uses additional reader information (e.g., reading history or geographical location) to adjust the recommendations and make the experience more relevant for each reader. For an example of the geo-personalized bandit, see here.

To prioritize mission-driven and significant stories, we use several approaches to quantify editorial importance. One approach is having editors assign a rank to each story in the pool, with more recent and newsworthy stories generally considered more important. Another method infers a story’s importance based on its past promotion on the homepage, where stories that remain in prominent positions for longer are rated as more important. Regardless of the approach, editorial importance can be combined with the bandit to ensure that editorial judgment is incorporated into the ranking process, thus prioritizing stories deemed important by the newsroom.

Once we have a ranked list of stories, we make final adjustments based on predetermined rules developed with our newsroom partners before stories are shown to readers. One such intervention we developed is a “pinning” function that allows editors to override the algorithm and pin important stories at the top. Other important examples are our “already-read” and “already-seen” filters, which deprioritize stories that the user has already read or seen a certain number of times (Figure 2). This finishing step ensures that editorial judgment shapes the final output and that we maintain a dynamic and fresh user experience.

How do we set up an algorithmically powered module on the home page?

The process begins with clearly defining editorial intentions, standards, and boundaries as well as reader goals, and then designing algorithms appropriately. To illustrate the process, consider the above-mentioned features module (Figure 1): Content in this module is among the most widely read on the home page. The goal of algorithmic programming for this module is to increase engagement by presenting readers with freshly published features and columns and also to ensure that the most relevant and engaging stories are subsequently displayed.

After several rounds of experimentation and extensive collaboration with editors, we realized that more features had to be built to achieve the intended reader experience and for the newsroom to be comfortable with integrating algorithms into their process for programming the home page. Together, we built and launched the following features, which are cornerstones in accelerating the use of algorithmic programming on the home page:

Exposure boosting: While pinning is an effective tool for increasing the exposure of important stories, it is also a rather blunt one: a pinned story is shown to all readers until unpinned by an editor. To meet a desire by home page editors for a “softer” and more dynamic solution, we developed an “exposure boosting” capability. While a “boosted” story also initially appears at the top of the module, it gradually moves down the slots over time — at a rate predetermined by editors — until it becomes subject to the algorithm’s bandit again. (Figure 3: Exposure Boosting).

Smart refreshing: Another way to increase the exposure of our stories while ensuring that readers are presented with fresh content is by removing articles that the user has seen several times but has not clicked on — this assumes that the reader is not interested in the story displayed and the algorithm instead shows the next story on the list. When an article is shown to a user, whether they click on it or not, it’s called an impression. This rather rudimentary logic has its drawbacks: Frequent visitors might experience recommendations refreshing too often, causing a disorienting “slot machine” effect. They could also quickly exhaust all recommendations, resulting in a static module. At the same time, infrequent users, who don’t reach the impression limit, might see the same stories on distant subsequent visits, leading to a home page that would feel stale.

These potential issues were especially of concern for high-traffic modules like the features module. To address them, we developed a capability called “smart refreshing.” This feature creates a more stable experience for frequent visitors by only increasing the impression counter if a certain amount of time has passed since the last impression. Effectively, impressions occurring less than that amount of time are collapsed into a single impression. For infrequent visitors, smart refreshing limits staleness by automatically refreshing recommendations after a set period since their first impression, even if the impression limit was not reached. Home page editors decide on the interval between impressions and the maximum duration a story remains after its initial view based on editorial judgment and A/B testing.

Exposure minimums: In response to concerns from editors that some stories risk not getting enough exposure under purely algorithmic programming, we developed exposure minimums. This capability gives the newsroom the reassurance that all stories (particularly less popular ones) receive a minimum number of impressions on the home page before the algorithm takes over their programming. This guarantee helps set editorial expectations for story exposure and has enabled the rollout of algorithms on prominent sections of the home page, such as the Features Module. Typically, higher minimum values increase story exposure but can interfere with algorithm optimization, reducing overall engagement. To find the right balance between exposure and engagement, the exposure minimum is determined in collaboration with our newsroom partners and through A/B testing.

Algo visibility tools: One blocker we encountered while trying to scale algorithmic programming was the lack of visibility for editors regarding reader experience and story performance. One of the biggest challenges was feedback from the newsroom that editors and reporters couldn’t tell if their stories would appear in an algorithmic module on the home page. With the “already-read” filter in place, their stories, which they would have read, wouldn’t show up on the home page.

To address this, our product designer, engineers and data scientists partnered with home page editors to conceptualize and build a browser extension that allows editors to track all the algorithmic modules on the home page, preview different A/B testing variants, and review all the stories that have been selected and eligible for promotion for each module. Our engineers also built a tool that sends automated alerts to editors about changes in algorithmic programming, including new stories added to the pool and any headline or summary updates. Additionally, the data science team developed a dashboard to provide near-real-time analytics for stories that were algorithmically programmed.

After rigorously testing each of these new features and getting editors familiar with these tools and concepts, we permanently implemented algorithmic programming for the features module in the spring of 2024. This approach not only streamlined the editorial workflow (daily updates to the module were reduced by a third), it also gave stories that had a longer shelf life more time on the home page, lifting overall engagement. Our product colleagues were delighted that powering the features module with an algorithm also helped increase engagement with our sub-brands such as Wirecutter and Cooking.

Taking algorithmic programming further — breaking news and more

The strong foundation we built by incorporating editorial thinking into our algorithms, coupled with the trust we cultivated, led to more demand from the newsroom for more algorithmic programming tools. Today, editors are also using a tailored set of algorithmic modules to power a secondary set of stories for a topic or news event. These modules are completely self-service for editors, and have been particularly useful during major news events, when the volume of coverage produced often exceeds the amount of real estate on the home page.

Currently, algorithmic programming recommends stories within individual modules on the home page. Next, we want to explore and test reordering modules based on a mix of editorial importance, engagement, and personalization signals. We believe this approach can further improve a reader’s experience and amplify our journalism.

Zhen Yang is a data scientist on the algorithmic recommendations team at The New York Times. Celia Eddy, Alex Saez, Derrick Ho, and Christopher Wiggins contributed to this post. This article originally appeared on NYT Open and is © 2024 The New York Times Company.

Illustration by Vivek Thakker

POSTED     Oct. 9, 2024, 11:51 a.m.
SEE MORE ON Aggregation & Discovery
Show tags
 
Join the 60,000 who get the freshest future-of-journalism news in our daily email.
Journalists fight digital decay
“Physical deterioration, outdated formats, publications disappearing, and the relentless advance of technology leave archives vulnerable.”
A generation of journalists moves on
“Instead of rewarding these things with fair pay, job security and moral support, journalism as an industry exploits their love of the craft.”
Prediction markets go mainstream
“If all of this sounds like a libertarian fever dream, I hear you. But as these markets rise, legacy media will continue to slide into irrelevance.”