A year ago, we told you about BloombergGPT, the terminal giant’s attempt to train a large language model on the company’s epic corpus of financial news and data. Now it’s another financial news powerhouse — the Financial Times — joining the fray. The FT formally announced Ask FT this morning. (FT CEO John Ridding had hinted at it in a recent interview with my colleague Sarah Scire, describing a future “way of interrogating FT’s news and archives, semantic search, summarization tools, audio articulation of our stories.”) Here are The Verge’s Emma Roth and Quentyn Kennemer, who got a sneak peek:
The Financial Times has a new generative AI chatbot called Ask FT that can answer questions its subscribers ask. Similar to generalized AI bots (like ChatGPT, Copilot, or Gemini), users can expect a curated natural language answer to whatever they want to know — but with answers derived from the outlet’s decades of published information rather than sources that are harder to explain or are subject to ongoing legal action. So don’t expect it to give you an answer for the best recipe for fettucini alfredo.
(I should note that the FT has, in fact, commented on the pleasure James Baldwin took pairing fettuccine alfredo with “fresh green beans in butter, followed by socca crêpes with almonds and whipped cream.”)
Sadly, I am not one of the “few hundred” FT subscribers granted beta access to Ask FT, so I can’t report back on its efficacy. But the idea is straightforward: Good news organizations have rich and deep archives in their chosen areas of coverage, and using an LLM to sort through those archives can summon up useful information that might prove resistant to traditional searching. (“This new feature will help our subscribers to make confident strategic and commercial decisions quickly by getting answers rather than search results,” FT Professional managing director Nick Fallon says in the press release.) The FT isn’t the first to launch an “Ask ___” AI product, and they definitely won’t be the last. They’ll all have to answer big questions about how these chatbots fit with their existing offerings — and with the habits of their audiences. Here are three that strike me as among the most important.How much outside information should be included in the training? For example, a group of tech publications (including Macworld, PC World, and TechHive) released a chatbot a few months ago called Smart Answers, which was trained “only on the corpus of English language articles” they produced. That might help the bot align with the publications’ editorial standards, but it also excludes a world of information someone seeking, er, smart answers might seek. It “doesn’t know who TikTok’s CEO is, for example.”
Only about half of BloombergGPT’s training data, in contrast, is Bloomberg-originated, with the other half coming from “general purpose datasets.” A narrow-purpose AI — like the San Francisco Chronicle’s Chowbot, which focuses solely on local food, or Aos Fatos’ FátimaGPT, which focuses on fact-checking — might fare better with a constrained training set than one seeking to tackle a more catholic set of questions. But how will users used to ChatGPT’s seeming omniscience view LLMs with narrower views?How much will timeliness matter? One of mainstream LLMs’ biggest weaknesses is the fact they live in the past — “as of my last knowledge update,” you might say. If you ask ChatGPT Plus when its corpus of knowledge was last updated, it’ll tell you April 2023. Its integration with web search can bring it closer to the present, but with potential damage to accuracy on even the most cut-and-dry questions.
For instance, I just asked it: “Who won between Houston and Texas A&M last night?” It replied: “The Houston Cougars won against the Texas A&M Aggies with a final score of 70-66 last night.” Nope, that was the score when they met on Dec. 16; last night, Houston won 100-95 in overtime. The Verge’s story noted that Ask FT seemed to think Nikki Haley was still a major candidate for president — though it successfully answered a question about Microsoft that relied on news stories less than a week old.
While news orgs may think of their archives as their key LLM asset, I suspect real-world usage will include a lot of what-just-happened real-time queries. And those are questions where accurate answers will be more difficult to produce. Will publishers be able to steer users toward archive-driven questions — or will they find ways to increase the accuracy of real-time queries, perhaps with more aggressive use of wire services or other timely data sources?
How will news orgs get user behavior to adapt to these new bots? Or how will they adapt the bots to match user behavior? Most general news consumption is not driven by a desire to seek specific information.1 People actively go to news sites for broad-spectrum updates on the world (“What’s going on today, local newspaper?”) or just to fill a content-sized hole in their day (“Give me something interesting to read, Washington Post”). And for news consumption that doesn’t start at a news site — which is to say, most news consumption — that generally starts with an answer, not a question, Jeopardy-style. (A headline on a social feed tells you about something that’s happened — and it’s almost always something that you weren’t explicitly seeking information about five seconds earlier.)
LLMs are, in contrast, overwhelmingly about specific information seeking.2 They share that characteristic with Google and other search engines, which are powered by specific user intent. (Things like “car insurance quote” and “cheap flights.”)
What I’m saying is that bot-offering news orgs will need to find ways to bridge that divide. It’s easier to imagine with a financial outlet like the FT or Bloomberg, where specific information seeking aligns better with high-end business users. But even for an outlet as high-end as The New York Times, it’s not obvious what use cases an “Ask NYT” chatbot would fulfill. News-org-as-all-knowing-oracle will require some philosophical shifts from news-org-as-regular-producer-of-stories. (For example, imagine an AI that could generate an on-the-fly backgrounder whenever a reader sees a person or concept they don’t recognize in a story. That sort of “Who’s this Ursula von der Leyen person?” question is the sort of specific information request that could be met contextually.)