In October 2023, Karen Rebelo came across a viral video of a stump speech by a former chief minister of the central Indian state, Madhya Pradesh. Rebelo, the deputy editor of Boom Live, a fact-checking publication based in Mumbai, is used to poring over video footage of prominent Indian political figures. But something about this particular recording of the local official, Kamal Nath, wasn’t adding up.
In the 40-second clip, Nath appears to address voters from a pulpit and says that if his party wins the upcoming state elections he plans to end the ladli behna program. The popular state welfare scheme was introduced by the opposing Bharatiya Janata Party, or BJP. The comments were more than uncharacteristic, they were suspiciously favorable to the BJP’s election bid.
“I told my teammates there could be only two possibilities: either you have the world’s best impersonator, or you’re using a voice clone,” Rebelo told me on a recent WhatsApp call. “There’s no other third possibility.”
The doctored videos suggested only two scenarios – political parties had a really good voice impersonator or the use of AI voice cloning tech. Most AI audio detection tools online are highly unreliable. These videos were viral when misinfo about Israel-Hamas war exploded. (3/n) pic.twitter.com/Fanc31O0ZO
— Karen Rebelo (@Karen_Rebelo) January 20, 2024
Rebelo theorized that the clip was an audio deepfake: AI-generated audio made to convincingly mimic Nath’s voice. But she didn’t know how to prove it.
“You can no longer solely rely on your human skills as a fact checker,” she said, explaining that standard reporting strategies had fallen short in verifying the video. Synthetic media experts largely agree that the rise of widely accessible generative AI tools over the past two years — across image, audio, and video — has led to a proliferation of political deepfakes. Some of this content is sophisticated enough that a studied human eye, or ear, in combination with ordinary fact-checking methods like a reverse image search or cross-checking details, is simply not enough to debunk it.
Instead, these deepfakes require forensic analysis, often including algorithmic models that can parse through a piece of media to find signs of manipulation. But like many newsrooms in India, Rebelo’s team at Boom Live was hard-pressed to figure out which detection tools were out there and whether they’d be able to access them.
“We didn’t have a very reliable AI audio detection tool at that time. I was looking around for something. I couldn’t find anything that I could completely trust,” Rebelo recalled, remembering several dead ends she encountered running the clip through free tools on the internet. Ultimately, it took Rebelo a few months, plenty of rejected inquiries, and a cold email to the University at Buffalo professor Siwei Lyu, a continent away, to finally find someone who was willing to run a full forensic analysis of the audio clip for free.
According to Lyu’s analysis, the audio of Nath was, in fact, a fake. As were three other clips circling on social media of politicians involved in the Madhya Pradesh state elections. On January 18, Boom Live finally ran its investigation exposing the audio deepfakes. The state election, meanwhile, had been declared over a month earlier.
In recent months, newsrooms across India have been trying to build out a process for deepfake detection. As Boom Live’s Rebelo found, this often requires building relationships with academic researchers, disinformation-focused nonprofits, and developers behind commercial AI detection tools, within India and abroad. A coalition of Indian fact-checking organizations, including Boom Live, launched a WhatsApp hotline in March, called the Deepfakes Analysis Unit, to try and centralize this work of authenticating political media across the country.
The momentum has largely been fueled by anticipation, and concern, around the ongoing Indian general elections. On April 19, voting began in what is expected to be the largest election in history, with over 1 billion people eligible to go to the polls. The election is so large it is split up into several waves, the last of which will take place on June 1. Among the candidates on the ballot is incumbent Prime Minister Modi, a member of the BJP, who is seeking a third term in office.
India may hold the largest election this year, but it is far from the last. In 2024, more than 60 countries will hold major elections and over half of the world’s population will be eligible to go to the polls. In the coming weeks, Mexico and South Africa will host their own general elections, followed closely by the E.U. parliamentary elections. On November 5, the U.S. will go to the polls in a presidential race that is forecasted to, once again, be plagued by election disinformation.
While the current infrastructure for deepfake detection is being pressure tested in India right now, the reporting strategies being carved out by journalists during the election offer a preview into the challenges that lie ahead for other newsrooms around the world.
“Right now it’s manageable, because it’s one [potential deepfake] in a day. What happens when it becomes multiple things in a day? When it becomes multiple on the hour?” wondered Rebelo when we spoke in the days leading up to the first wave of voting. “The challenge will be what happens when it goes nuclear.”
So far in this Indian election cycle, academics studying synthetic media have been a critical resource for reporters like Rebelo. Among them has been Mayank Vatsa’s lab at the Indian Institute of Technology (IIT), Jodhpur. Vatsa started seeing requests for deepfake testing from journalists trickle in last year. Quickly, that influx became a steady stream, with one or two potential deepfakes hitting his inbox every week.
Vatsa soon realized the time it took to sort through those requests and correspond with each journalist alone was unsustainable. In February, he launched Itisaar in collaboration with the Indian startup DigitID. An online web portal, Itisaar allows users to upload files directly and automates parts of the verification process. Currently, individuals can upload files for free. Now, nearly three months into the project, Itisaar is receiving one or two potential deepfakes every hour.
Despite its nascency, the platform already has over 80 approved users and roughly 40% of those are journalists. So far, he says his team has fully evaluated more than 500 submitted samples.
Itisaar can output a simple report that states, up or down, whether a piece of media is a deepfake with a percentage confidence rating. But Vatsa explains journalists often come back to him looking for more granular information, requesting comprehensive reports that they can use to bolster their fact check before running a story.
If an audio clip is one minute long, the Itisaar team can break the clip into four-second segments and mark whether those subclips have been altered. Some of the audio deepfakes they’ve analyzed take real audio recordings and splice in AI-generated segments. “We call this a half-truth,” said Vatsa.
When a journalist submits a video clip, Itisaar can break down the clip by frame and mark where in the video the authentication model is “focusing” while making decisions. If the model is focusing mostly on the lip region, that gives a cue that the mouth has been altered and there’s a very high chance the video is a lip sync, according to Vatsa. If the model focuses on one face in a group, that’s a cue the video is a face swap. “We can get this kind of interpretability, and we provide these attention maps to the journalist,” he said.
When videos of Bollywood actors Aamir Khan and Ranveer Singh criticizing Modi went viral the first week of the election, Boom Live used Itisaar to test the clips. From Itisaar’s report they were able to determine the clips were made using a “voice swap,” meaning AI-generated audio had been used to overlay the fake critical comments onto real video footage. The Indian Express, NDTV, Hindustan Times and other major national news outlets aggregated Boom Live’s findings.
#Watch A video of actor #RanveerSingh criticising PM Narendra Modi over rising unemployment and soaring inflation is a #deepfake video made using an artificial intelligence voice clone of the Bollywood actor.
Read here: https://t.co/Nw972gt2bw#BOOMFactCheck pic.twitter.com/7LyznXpZOU
— BOOM Live (@boomlive_in) April 21, 2024
There are notable benefits for reporters working with local researchers in India like Vatsa. “The beauty of our approach has been, with respect to audio, that it is multilingual and, with respect to video and images, that it encompasses the diversity of skin tone in the Indian context,” said Vatsa.
One challenge in reporting on election deepfakes in India is language. National newsrooms may have the resources to closely track political disinformation in Hindi, English, or Bengali, but the same often cannot be said for all 22 official languages in India, as well as the more than 120 major regional languages spoken across the country.
Vatsa’s team at IIT Jodhpur, however, has collected a database of 4.1 million audio deepfakes to train its tools. That cuts across Hindi, English, and 13 additional languages spoken on the Indian subcontinent, including Tamil, Telugu, and Urdu. The team’s deepfake video dataset, meanwhile, has more than 130,000 samples, mostly depicting Indian people.
“Data is the new oil, right? What kind of oil are you going to put in your engine?,” asked Vatsa, questioning the efficacy of detection tools that run only on English-language datasets, or that otherwise limit themselves to Western data sources. “That’s the whole game here. If I have to solve the problem, for diversity in language, in a multilingual setting, I have to only put that oil into my engine.”Vatsa has observed the shortcomings of English-dominated deepfake detection models first hand. Last month, Itisaar received a video through its portal that was in a regional language, one most spoken in a single south Indian state. When the Itisaar team fed the video into its model trained solely on English, it said it was authentic. When they used their standard multilingual model, it correctly identified the video as a deepfake.
“I’m not contesting that a language-agnostic model can be created. Certainly, it can be,” said Vatsa. “But if you have language diversity available to you to train on, then you actually get much better performance, and higher confidence.” Currently, Vatsa claims Itisaar’s models, in almost all 15 languages it is trained on, are registering more than 80% accuracy in detecting deepfakes. In some cases, including Hindi and English, that number rises to 90 to 95%.
“Many of the tools which are available, we don’t know what technology has been used and what kind of diversity the databases have used. But their diversity — the population that they’re looking at — probably is very different from the population that I’m looking at right now,” he said.
While Itisaar is one example of the benefits of a homegrown deepfake detection tool, many reporters in India are, in practice, finding their strongest partnerships with experts abroad.
That includes Nilesh Christopher, an independent journalist based in Bangalore, who has been at the forefront of covering AI-generated political disinformation. (Christopher will also be a 2024–2025 Nieman Fellow.) Most recently, he reported a series of stories for Al Jazeera on the topic, including a piece exposing the rise of AI-generated videos of deceased Indian politicians. They are increasingly being used for posthumous election endorsements. For Wired, Christopher and Varsha Bansal reported that over 50 million AI-generated voice clone calls were made in India in the two months leading up to the elections, as part of an emerging deepfake cottage industry for voter outreach.
7/ Personalized video outreach using AI
Many vendors have been pitching political parties on personalized video outreach. The candidate’s voice and likeness are cloned to spell out the name of the voter’s name. One such sample shared by a deepfake service provider⬇️… pic.twitter.com/ksCEJdiZLg
— Nilesh Christopher (@NilChristopher) May 20, 2024
In a recent phone call, Christopher echoed other journalists I spoke to — finding reliable detection tools to do this work is not easy. But for Christopher, one reason has been political self-censorship.
“The Indian organizations I reached out to directly wanted to do forensic analysis off record, and tell me about it. None of them wanted to be on-record, fearing political backlash,” he told me, explaining he has encountered this problem with both AI startups and academic research labs in India.
In recent years, political journalists and fact checkers have faced increasing scrutiny and censorship by the Modi government. In 2022, the Delhi police arrested Mohammed Zubair, the founder of the prominent fact checking site Alt News, because of a social media post. More recently, the government has mandated bans of journalists on social media platforms like Twitter and begun the contentious roll-out of a centralized government fact-checking body.
“Once the startups weigh in on a specific political issue, they face the risk of the politician coming after them,” Christopher said. He describes the resistance of local Indian startups to taking on political deepfake checks, particularly on-the-record, as a “systemic issue.”
“I’m quite confident that none of them would take up detection if it’s a very contentious high-stakes election issue,” he added, imagining an incident that had influence over an election outcome. “That hinders and pushes journalists behind in the news cycle as we try to report the truth.”
For many reporters, gaps in domestic deepfake testing have forced them to regularly seek out expertise overseas. Christopher, for one, has been working closely with Witness, a Brooklyn-based nonprofit that furthers the use of video and technology to defend human rights. Last year, Witness launched its Deepfake Rapid Response Force, an initiative that seeks to help journalists take on some of “the most difficult deepfake cases.”
For one of Christopher’s investigations, published by Rest of World in July 2023, Witness analyzed two audio clips of a politician from the southern state of Tamil Nadu. The politician claimed the embarrassing clips — in which he accuses his own party officials of corruption — were AI-generated. The Deepfake Rapid Response Force was able to conduct machine tests and bring in Tamil-speaking experts to review the recordings. Ultimately, one of the clips was deemed real by several sources. The deepfake claim appeared to be a cover to save face.
Services like this, run by Western nonprofits, are critical to journalists reporting on election deepfakes in India right now, according to Christopher. But there are still significant downsides. For one, the full review by the Deepfake Rapid Response Force took about three weeks.
“As someone who is in the Global South, without easy access to these tools, even when organizations like Witness are willing to help, the time lag between when you submit a piece of evidence and you get the result is so far out,” he said, noting that even when checks are expedited, there are few academic or nonprofit partners who can get something turned around in less than a few days.
Academics I spoke to confirmed that they are being inundated with requests from journalists in India, with limited time and resources to take on these pro-bono cases. That means as the election unfolds, there is a constant need to triage which potential deepfakes get prioritized.
“I try to limit that to high-stakes situations,” said Hany Farid, a leading researcher in synthetic media at the University of California, Berkeley, who has worked closely with Indian journalists in recent months. Currently, Farid says his team can only reasonably take on one to two checks per day, despite requests coming in from all over the world.
The lab’s process isn’t as simple as pressing a button; it often requires the team members to customize their forensics approach or run a series of different tests. “I don’t care about the sharks swimming in Central Park after a thunderstorm,” Farid said. “I try to limit myself to the things that I think are impactful and have a bit of a ticking clock.”
But he admits that turnaround times are a chronic challenge facing journalists. “You’ve got a content verification problem, but you also have a speed problem and a volume problem. And arguably, the speed and the volume are really the problem,” he said.
That’s especially concerning because some of the most nefarious election-related deepfakes in the region have been documented in the 48 hours before a major voting day.
In February, an audio clip of Imran Khan, the former prime minister of Pakistan, was released the day before the country’s general election. In the clip, an AI-generated voice mimicking Khan urged voters to boycott the elections, and claimed the vote had been rigged. In that case, the clip was quickly debunked — Khan has been in prison since August of last year.
شکست کا شدید خوف، عمران نیازی نے انتخابات کے بائیکاٹ کا اعلان کردیا. pic.twitter.com/ns1SmZ5Tvx
— Javed Iqbal (@javedeqbalpk1) February 7, 2024
In January, a similar incident plagued neighboring Bangladesh’s national assembly vote. On election day, two AI-generated videos of independent candidates, in which they appeared to say they were withdrawing from the race and supporting their opposition, began circulating on Facebook.
The incidents beg the question of what will happen in India when a fact check is needed in a matter of hours, not a matter of days or weeks? Amid competition for the limited time of a limited pool of academics, some newsrooms have already turned to commercial detection tools to solve this problem.
Among dozens of free tools, Hive Moderation comes up frequently, since it offers a free dashboard for basic image, audio, and video detection, alongside its money-making enterprise services. AI or Not offers 10 free tests of image files per month before charging kicks in. For audio deepfakes, Eleven Labs, a well-known AI audio generation company, has created its own free detection tool called AI Speech Classifier. It can only be used to confirm whether Eleven Labs itself was used to create a recording, though, and any other audio deepfake will slip beneath its radar.
Earlier this month, OpenAI announced it’s developing a similar detection tool for images generated using Dall-E. The tool has yet to launch. Last year, OpenAI released a similar classifier for AI-generated text, but it was shut down within six months because of its low rate of accuracy.
“I would prefer academic institutions any day of the week. There’s no question in my head at all, because the quality of the analysis will be far more robust than any commercial tool out there,” said Rebelo. In the past, Boom Live has tried to use free, publicly available AI image detectors, with mixed results. In one case Rebelo had to issue a retraction after a faulty result from an AI image testing service, an experience she calls “scarring.”
Given the limited bandwidth of academics, Boom Live invested in a subscription to Loccus, a tool that detects audio deepfakes and is already widely used in the banking sector to verify the voices of customers, among other use cases. Loccus’ profile has risen quickly, particularly after it entered a partnership with Eleven Labs last fall to build better detection tools.
“We’ve definitely seen a surge in demand this year [from journalists],” said Manel Terraza, the CEO and founder of Loccus. Most small fact-checking organizations, including those in India, have opted for the company’s monthly subscription. For larger media companies, Loccus tends to sell its API, which newsrooms use to service in-house tools that are charged for each minute of audio processed. Locus has signed one such deal with Agence France-Presse (AFP), the French wire service.
Little information about Loccus’ products is available to prospective customers — including its pricing tiers for journalists. Most subscription deepfake detection products on the market, including competitors Reality Defender and Deep Media, require personal consultations before signup. “When you’re building a solution like ours, you need to be very careful around adversarial attacks,” said Terraza, explaining that bad actors could try to use Loccus to reverse engineer deepfakes that evade detection.
Some journalists told me the cost of Loccus, and similar subscription products, can quickly become prohibitive for already bootstrapped newsrooms, or for independent journalists like Christopher. “I met with a couple of websites that promised to give me a response. They were behind paywalls, though, which, again, creates access friction for a journalist on a deadline,” he said.
One solution to these cost problems though has started to emerge among Indian journalists — pooling newsroom resources.
The Misinformation Combat Alliance (MCA) offers one model for how news organizations in India are working together to address both capacity and cost barriers in deepfake detection. Launched in 2022, the MCA currently includes 12 Indian fact-checking organizations among its ranks, including Boom Live, Logically Facts, India Today, and The Quint.
In February, the coalition launched a dedicated WhatsApp tipline called the Deepfakes Analysis Unit (DAU). The tipline is open to anyone, including journalists, to submit. To subsidize its operating costs, the DAU is receiving direct funding from Meta, which owns WhatsApp and has long been criticized for enabling the flow of political disinformation across India.
The most popular messaging platform in the country, WhatsApp is end-to-end encrypted, meaning false information can be forwarded from private group chat to private group chat without easy detection. The BJP is reported to operate as many as 5 million WhatsApp groups across the country.
The DAU has hired three full-time editorial staff in India to triage requests that come in on WhatsApp. If the team determines something could be a deepfake, they’ll conduct verification using standard AI detection tools and OSINT techniques. If there are signs of manipulation, that same clip may be passed off to one of the DAU’s forensic partners. That includes deepfake detection labs at academic institutions, like Hany Farid’s lab at UC Berkeley. It also includes the developers of for-profit authentication tools, including IdentifAI and Contrails AI, which send back detailed analysis free of charge.
Currently, the DAU is only servicing video and audio clips in English, Hindi, Tamil and Telugu, but not images or clips in other regional languages.
“The DAU becomes a sort of central nodal point to coordinate with a lot of different forensic partners. Otherwise, what was happening was all the fact-checkers were reaching out to set up individual partnerships with certain labs and startups. It wasn’t efficient,” said Tarunima Prabhakar, co-founder of the Delhi-based Tattle Civic Technologies, which is a member of the MCA and has been central to launching the DAU tip line.
There are other advantages to newsrooms working as a collective, according to Prabhakar, including bargaining power. “Big AI content-generation companies might not speak to a specific group in India, but as an alliance of 12 fact-checking orgs, they are speaking to us.”
The same can be said for academics. “We tend to try to work with umbrella groups, like when the AP does a fact-check, hundreds of organizations get that fact check,” said Farid, the Berkeley professor, who mentioned the MCA is among the umbrella organizations he works with closely.The DAU is also able to offer access to commercial detection tools by proxy. With proper coordination, newsrooms can avoid charges for each testing the same media item by routing their verification through the DAU or reviewing public reports on content the DAU has already tested.
The DAU may not be a silver bullet for spotting deepfakes in India’s complex information ecosystem, but it is one innovative model for cross-industry, cross-newsroom collaboration. It’s also a model that could be replicated by other newsrooms around the world as they navigate deepfake coverage during their own upcoming election cycles.
The challenge of finding deepfakes amongst the deluge of political media circulating in India right now is daunting. The tools used to detect them are new. But the dynamics on the ground are, unfortunately, more than familiar.
“I would argue the bigger story is good old-fashioned social media: Twitter and YouTube, Facebook and TikTok, Telegram and WhatsApp,” said Farid. Political deepfakes can only travel so far without being disseminated across group chats and newsfeeds and For You Pages.
“If you send a deepfake to five of your email friends, I’m not worried. It’s how these deepfakes are not just being allowed on platforms, but being promoted and algorithmically amplified and rewarded and monetized,” he added. “That’s an age-old story there. In fact, it’s not even a generative AI story.”
This story has been corrected to accurately describe the Deepfakes Analysis Unit’s internal verification process.