Weather     Live Markets

ElevenLabs: Revolutionizing AI Voice Technology

In the heart of Poland, where dubbed films have long been plagued by monotonous lektors speaking all dialogue in a deadpan Slavic tone, Mateusz “Mati” Staniszewski and Piotr Dabkowski identified an opportunity to transform the audio landscape. Having grown up enduring this “uniquely Polish horror” of quality foreign films being drowned out by a single narrator, the high school friends joined forces while working at Palantir and Google respectively to experiment with artificial intelligence. Their initial project—an AI public speaking coach—revealed the potential to create something revolutionary: AI-generated voices that sounded genuinely human.

By May 2022, the pair had quit their jobs and pooled their savings to work full-time on ElevenLabs. Their breakthrough came in January 2023 when they launched their first model, capable of taking any text and reading it aloud with emotional range—happiness, excitement, even laughter—far surpassing the robotic delivery of Siri and Alexa. The technology could also clone voices, opening doors for authors to instantly create audiobooks, YouTubers to translate content into multiple languages, and media companies to expand their audio offerings. “It was obvious that this was the best model and everyone was picking it off the shelf,” noted Jennifer Li of Andreessen Horowitz, which co-led a $19 million investment round. This innovation catapulted the Warsaw and London-based startup into the spotlight, with venture capitalists ultimately pouring in over $300 million, valuing the company at $6.6 billion and making both founders billionaires at just 30 years old.

ElevenLabs’ success stems from its remarkably human-sounding voice technology, which has attracted diverse clients from corporate giants to individual creators. About half of its $193 million in trailing 12-month revenue comes from corporations like Cisco, Twilio, and Adecco, which use the technology for customer service calls and job interviews. Epic Games employs it to voice characters in Fortnite, including Darth Vader (with the James Earl Jones estate’s consent). The remaining revenue comes from early adopters—YouTubers, podcasters, and authors who have embraced the technology’s capabilities. Unlike many AI companies struggling to turn a profit, ElevenLabs has achieved impressive financial results, with Forbes estimating $116 million in net profits over the past year—a 60% margin that sets it apart in the competitive AI landscape.

Despite entering a field where tech giants like Google, Microsoft, Amazon, and OpenAI are all vying for dominance, ElevenLabs has maintained its edge. Its 300-person team has developed voice models so superior that they can command premium prices—up to three times those of American competitors. The company offers the largest library of human-sounding voices (10,000 and counting), including A-list celebrities Michael Caine and Matthew McConaughey. Testing by data training startup Labelbox revealed that ElevenLabs’ models make half as many errors as its closest competitor, OpenAI. “We are one of the very few companies that are ahead of OpenAI—not only on speech, but speech-to-text and music. That’s hard,” Staniszewski proudly states. This success comes from the founders’ focused approach: a tight team of machine learning researchers obsessively tackling specific problems with limited resources, which forced innovation rather than relying on brute computing power.

However, ElevenLabs’ journey hasn’t been without controversy. The technology’s ability to clone voices has led to concerning misuse—from AI soundalikes of public figures narrating inappropriate content to fraudsters impersonating loved ones’ voices in sophisticated scams. A lawsuit from audiobook narrators Karissa Vacker and Mark Boyett alleged that ElevenLabs used copyright-protected audiobooks to train its models, resulting in unauthorized clones of their voices appearing as default options. Though this case was settled out of court, it highlighted the ethical challenges facing voice AI technology. The company has responded by implementing safeguards: creating a list of prohibited voices (mostly politicians and celebrities), employing human moderators and AI tools to monitor for misuse, requiring consent checks for newly cloned voices, and offering a free deepfake detector to the public.

As ElevenLabs matures, Staniszewski and Dabkowski are expanding beyond voice technology. Responding to demand from creators and media companies, they’ve launched an AI music generator and plan to introduce AI avatars for videos next year. Their most ambitious vision involves creating a comprehensive hub where clients can manage all their AI tools. “We are building a platform that allows you to create voice agents and deploy them smoothly,” explains Staniszewski. This direction puts them in competition with numerous startups and tech giants, but their profitability provides a strategic advantage. Still, they face significant challenges—voice models will eventually become commoditized, their premium pricing already faces resistance from some customers, and expanding into more computationally intensive areas like music and video requires substantial infrastructure investment. The company has already committed $50 million to a data center project in Oregon. Meanwhile, Dabkowski remains committed to their original mission of transforming dubbed films, boasting that their next model will be able to translate and voice an entire movie in one operation. “We never give up on our missions,” he declares, as the aging corps of Polish lektors continue their work—for now.

Share.
Leave A Reply

Exit mobile version