Voice Cloning with ElevenLabs AI
The 1% Chance explores themes of hope, wellness and discovery in the context of MND / ALS. This article is for informational purposes only and is not medical advice. The author is not a healthcare professional. Please consult a healthcare professional about your own healthcare needs.
Transcript:
When I got the news that my dad only had a few weeks to live, I went to visit him. He was sitting up in bed. ‘Fit like, Grayz?’ he said wheezily, putting a brave face on it. I’d never seen him so frail.
I tried to ask how he was feeling, but he couldn’t understand me. My voice had been degrading steadily for over a year; an early symptom of the diagnosis. The voice in my head sounded fine but by now, the voice that came out of my mouth was a flat monotone, lacking any rise and fall, unable to round out vowels or make any consonant sounds. I tried again, but it was no use. It was devastating.
Why now? This one moment in life when words meant more than they ever had? The chance to say ‘thank you’ and ‘I love you’ one final time had been stolen by this damned disease. When we think about the losses that come with a diagnosis like MND / ALS, there’s just no way to account for or measure the value of a lost moment like this.
I’d already set up a synthetic voice app but I didn’t want to use it until it was completely necessary. Perhaps this was the moment but the artificial voice sounded frail and debilitated, leaving me feeling disempowered. In my experience, there’s always another way or a workaround. So, I started to look for alternatives.
I searched and searched until I found what looked like the best option. A company called ElevenLabs said they could use recordings of my voice to generate an AI version which I’d be able to use from an app on my phone. I had hours and hours of high-quality voice recordings from my podcast which I had backed up in multiple places. ElevenLabs wasn’t specifically designed for people in my position, like my other voice app was, but I saw how I could make it work.
As quickly as I could, I uploaded my voice recordings to their website. I waited with hope and trepidation. Could this work? If it did, I’d have my voice back, like some technological miracle from an episode of Star Trek.
It did work! In fact, the voice sounded so like me - it was eerie!
We had some fun with it at home, but the first place I tried it in public was a coffee shop. I preloaded the phrase and when I got to the counter, I waved my phone sheepishly at the barista and deployed my request: ‘Can I have a small decaf latte, please?’
It wasn’t loud enough, so I had to fumble to increase the volume, hold the phone across the counter and find the play button again. I got my coffee, but I felt stupid and wondered how I was going to make this work.
The next use I found for my artificial voice was to create an audio version of one of my daughter’s favourite bedtime stories, The Lion King. When she was little, it became her go-to bedtime book, and I’d read it to her night after night until she drifted off to sleep. After a while, I didn’t even need to look at the pages, and I’d sit there in the darkness, the words flowing from memory: “The animals came from faaaarrrr and near, across the dusty African plain…”
Having trouble ordering a coffee was inconvenient, but losing the ability to read a bedtime story for my daughter was gutting. The loss of those small but significant moments that we share with our kids dozens of times every day - little jokes at breakfast, comments on a TV programme, the chance to offer words of comfort when she’s upset, chatting in the car, or singing songs together - has been a source of deep pain, as I’m sure any parent can understand. Taking back the ability to do bedtime stories felt like one in the eye for MND: a little victory and, hopefully, not the last, thanks to ElevenLabs.
I use their ‘Creator’ subscription for $22US per month. The plan includes Professional Voice Cloning, producing ‘clones that are virtually indistinguishable from the real thing, requiring a minimum of 30 minutes of clean audio to generate high-quality, lifelike voice clones.’
Professional Voice Cloning allowed me to recreate my own voice, which I find much more humanising than a random or similar voice that isn’t mine. The subscription also includes 100,000 credits per month (1000 credits covers approximately 1 minute of audio), with additional credits charged at $0.30/1000 credits.
Their ‘Reader’ app allows me to generate speech on the go, save common phrases (like my coffee order) and even to scan text which I can then read aloud (bedtime stories, for example).
Amidst the relentless challenge of living with MND / ALS, my AI voice has been one of the brightest glimmers of hope, allowing me to chat with friends, send voice messages to people, create these audio blogs and even produce my own audiobook. By far the most important, however, is the ability talk with my daughter, read her a bedtime story and tell her, ‘Daddy loves you’.
I stand with you.