Audiobooks have gained tremendous momentum in the last ten years. According to a survey by the Audio Publishers Association, the audiobooks industry grew by 34 percent in 2018 alone. In 2019 and 2020, the industry saw growth rates of 14% and 12% respectively. While less than 3000 audiobooks were produced in 2005, the number has rapidly grown over the years, hitting 71,502 in 2020. Going by these statistics, audiobooks are clearly on a boom.

But like all new technologies, the audiobook industry is facing its share of controversies, chief of which is the role of artificial intelligence in audiobook production. If you have ever listened to an audiobook in the past, you most likely heard the voice of a professional actor who narrates books for a living. Or if you were lucky, you may have listened to an audiobook narrated by its author, for example Trevor Noah’s Born a Crime.

Now, take a moment to imagine a world where, instead of listening to audiobooks voiced by human actors, you will be listening to audiobooks narrated by synthetic voices generated via artificial intelligence. If this sounds a bit too far-fetched, it is not. AI for audiobook creation is gaining traction fast and shaking the traditional audiobooks industry that chiefly relies on human voice actors.

What do Book Narrators Think About the Rise of AI for Audiobooks?

 Cited in Publishers Weekly, Hillary Hubert, a board member of Professional Audiobook Narrators Association (PANA) and a member of the steering committee for the Screen Actors Guild – American Federation of Television and Radio Artists (SAG-AFTRA), said the role of AI in the production of audiobooks has ignited much debate among professional narrators. It’s understandable why they would have fears around authors using ai for audiobook creation.

Most voice actors and narrators are wary of AI in audiobook production for several reasons. First on the list is the possibility of losing their livelihoods. The audiobook industry has thousands of people working in different roles such as voice actors, audio engineers, audiobook editors, etc. What happens to these people if their work can be automated using AI? This is an important question that the critics of AI in audiobook production have raised.

The second reason for hesitation around AI for audiobooks entails the licensing of an actor’s voice. Let’s say you license your voice to be used by Company X for AI-facilitated audiobook production. How much say can you retain in how your voice is used? What if your voice is used to narrate content that you find questionable? And how well are you going to be remunerated for the usage of your voice? These are some of the key questions that audio narrators are grappling with. Licensing their voices to AI companies may seem akin to ceding the control that they have about how those voices can be used.

Lastly, critics are worried that the use of AI in audiobook production obliterates the connection that currently exists between a voice actor narrating an audiobook and the listener. Is it possible for listeners to appreciate an AI voice as much as they appreciate the voice of Stephen Fry or Neil Gaiman? Using synthetic voices for audiobooks, the argument goes, dilutes the special storytelling experience that listeners get from listening to a human voice.

Who are the Proponents of AI for Audiobooks?

 While most audiobook narrators are looking at AI with skepticism, there is another group of people in the industry who see huge possibilities in using AI for audiobook production. These are chiefly AI entrepreneurs and small publishers. What are the reasons for their optimism?

First, proponents of AI in audiobooks production cite the high costs of producing audiobooks using human book narrators. A Wired article on the use of synthetic voices states that audio narrators charge about 250 dollars per finished hour. Other sources estimate that some talents in the audiobook industry can charge as much as 1000 dollars per finished hour. Once you add editing and other production costs, creating a single audiobook easily costs several thousand dollars.

While big publishers may be fine with these costs, it is simply not feasible for small or individual publishers to spend that much on a single audiobook. This is where AI comes in. With the use of AI, the cost of creating an audiobook is drastically reduced. As an example, DeepZen, an AI company that specializes in producing audiobooks, charges about 120 dollars per finished hour and less depending on the services that the client opts for. Other companies such as Speechki promise to charge even smaller amounts.

Besides reducing the costs of production, AI proponents argue that this new technology is the best way to scale audiobook creation. In an interview by Joanna Penn at The Creative Penn, Tylan Kamis, the CEO of DeepZen, said that there are almost 50 million eBooks in the world but only half a million in audio format. Nearly 90% of audiobooks are also in one language: English. If AI is widely adopted in the production of audiobooks, Tylan believes that more audiobooks could be produced at a faster rate. If the audiobook industry continues to rely on human actors for narration, it will take an extremely long time to translate the millions of books in the world into an audio format. This means that people who benefit most from audiobooks, including those who are print-disabled, will continue missing out. In addition, AI will make it easier to make audiobooks in languages other than English. Instead of finding dozens of actors to narrate a book in different languages, AI can automate the entire process and reduce the time needed to make a book available in different languages.

Lastly, supporters of AI for audiobooks believe that the technology will provide more revenue streams for audio narrators through voice licensing. As an example, Tylan (DeepZen’s CEO) cited the case of Edward Hermann, a renowned audiobook narrator who died in 2014. According to Tylan, DeepZen managed to license his voice for the production of audiobooks. In this manner, his legacy is assured and at the same time, his estate continues to receive an income based on the sales of the audiobooks that DeepZen produces using his cloned voice.

audio book maker

AI Content Creation Tools

 If you would like to venture into AI-supported audiobook production, there are a number of platforms you can experiment with. Below we highlight a few examples:

  1. Google’s Auto-Narrated Audiobooks

This platform leverages Google’s research and technology to provide publishers with a fast and inexpensive way of creating an audiobook. The service offers a range of accents and genders. To use the platform, you first need to provide an eBook in the EPUB format and also offer it for sale on Google Play. You are also required to have the audio rights to the eBook you provide. The eBook should ideally have little emotion or dialogue. Books with plenty of charts and graphs are also not ideal if you intend to get the most out of this service.

Once your audiobook is ready, you can download it and sell it on any platform that accepts audiobooks narrated by synthetic voices (this excludes Amazon’s Audible platform which does not currently allow AI-produced audiobooks). At the moment, this service is offered at no cost and is available only to users in the United States, Australia, Canada, United Kingdom, New Zealand, and Spain. To learn more, check out Google’s step-by-step guide on how to create an auto-narrated audiobook.

  1. DeepZen

Based in the United Kingdom, DeepZen is a company that provides AI-generated audiobooks and voiceovers. DeepZen claims that all its synthetic voices are generated from licensed replicas of human voice actors. This, combined with experienced audio editors, provides a service that the platform claims is “indistinguishable from traditional narration.” According to the CEO, Tylan Kamis, DeepZen is planning to launch a portal where audio narrators can create synthetic versions of their own voices and use them to create audiobooks in a faster way.

  1. Speechki

Speechki is another popular AI audiobook production platform that promises “natural-sounding synthetic narration using artificial intelligence.” Speechki claims that it can provide users with an AI-generated audiobook in just 15 minutes and for ten times cheaper than traditional audiobook production. Speechki allows clients to choose from 341 synthetic voices and 77 languages.

  1. Beyond Words

Beyond Words promises the user “ethically created AI voices” that are available for unlimited usage. Beyond Words provides clients with the latest text-to-speech voices from WaveNet (Google’s AI audio research program), Microsoft Azure, Yandex, and Amazon Polly. Beyond Words boasts clients such as the United Nations, the Irish Times, and the Japan Times. Packages range from free up to $250 per month.

  1. Audiobook.ai

This platform offers 146 voices in 43 languages to the users. They also claim to produce an audiobook in just 10 minutes. Potential clients have the option of trying the service for free before they make a purchase. If you don’t want to deal with proofreading, the company also provides “white-glove service” that handles the editing and proofreading part.

  6. Podcastle.ai

This one is my current favorite.

Podcastle Ai converts text into speech with a variety of features that are sure to make the process simple and efficient. It offers studio-quality recording, audio detection, and voice-to-text translation. The audio detection feature allows you to easily transcribe any sound or dialogue, while the voice-to-text translation tool can be used to convert spoken words into written form. There is a paid plan available that provides access to even more features, but the basic version also provides plenty of useful free tools.

I really like this one for the “magic dust” feature that smooths out the rough spots and makes your voice sound remarkably good. The podcasting component is the most advanced of the programs I’ve seen, with one-click background noise removal (which is also great for audiobooks). $29 a month will get you up to 200,000 words of text-to-speech narration, which is far beyond what’s needed for the typical book.

I’d recommend starting with the free version, which will do text-to-speech narration for roughly 6 – 8 pages, then upgrade.

Audiobooks and podcasts are far from the only tools you can use with this program. 

Artificial Intelligence Won’t Replace You

Some authors fear it won’t be long before an AI writing generator will perhaps take over story writing itself. Proponents of AI say a story writing AI will always need a human guide. According to Forbes, AI “can’t massage the phrasing or other intangibles.” The use of AI in publishing is already common. Editing programs such as ProWritingAid or Grammarly are examples of how AI can benefit an author’s content creation. 

Next Steps

 With its rapid growth and projected market size of 15 billion dollars by 2027, the audiobook industry is here to stay. As the industry scales, new technologies such as AI will inevitably have a big role to play. Whether you are an author, a professional book narrator, or an AI entrepreneur, you need to explore this nascent field to learn more about what it can offer you. While there are fears that AI-generated voices could replace human narrators, there is a strong chance the two will co-exist and perhaps ultimately enrich the audiobook experience for book lovers.

Take a look at a free audiobook maker online to get a taste of how artificial intelligence in publishing can be used: (YouTube link to come)

Want to create your own audiobook? Get your free download here: https://thepublishingcircle.com/Audio

Connect With Me On Social Media