Acquired by

Less is More: Sonantic Perfects Subtle Emotions & Non-Speech Sounds

Request access

Try our hyper-realistic voices

Zeena Qureshi

CEO & Co-founder

March 8, 2022

Human beings are incredibly complex by nature and voice plays a critical role in helping us connect with the world around us. The average person has more than 6,200 thoughts a day and more than 27 categories of emotions to capture how we’re feeling. This is what makes speech so deeply dynamic.

At Sonantic, we are committed to emulating the human voice. Our AI-generated voice models can express a wide range of bold emotions. And yet, so much of verbal communication consists of subtleties.

The best performances demonstrate true complexity, vulnerability and authenticity, with subtlety being a critical component of artful communication. As the old saying goes, “It’s not what you say, it’s how you say it.” No one understands this concept better than storytellers and actors. Our most recent breakthroughs — subtle emotions and non-speech sounds — creates unlimited potential for our customers like never before.

Developing Subtle Emotions

Perfecting subtle emotions is more difficult from a technological standpoint. AI voice models tend to make emotions sound muted. The more subtle they are, the more likely they are to get drowned out and sound robotic. We researched methods to ensure our voice models counteract this muting effect, keeping the subtlety of the emotion intact. We’re excited to roll out new subtle styles for our customers such as “flirty,” “coy” and “teasing,” amongst others.

Capturing Non-Speech Sounds

When you think about it, communication is so much more than words. There are pauses, deep breaths, oohs and ahhs that give additional insight into how we’re thinking and feeling at any given moment. It’s these non-speech sounds that bring our conversations to life. Even something as simple as breathing, which is so innate to us, makes a difference in how natural humans sound. Without these little details, our conversations would sound robotic and monotonous.

As part of this launch, we’re excited to announce that our AI voice models can now breathe, laugh and even scoff. We had an “aha” moment when we were able to hear a laugh followed by a breath, which sounded so real. We wanted to break this down so you can hear the difference.

Breath Only:


Laugh + Breath:


The Breakthrough

So, how did we do it? There were two key elements: acting performances and algorithms.

Sonantic has always worked with the most talented voice actors to train our Voice Engine, and this most recent evolution was no different. But subtle emotions can be tricky even for the most seasoned performers, often requiring multiple takes to get it just right. We re-worked scripts, took more time preparing, recorded more takes and gave our actors as much creative freedom as possible. This ultimately gave us dynamic, versatile and multilayered material to work with.

Capturing non-speech sounds was no easy feat, and it required us to experiment with new algorithms until we got it just right. On our first few tries, things sounded garbled and robotic. But with great teamwork and collaboration amongst our Research, Data and Engineering teams, we developed new proprietary algorithms that allowed us to capture these subtleties with precision and accuracy.

Bringing Innovation to Life: “What’s Her Secret?”

To showcase these innovative advancements, we produced a video called “What’s Her Secret?” This is the most realistic product demo Sonantic has ever created.


When we came up with the concept for the video, we decided to focus on the theme of love because that’s when we feel and act the most vulnerable. The character’s calm, soothing voice fluctuates ever so slightly as she asks, “What would it take for you to fall in love with me?” As you listen to the dialogue, notice when she laughs, breathes or sighs at just the right moments. The video surprises viewers when it's revealed that, while the woman on screen is a real person, the main character speaking is an AI.

These advancements in speech synthesis make Sonantic’s platform more comprehensive. We’re excited to offer these new capabilities to our customers, and would like to thank everyone who’s been along this journey with us thus far. Together, we’re creating truly powerful tools for voice actors and entertainment studios to continue driving the future of voice innovation.

→ Return to blog

Heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

  • It prevents you to directly style elements of a RT when this RT is bound to the CMS. I hope this limitation will be removed soon, but in the meantime the workaround is pretty easy to handle. Proceed as follow:
  • CMS. I hope this limitation will be removed soon, but in the meantime the workaroun
  • Vents you to directly style elements of a RT when this RT is bound to the CMS. I hope this limitation will be removed soon, but in the meantime the wor