The Gist
- AI evolution. Emotional intelligence in AI is poised to revolutionize how these models understand and interact with humans.
- Emotion tech. Hume AI’s technology reads emotions from voices and facial expressions, enhancing AI capabilities.
- Nonverbal signals. Current AI lacks emotional understanding; integrating nonverbal cues is crucial for better communication.
The AI field has made remarkable progress with incomplete data. Leading generative models like Claude, Gemini, GPT-4, and Llama can understand text but not emotion. These models can’t process your tone of voice, rhythm of speech or emphasis on words. They can’t read your facial expressions. They are effectively unable to process any of the nonverbal information at the heart of communication. And to advance further, they’ll need to learn.
AI’s Next Leap: Emotional Intelligence
Though much of the AI sector is currently focused on making generative models larger via more data, compute, and energy, the field’s next leap may come from teaching emotional intelligence to the models. The problem is already captivating Mark Zuckerberg and attracting millions in startup funding, and there’s good reason to believe progress may be close.
Related Article: AI Gets Empathetic: Advances in Emotionally Intelligent AI
Zuckerberg: AI Needs Emotional Intelligence
“So much of the human brain is just dedicated to understanding people and understanding your expressions and emotions, and that’s its own whole modality, right?” Zuckerberg told podcaster Dwarkesh Patel last month. “You could say, okay, maybe it’s just video or an image. But it’s clearly a very specialized version of those two.”
Related Article: How AI Is Revolutionizing the Customer Journey in 2024
Ex-Meta Employee Leads AI Emotion Innovation
One of Zuckerberg’s former employees might be the furthest along in teaching emotion to AI. Alan Cowen, CEO of Hume AI, is a former Meta and Google researcher who’s built AI technology that can read the tune, timber, and rhythm of your voice, as well as your facial expressions, to discern your emotions.
Related Article: Emotions in Marketing: The Art of Anticipatory Customer Experience
Hume’s EVI Bot Reads and Reacts to Emotions
As you speak with Hume’s bot, EVI, it processes the emotions you’re showing — like excitement, surprise, joy, anger, and awkwardness — and expresses its responses with “emotions” of its own. Yell at it, for instance, and it will get sheepish and try to defuse the situation. It will display its calculations on screen, indicating what it’s reading in your voice and what it’s giving back. And it’s quite sticky. Across 100,000 unique conversations, the average interaction between humans and EVI is 10 minutes long, a company spokesperson said.
Related Article: Emotional Intelligence in Customer Service: The Key Differentiator
Cowen: Voice Details Reveal Hidden Insights
“Every word carries not just the phonetics, but also a ton of detail in its tune, rhythm, and timbre that is very informative in a lot of different ways,” Cowen told me on Big Technology Podcast recently. “You can predict a lot of things. You can predict whether somebody has depression or Parkinson’s to some extent, not perfectly… You can predict in a customer service call, whether somebody’s having a good or bad call much more accurately.”
Hume’s $50M AI Reads Emotions in Voices
Hume, which raised $50 million in March, already offers the technology that reads emotion in voices via its API, and it has working tech that reads facial expressions that it has yet to release. The idea is to deliver much more data to AI models than they would get by simply transcribing text, enabling them to do a better job of making the end user happy. “Pretty much any outcome,” Cowen said, “it benefits to include measures of voice modulation and not just language.”
Text Lacks Emotional Communication
Text is indeed a lacking communication medium. Whenever anything gets somewhat complicated in text interactions, humans tend to get on a call, send a voice note, or meet in person. We use emojis or write things like “heyy” in a text to connote some emotion, but they have their limits, Cowen said. Text is a good way to convey complex thoughts (as we’re doing here, for instance) but not to exchange them. To communicate effectively, we need nonverbal signals.
Voice Assistants Fail Without Emotional Context
Voice assistants like Siri and Alexa have been so disappointing, for instance, because they transcribe what people say and strip all emotion out when digesting the meaning. Generative AI bots’ ability to deliver quality experiences in their current form is notable, but it also shows how much better they can get, given how much information they lack.
Hume Trains AI on 1M Emotional Surveys
To program ’emotional intelligence’ into machine learning models, the Hume team had more than 1 million people use survey platforms and rate how they’re feeling, and connected that to their facial expressions and speech. “We had people recording themselves and rating their expressions, and what they’re feeling, and responding to music, and videos, and talking to other participants,” Cowen said. “Across all of this data, we just look at what’s consistent between different people.”
Hume AI Predicts and Modulates Emotional Responses
Today, Hume’s technology can predict how people will respond before it replies, and uses that to modulate its response. “This model basically acquires all of the abilities that come with understanding and predicting expression,” Cowen said. “It can predict if you’re going to laugh at something — which means it has to understand something about humor that it didn’t understand before — or it can predict if you’re going to be frustrated or if you’re going to be confused.”
Emotionally Intelligent AI to Revolutionize Companionship
The current set of AI products has been understandably limited given the incomplete information they’re working with, but that could change with emotional intelligence. AI friends or companions could become less painful to speak with, even as a New York Times columnist has already found a way to make friends with 18 of them. Elderly care, Cowen suggested, could improve with AI that looks out for people’s everyday problems, and is also there as a companion.
Cowen Envisions Emotion-Aware AI in Everyday Products
Ultimately, Cowen’s vision is to build AI into products, allowing an AI assistant to read your speech, emotion, and expressions, and guide you through the experience. Imagine a banking app, for instance, that takes you to the correct pages to transfer money, or adjusts your financial plan, as you speak with it. “When it’s really customer service, and it’s really about a product,” Cowen said, “the product should be part of the conversation, should be integrated with it.”
Emotional Intelligence: AI’s Path Beyond Resource Limits
Increasingly, AI researchers are discussing the likelihood of slamming into a resource wall given the limits on the amount of data, compute, and energy they can throw at the problem. Model innovation, at least in the short term, seems like the most likely way to get around some of the constraints. And while programming emotional intelligence into AI may not be the exact way to advance the field, it should have a chance. And it shows a way forward, toward building deeper intelligence into this already impressive technology.