Look Whose AI is Talking

In the latest development of new AI technologies, Taco Bell’s parent company, YUM Brands, announced it is testing Voice AI in its drive-thrus. Will consumers eat this innovation with their tacos? It remains to be heard. A bit earlier, Spotify released its AI DJ X to your devices and speakers, elevating “personalization to a whole new level.” Where is the AI heading when it already seems ahead of us? And it does not need fifteen-minute breaks.

From developers’ point of view, these new technologies are impressive, as Bounteous’ Chris Crichton, who worked on Taco Bell’s AI, mentioned in a LinkedIn post: “I’ve spent the last few weeks testing Taco Bell‘s voice AI drive-thru experience. Like most innovation, it’s a work in progress with iterative improvements being made frequently (modifiers, integration into ODMB screen, loyalty integration, etc.)… and it’s IMPRESSIVE!” Voice AI is a result of Yum! Brands creating technologies in addition to acquiring others.

Spotify, on the other hand, acquired a voice engine, Sonatic, to help with AI development, a decision somewhat influenced by social media, and to put a bit of personality in it, based their DJ X on an actual person involved in the project: Xavier “X” Jernigan, Spotify’s head of cultural partnerships.

“We have cultural experts and creative producers in-house. We use all of that knowledge of culture and what’s happening in music, our relationship with artists and our relationships with labels to help in the curation process. It’s not a replacement for a DJ, it’s an enhancement,” he told Forbes.

However, the voices we hear from Spotify’s DJ X and Taco Bell’s speakers are the culmination of work on underlying technologies and consolidations dependent on computer processing power and the ability to transmit bits of information across distances. A similar process of combining technologies marked the birth of telecommunications in the last century and provided people with early education on consuming data across the spectrum.

Where do we go from here with AI audio? Opportunities are endless because distribution platforms are already in place. Terrestrial radio should not be immune to the efforts to include artificial intelligence in their airways. 99.1 AI is coming on the air in your city soon.

Currently, we are at the end of the beginning stage of AI. Early developments in AI were more broadly accepted. Early word processors included features like spell check, auto-completed sentences, and pre-set email replies (Sounds good!). Even Adobe, with its Photoshop apps, could generate extra pixels to autocomplete an image. 

We start to learn about it, how it works and operates, and what it can offer, and at the same time, AI is doing the same thing, looking for ways to manage better and communicate in a way humans do. Can AI replace another human? As Christine Ro noted in her article for BBC, “The convenience and breadth of an AI chatbot can’t compete with the pleasures of chatting with someone whose personality quirks I’ve learned over the course of years. It is however a useful supplement.”

Educational applications are another way to utilize the power of computer chips converging into a human voice. One of the early educational products is Gliglish, created by Fabien Snauwaert, helping people learn by chatting with AI in other languages. I gave it a try to experience the features and see how it works. We chatted for a bit about movies, mainly the penultimate edition of the Mad Max series by George Miller. Gliglish AI corrected me about the movie title at first. “I think you mean “Mad Max: Fury Road.” Why do you like it?” Gliglish asked. I opened up and mentioned it reminded me of Moby Dick, the way they chased after the big rig through the desert. The female voice of Gliglish noted “That’s an interesting comparison! The chase scenes in the movie are definitely intense.” Boom, she gets it.

The application has been described as a success by its developer “Gliglish is a website to learn languages by speaking with an AI teacher. People just talk into their microphone to practice on their phone or computer. It’s been taking the world of language learning by storm (440K views per month) with its focus on spoken practice and fun, casual conversations.”

Having a voice in personal and business matters is a human right. Connecting the brain to voice is natural; we don’t think about it too much. Things get tricky when it comes down to giving a voice to other people or entities. What kind of technology is behind the voice? Apple has recently introduced its product, Apple Intelligence, doing what most other companies will do with their AI products: rebranding it and addressing privacy concerns. 

Many people were speechless when they learned of AI’s capabilities. In the last six months, I’ve tested a large number of applications. Many companies are trying to develop useful applications for, in my case, advertising. The claim is that AI can simplify the process by automating things and saving time. The concern was always around the data that makes the AI, how it was obtained, the output, and how to control it. Yes, some apps can create videos, but those apps require initial human guidance.

Are we talking about the next industrial revolution, with data flowing across production lines to be reassembled across various platforms? It remains to be seen. What do you say, AI?