Huawei researchers believe giving AI a body is the next big thing

Huawei

A bunch of brainiacs from Huawei’s Noah’s Ark Lab in Paris just dropped some pre-print research discussing a framework they’re cookin’ up for “embodied artificial intelligence” (E-AI). They’re claiming this could be the next big leap towards achieving artificial general intelligence (AGI).

AGI, also known as “human-level AI” or “strong AI,” basically means an AI system that can do pretty much any task you throw at it, as long as it’s got the juice it needs to run. Now, there’s no one-size-fits-all definition of what makes an AI system a true general intelligence, but companies like OpenAI are all about chasing after this tech dream.

So, when the big GPT (generative pre-trained transformer) tech hit the scene in the late 2010s, a bunch of AGI experts started chanting the mantra: “scale is all you need.” Basically, they thought that if you beefed up transformers to massive scales beyond what we had at the time, it could pave the way to building an AGI model.

But according to the Huawei crew’s paper, the gist is that those big language models like OpenAI’s ChatGPT and Google’s Gemini just don’t cut it when it comes to grasping the real world. Why? Because they’re not living in it.

“It is a prevalent belief that simply scaling up such models, in terms of data volume and computational power, could lead to AGI. We contest this view. We propose that true understanding … is achievable only through E-AI agents that live in the world and learn of it by interacting with it.”

According to the researchers, if AI agents want to really get down and dirty in the real world, they’ve gotta have some sort of body that can see, do stuff, remember, and learn.

When we talk about perception here, we’re talking about giving the AI system the power to grab raw data straight from the real world, right as it happens, and then crunching and encoding that data into some kind of learning space. Basically, AI needs to be able to focus on what it wants, using its own “eyes” and “ears,” if it wants to really get a handle on the real world and act like a true general intelligence.

In the end, the researchers lay out a theoretical plan for how a big language model (LLM) or a basic AI model could one day have a body and achieve these aims. But they’re quick to mention there are tons of hurdles to jump over. One biggie is that the beefiest LLMs right now are chillin’ on huge cloud networks, which makes giving them bodies a tough nut to crack with today’s tech.

Rohan Sharma