Ditto AI
Blog

How virtual training environments are teaching AI models in the real world

Posted by Ditto on Jan 29, 2020 11:00:00 AM

Whether it’s powering a self-driving car or the brain behind a dexterous robotic arm, AI is capable of performing tasks more accurately, efficiently and safely than us humans – with a major caveat. Like us, it has to learn how to do those things in the first place.

The artificial neural networks powering deep learning have made adaptive learning a reality, but when it comes to robotics and autonomous vehicles, there’s a high cost-of-failure. A self-driving car crash means costly repairs, and reinforcement learning can take a huge amount of time when robotic arms have made hundreds of thousands of attempts.

Enter virtual training environments.

Text only V2 CTA

Learning in VR

In the past few years, major AI developers have found a way to avoid damaging their physical capital during the learning process by training their programmes in virtual environments.

Thanks to deep learning algorithms that are able to observe, imitate and troubleshoot tasks, the AI ‘brains’ behind robotics are able to learn in an entirely risk-free environment by ‘imagining’ the education process. Developers demonstrate the action they want their AI to take in a virtual setting, and it’s then fed as many digitally constructed variables as possible, learning and improving with each attempt.

This doesn’t just cut down on damage to assets; it also has the potential to cut learning times by significant amounts. Cloud-powered virtual environments allow for an exponential increase in the number of learning attempts that can be made at once. This kind of learning acceleration has been witnessed before. OpenAI’s videogame-playing bot gained ‘lifetimes of experience’ in around two weeks, before going on to beat the best Dota 2 players in the world.

Virtual training in action

Perhaps unsurprisingly given their past success, OpenAI are some of the leaders in virtual training for AI-powered robotics. The organisation has been teaching a robotic arm to complete tasks by educating it in VR. The AI’s ‘vision network’ is shown hundreds of thousands of digitally-constructed images representing changes to its environment, while its ‘imitation network’ observes a human guide perform an action in VR. It’s designed to mimic the way that humans intuitively learn and build knowledge based on their environment, physical bodies and brain (known as ‘embodied cognition’).

When the two networks come together, the system is able to account for variables and intuitively complete block-stacking, for example, without ever having physically stacked blocks. The educated AI ‘brain’ is simply transferred to the robotic arm, and it’s ready to go. As OpenAI themselves put it:

‘We’ve created a robotics system, trained entirely in simulation and deployed on a physical robot, which can learn a new task after seeing it done once.’

Beyond stacking blocks

OpenAI aren’t the only group experimenting with virtual training. Nvidia’s ISAAC platform ‘lets developers train and test their robot software…using virtual simulation’, and boasts that ‘engineering iterations and testing can be done in minutes..[with] no risk of damage or injury.’

When it comes to autonomous vehicles, there’s been some impressive evidence of the method’s efficacy. Researchers and developers at MIT have used the same basic principles to train a drone to navigate complex obstacles, letting it fly in an empty warehouse while ‘hallucinating’ the course it has to navigate. It allows developers to test their self-driving software without damaging vehicles. The implications for the self-driving car industry are obvious.

Explaining the training

Virtual training environments may well be the missing link in the quest for embodied cognition in AI-powered robots. The approach cuts risk, takes far less time and costs less, all while providing repeatable results.

It’s a potential boost for explainable AI, but also presents a challenge. Virtual training of the sort practiced at OpenAI has a logical, human process to it that past black-box algorithms haven’t. The vision-imitation network pairing provides a clear trail of breadcrumbs when developers want to know how their AI has come to its decision.

But, when it’s operated on the scale that cloud computing affords, there’s a risk that autonomous deep learning networks will learn bad habits or mis-infer the task at hand. It’s something that developers will need to watch for and experiment with as AI takes on the fallibility that comes with human-inspired learning processes.

Wide V2 CTA

Topics Explainable AI Latest Developments Future of AI Deep Learning