ARTIFICIAL INTELLIGENCE: OpenAI uses a game of Hide and Seek to train the next generation of sophisticated AIs

Without diving down the slippery slope argument of the evolution of biological organisms, we know for a fact that both natural selection and competition play a massive part in the development of complex life forms – as well as their respective intelligence -- in existence today. It is this that a team at the San Francisco-based for-profit AI research lab OpenAI are trying to replicate in a virtual world with the aim of creating a more sophisticated AI.

The experiment centers around two existing ideas: (1) multi-agent learning – essentially pitting multiple algorithms against each other to provoke emergent behaviors through inherent competition and coordination (similar to an ant colony). (2) Reinforcement learning – the most well-known, yet time and resource intensive method of training an AI via trial and error (similar to teaching a child how to ride a bicycle). The latter was the method used initially to train OpenAI’s ‘Dota 2’ bot OpenAI Five, which presumably has played 180 years’ worth of the multiplayer online video game against itself and its past self every single day. This was not all in vain, as earlier this year OpenAI Five won 7,215 games of Dota2 against human players from around the globe, ending up with an astounding victory rate of 99.4% overall.

Lets get back to the experiment shall we? Through playing a simple game of hide and seek hundreds of millions of times, two opposing teams of AI agents developed complex hiding and seeking strategies that involved tool use and collaboration. The development of such complex strategies offers insight into OpenAI’s dominant research strategy: to dramatically scale existing AI techniques to see what properties emerge. Such properties/complex strategies evolved from hiders learning how to use boxes to block exits and barricade themselves inside rooms, to eventually learning how to exploit glitches in their environment, such as getting rid of ramps for good by shoving them through walls at a certain angle, or seekers surfing boxes to gain higher ground to catch the hiders. It is important to note that we are talking about algorithms learning from one another to autonomously develop strategies to compete and cooperate without the use of human interference. Pretty mind-blowing if you ask us.

This is interesting because human-relevant strategies and skills can emerge from multi-agent competition and standard reinforcement learning algorithms at scale. These results inspire confidence that in a more open-ended and diverse environment such as in financial markets, multi-agent dynamics could lead to extremely complex and human-relevant behavior, as well as potentially solving problems that humans don't yet know how to. We have found that the total impact of AI implementations across financial sectors is $1 trillion by 2030, a 22% traditional cost reduction. Breaking this down, the potential cost exposure consists of $490 billion in front office (distribution), $350 billion in middle office, $200 billion in back office (manufacturing). To learn more, or get access to the report (click here).