OpenAI, the AI research organization, can claim a world first: its artificial intelligence system trained to play the complex strategy game Dota 2 has bested a world champion e-sports team. The competition was held in San Francisco today and dubbed the OpenAI Five Finals, ending the organization’s public demonstrations of its Dota-playing technology on a high note.
The competition on the human side included five top Dota 2 pros from team OG, which won the world’s most coveted e-sports prize last year when it took the No. 1 spot at The International, the premiere annual Dota 2 tournament with prizes now totaling $25 million. OG faced off in a best-of-three contest against the OpenAI Five bots, all trained using the same deep reinforcement learning techniques and controlled independently by different layers of the same system. Reinforcement learning is effectively a trial and error approach to self-improvement, wherein the AI is dropped into the game environment with zero understanding of how the game works and trained extensively using reward systems and other incentivizing mechanisms.
Today’s performance is by far the highest-quality demonstration of OpenAI Five’s capabilities to date, with the system having narrowly lost two games to less capable e-sports teams last August. According to OpenAI co-founder and chairman Greg Brockman, who is also the organization’s chief technology officer, OpenAI Five improves by playing itself in an accelerated virtual environment. “OpenAI Five is powered by deep reinforce learning, which means we didn’t code it how to play. We coded it how to learn,” Brockman told the crowd ahead of the competition. “In its 10 months of existence, it’s already played 45,000 years of Dota 2 gameplay. That’s a lot — it hasn’t grown bored yet.”
Dota 2 is a vastly complex strategy game, involving more than 100 unique characters, deep skill trees and item lists, and a dizzying array of variables playing out onscreen at any given moment in a match. As such, OpenAI imposes certain limits when its AI system plays professional players, most prominently by capping the number of heroes used by both five-player teams.
In this case, each squad had 17 heroes to choose from. OpenAI also chose the so-called “Captain’s Draft” game mode, which lets each team strategically ban heroes to prevent the other team from selecting those characters before using a distinct picking order. That lets the captain build off strengths between hero combinations and leverage enemy hero weaknesses through strong counters once the teams do begin filling out the roster one by one. Like prior matches, OpenAI also disabled summoning and illusion features, both of which involve bringing on additional variables in the form of hero copies and unique creatures that OpenAI hasn’t trained its system to account for.
Beyond that, the game is played just like a normal Dota 2 match, with the ultimate goal of destroying the enemy team’s “ancient,” or a large tower at the end of each team’s territory that becomes vulnerable only when the enemy team successfully destroy smaller towers throughout the course of the match, in between hero-on-hero team fights.
In the first match of the day, OpenAI Five surprised OG and claimed victory through reliance on a number of aggressive tactics, including the peculiar decision to spend earned in-game currency to instantly revive heroes upon death, even early on in the match. As noted by Greg Brockman, OpenAI’s chief technology officer, OpenAI is fond of strategies that favor short-term gain, revealing its deficiencies in mastering the type of long-term planning humans are great at and typically rely on to win such strategy contests. However, in this match, the early buy backs paid off and OpenAI Five gained an edge that OG simply could not overcome as the match dragged on into the 30-minute range.
We see this happen in test games all the time: the bots buy back, the humans laugh, and then the humans lose. Hard to know if it’ll happen here too…
— Greg Brockman (@gdb) April 13, 2019
In the second match, OpenAI performed even better, gaining an early advantage against OG in the first few minutes and then ruthlessly advancing on the human players until it clinched victory in a little more than half the time it needed to win the first match. Mike Cook, an avid Dota 2 player and viewer who specializes in the blending of AI and game design, noted how unusually aggressive OpenAI Five began playing in the second match, and how little OG was doing to combat its advances across the map. Cook noted specifically how well OpenAI Five was able to take advantage of its specific hero picks.
This is probably over already, sadly. OpenAI have four of the top five heroes ranked by net worth. At ten minutes in, against bots with the execution of OpenAI, this is really bad. #openaifive
— mike cook (@mtrc) April 13, 2019
For OpenAI, the victory here is not just a cause of celebration in and of itself, but a testament that its approach to reinforcement learning and its general philosophy about AI is yielding milestones. The research team will no longer hold any public demonstrations of its AI bot, but its now working on software that will let humans collaborate alongside the OpenAI Five software in real time, playing on a team with the bots and learning from their peculiar, unprecedented strategies and behaviors. The organization is also releasing a platform for the public to play against OpenAI Five, a mode it’s calling Arena, that will be open for three days starting April 18th.
Special announcement: we’re inviting the entire Internet to play OpenAI Five (whether as a competitor or teammate) at once.
Sign up today! Very excited to see what we learn from observing OpenAI Five in the wild. pic.twitter.com/TaMhxdgVIt
— Greg Brockman (@gdb) April 13, 2019
OpenAI says that the collaboration software may not ever make it to the public, although I was able to try it for myself here at the event. (Despite having world-class Dota 2 AI on my team, I unfortunately was crushed in much less dramatic fashion than OG.) But Sam Altman, co-founder and CEO of OpenAI, says this type of work is evidence that collaboration with AI agents could result in huge benefits in the future.
“That’s an important lesson for how the world is going to work, training these things and have them work in parallel, Altman says in an interview with The Verge. “Collaboration is one of the more positive visions we have for the future of the world — AI works alongside humans to make humans better and have more fun and more impact.”
Altman says OpenAI will likely continue to dabble with Dota 2 and other video game environments, primarily because they are such good test beds for AI and good benchmarking tools for measuring progress. But he tells me there probably does not exist a video game out there right now that a system like OpenAI Five can’t eventually master at a level beyond human capability. For the broader AI industry, mastering video games may soon become passé, simple table stakes required to prove your system can learn fast and act in a way required to tackle tougher, real-world tasks with more meaningful benefits.
Ultimately, OpenAI wants to take its Dota 2 learnings and expand them to new domains outside games and, eventually, into the real world. To that end, the organization is working on using reinforcement learning and other techniques to imbue robotic hands with more deft, dexterous, and humanlike movement.
“What OpenAI is trying to do is build general artificial intelligence and to share those benefits with the world and make sure it’s safe,” Altman says, referring to the quest to build a multi-purpose AI system capable of performing any task a human can. “Were not here to beat video games, as fun as that is. We’re here to uncover secrets along the path the AGI.”
Correction: A previous version of this article said OpenAI co-founder Sam Altman was the organization’s chairman. He is in fact the CEO, while CTO Greg Brockman is its chairman.