New LLM-based AI Agent Achieves Human Performance Levels in Pokémon
A new creation from Georgia Tech researchers explores the capabilities of large language models (LLMs) with an action-based artificial intelligence (AI) agent through online Pokémon battles.
The invention, PokéLLMon, uses LLMs with a suite of optimizations to autonomously turn text into actions. It is the first LLM-embodied AI agent that achieves human-parity performance in tactical battle games.
School of Computer Science Professor Ling Liu and two of her Ph.D. students, Sihao Hu and Tiansheng Huang created PokéLLMon.
The group says that they chose Pokémon battles as the testbed for two key reasons. First, the game involves many strategies that can easily be translated into text. Next, the win rate of games can be directly measured and used to evaluate the AI agent’s performance.
Depending on the gameplay and battle strategies used by the opposing human player, the PokeLLMon agent can use three optimization techniques in its battles against human players:
· In-context reinforcement: With this technique, PokéLLMon uses text-based feedback from battles to refine its strategy and learn what strategies are effective.
· Knowledge-augmented generation: This optimization uses game knowledge that the team coded into the model to enable just-in-time decision making and combat potential hallucinations of LLMs.
· Consistent-action generation: When facing a powerful opponent, this optimization empowers the PokéLLMon to take timely and effective actions instead of resorting to panic-induced strategy switching.
These strategies enable PokéLLMon to achieve human performance levels in games against human players.
“Our LLM-embodied AI agent is the first one that leverages LLM to play online PokeMon battles in real time, achieving a win-rate of 49% in ladder competitions, and 56% in invited battles through our testing of over 100 battles," said Liu.
Hu said that the first version of their PokeLLMon agent can still be tricked by expert human players and some of their attrition strategies.
The research group will continue their exploration in LLM-embodied AI action agent for online games. One plan involves creating an LLM-enhanced AI agent that can play fully autonomously in open-world video games.
For more information about PokéLLMon, read the full research paper PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models, or visit the project’s GitHub page.
We are thrilled to announce Vivek Sarkar as the new Dean of the College of Computing at Georgia Tech! With a distinguished career spanning academia and industry, Sarkar's leadership promises to elevate our community to new heights. https://t.co/2mX5D46cJz pic.twitter.com/LxpLTCXWZV
— Georgia Tech Computing (@gtcomputing) April 12, 2024
@GeorgiaTech's dedication to excellence in computer science (CS) has been recognized once again, with the latest U.S. News and World Report rankings unveiling the institution at 7th place overall for graduate CS studies.https://t.co/qavNUSTb7n pic.twitter.com/BcGyGBQld8
— Georgia Tech Computing (@gtcomputing) April 10, 2024