Opponent modeling

From PokerAI

Jump to: navigation, search
Cleanup icon
This page or section needs to be improved. Please help making it better in any way that you can. Then remove this box (tag) and it will be automatically removed from the cleanup page.

Contents

From Salim & Rohwer, 2005

This article is quoted from here:

Opponent modeling targets accurate predictions of future opponent actions. For poker, opponent modeling is difficult. It is a game of imperfect information, chance and incomplete knowledge. Contrast this with other games targeted by machine learning research. Chess has a game state known to each player. There is no risk and chance since playing the best move is always the best action. Other games where chance is present, like backgammon, still retain perfect information. And games that do retain both chance and imperfect information typically include just one opponent. These games, such as degenerate one-against-one poker games and RoShamBo (the kid’s game of rock, paper, scissors), do not have the additional complexity of play against multiple opponents. These difficulties have led researchers to conclude that ”opponent modeling in poker appears to have many of the characteristics of the most difficult problems in machine learning–noise, uncertainty, an unbounded number of dimensions to explore, and a need to quickly learn and generalize from relatively small number of heterogeneous training examples.” Hetergeneous is used because when a player folds, quits the game early, their cards, the missing link of poker’s imperfect nature, are not revealed to other players. What is the gain from opponent modeling study? Human poker players are good at understanding their opponent. The best human players are frequently able to form an accurate model from single data points. And while the best poker programs have successfully improved with opponent modeling, the program’s developers conclude there are numerous opportunities for improvement and that for a poker program to defeat the best human players opponent modeling is critical. In computer poker game-playing research, the University of Alberta Poker Research group leads a sparse field of researchers. They have developed an excellent poker-playing machine called Poki, building on their prior work developing a world champion checkers program, Chinook. Poki and the University of Alberta research group is focused on adaptive artificial intelligence. Key to Poki’s success thus far is adjustment to new information. Yet the deluge of information leads Poki to more slowly adjust to opponents. For example, in heads-up trials with an online poker legend their PokiBot successfully outplayed the human over 3500 hands. But then, the human changed course, refocusing after modeling the PokiBot. He changed his playing mode from overly aggressive to cagey passiveness; outplaying PokiBot over the next 3500 hands. In the group’s paper, ’The Challenge of Poker’, they conclude that the build-up of interia after thousands of observed data points can be detrimental if the player changes mood. Past success may have been due to static or a fixed-playing style of opponents. And they also conclude that it is difficult to track good players who consistently alter playing-style over relatively brief periods. This adjustment inertia helps explain why the human expert player proved superior to Poki. Our prescription to Poki’s adaptation inertia is to vary playing style, pursuing the emotion of the table by tracking the ebb and flow of the game. Tracking opponents with a long-term and short-term picture: keeping long-term measures of frequency and building short-term models for adaption. For us, a black-box neural net does not provide a simple enough understanding. By using a case-base and a case-base reasoning framework, we will understand the influences of our poker bot. The combination of long-term opponent characteristics and short-term opponent changes will target a table temperature. Is one player who usually is the air of passivity, folding early and often, suddenly playing like a maniac and aggressively betting over the short term? If we recognize this historical difference and alter our expectations quickly, then our poker bot should loosen a hand strength threshold requirement and let the maniacal opponent lose more!

Other

The University of Alberta games group has published some rather interesting reading on opponent modeling in heads-up limit hold-em.

See Also

Links

  • Michel Salim and Paul Rohwer, Poker Opponent Modeling, CS Department Indiana University [1] (Broken link)
  • Bayes’ Bluff: Opponent Modelling in Poker [2] | PokerAI discussion
  • Bayes-Relational Learning of Opponent Models from Incomplete Information in No-Limit Poker [3] | PokerAI discussion
  • Page 4 of Monte-Carlo Search in Poker using Expected Reward Distributions [4]