.. raw:: html

A Python module of imperfect information games and learning algorithms.

.. raw:: html

About • Installation • Background • Features • Getting Started • Documentation • References • License

About ----- Imperfecto is a Python module of heavily commented implementations of algorithms and games with imperfect information such as Rock-Paper-Scissor and Poker. See `Features <#features>`_ for a list of games and algorithms we provide. See `Installation <#installation>`_ for instruction on how to install this module, and `Getting Started <#getting-started>`_ on usage and how to customize our module for your own game. For a complete documentation, see `Documentation <#documentation>`_. Installation ------------ Clone this repo and install it as a module. .. code-block:: $ git clone https://github.com/vlongle/Imperfecto.git $ cd imperfecto $ pip3 install -e . Try to import the module to check if the installation has been successful. .. code-block:: $ python3 >> import imperfecto Background ---------- Imperfect Information Games ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Games such as Chess are known as "perfect information" games as there is no private information to each player. Every player can observe each other's moves, and the game state (i.e., the chess board). However, games such as Poker are "imperfect-information" as each player has their own private card. Games with simultaneously moves such as rock-paper-scissor can also be modeled as imperfect information as each player doesn't know the opponent's move ahead of time. Games with imperfect information are typically modeled as `extensive-form game (EFG) `_. EFG models sequential games with a game tree. Players take turn making moves until a termination node is reached, where the payoff is revealed. Players make decision at each decision node. However, the players do not know exactly which node they are in. Instead, they only know that they're in a set of nodes known as an information set. An information set represents all the possible worlds that are consistent with what the player knows. Features -------- .. list-table:: :header-rows: 1 * - Games - Progress * - Rock-paper-scissor - ✔️ * - Asymmetric Rock-paper-scissor - ✔️ * - Bar-crowding Game - ✔️ * - Prisoner's Dilemma - ✔️ * - Kuhn Poker - ✔️ * - Leduc Poker - * - Texas Hold' Em .. list-table:: :header-rows: 1 * - Algorithm - Progress * - Regret Matching - ✔️ * - Counterfactual Regret Minimization (CFR) - ✔️ * - Monte Carlo CFR - * - Deep CFR - * - Bandit - * - Opponent modeling Getting Started --------------- Look at all the demos that we have in ``imperfecto/demos`` folder, and try any of them. For example, run (from the repo's root folder) .. code-block:: $ python3 imperfecto/demos/regret_matching_demo.py --help to print out the available options for that demo file. Pick your desired options and run a demo file. For example, .. code-block:: $ python3 imperfecto/demos/regret_matching_demo.py --game PrisonerDilemmaGame --train_regret_matching Writing Your Own Games! ^^^^^^^^^^^^^^^^^^^^^^^ Normal-form Games ~~~~~~~~~~~~~~~~~ Add a new game by writing an ``ACTION_ENUM`` class that defines the possible action that is available to each player. Then, write a Game class that inherits from our ``NormalFormGame``. You will need to implement the ``get_payoffs`` function. .. code-block:: from enum import intenum from typing import sequence from imperfecto.games.game import normalformgame from imperfecto.utils import lessverboseenum class MY_CUSTOM_ACTION(lessVerboseEnum, IntEnum): ... class MyCustomNormalFormGame(NormalFormGame): actions = MY_CUSTOM_ACTION n_players = ... def get_payoffs(self, history: Sequence[MY_CUSTOM_ACTION]) -> Sequence[float]: ... See ``imperfecto/games/prisoner_dilemma.py`` for an example. You can then use your custom game with the functions available in ``imperfecto/demos/regret_matching_demo.py``. For example, .. code-block:: ... from imperfecto.demos import to_train_regret_matching Game = MyCustomNormalFormGame to_train_regret_matching(Game) TODOs ----- * Implement other variations of counterfactual regret stuff like Monte Carlo, deep counterfactual * Implement some cool visualization of training like this (https://medium.com/hackernoon/artificial-intelligence-poker-and-regret-part-2-ee2e329d6571). * Implement bandit stuff (Exp3 for example, UCB, beta Thompson sampling stuff https://www.youtube.com/watch?v=XxTgX8FlDlI, https://www.cs.ubc.ca/labs/lci/mlrg/slides/Introduction_to_Bandits.pdf). A whole book on bandit lol (https://tor-lattimore.com/downloads/book/book.pdf) * Consider opponent modeling as well (https://abhinavrobinson.medium.com/ai-wins-rock-paper-scissors-mostly-65e0542f945b this is a nice summary, and the CMU slides somewhere as well... ( http://www.cs.cmu.edu/afs/cs/academic/class/15780-s13/www/lec/ganzfried_gt.pdf)) * Maybe also implement some deep RL methods like Q-learning, actor-critic for these games. * More game: Leduc Poker, Texas Hold Em. Documentation ------------- Please see `https://vlongle.github.io/Imperfecto/ `_ for full documentation. References ---------- * `An Introduction to Counterfactual Regret Minimization (Neller \& Lanctot) `_ * `Steps to building a Poker AI (Thomas Trenner) `_ * ` `_ License ------- MIT License @ Long Le 2022.