With this project, my aim is to create an AI player for the cooperative card game Hanabi (Antoine Bauza, 2010). This will be a term project for the Artificial Intelligence graduate course, in the Computer Sciences department of UFPR (2019/1).

Hanabi was one of the games discussed in a paper which I co-authored with my former advisor, dr. André Battaiola, Distinctive Features and Game Design. The game rules may be found at the publisher’s site.

Hanabi in brief

In Hanabi, there is a deck with five suits of cards. Each suit is identified by its color: red (R), green (G), blue (B), yellow (Y), white (W).

In each suit, there are 10 cards, numbering 1 to 5: three 1s, two 2s, two 3s, two 4s, one 5.

During play, each player will have four or five cards in his hand. The player may not see his own cards: he must hold them in his hand, showing the cards to his fellow players, but he can only see the card backs. Conversely, he can see the cards from all other players.

All players play as a team. Their common objective is to play 5 cards of each suit, in strict ascending order. Thus, the first card to be played in each suit must be a 1. For instance, if the players have already played 1G, 2G, 3G in the green suit, and 1Y, 2Y in the yellow suit, but no other cards, the next card to be played must be either 4G, or 3Y, or a 1 from any of the three remaining suits.

When a player takes his turn, he must execute one action only, choosing between three possibilities:

  • play a card from his hand;
  • discard a card from his hand;
  • give some information to other player about cards in his hand.

After playing or discarding a card, the player must take a new card from the deck.

When giving information on other player’s cards, the information must be either (1) pointing all cards in his hand with the same number; or (2) pointing all cards in his hand with the same suit. There can be no other communication between players during the game.

The game also includes eight information tokens, kept in a common pool. When a player gives information on other player’s cards, he must spend one of the tokens, and remove it from the pool; no information may be given if the pool is empty. Conversely, if a player discards a card from his hand, one of the spent tokens is put back in the pool.

If a player decides to play a card from his hand, and there is no legal place for it to be played (for instance, if a 5G or a 2B were played in the example above), the card is discarded and no information pool is returned. No more than three cards may be misplayed in a game; if a fourth card is misplayed, the game ends immediately.

The game also ends immediately if all five suits have been completed.

If the last card is taken from the deck, each player will play one last time before the end of the game.

The common score is the sum of the largest value card played in each suit. Thus, the score will be a number from 0 to 25.

Rationale: why Hanabi?

Most AI player implementations create players for competitive games, whether digital or non-digital. Even so, there are quite a few AI players in cooperative games, often as bot players filling in a team.

In HAI, the challenge is to implement a player able to play by itself, and not as a support to a human player. Hanabi can be played with two to five players.

In most cooperative games, communication between players is of essence, and AI players in such games must take this into account. In Hanabi, player communication is rigidly restricted: players may not communicate with one another, except through their moves in play. As such, there is no need to introduce communication capabilities in HAI.

Furthermore, the score offers a ready-made gauge with which to measure player effectiveness.

High concept

HAI will be implemented as a C program. The main routine will be the “game master”. The AI player will be implemented as a function called by the main routine, receiving the game state as input parameters and returning its move. The main routine will adjudicate the results of the move chosen by the player function. When the game ends, the main routine will output elapsed time and game score.

Player function development will happen in stages; in each stage the player function will (hopefully) have a better AI algorithm than the previous one. For each stage, a series of games will be recorded, in order to compare performance from one stage to another, both in elapsed time and in score.

The number of players will be fixed at four; future enhancements may allow for two, three, or five players.