G5AIAI - Introduction to Artificial Intelligence

This course is run at the The University of Nottingham within the School of Computer Science & IT. The course is run by Graham Kendall (EMAIL : gxk@cs.nott.ac.uk)


Game Playing

Introduction

In this part of the course we start to look at game playing, in particular the problems of searching when we are playing games.
The search problems we have looked at up until now assume that the situation is not going to change (for example, if we are searching for a route between two towns, the towns do not move during the search). In game playing this is not the case. If you make a move in (say) chess then your opponent is going to move next, so you cannot be certain what the board state will be after your opponents move.

This area of computer science has been studied for many years by AI researchers. Even Charles Babbage [BOW53] thought about using his analytical engine to play tic-tac-toe. And Alan Turing even devised a chess playing program (that was never implemented) see [BOW53]..

These notes are largely based on the course text (Russell & Norvig, 1995) but, if you are interested in this subject there is a lot of literature available.

In addition, you should take a look at the reading list for this part of the course.

Brief History of Game Playing

Game playing has a long history within AI research. Chess has been the holy grail of game playing programs and, as a consequence, received particular interest culminating in Deep Blue (see [IBMRes]) beating Kasparov in 1997, albeit with specialized hardware [HAM97] and brute force search, rather than AI techniques. But even as long ago as 1950 chess was of interest to computer scientists.

Chess

Claude Shannon (Shannon, 1950), on March 9th 1949 delivered a paper at a New York conference. In it he presented his ideas as to how a computer could play chess. He spoke about the size of the search space and how deep the computer would have to look to find the next move. Chess has 10120 unique games (with an average of 40 moves - the average length of a master game).
Working at 200 million positions per second, Deep Blue would require 10100 years to evaluate all possible games. To put this is some sort of perspective, the universe is only about 1010 years old and 10120 is larger than the number of atoms in the universe.
Shannon also noted that analysing to a depth of 40 at a rate of one game per microsecond would take a computer 1090 years to make its first move.
In 1957 artificial intelligence pioneers Herbert Simon and Allen Newell predicted that a computer would beat a human at chess within 10 years.
Simon admits "I was a little too far-sighted with chess, but there was no way to do it with machines that were as slow as the ones back then." In 1958, the first computer able to play chess was an IBM 704 with about one-millionth Deep Blue's capacity.

Despite the problems of getting computers to play chess there have been steady improvements. In 1967, a program called Mac Hack started competing successfully in human tournaments. In 1983, a program called Belle attained "expert" status from the United States Chess Federation.
In the mid-1980s, scientists at Carnegie Mellon University started the work that was to become the Deep Blue project. They used a Sun workstation that could examine 50,000 positions per second. The project moved to IBM in 1989.
On May 11, 1997, Kasparov lost a six game match against deep blue with a score of 3.5-2.5. Two wins for Deep Blue, one for Kasparov and three ties. Many people see this date as the day that a computer won the world chess championship.

Chess still receives a lot of research interest as scientists turn to learning techniques that allow a computer to 'learn' to play chess, rather than being 'told' how it should play [KEN01a]. Kendall and Whitwell developed a learning algorithm that allowed the computer to learn how to play chess rather then being programmed with a strategy. There is still a lot more work to do in this area but the initial results are encouraging. This paper is available from my web site (if you are interested). It is available as a pdf (265kb) file.

 

Checkers

Learning techniques were being used for checkers (draughts) as far back as the 1950's with Samuel's seminal work ([SAM59], re-produced in [SAM63] and [SAM00]). Arthur Samuel, in 1952 (see [SAM59], [SAM63] or [SAM00]), wrote the first checkers program. The original program was written for an IBM 701 computer. In 1954 he re-wrote the program for an IBM 704 and added a learning mechanism. What makes this program stand out in AI history is that the program was able to learn its own evaluation function. Taking into account the IBM 704 had only 10,000 words of main memory, magnetic tape for long-term storage and a cycle time of almost one-millisecond, this can be seen as a major achievement in the development of AI.
Samuel made the program play against itself and after only a few days play, the program was able to beat its creator and compete on equal terms with strong human opponents.
It remains as a testament to Samuel that there was little more work done on checkers until Jonathon Schaeffer et. al. developed Chinook ([SCH96] or [SCH97]).

Jonathan Schaeffer's work led to the development of Chinook. In 1992 Chinook won the US Open and subsequently challenged for the world championship. Dr. Marion Tinsley had been the world champion for over 40 years. In that time he had only lost three games. Playing Chinook, he lost his fourth and fifth game but ultimately won the match by 21.5 points to Chinook's 18.5 points. In August 1994 there was a re-match but the match ended prematurely when Dr. Tinsley had to withdraw for health reasons. As a result of this Chinook become the official world champion. Scheaffer (1996, p.447) claimed that Chinook was rated at 2814. The best human players are rated at 2632 and 2625.

Despite Chinook becoming the world champion, the search has continued for a checkers player that is built using "true" AI techniques. Chellapilla and Fogel ([CHE99],[CHE00],[FOG01]) have developed Anaconda, named, due to the strangle hold it places on its opponent. It is also called Blondie24 [FOG01] which is the name it used when playing on the internet This name was chosen in a successful attempt to try and attract players on the assumption they were playing against a blonde 24 year old female. Anaconda (or Blondie24) uses an artificial neural network (ANN), with 5000 weights, which are evolved by an evolutionary strategy. The inputs to the ANN are the current board position and it outputs a value which is used in a mini-max search. During the training period the program is given no information other than whether it won or lost (it is not even told by how much). Anaconda is certainly not given any strategy and contains no database of opening and ending game positions. Co-evolution is used to develop Anaconda, by playing games against itself. Once Anaconda is able to play at a suitable level, it often searches to a depth of 10, but depths of 6 and 8 are also common in play. This program has been available to the delegates at the Congress on Evolutionary Computing (CEC) conference for the past two years (CEC'00, San Diego and CEC'01, Seoul) with Fogel offering a prize of $100 (CEC'00) and $200 (CEC'01) to anybody who could defeat it. The prize remains unclaimed and at the next conference (CEC'02, Hawaii), the prize rises to $300.

 

Poker

Poker has an equally long research history with von Neumann and Morgensten [VON44] experimenting with a simplified, two-player, version of poker in their seminal work on economics. They recognized that accomplished poker players regularly adopt bluffing as part of their game, which would have to be catered for in any automated poker player.

Findler [FIN77] studied automated poker, over a 20 year period. He also worked on a simplified game, based on 5-card draw poker with no ante and no consideration of betting position due to the computer always playing last. He concluded that dynamic and adaptive algorithms are required for successful play and static mathematical models were unsuccessful and easily beaten. Other than von Neumann's and Findler's work research within poker playing has been limited. There have been three main research groups working on the subject, all relatively recently.

One of the groups is headed by Jonathan Schaeffer (of Chinook fame). He and members of his research team have developed ideas which have led to the release of Loki, which is, arguably, the strongest automated poker playing program to date. It is still a long way from being able to compete in the World Series of Poker, an annual event held in Las Vegas, but initial results are promising. Schaeffer's work concentrates on two main areas [BIL98a], [SCH99]. The first is to make betting decisions using probabilistic knowledge [BIL99] to determine what action to take (fold, call or raise) given the current game state. Billings et. al. also uses real time simulation of the remainder of the game that allows the program to determine a statistically significant result in the program's decision making process. Schaeffer's group is also looking at opponent modeling [BIL98b]. This allows Loki to maintain a model of an opponent that it uses to decide what betting decisions to make. This is done on the assumption that players adopt differing styles and any automated player has to be able to exploit this.

The Gala system developed by Koller and Pfeffer [KOL97] allows games of imperfect information to be specified and solved, using a tree based approach. However, due to the size of the trees they state "…we are nowhere close to being able to solve huge games such as full-scale poker, and it is unlikely that we will ever be able to do so."

Luigi Barone and Lyndon While have also recently carried out research into the automation of poker [BAR98], [BAR99], [BAR00]. They recognise that there are four main types of poker player : Loose Passive, Loose Aggressive, Tight Passive and Tight Aggressive players ([BAR99], BAR00]). In their first paper [BAR98] they suggest using evolutionary strategies as a way of modelling an adaptive poker player. They use a simple poker variant where each player has two private cards, access to five community cards and there is only one round of betting. This initial work incorporates three main areas of analysis; hand strength, betting position and risk management. The work demonstrates how a player that has evolved using evolutionary strategies can adapt its style to two types of table (loose and tight). In [BAR99] they develop their 1998 work by introducing a hypercube which is an n dimensional vector, used to store candidate solutions. The hypercube has one dimension for the betting position (early, middle and late) and another dimension for the risk management (selected from the interval 0..3). At each stage of the game the relevant candidate solutions are selected from the hypercube (e.g. middle betting position and risk management 2) and the decision is made whether to fold, call or raise. To make this decision the hypercube entry holds seven real numbers which are used, via suitably defined functions, to produce probabilities for each choice. It is the seven real values that are evolved depending on whether the player won the hand or not. Barone reports that this poker player improves on the 1998 version and is able to adapt to different playing styles. Their 2000 work [BAR00] extends the dimensions of the hypercube to include four betting rounds (pre-flop, post-flop, post-turn and post-river) and an opponent dimension so that the evolved player can choose from the hypercube which type of playing it is up against. The authors report this player out performs a competent static player.

Poker is interesting, from an artificial intelligence research point of view, as it is a game of imperfect information. Unlike chess and checkers, games such as poker have some information that is unseen. Poker also contains other unknowns such as the playing styles of the other players who may use bluffing (and double bluffing) during the course of the game. Unlike complete information games where the techniques to solve the games (computational power allowing) have been known and understood for a long time (such as mini-max search and alpha-beta pruning), games of imperfect information have not received the same sort of analysis and, doing so, could prove relevant to many other areas such as economics, on-line auctions and negotiating.

Some recent work at Nottingham [KEN01b] has recently started to carry out research into poker. In fact, this publication was the results of a third year undergraduate project that was written up and submitted to a conference. The paper considers how a computer can learn how to play poker, rather than being told how to play. If you are interested, the paper is available from my web site in both word (227kb) and pdf (52kb) format.

 

The Prisoners Dilemma


It is interesting to consider game "theory" and game "playing" within the context of this project. Game theory was developed by von Neumann and Morgensten [VON44] in 1944 as a tool to model and study the economic environment. Even in this early work, they considered a simplified version of poker as a means to test their methods. The prisoners dilemma (PD) and iterated prisoners dilemma (IPD) has also received much attention within the context of game theory. There is much literature on which to draw upon in this area with Axelrod ([AXE84], AXE87]) carrying out the most famous study which revealed that a tit-for-tat strategy beat all other strategies. However, despite the large literature base, this is an on-going area of research with Paul Darwen and Xin Yao ([DAR95], [DAR01]) carrying out some recent work. In particular, their 2001 work [DAR01] extends the prisoners dilemma by offering more choices, other than simply "cooperate" or "defect." They show that allowing more choice decreases the cooperation between the players. This is an area worthy of future research as it models some real world problems better than the binary decisions that exist within the original IPD.

If you are interested in The Prisoners Dilemma you might like to read [POU92]. This discusses the problem and also relates it to the real world (such as the cold war). In fact, it is really the other way round. Real life prisoners dilemma's (such as the cold war) motivated the development of the computer models and experiments.

 

Minimax

As I said above, the problem with playing games is that there is another agent (he/she/it) involved that is trying to thwart your every move. Your objectives are in direct opposition.
In 1944, John von Neumann co-authored a book [VON44] which outlined a search method called minimax, as it tries to maximise your play, whilst minimising your opponents.
In order to implement minimax you need some way of measuring how good (or bad) your positions is. This is often called a utility function.

Take a look at this game tree

 

 

This is a "made up" game, but we can make some observations about it.
Firstly, let's assume that we have generated the full game tree so, starting from the initial state, this represents the entire search space for the game. Of course, this is not a serious game but anything larger and we would have trouble drawing the entire search tree (even for something like tic-tac-toe).
The game starts with the computer making a move, A, which allows the game to move to states B and C. The opponent is now allowed to move and could move to states D, E, F or G. Finally, the computer makes a final move and we reach the eight terminal states shown.
Once we have expanded all the search tree we can decide who has won the game and we can assign each terminal state a value showing which player won.
We will assume that a negative value represents a win for the opponent and a positive value represents a win for the computer (we could, I suppose have a zero value for a draw, but we have not bothered in this tree).

Once we reach this position we know, with absolute certainty, who will win the game if we follow a particular branch through the search tree. The aim for the computer is to decide at any point which branch is the correct one to take. That is, we want to force the opponent into taking moves that will ultimately lead the computer to win. And this is what minimax allows us to do.
Once we have generated the complete search tree we can backtrack up the tree assigning values to nodes based on the player whose turn it is to move. The computer will be trying to maximise the value at any given node and (we assume) the opponent will be trying to minimse the value at any given node. If we assume the worst and give our opponent credit for what he/she/it will do then we can assume they will always play a perfect game and minimse the value at a given choice point.

If you look at the search tree you can see that the values have been propagated back up through the tree based on whose turn it is to play and whether they are trying to maximise or minimise at that point.

Once we have created the complete game tree, the computer player is now able to play the perfect game. At each node it just moves to the next node that has the highest value. If you follow the game through you will find that the computer will win as it will start with move B. We assume the opponent will move to E (i.e. it is trying to minimise) and we will take the move that gives us a victory with a value of 1.

Of course, there is a flaw in this argument as the computer gets two moves to the opponents one. This is not really fair is it!? But, without drawing an even bigger game tree, we can't show a complete game and, as we shall see, in a moment this is irrelevant anyway.

It is also worth mentioning at this point that, in the context of game search trees a move by each player is often referred to as a ply. So, a typical game is two ply, which represents one move by each player. So, the search tree above is 1½ ply.

So, lets' look at a larger example (taken from Luger, 1998). The game we are going to consider is called nim. We start with a pile of tokens. At each move the player must divide the tokens into two non-empty, non-equal piles. So, a pile of seven tokens can be divided into piles of 6-1, 5-3 and 4-3 but not 3-3.
The complete search tree for this game can be shown as follows.

 

To make things (appear!) fair, we will let MIN go first. At every leaf node we assign a value to each position of one, if the play has resulted in a win for MAX and zero if the play has resulted in a win for MIN (the various turns are shown on the left). These values are then propagated up the tree in the same way we did before.
It can be seen that, if we let MIN go first we can guarantee that we (MAX) will win (of course, if we made ourselves act first then MIN could guarantee a win for itself). This is obvious from the search tree as at level 0 (the top of the tree) the minimum value is 1. That is, MIN cannot minimise the value to zero, which shows it is going to ultimately lose (assuming we play correctly!!).
Another way to convince yourself that this MAX is guaranteed to win is to follow the bold arrows. These are the search paths that lead to MAX winning. You can see, on each of these paths that when it is MIN's turn to move it cannot move to a position where the state is labelled with a zero.


The problem with the minimax procedure outlined above is not practical for games of any size as we cannot calculate the entire game tree in order to find out the terminal values so that we can propagate them back up through the tree.
Therefore, the problem we have is that we have to be able to decide a value for each node that reflects our position of winning from that point. The only way to do this is to use an heuristic evaluation. For our purposes, we will assume that we have a heuristic function that returns a single value that reflects the probability of winning from the current position. Note, that this is different to the heuristic values we have looked at previously. In search problems where we are searching for a goal our heuristic value represents an estimate as to how close we are to the goal. In game playing we are estimating the chances of winning from a given position. It is not a percentage of us winning - but can be seen as such.

 

Digression

Of course, we have the problem of trying to work out a suitable evaluation function. A common method is to capture all the features that we think might be important and then use a linear function to combine them using different weights for each feature. We still have the problem of what the weights should be. An obvious method is either to get the program to play against itself and adjust the weights based on the outcome of the game. This weight adjustment can be done in a variety of ways. In the Chellipilla paper [CHE00] mentioned above they used a neural network that was able to play checkers and the weights were stored in the synapses and these were adjusted using a gaussian distribution. Barone (2000), used a similar method (gaussian) to adjust weights when he developed a program that learnt how to play poker.
Work at The University of Nottingham [KEN01a] has developed a chess playing program that adjusts the weights using a gaussian distribution but the standard deviation parameter was set based on feedback on how far the population of solutions were apart from each other. This work was a third year project and a paper has now been submitted to an international conference (CEC 2001, Congress on Evolutionary Computation, COEX Center, Seoul, Korea, May 27-29, 2001).

End of Digression

 

Fixed Depth Minimax

If we have an evaluation function then we can apply a minimax procedure using a fixed depth. This means we expand the tree to a given depth (which is normally based on time available). Once we have the tree, we evaluate the positions, using the evaluation function, and then propagate those values back up the tree in the usual way. Once our opponent makes their move, the search tree is generated again in order to decide which move to make next.

One of the problems with this limited look-ahead is known as the horizon problem. An example ([LUG98, p149] and [LUG02, p148]) probably makes this easier to explain. Assume that you are playing chess and your evaluation function is based on the number and values of the pieces you have on the board. Your look-ahead may give you an opportunity to take one of your opponents rooks; and this give s a good evaluation function. However, if you had looked ahead another n ply you would have seen that your opponent is sacrificing their rook so that they can capture your queen.
One solution to this problem is to further expand promising nodes so that you can look over the horizon. However, of course, this means you cannot explore other paths and you still do not get rid of the horizon problem as there will always (well, nearly always) be other paths to explore.

Another problem with a fixed depth look-ahead is that only quiescent positions should be evaluated (that is positions which do not exhibit wild swings in the evaluation function). To take the same example as above, if our evaluation function simply considers material advantage in chess then positions which look at favourable captures are not quiescent. In these circumstances, we can look-ahead until quiescence is reached. This is often referred to as a quiescence search.

 

Alpha-Beta Pruning

The problem with minimax is that we expand every node down to a certain depth even though, in certain cases we are wasting our time. To combat this a procedures known as alpha/beta (it is called this for historical reasons) pruning has been developed. It was probably invented by John McCarthy and further details can be found in [PEA84].
Alpha-beta pruning uses a depth first search. In doing so it maintains two variables, a and b (alpha and beta, though I can't get these in HTML - do you know how?). a, which is associated with MAX can never decrease and b, which is associated with MIN can never increase.

Take a look at this search tree.


 

Assume we have evaluated, using depth first search down to node I. This means we can maximise node D to a value of 6. If we now continue the search we will eventually evaluate node J and assign it a value of 8. At this point we do not need to evaluate K (or any of its siblings); for the following reasons.

Similarly, we can cut of the search from node G downwards.

The improvements we can get with alpha-beta searching depends on the problem being considered and the order in which the nodes are expanded. At worst, it will perform the same as minimax (i.e. no pruning will be possible). At best, alpha-beta pruning can effectively double the search space.

 

Other Resources

One of the world's leading game playing research groups (The University of Alberta GAMES Group) is headed by Jonathan Schaeffer (of Chinook fame). From this page there are links to Checkers, Chess, Poker, Awari (which was the subject of the G5BAIM coursework in 2000/01) and many other games.

From these pages, you can find many papers that you might like to download. In particular you might like to take a look at [SCH00]. This is a recent paper by Jonathan Schaeffer that looks at various game playing techniques (such as mini-max search and alpha-beta pruning) and then goes on to talk about the state of the art of various types of games. The paper is available from here.

 

Exam Questions

I have supplied a complete set of old exam papers so that you can see the type of questions I have set in the past. The questions also contain answers (though not model answers), including to those above.

I suggest that you try question 1 from the 2000/01 G5BAIM (Artificial Intellignence Methods) exam before taking the self assessment part of this course, as you may be asked questions relating to this exam question. If you download the exam paper, the answer are supplied as well. It's obviously better if you can try the exam question without looking at the answer, but that is up to you and your concious!

 

References

  1. [AXE84] Axelrod R. M. The Evolution of Cooperation. BASIC Books, New York, 1984.
  2. [AXE87] Axelrod R. M. The Evolution of Strategies in the Iterated Prisoner's Dilemma. In Genetic Algorithms and Simulated Annealing, ch. 3, pp 32-41. Margan Kaufmann, 1987.
  3. [BAR98] Barone L. and While L. Evolving Adaptive Play for Simplified Poker. In proceedings of IEE International Conference on Computational Intelligence (ICEC-98), pp 108-113, 1998.
  4. [BAR99] Barone, L. and While, L. An Adaptive Learning Model for Simplified Poker Using Evolutionary Algorithms. In proceedings of the Congress of Evolutionary Computation (GECCO-1999), p. 153-160, 1999.
  5. [BAR00] Barone, L. and While, L. Adaptive Learning for Poker. In proceedings of the Genetic and Evolutionary Computation Conference, pp 566-573, 2000.
  6. [BIL98a] Billings D., Papp D., Schaeffer J. and Szafron D. "Poker as a Testbed for Machine Intelligence Research." In Advances in Artificial Intelligence (Mercer R. and Neufeld E. eds.), Springer-Verlag, pp 1-15, 1998.
  7. [BIL98b] Billings D., Papp D., Schaeffer J. and Szafron D. "Opponent Modeling in Poker." In Proceedings of AAAI-98 (15th National Conference of the American Association for Artificial Intelligence (AAAI)), pp 493-499, 1998
  8. [BIL99] Billings, D., Peña, L., Schaeffer, J. and Szafron, D. Using probabilistic knowledge and Simulation to Play Poker. In proceedings AAAI-99 (Sixteenth National Conference of the American Association for Artificial Intelligence (AAAI)), July 18-22, 1999, Orlando, Florida, AAAI Press, Menlo Park, California, pp 697-703, 1999.
  9. [BOW53] Bowden, B. V. 1953. Faster Than Thought. London: Pitman
  10. [CHE99] Chellipilla K. and Fogel, D. B. "Evolution, Neural Networks, Games, and Intelligence." Proc. IEEE, Vol 87, No. 9, pp 1471-1498, 1999.
  11. [CHE00] Chellipilla K. and Fogel, D. B. "Anaconda Defeats Hoyle 6-0: A Case Study Competing an Evolved Checkers Program against Commercially Available Software." In Proceedings of Congress on Evolutionary Computation, July 16-19 2000, La Jolla Marriot Hotel, La Jolla, California, USA, pp 857-863, 2000
  12. [DAR95] Darwen P. and Yao X. "On Evolving Robust Strategies for Iterated Prisoners Dilemma." In Progress in Evolutionary Computation, vol. 956 of LNAI, pp 276-292, Springer, 1995.
  13. [DAR01] Darwen P. and Yao X. "Why More Choices Cause Less Cooperation in Iterated Prisoner's Dilemma." In proceedings of Congress of Evolutionary Computation (CEC'01), pp 987-994, 2001
  14. [FIN77] Findler N. "Studies in Machine Cognition Using the Game of Poker." CACM 20(4), pp 230-245, 1977
  15. [FOG01] Fogel, D. "Blondie24 : Playing at the Edge of AI". Morgan Kaufmann, 2001, ISBN:1558607838
  16. [HAM97] Hamilton, S.,Garber L. "Deep Blue's hardware-software synergy". IEEE Computer, 30(10), pp 29-35, October 1997.
  17. [IBMRes] http://www.research.ibm.com/deepblue/meet/html/d.3.html
  18. [KEN01a] Kendall, G. and Whitwell, G.. "An Evolutionary Approach for the Tuning of a Chess Evaluation Function using Population Dynamics". Accepted for CEC2001 (Congress on Evolutionary Computation 2001), COEX Center, Seoul, Korea, May 27-29, 2001
  19. [KEN01b] Kendall G. and Willdig M. An Investigation of an Adaptive Poker player. Accepted for The 14th Australian Joint Conference on Artificial Intelligence (AI'01), Adelaide, Australia, 10 - 14 December 2001
  20. [KOL97] Koller D. and Pfeffer A. Representations and Solutions for Game-Theoretic Problems. Artificial Intelligence 94(1-2), pp 167-215, 1997.
  21. [LUG98] Artificial Intelligence : Structures and Strategies for Complex Problem Solving by George F Luger and William A Stubblefield(3rd Edition, 1998, Addison Wesley Longan, ISBN 0-805-31196-3)
  22. [PEA84] Pearl, J. 1984. Heuristics: Intelligent Strategies for Computer Problem Solving. Reading, MA: Addison-Wesley
  23. [POU92] Poundstone, W. Prisoner's Dilemma. Doubleday, 1992
  24. [SAM59] Samuel A. L. "Some Studies in Machine Learning using the Game of Checkers". IBM Journal of Research and Development, 3(3), pp 210-229, 1959
  25. [SAM63] Samuel, A. L. 1963. Some Studies in Machine Learning using the Game of Checkers. In Computers and Thought, ed E. A. Feigenbaum and J. Feldman. New York: McGraw Hill
  26. [SAM00] Samuel A. L. "Some Studies in Machine Learning using the Game of Checkers". IBM Journal of Research and Development, Vol. 44 No. 1/2 January/March, pp 207-226, 2000
  27. [SCH96] Schaeffer J., Lake R., Lu P. CHINOOK The World Man-Machine Checkers Champion. AI Magazine, 17(1), pp 21-30, 1996
  28. [SCH97] Schaeffer, J. "One Jump Ahead : Challenging Human Supremacy in Checkers". Springer-Verlag, 1997, ISBN:0387949305
  29. [SCH99] Schaeffer J., Billings D., Peña L. and Szafron D. "Learning to Play Strong Poker." In proceedings of the Sixteenth International Conference on Machine Learning (ICML-99) (Invited Paper)
  30. [SCH00] Schaeffer J. The Games Computers (and People) Play. Advances in Computers, vol. 50 Marvin Zelkowitz editor, Academic Press, pp. 189-266, 2000
  31. [SHA50] Shannon, C. E. 1950. Programming a computer to play chess. Philosophical Magazine [Series 7] 41:256-275
  32. [VON44] von Neumann J., Morgenstern O. "Theory of Games and Economic Behavior." Princeton NJ: Princeton Univ. Press, 1944

 


 Last Updated : 14 Sep 2001