Alpha–beta pruning

The alpha-beta search (also Alpha - Beta Cut or alpha-beta pruning called ) is an optimized variant of the minimax search procedure, ie an algorithm for determining an optimal train for games with two opposing parties. During the search, two values ​​alpha and beta to be updated, indicating which result, players can achieve with optimal play. Using these values, it can be decided which parts of the search tree need not be investigated because they can not influence the outcome of problem solving.

The simple ( non-optimized ) alpha-beta search yields exactly the same result as the minimax search.

Informal Description

The minimax algorithm analyzes the complete search tree. But it also nodes are considered that are not included (the choice of the branch to the root) in the result. The alpha-beta search tries to ignore as many of these nodes.

An illustrative example of the operation is a two-person game in which the first player selects one of several pockets and his opponent receives the object with the lowest value of this bag.

The Minimax algorithm searches for the selection of all bags completely and therefore requires a lot of time. The alpha-beta search contrast is checked first, only the first pocket completely by the object with the minimum value. In all other bags will be searched only as long as, until the value of an item falls below this minimum. If this is the case, the search is aborted in this case and the next case examined. Otherwise, this bag is a better choice for the first player and its minimum value is used for the further search as a new frontier.

Similar situations are familiar every chess player who just checks a specific train to see if it appears advantageous to him. If he finds in his analysis of turn unfavorable for themselves response of the opponent, then he "refutes " this train as view and discard. It would be completely pointless, yet to investigate further responses of the opponent, in order to determine whether the enemy still has more effective rebuttals and how bad the train is actually planned for the player.

The algorithm

The alpha-beta search works basically the same as the above informal description. The idea is that the two values ​​( alpha and beta) are passed, which describe the worst-case scenario, the player. The alpha value is the result that will reach at least Player A, the beta value is the result, which will reach more than Player B ( Here it should be noted that the goal for Player B to obtain the lowest possible result, since he " minimizing " plays! )

If a maximizing node ( Player A ) a train whose return exceeds the beta value, the search is aborted in this node ( beta cutoff, because Player B A would not even offer this option, because it its previous maximum would exceed concession ). Returns the train instead, a result that exceeds the current alpha value, this is lifted upward.

The same applies for minimizing node, is terminated at values ​​less than alpha (alpha - cutoff ) and the beta value is adjusted downward.

The figure above shows an example tree with 18 leaves, of which only 12 are evaluated. The three-edged values ​​of an internal node describe the alpha value, the return value and the beta value.

The search algorithm used is a so-called alpha-beta window whose lower limit of the alpha value and the upper limit is the beta value. This window is passed to the child node, starting at the root with the maximum window [- inf, inf ]. The sheets 1, 2 and 3 are evaluated by a maximizing node and the best value of 10 is given to minimizing parent node. This adjusts the beta value ( indicated by the orange line on the left below) and passes the new window [- inf, 10] the next -maximizing child node that owns the leaves 4, 5 and 6. The return value of 12 Sheet 5 is so good that it exceeds the beta value of 10. Thus, sheet 6 must no longer be considered because the result 12 of this sub-tree is better than that of the left subtree, and would therefore never elected by the minimizing player.

Similarly, when minimizing node with the 3- alpha cutoff. Although this part of the tree was only partially evaluated, it is clear that the maximizing root node would never choose this option because of the minimized node could force a result of more than 3, while from the middle part of the tree, the result is 12 is ensured.

Implementation

Note: Differences for easy Minimax algorithm are highlighted in yellow.

Main program ( excerpt):

GespeicherterZug = NULL;   gewuenschteTiefe int = 4;   int rating = max ( gewuenschteTiefe,                       -infinity, infinity ); if ( gespeicherterZug == NULL)      There were no other trains more;   else      gespeicherterZug run; The normal version is:

Int max ( int depth,           int alpha, int beta) {if ( depth == 0 or keineZuegeMehr ( spieler_max ) )         return evaluate ();      int MaxValue = alpha; generiereMoeglicheZuege ( spieler_max );      while ( still train there) {         fuehreNaechstenZugAus ();         int value = min ( depth -1,                        MaxValue, beta); macheZugRueckgaengig ();         if ( value > max value ) {            MAX VALUE = value;            if ( max value > = beta) break; if ( deep == start deep )               gespeicherterZug = train;         }      }      return maximum value;   }   int min ( int depth,           int alpha, int beta) {if ( depth == 0 or keineZuegeMehr ( spieler_min ) )         return evaluate ();      int MinValue = beta; generiereMoeglicheZuege ( spieler_min );      while ( still train there) {         fuehreNaechstenZugAus ();         int value = max ( depth -1,                        alpha, min value ); macheZugRueckgaengig ();         if ( value < MinValue ) {            MinValue = value;            if ( min value < = alpha) break; }      }      return min value;   } The NegaMax variant is:

MiniMax int (int player, int depth,               int alpha, int beta) {if ( depth == 0 or keineZuegeMehr ( player ) )         return rate ( player );      int MaxValue = alpha; generiereMoeglicheZuege ( player );      while ( still train there) {         fuehreNaechstenZugAus ();         int value = miniMax ( player, depth -1,                             -beta - max value ); macheZugRueckgaengig ();         if ( value > max value ) {            MAX VALUE = value;            if ( max value > = beta) break; if ( deep == start deep )               gespeicherterZug = train;         }      }      return maximum value;   } Note: While the standard implementation for a player maximized and minimized for the other player, maximize Negamax variant for both players and reversed and negated alpha and beta in the recursion. It follows that the evaluation function must behave differently in the two implementations.

  • Standard implementation: The better the board position for maximizing player, the greater the return value of the evaluation function is (rate function ()). The better it is for the minimizing player, the smaller the return value.
  • Negamax implementation: Since both players maximize the evaluation function must provide all the larger values ​​, the better the board position of the straight -withdrawing (rate function ( player ) ). The value is always specified from the point of view.

Optimizations

Is it possible to calculate due to the complexity of the considered game the game tree only up to a certain depth, two optimization approaches are possible. Firstly, the evaluation function can be improved, on the other hand provides the alpha-beta algorithm itself optimization potential. The better the evaluation function, the less deep the game tree must be considered in order to achieve a certain skill level.

Below are details only on optimization of alpha-beta search.

Presorting of trains

Unlike the minimax algorithm with alpha - beta search, the order will be processed in the child node (trains ), an essential role. The faster the alpha-beta - window is reduced, the more cutoff can be achieved. Therefore, it is important to first consider the features that promise the best result. In practice, various heuristics are used. In chess, for example, you can sort the features according to whether or which character is beaten, or even which character beats. "Tower beats lady " is therefore far from "Tower takes Rook " sorted and " farmer takes Rook " is sorted between the two.

Principal Variation Search

A node in the alpha-beta search belongs to one of three categories (based on the NegaMax variant):

  • Alpha - node: Each Folgezug returns a value less than or equal to alpha, which means that not a good train is possible here.
  • Beta - node: At least one Folgezug returns a value greater than or equal to beta, which means a cutoff.
  • Principal- variation - node: At least one Folgezug returns a value greater than alpha, but all provide a value less beta.

Sometimes you can detect early stage to which node it is. Returns the first tested Folgezug a value greater than or equal Beta, then it is a beta node. If it returns a value less than or equal alpha, then an alpha node may be (assuming the trains are well -sorted ). But it returns a value between alpha and beta, so may constitute a Principal Variation node.

The principal variation search now assumes that a Folgezug, which gives a value between Alpha and Beta, will turn out to be the best possible train. Therefore, the alpha-beta is in the window following the mininum ( alpha, alpha 1) reduced ( nonzero ), in order to achieve a maximum number of cutoffs, but the remaining features yet to prove as bad.

If, during search with this zero window out that the train has a value greater than alpha ( but the result is still less than beta, otherwise you can equal a beta- cut make ), the search must be repeated with a larger window: Since you already know a lower bound on the pull value from the zero window, the window in the repeat search of this value should reach beta. Because of the repetition sometimes necessary to search only be achieved by this method success when the trains were well -sorted, because it misclassifications are minimized in one of these three categories, that is, it is unlikely that a Folgezug has a better value than alpha and the search must be repeated. On the other hand principal variation node can be used in combination with the iterative deepening for the sorting of the trains.

Int Alpha Beta ( int depth, int alpha, int beta) {      if ( depth == 0 ) return Evaluate ();      PVgefunden BOOL = FALSE;      best = -infinity;      GeneriereMoeglicheZuege ();      while ( ZuegeUebrig ())      {          FuehreNaechstenZugAus (); if ( PVgefunden ) {value = alpha beta (low -1, alpha- 1, alpha ); if ( value > alpha && value < beta) value = Alpha Beta ( depth -1, -beta, value); else }              value = Alpha Beta ( depth -1, -beta, -alpha);          MacheZugRueckgaengig ();          if ( value > best )          {              if ( value > = beta)                  returnValue;              best value =;              if ( value > alpha)              {                  alpha = value;                  PVgefunden = TRUE;              }          }      }      return best; } Iterative depth-first search (Iterative Deepening )

The iterative deepening search is gradually increasing the depth of the search tree. Since the alpha-beta search is a depth-first search, you can not determine how long the computation will take usually before. Therefore, one starts with a small search depth increases and this gradually. The result of a calculation can be used to better pre-sort the trains with increased search depth.

Aspiration windows

Aspiration windows are used together with the iterative deepening search. Generally, the alpha-beta search starts at the root, with a maximum window. In iterative deepening search but it can be assumed that a new calculation with higher depth will provide a similar result value. Therefore, the search window can be set to a (relatively) small area around the resulting value of the previous calculation initial. Turns out that this window was too small, must (similar to the principal- variation - search) search with a larger maximum window or be repeated.

Killer heuristic

The killer heuristic is a special type of Zugvorsortierung. It is assumed here that trains that have caused a cutoff in other parts of the search tree will cause a cutoff (at the same depth). Therefore, they will in future always be considered first, provided they are in the game just considered position valid moves. This heuristic can not be usefully applied in all games, as the view that these so-called killer trains are still valid moves in other parts of the search tree, depends on the nature of the game.

Find peace

In the alpha-beta search or the minimax algorithm, it is not very favorable when searching on reaching a fixed search depth is canceled. In this way, for example, could in chess hitting a covered farmers by the lady appear to be advantageous if the strike back below the search horizon and therefore "overlooked" by the algorithm would. For this reason, the transition from the alpha-beta search to the evaluation function is performed dynamically, and such that below the minimum depth of search in terms of the evaluation function an approximate constancy is awaited. This " rest " in relation to the values ​​of the evaluation function is eponymous for the procedure, it is called an idle search (English Quiescence search). In chess, the rest search results, in particular, that an exchange of blows is analyzed until the end.

Zero - train - Search

The zero - train - search is an optimization that makes use of the fact that in some games (eg chess) is the train right of advantage.

Comparison of Minimax and Alpha Beta

The following table shows a sample calculation of a chess position with a constant search depth of four half-moves ( each player draws twice). So that the normal minimax algorithm has been applied and alpha-beta without Zugsortierung and with (simple ) Zugsortierung. The percentage at the cutoffs are based on the entire search tree and describes how much of the entire search tree has not been evaluated. These are estimates, which are based on that the subtrees are roughly the same size (with cutoffs is not known how big the cut-away part tree really would ).

It is clearly seen that the alpha-beta search represents a significant speed increase over minimax. The Zugsortierung improves the computation time in this example by a factor of 10, the fact that with Zugsortierung the number of cutoffs absolutely decreases can be explained by the fact that these are undertaken at higher levels in the search tree and thus larger parts to be cut away.

History

While the centrality of the minimax algorithm for chess programming has already been highlighted by the pioneers Alan Turing and Claude Shannon in the 1940s, the beginnings of the alpha-beta search in the late 1950s, initially only alpha cutoffs were used. Only in the 1960s, the alpha-beta algorithm has been an integral part of chess programs. A first, but unpublished description dates from the year 1963. The first detailed study was published in 1975.

52512
de