How to use the Negamax algorithm

I was wondering how I can use the negamax algorithm. I am trying to write an agent for a mancala game in C #. The algorithm gives you one number when playing node.

Let's say my AI player wants to make a move. The negamax function returns one number. So this tells me which result is best for this. How can I use this number?

if he turned the player, I tried to take his possible steps and check the negamax for each of them. However, if I first make a transition and then check negamax, then when negamax is working (let's say we're still only at level 1) it will evaluate the move and then player B should be the next step.

I am very confused about this. When I see the negamax pseudocode (for example on the wikipedia page) it says to try to get the player to move. If I do this, it will return the highest score without telling me what move that result got.

How should you use Negamax?

+3


source to share


2 answers


It's fun.

It's all about examining each node in the tree of possible moves. If you are using alpha-beta pruning, you can make the algorithm more efficient by "pruning" (not evaluating) some branches of the tree. I'm going to assume that you are not using pruning and will be looking at a full tree.

If Mancala is a very simple game like Tic-Tac-Toe, you can implement the algorithm without the need for a "score function". With tic-tac-toe, if you play every possible move, you get either a win, a loss, or a draw. You would implement the negamax algorithm without any relation to the intermediate states of the game (i.e. any movement to the last), because there are a very limited number of possible moves, and the AI ​​engine can easily figure out all the possibilities until the very end.

In chess, on the other hand, the "evaluation function" (EF, hereinafter) is important because no equipment on this planet can compute every possible sequence of chess movements until the end of the game. So most chess AIs will go 12-14 levels and then evaluate the resulting position by assigning 8 points for the queen, 5 for the rook, 3 for the bishop or knight, 1 for the pawn, and then assign extra points for things like squares. controlled (more points for controlled center squares), king safety, etc.

For Mancala, as far as I can tell, it is probably difficult enough that a scoring function is needed, but perhaps this scoring function will be simple, such as the number of seeds still in possession, as well as adding points for seeds that are in an advanced position ... (I looked at Wiki Mancala, it looks like there are many possible options - I'm not sure who you are working with.)

Thus, the Negamax algorithm must be implemented at a certain depth (i.e. not until the end of the game, using all possible pieces) and with a simple EF. Let's say you are implementing an AI that is looking 5 steps deep. The good thing about non-gamax is that it is completely symmetrical and zero; in other words, if a position is rated 5 for an AI, it is rated -5 for a human player. And if it is 13 for a person, then it will be -13 for AI. This is the "singular" that is being discussed. With that in mind, the AI ​​algorithm would look something like this (again, without cropping):

1) Explore every possible AI move

2) For each of these steps, investigate every possible opponent's response

3) For each of these possible answers, examine every possible AI move

4) For each of these possible AI moves, investigate every possible opponent's response



5) Finally, for each of the opponent's possible responses, investigate each possible AI move

We've now reached depth 5 and you've created a tree with 5 levels and maybe thousands or millions of leaves (bottom layer nodes) of the tree. You code this so that each node has a reference to its parent node and references all of its child nodes so that you can easily traverse the tree, from parent to children and then back again.

Once you've set up the tree correctly, now it's time to implement the negamax algorithm as follows (let's say a higher score is better for an AI player):

6) For each answer against level 4, find the HIGHEST estimate of all movements of the AI ​​children and crop all other children. You determine which of the 5-now-moves your AI will play in response to every possible response from the 4th from the current opponent. So now every level 4 answer has exactly one level 5 accepted answer. You now assign the grade you made from a 5th level child to a 4th level parent. This suggests that if you reach this 4th level opponent, the AI ​​will make that 5th level and the board will judge that score.

7) Then you rate every third level AI move and each one find the lowest mark among all the opponent's moves from 4th place, crop all other children and assign a 4th level grade (which came from the highest 5th level node) up to the 3rd level. You do the same as in step 6, except that you are using a LOWEST child score (b / c is the movement of the AI, not the movement of the opponent).

8) Do the same for the 2nd level as step 6, finding the HIGHEST score among all movements from the third to the last, and assign the second level nodes the same high scores.

9) Do the same for the 1st level as in step 7, find the LOWEST score among all the displacements from the second to the next, and assign the 1st level nodes the same low scores.

10) Look at all 1st level nodes and your AI should be playing the one with the HIGHest score.

Obviously, you would make the depth not hardcoded to 5, but instead a parameter, and you would use recursion (as in the Wiki) to accomplish this. To select depth, look at how long it takes to launch and set n to the highest depth that still allows the AI ​​to react quickly. Once you've got the basics in place, you can add pruning strategies later, which will allow you to increase depth without evaluating entire tree branches, which obviously aren't the right move, but that's the complete basic non-gamax I've laid out for you.

Good luck, this should be fun to program!

+5


source


Onemancat gives a very detailed explanation - +1.



The short answer to your question is that negamax returns a score for a specific position, so you have to play every move in the first layer, call negamax on each resulting position to evaluate it, and then select the move with the best result as the result.

+2


source







All Articles