Describe how the minimax and alpha-beta algorithms change for two-player, non- z
ID: 3592571 • Letter: D
Question
Describe how the minimax and alpha-beta algorithms change for two-player, non- zero-sum games in which each player has a distinct utility function and both utility functions are known to both players. If there are no constraints on the two terminal utilities, is it possible for any node to be pruned by alpha-beta? What if the player’s utility functions on any state differ by at most a constant k, making the game almost cooperative?
NOTE: Please do not use the answer from chegg textbook solutions. Because I do not understand it. Thank you.
Explanation / Answer
Solution:
Case 1: Minimax
Since each player has a separate utility function, if played right, minimax algorithm should work as usual. Each player is going to choose what is best for him, even if it's good for the other player too. So, we proceed by assigning increasingly positive values for states which increase better for Max. Similarly, we assign increasingly negative values for states increasingly better for Min. If the above two conditions are satisfied, then the minmax algo works unchanged. If in case both players have increasingly positive values, then each player can opt to choose the maximum value everytime to win. This can be called as maximax algo.
Case 2 : Alpha-beta
Unlike minimax algorithm, alpha beta pruning will not work in this case. This is because in alpha-beta, we go by the notion that what's good for Max is bad for Min. Let's take an example. Min won’t allow Max to go down a path since Min can force something worse, hence Max knowing this does not have to explore the path. Since we have been given that this is a zero-sum assumption, the same state cannot be good for both Max and Min. You cannot just assume that what Min likes, Max won't and also vice versa.
Please, please upvote and ask your doubts in the comments.