I'm trying to develop a Tic-Tac-Toe game with an unbeatable AI and am at a point where my negamax function returns correct output most of the time. At times, though, under certain predictable conditions, the computer chooses a board move that doesn't make sense and fails to block a user win.
Based on the research I've done, the negamax algorithm itself seems to look fine. I'm more concerned that there may be something wrong in my evaluation function; in other words, in the way the heuristic value is calculated. Most of my energies were directed at tackling negamax, so the other functions were mostly afterthoughts.
Here's just one example of what I'm describing:
// user: X, computer: O, turn: computer
// this scenario should return 1 to block user from [0, 1, 2]
// returns 5 when negamax called with compNum
// output is correct when negamax called with userNum (simulating user's turn)
var scenario01 = [
userNum, 0, userNum,
userNum, compNum, 0,
compNum, 0, 0
];
When faced with this scenario, rather than choosing square 1 (top middle: array is zero-indexed) to block user from achieving a win, the negamax function outputs 5, a square that doesn't have any immediate use to either player. However, in testing, if I call negamax on the same scenario with userNum as an argument, negamax does its job and finds the winning square 1 for the user.
I know I'm missing something deceptively simple, and I have a feeling that it's related to the way I've set up userNum and compNum and the way they're hard-coded. I may have taken my Minimax ideas and translated them unsuccessfully to negamax, meaning that I may not be maximizing value for both user and computer. Or I may be maximizing from the wrong player's point-of-view. If someone would kindly provide a push in the right direction, I would be grateful.
The full, working version, along with tests, is available at: https://repl.it/CnGD/4
I didn't want to post an overwhelming amount here, but I can create a snippet on this site if requested.
Negamax function
var negamax = (board, player, depth, lvl, α, β) => {
if ((board.getWinner(userNum, compNum) !== null) || depth >= lvl) {
return board.getWinner(userNum, compNum);
// also tried: return board.score(depth) * player;
}
var highScore = -Infinity;
board.available().forEach(function (move) {
board.mark(move, player);
var score = -negamax(board, -player, depth + 1, lvl, -β, -α);
board.undo(move);
if (score > highScore) { // if better cell is found
highScore = score; // note best score so far
bestMove = move; // note best move so far
α = Math.max(α, score);
if (α >= β) { // alpha-beta pruning
return bestMove;
}
}
});
return bestMove; };
Evaluation function(s)
// these functions are part of a Board prototype
// I haven't included it all here, but will provide a link
getWinner: function (userNum, compNum) {
return (
this.validateWin(this.occupied(userNum)) ? userNum :
this.validateWin(this.occupied(compNum)) ? compNum :
this.matchAll(true) ? 0 :
null
);
},
score: function (depth) {
if (board.getWinner(userNum, compNum) === userNum) {
return depth - 10;
} else if (board.getWinner(userNum, compNum) === compNum) {
return 10 - depth;
} else {
return 0;
}
}