Real Time A* Search
Simply repeating minimin search for each move ignores information from previous searches and results in infinite loops. In addition, since actions are committed based on limited information often the best move, may be due to undo the previous move. The principle of rationally is that backtracking should occur when the estimated cost of continuing the current path exceeds the cost of going back to a previous state plus the estimated cost of reaching the goal from the state Real-time A* (RTA*) implements the policy in constant time per move on a tree.
For each move, the f(n) = g(n) + h(n) value of each neighbour of the current state is computed where <n()> is now the cost of the edge from the current state to the neighbour, instead of from the initial state. The problem solver moves to the neighbour with the minimum f(n) value, and stores with the previous state the best f(n) value among the remaining neighbours. This represents the h(n) value of the previous state from the perspective of the new current state. This is repeated until a goal is reached. To determine the h(n) value of a previously visited state, the stored value is used, while for a new state the heuristic evaluator is called. Note that the heuristic evaluator may employ minimum lookahead search with branch-and-bound as well.
In a fluite problem space in which there exists a path to a goal from every state, RTA* is guaranteed to find a solution, regardless of the heuristic evaluation function. Moreover, on a tree, RTA* makes locally-optimal decisions given the information it has seen so far.