Subscribe

# Learning Real-Time A* (LRTA*) Search

If a problem is to be solved repeatedly with the same goal state but different initial state then one would like an algorithm that improves its performance over time. Learning Real-Time A* (LRTA*) is such an algorithm. It behaves almost identically to RTA*, except that instead of storing the second-best f value of a node as its new heuristic value, it stores the best value instead. Once one problem instance is solved, the stored heuristic values are saved and become the minimal value for the next problem instance.

While LRTA* is less efficient than RTA* search for solving a single problem instance, if it starts with admissible initial heuristic values over repeated trials, its heuristic values eventually coverage to their exact values. At which point the algorithm returns optimal solutions.

Related Articles

Interleaving Search

## Real Time A Star

### Real-Time A* Search

Simply repeating minimin search for each move ignores information from previous searches and results in infinite loops. In addition, since actions are committed based on limited information often the best move, may be due to undo the previous move. The principle of rationally is that backtracking should occur when the estimated cost of continuing the current path exceeds the cost of going back to a previous state plus the estimated cost of reaching the goal from the state Real-time A* (RTA*) implements the policy in constant time per move on a tree.

For each move, the f(n) = g(n) + h(n) value of each neighbour of the current state is computed where <n()> is now the cost of the edge from the current state to the neighbour, instead of from the initial state. The problem solver moves to the neighbour with the minimum f(n) value, and stores with the previous state the best f(n) value among the remaining neighbours. This represents the h(n) value of the previous state from the perspective of the new current state. This is repeated until a goal is reached. To determine the h(n) value of a previously visited state, the stored value is used, while for a new state the heuristic evaluator is called. Note that the heuristic evaluator may employ minimum lookahead search with branch-and-bound as well.

In a fluite problem space in which there exists a path to a goal from every state, RTA* is guaranteed to find a solution, regardless of the heuristic evaluation function. Moreover, on a tree, RTA* makes locally-optimal decisions given the information it has seen so far.

## Minimin Search

Minimin search determines individual single-agent moves in constant time per move. The minimin search algorithm searches forward from the current state to a fixed depth determined by the informational or computational resources available. At the search horizon, the A* evaluation function f(n) = g(n) + h(n) is applied to the frontier nodes. Since all the decisions are made by a single agent, the value of an interior node is the minimum of the frontier values in the sub tree below the node. A single move is then made to the neighbour of the current state with the minimum value.

Most heuristic functions obey the triangle inequality characteristic of distance measures. As a result, f(n) = g(n) + h(n) is guaranteed to be monotonically nondecreasing along a path. Moreover, since minimum search has a fixed depth limit, we can apply depth-first branch-and-bound to prune the search tree. The performance improvement due to branch-and-bound is quite dramatic in some cases extending the achievable seearch horizon by a factor of five relative to brute-force minimum search on sliding-title puzzles.

Minimin search with branch-and-bound is an algorithm for evaluating the immediate neighbours of the current node. As such, it is run until the best child is identified at which point the chosen move is executed in the real world. We can view the static evaluation function combined with lookahead search as simply a more accurate but computationally more expensive, heuristic function. In fact, it provides an entire spectrum of heuristic functions trading off accuracy for cast, depending on the search horizon.

Related Articles:

# Interleaving Search And Execution

In AI search, there are computational limits and uncertainty due to the opponent’s move as in two-players games. So it is wise to have search and execution interleaved with each search determining only the next move to be made. This paradigm is also applicable to single-agent problems. In the case of autonomous vehicle navigation, for example, information is limited by the horizon of the vehicle, sensors, and it must physically move to acquire more information. Thus, one move must be computed at a time, and that move executed before computing the next. Following are the algorithms designed to address this scenario.

Related Articles

Minimin Search

Real-Time A* Search

Learning Real-Time A* Search

Related Articles