Search algorithms

Seek, and ye shall find

To find a solution, you usually need to search for it. For artificial agents, this is no different. On this page, we explore several ways in which agents can search, and list some of the advantages and disadvantages. The JavaScript on this page makes use of HTML5.

We represent a search problem as a graph, where each node is a point in the solution. For example, this could be a road map, where different nodes are different cities. But nodes can also represent different actions in a plan. For example, an agent that wants to make a cup of tea can search through his repertoire of actions and find that it should pass through the nodes of “putting on the kettle” and “pouring the hot water” to get to the goal of “enjoying a nice cup of tea”.

In our representation, agents always try to find a path from the start S to the goal G. Every arrow represents a path from one node to another, and can only be travelled in one direction. The number next to each edge shows the cost of taking that path (e.g. the time, effort, or energy consumed by taking the path). When there is no edge connecting two nodes, it is not possible to go from one node to the other directly. However, it may still be possible to go there indirectly.

Below, we describe a number of search algorithms, and show an example of the way this algorithm works in practice. This list of algorithms includes

Each example consists of the example network and the stack. The stack shows what options the agent can still explore. Whenever a node is put onto the stack (the agent recognizes the possibility of going to that node), the corresponding node in the network turns green. When the node is removed from the stack (the agent has finished considering the option), it turns blue. As soon as the goal node G turns blue, the agent has found a solution and stops. In some cases, it may happen that an edge turns red. This indicates that the agent has chosen not to investigate that path at all.

To compare these search algorithms more directly, you can also view the search algorithm script page, which provides a simultaneous view of the network, the stack, the search tree, and the algorithm.

Depth first search

An agent that uses depth first search is always moving forward. Every time the agent reaches a node, it looks at all the options available from that point and chooses to continue searching in the last direction that the agent has observed. Backtracking to a previous point only happens when the agent reaches a dead end, and is therefore forces to turn back.

One of the main advantages of depth first search is that it requires very little memory. But because of that, depth first search may lead agents to go around in circles. Fortunately, there are no circular paths in the example above. In general, to prevent going around in circles, agents can skip visited nodes. Whenever they see a node for the second time, they ignore the possibility of going there.

Depth limited search

As mentioned above, the disadvantage of depth first search is that you may get stuck in a circular path. One way to overcome this is stop searching after a given number of steps. For example, the agent may restrict itself for paths that visit no more than three nodes. This way, the agent is sure not to get stuck in a circular path for all eternity. The downside, of course, is that if there the path from start to goal visits at least four nodes, the agent cannot find a solution even though it exists.

Iterative deepening search

The disadvantage of depth first search is that you may get stuck in a circular path. Depth limited search solves this partially by putting a limit on the length of the path. Unfortunately, this may lead the agent to be unable to find the solution, because it takes too many steps.

To solve this problem, the agent could simply increase its search limit and try again. This is exactly what iterative deepening search does. Whenever the agent is unable to find a path from start to goal, the agent increases its search depth limit and restarts the search. But the demonstration shows that this solution has its own problems: the agent may be doing the same search over and over again, forgetting what it has learned every time.

Breadth first search

If depth first search is about always moving forward, breadth first search is about not straying too far. An agent that follows breadth first search explores its options in the order it observed these options. Rather than staying on one chosen path, the agent will jump back and forth between options until it finds the solution.

Breadth first search has a huge advantage over depth first search. Because breadth first search means a meticulous search for a path to the goal, it is guaranteed to find such a path if it exists. It will even find the path that takes the fewest number of steps, though this may not be the path with the lowest cost. This advantage comes at a cost of the memory needed to remember all paths that are not yet explored.

Uniform cost search

Uniform cost search is a refinement of breadth first search. Whereas breadth first search always considers the path with the fewest steps first, uniform cost search prefers to consider paths with lower costs. An agent that follows uniform cost search orders its stack of paths to consider based on the length of the path. In the stack view of the example above, the length of the path is listed below the node. Since the agent always looks at the path with the shortest length first, as soon as the agent considers the goal node G, it has found the shortest path from start to goal.

Greedy best first search

Unlike the previous search algorithms, which were all uninformed search, greedy best first search is an informed search algorithm. That is, greedy best first search makes use of information that is not part of the problem description in the form of a heuristic. This heuristic gives an estimate for the cost of reaching the goal from any node. For example, a heuristic on a road map could be the straight-line distance between two cities. This is easier to calculate than the road distance, but still gives the agent some indication of whether it is looking in the right direction. This is the main idea of the heuristic: it should be easy to calculate, but still give information that is relevant to the search.

An agent that follows greedy best first search orders its stack based on the heuristic value, and therefore tries nodes with a low heuristic value first. That is, it first tries those nodes that are expected to be very close to the goal node already, and ignore those paths that go into the wrong direction. However, like depth first search, this can cause the agent to go around in circles. Another disadvantage of greedy best first search is that it needs a heuristic, which may not be easy to get.

A* search

Like greedy best first search, A* search is an informed search algorithm that makes use of a heuristic in addition to the problem description. But instead of relying only on the heuristic, A* search also looks at how long the path already is (similar to uniform cost search. That is, A* search makes an estimate of the total length of the path rather than the remaining distance, and orders the stack based on this estimate.

Like uniform cost search, A* search will find the shortest length path, but only if the heuristic follows a simple rule: the heuristic cannot overestimate the cost of actually reaching the goal. That is, the heuristic value for a node cannot be more than the actual cost of reaching the goal from that node. It can, however, be less. In fact, for the special case that the heuristic value is zero for all nodes, A* search is the same as uniform cost search. But for a more informative heuristic, A* search could be a lot more efficient in finding the path with the lowest cost.