AI class unit 2
These are my notes for unit 2 of the AI class.
Problem Solving
Introduction
In this unit we talk about problem solving. The theory and technology of building agents that can plan ahead to solve problems. In particular we're talking about problem solving where the complexity of the problem comes from the idea that there are many states, as in this problem here:
A navigation problem where there's many choices to start with and the complexity comes from picking the right choice now and picking the right choice at the next intersection and the intersection after that. Stringing together a sequence of actions.
This is in contrast to the type of complexity shown in this picture:
Where the complexity comes from the partial observability, that we can't see through the fog where the possible paths are or we can't see the results of our actions or even the actions themselves are not known. This type of complexity will be covered in another unit.
Here's an example of a problem:
This is a route finding problem. Where we are given start city Arad and a destination city Bucharest (the capital of Romania) and the problem then is to find a route from Arad to Bucharest. The actions that the agent can execute are driving from one city to the next along one of the roads shown on the map. The question is, is there a solution that the agent can come up with given the knowledge shown here to the problem of driving from Arad to Bucharest?
What is a Problem?
The answer is no. There is no solution that the agent could come up with, because Bucharest does not appear on the map. So the agent doesn't know any actions that can arrive there, so let's give the agent a better chance:
Now we've given the agent the full map of Romania. The start is in Arad; and the destination, or goal, is in Bucharest. The agent is given the problem of coming up with a sequence of actions that will arrive at the destination. Now is it possible for the agent to solve this problem?
The answer is yes, there are many routes, or steps, or sequences of actions, that will arrive at the destination, here is one of them:
Now let's formally define what a problem looks like. A problem can be broken down into a number of components.
First, the initial state that the agent starts out with. In our route finding problem the initial state was the agent being in the city of Arad.
Next, a function Actions that takes a state as input and returns a set of possible actions that the agent can execute when the agent is in this state. In some problems agents will have the same actions available in all states and in other problems they'll have different actions dependent on the state. In the route finding problem the actions are dependent on the state, when we're in one city we can take the routes to the neighbouring cities but we can't go to any other cities.
Next, we have a function called Result which takes as input a state and an action and delivers as its output a new state. So for example, if the agent is in the city of Arad (that would be the state) and takes the action of driving along route E671 toward Timisoara then the result of applying that action in that state would be the new state where the agent is in the city of Timisoara.
Next we need a function called GoalTest which takes a state and returns a boolean value (true or false) telling us if this state is a goal or not. In a route finding problem the only goal would be being in the destination city, the city of Bucharest, and all the other states would return false for the GoalTest.
Finally we need one more thing, which is a PathCost function, which takes a path, a sequence of state-action transitions, and returns a number which is the cost of that path. Now for most of the problems that we deal with we'll make the PathCost function be additive so that the cost of a path is just the sum of the cost of the individual steps. So we'll implement this PathCost function in terms of a StepCost function. The StepCost function takes a state, and action, and the resulting state from that action, and returns a number n which is the cost of that action. In the route finding example the cost might be the number of miles travelled or maybe the number of minutes it takes to get to that destination.
Example: Route Finding
Now let's see how the definition of a problem maps on to the route finding domain. First the initial state we're given, let's say we start of in Arad; and the goal test, let's say that the state of being in Bucharest is the only state that counts as a goal and all other states are not goals.
Now the set of all the states here is known as the state space, and we navigate the state space by applying actions. The actions are specific to each city, so when we're in Arad there are three possible actions (following each of the connecting roads).
As we follow roads we build paths, or sequences of actions. So just being in Arad is the path of length zero and now we could start exploring the space and add in various paths of length one, length two, length three, etc.
Now at every point we want to separate the state out into three parts. First the ends of the paths, the fatherst paths have been explored, we call the Frontier. And then to the left of that in this diagram we have the Explored part of the state, and then off to the right we have the Unexplored.
One more thing. In this diagram we've labelled the step cost of each action along the route. So the step cost of going between Neamt and Iasi would be 87, corresponding to a distance of 87 kilometers. So the cost of the path going from Arad to Oradea would be 71 + 75.
Tree Search
Now let's define a function for solving problems. It's called Tree Search because it superimposes a search tree over the state space. Here's how it works:
It starts off by initialising the frontier to be the path consisting of only the initial state. Then it goes into a loop in which it firsts checks to see if we still have anything left in the frontier, if not we fail, there can be no solution. If we do have something then we make a choice -- and Tree Search is really a family of functions, not a single algorithm, which depends on how we make that choice, and we'll see some of the options later -- we go ahead an make a choice of one of the paths from the frontier, and remove that path from the frontier. We find the state which is at the end of the path, and if that state's a goal then we're done, we found a path to the goal. Otherwise we do what's called expanding that path: we look at all the actions from that state and we add to the path the actions and the result of that state, so we get a new path that has the old path, the action, and the result of that action; and we stick all of those paths back on to the frontier.
Now Tree Search represents a whole family of algorithms, and where you get the family resemblance is that they're all looking at the frontier popping items off and looking to see if they're goal tests, but where you get the difference is in the choice of how you're going to expand the next item on the frontier. Which path do we look at first? We'll go through different sets of algorithms that make different choices for which path to look at first.
The first algorithm to consider is called Breadth-First Search. Now it could be called "Shortest First Search", because what it does is always takes off the frontier one of the paths that hasn't been considered yet that is the shortest possible.
So how does it work? Well we start off with the path of length zero, starting in the start state:
And that's the only path in the frontier, so it's the shortest one, so we pick it and then we expand it. We add in all the paths that result from applying all the possible actions. So now we've removed the first path from the frontier, but we've added in three new paths:
Now we're in a position where we have three paths on the frontier, and we have to pick the shortest one. Now in this case all three paths have the same length, length 1. So we break the tie at random -- or using some other technique -- and let's suppose that in this case we choose the path from Arad to Sibiu. So once the path from Arad to Sibiu is taken off the frontier which paths are going to be added to the frontier?
The answer is that in Sibiu the action function gives us four actions, corresponding to travelling along each of the four roads. So we have to add in paths for each of those actions:
The paths are:
- Arad to Sibiu to Oradea
- Arad to Sibiu to Fagaras
- Arad to Sibiu to Rimnicu Vilcea
- Arad to Sibiu to Arad
Now it may seem silly and redundant to have a path that starts in Arad, goes to Sibiu, and then returns to Arad... how can that help us get to our destination in Bucharest? But we can see that if we're dealing with a tree search why it's natural to have this type of formulation, and why the tree search doesn't even notice that it's backtracked. What the tree search does is superimpose on top of the state space a tree of searches, and the tree looks like this:
We start off in state A, from which there were three actions giving us paths to Z, S and T. And from S there were four actions which gave us paths to O, F, R, A. Notice that we returned to A in the state space, but in the tree it's just another item in the tree.
Here's another representation of the search space:
What's happening is that as we start to explore the state we keep track of the frontier which is the set of states that are at the ends of the paths that we haven't explored yet, and behind that frontier is the set of explored states, and ahead of the frontier is the unexplored states.
Now the reason we keep track of the explored states is that when we want to expand and we find a duplicate so say when we expanded from L and pointed back to state T if we hadn't kept track of that we would have to add in a new state for T below L:
But because we've already seen T and we know that it's a regressive step into the already explored state, now because we've kept track of that we don't need it any more.
Graph Search
Now we see how to modify the Tree Search function to make it be a Graph Search function, to avoid those repeated paths.
What we do is we start off and initialise a set called the Explored Set of states that we've already explored. Then when we consider a new path we add the new state to the set of already explored states. And then when we're expanding a path and adding in new states to the end of it we don't add that in if we've already seen that new state in either the frontier or the explored set.
Now back to Breath-First Search, and let's assume that we're using the Graph Search, so that we've eliminated the duplicate paths. The path that goes from Arad to Sibiu back to Arad is removed. We're left with 5 possible paths. Given these five paths show which ones are candidates to be expanded next by the Breadth-First Search algorithm:
Breadth First Search
Breadth First Search always considers the shortest paths first, and in this case there are two paths of length one, Arad to Zerind and Arad to Timisoara, so those would be the two paths that would be considered. Now let's suppose that the tie is broken in some way and we chose the path from Arad to Zerind, so we want to expand that node (Zerind), we remove it from the frontier and put it in the explored list, and now we say what paths are we going to add?
In this case there is nothing to add, because of the two neighbours on is in the explored list and the other is in the frontier, so if we're using Graph Search then we won't add either of those.
So we move on, and we look for another shortest path, there's one path left of length one, so we look at that path, we expand it, add in the path to Lugoj, put Timisoara on the explored list, and now we've got three paths of length two, we choose one of them. Let's say we choose Fagaras, which states do we add to the path, and say whether the algorithm terminates now because we've reached the goal or whether we're going to continue.
The answer is that we add one more path, the path to Bucharest, we don't add the path going back from Fagaras because it's in the explored list. But we don't terminate yet. True, we have added a path that ends in Bucharest, but the goal test isn't applied when we add a path to the frontier, rather it's applied when we remove a path from the frontier, and we haven't done that yet.
Now why doesn't the general Tree Search or Graph Search algorithm stop when it adds a goal node to the frontier? The reason is because it might not be the best path to the goal. Here we found a path of length two, and we added a path of length three that reached the goal. Now in general a Graph Search or a Tree Search doesn't know that there might be some other path that we could expand that might have a distance of say 2.5. But there's an optimisation that could be made. If we know we're doing Breadth First Search and we know there's no possibility of a path of length 2.5 then we could change the algorithm so that it checks states as soon as they're added to the frontier rather than waiting until they're expanded; and in that case we can write a specific Breadth First Search routine that terminates early and gives us a result as soon as we add a goal state to the frontier.
Breadth First Search will find this path that ends up in Bucharest and if we're looking for the shortest path in terms of number of steps Breadth First Search is guaranteed to find it. But, if we're looking for the shortest path in terms of total cost, by adding up the step costs, then it turns out there is another path that is shorter. So let's look at how we could find that path.
Uniform Cost Search
An algorithm that has traditionally been called Uniform Cost Search, but could be called "Cheapest First Search", is guaranteed to find the path with the cheapest total cost. Let's see how it works.
We start out as before in the start state:
And we pop that empty path off, move it from the frontier to explored:
Then add in the paths out of that state:
As before there will be three of those paths. And now, which path are we going to pick next to expand, according to the rules of cheapest first?
Cheapest-first says we should pick the path with the lowest total cost. That would be Arad to Zerind which has a cost of 75 compared to 118 and 140 for the other paths. So we get to Zerind and we take that path off the frontier and put it on the explored list, add in its neighbours, not going back to Arad, but adding in Zerind to Oradea. Summing up the total cost of that path 71 + 75 = 146 for Arad to Zerind to Oradea.
Now the question is which path gets expanded next?
Of the three paths on the frontier we have paths with cost 146, 140 and 118. 118 is the cheapest so Arad to Timisoara gets expanded. We take it off the frontier, move it to explored, and add in the successors which in this case is only one (Eugoj) and that has a path total of 229.
Which path do we expand next? Well we've got 146, 140 and 229. So 140 is the lowest, so we take Sibiu off the frontier and add it to explored, and add in the path to Rimnicu Vilcea for a path cost of 220, and the path to Fagaras with total cost of 239.
So the question is which path do we expand next?
The answer is Oradea with a path cost of 146. Take it off the frontier and put it on explored, but there's nothing to add because both of its neighbours have already been explored.
Which path to we look at next?
The answer is Rimnicu Vilcea. 220 is less than 229 and 239. Take it off the frontier, put it on explored. Add in (to the frontier) two more paths to Pitesti (366) and Craiova (317). Now notice that we're closing in on Bucharest: we've got two neighbours almost there but neither of them has their turn yet. Instead the cheapest path is Arad to Timisoara to Eugoj for 229, so take that off the frontier and add it to the explored list. Add 70 to the path cost so far for 299 to Mehadia.
Now the cheapest node is 239 at Fagaras so we expand and get a path to Bucharest for 460.
Now the question is: are we done? Can we terminate the algorithm?
And the answer is no, we are not done yet. We've put Bucharest, the goal state, onto the frontier, but we haven't popped it off the frontier yet; and the reason is because we've got to look around and see if there is a better path that can reach Bucharest. (There was a mistake in the video for the rest of the path, Pitesti should have already been off the frontier).
Eventually we pop off all of the paths and we'd get to the point where the 418 path was popped off the frontier and the cheapest path to Bucharest is known.
Search Comparison
So we've looked at two search algorithms. The first is Breadth First Search in which we always expand first the shallowest paths, the shortest paths. The second is Cheapest First Search in which we always expand the path with the lowest total cost. And we take this opportunity to introduce a third algorithm called Depth First Search which is in a way the opposite of Breadth First Search. In Depth First Search we always expand first the longest path -- the path with the most links in it.
Note: "optimal" above indicates that the algorithm will find a path and that path will be the "shortest" path, where for Cheapest First that means the path with the lowest total cost, and for Breadth First Search and Depth First Search the "shortest" path is the one with the least number of steps.
Given the non-optimality of Depth First Search, why would anyone choose to use it? Well the answer has to do with the storage requirements. Above we've illustrated a state space consisting of a very large or even infinite binary tree. As we go to levels 1, 2, 3 down to level n the tree gets larger and larger. Now let's consider the frontier for each of these search algorithms:
For Breadth First Search we know that the frontier looks like as above. So when we get down to level n we require storage space of 2n paths in a Breadth First Search.
For Cheapest First Search the frontier is going to be more complicated. It's going to sort of workout this contour of cost, but it's going to have a similar number of nodes to Breadth First Search.
For Depth First Search at any point our frontier is only going to have n nodes, as opposed to 2n, so that's a substantial savings for Depth First Search. Now of course if we're also keeping track of the explored set then we don't get that much savings. But without the explored set Depth First Search has a huge advantage in terms of space saved.
One more property of the algorithms to consider is the property of completeness, meaning if there is a goal somewhere will the algorithm find it? So let's move from very large trees to infinite trees and let's say that there's some goal hidden somewhere deep down in that tree. So the question is are each of these algorithms complete? That is, are they guaranteed to find a path to the goal?
The answer is that Breadth First Search is complete. So even if the tree is infinite if the goal is placed at any finite level eventually we're going to march down and find that goal.
Same with Cheapest First Search, no matter where the goal is if it has a finite cost eventually we're going to go down and find it.
But not so for Depth First Search. If there's an infinite path Depth First Search will keep following that. So it will keep going down and down and down along that path and never get to the path on which the goal sits, so Depth First Search is not complete.
More on Uniform Cost Search
Let's try to understand a little better how Uniform Cost Search works. We start at a start state, and then we start expanding out from there looking at different paths, and what we end up doing is expanding in terms of contours like on a topological map:
Where first we expand out to a certain distance, then to a farther distance, and then to a farther distance.
Now at some point we meet up with the goal:
Now we've found a path from the start to the goal, but notice that the search really wasn't directed in any way toward the goal. It was expanding out everywhere in the space, and depending on where the goal is we should expect to have to explore half the space on average before we find the goal. If the space is small that can be fine, but when the space is large that won't get us to the goal fast enough.
Unfortunately there's really nothing we can do with what we know to do better than that. So if we want to improve, if we want to be able to find the goal faster, we're going to have to add more knowledge.
The type of knowledge that's proven most useful in search is an estimate of the distance from the start state to the goal. So let's say we're dealing with a route finding problem, and we can move in any direction (up or down, right or left) and we take as our estimate the straight line distance between a state and a goal, and we'll try to use that estimate to find our way to the goal fastest.
Now an algorithm called Greedy Best-First Search does exactly that. It expands first the path that is closest to the goal according to the estimate. So what do the contours look like in this approach?
Well we start at S and we look at all the neighbouring states and the ones that appear to be closest to the goal we would expand first. So we start expanding and that leads us directly to the goal. So now instead of exploring whole circles that go out everywhere in the search space, our search is directed toward the goal. In this case it gets us immediately to the goal, but that won't always be the case if there are obstacles along the way.
Consider this search space:
We have a start state and a goal and there's an impassable barrier. Now Greedy Best-First Search will start expanding out as before trying to get towards the goal, and when it reaches the barrier what will it do next? Well, it will try to increase along a path that is getting closer and closer to the goal. So it won't consider going back and up, which is farther from the goal, rather it will continue expanding out along the bottom lines which always get closer and closer to the goal:
Eventually it will find its way toward the goal. So it does find a path, and it does it by expanding a small number of nodes, but it's willing to accept a path that is longer than other paths. Now if we had explored in the other direction we could have found a much simpler path, a much shorter path, by just popping over the barrier and then going directly to the goal. But Greedy Best-First Search wouldn't have done that, because that would have involved getting to the barrier and then considering states which are farther from the goal.
What we would really like is an algorithm that combines the best parts of Greedy Search which explores a small number of nodes in many cases, and Uniform Cost Search which is guaranteed to find a shortest path. We'll learn how to do that next with something called the A* (A-star) algorithm.
A* Search
A* Search works by always expanding the path that has the minimum value for the function f which is defined as the sum of the g and h components. Now the function g of a path is just the path cost. And the function h of a path is equal to the h-value of the state, which is the final state of the path, which is equal to the estimated distance to the goal.
Here's an example of how A* works. Suppose we found this path through the state space to the state X and we're trying to give a measure to the value of this path. The measure f is a sum of g, the path cost so far, and h which is the estimated distance that the path will take to complete its path to the goal.
Now minimising g helps us keep the paths short, and minimising h helps us keep focused on finding the goal. The result is a search strategy that is the best possible, in the sense that it finds the shortest length path while expanding the minimum number of paths possible. It could be called Best Estimated Total Path Cost First, but the name A* is traditional.
Now let's go back to Romania and apply the A* algorithm. We're going to use a heuristic which is a straight-line distance between a state and the goal. The goal again is Bucharest so the distance from Bucharest to Bucharest is zero. For all the other states the straight-line distance is written on the map.
Now I should say that all the roads here are drawn in straight-lines but actually roads are going to be curved to some degree, so the actual distance along the roads is going to be longer than the straight-line distance.
Now we start out as usual from Arad as the start state. And we'll expand out Arad so we'll add three paths:
The evaluation function f will be the sum of the path length and the estimated distance. So the path length from Arad to Sibiu is 140 + 253 = 393; Arad to Zerind 75 + 374 = 449; Arad to Timisoara 118 + 329 = 447. Now the question is, out of all the paths that are on the frontier, which path will we expand next under the A* algorithm?
Let's go ahead and expand the path at Sibiu now, so we're going to add three paths. The path to Oradea has a path cost of 291 and estimated distance to the goal of 380 for a total of 671. The path to Fagaras has a path cost of 239 and an estimated distance of 176 for a total of 415. The path to Rimnicu Vilcea is 220 + 193 = 413. And now the question is which state do we expand next?
The answer is that we expand the node at Rimnicu Vilea next because it's total 413 is less than all the other ones on the frontier.
So we expand the node at Rimnicu Vilea giving us two more paths, the node at Pitesti with an f-value of 417 and the node at Craiova with an f-value of 526. So the question is, which path are we going to expand next?
The answer is that we expand the node at Fagaras next because it's f-value is less than all the other paths in the frontier.
Now we expand Fagaras and we get a path that reaches the goal and it has a path length of 450 and an estimated distance of 0 for a total f-value of 450. Now the question is what do we do next? Click 'end' if you think we're at the end of the algorithm and we don't need to expand next, or click on the node that you think we will expand next.
The answer is that we're not done yet, because the algorithm works by doing the goal test when we take a path off the frontier, not when we put a path on the frontier. Instead we just continue in the normal way and choose the node on the frontier with the lowest value, and that would be the path through Pitesti with a total of 417.
So let's expand the node at Pitesti. We have to go down to to Craiova, but we see a path which we've seen before. Then we go in the direction of Bucharest and we reach our goal and the h-value is going to be 0 because we're at the goal, and the g-value works out to 418. Now again we don't stop here just because we've put a path onto the frontier, rather we put it there and we don't apply the goal test next, but now we go back to the frontier and it turns out that this 418 is the lowest cost path in the frontier. So now we pull it off, do the goal test, and now we've found our path to the goal and it is in fact the shortest possible path.
In this case A* was able to find the lowest cost path. Now the question, which you'll have to think about, because we haven't explained it yet, is whether A* will always do this? Answer 'yes' if you think A* will always find the shortest cost path, or answer 'no' if you think it depends on the particular problem given, or answer 'no' if you think it depends on the particular heuristic estimate function h.
The answer is that it depends on the h function. A* will find the lowest cost path if the h function for a state is less than the true cost of the path to the goal through that state. In other words we want the h to never over-estimated the distance to the goal. We also say that h is optimistic. Another way of stating that is that h is admissible, meaning that it's admissible to use it to find the lowest cost path. So think of all of these as being the same way of stating the conditions under which A* finds the lowest cost path.
Optimistic Heuristics
Here we give you an intuition as to why an optimistic heuristic function h finds a lowest cost path. When A* ends it returns a path p with estimated cost c, and it turns out that c is also the actual cost, because at the goal the h component is zero and so the path cost is the total cost as estimated by the function.
Now all the paths on the frontier have an estimated cost that's greater than c and we know that because the frontier is explored in cheapest first order. But if h is optimistic then the estimated cost is less than the true cost, so the path p must have a cost that's less than the true cost of any of the other paths on the frontier. And any paths that go beyond the frontier must have a cost that's greater than that, because we agreed that the step cost is always zero or more. So that means that this path p must be the minimal cost path.
Now this argument, I should say, only goes through just as-is for tree search. For graph search the argument is slightly more complicated, but the general intuitions hold the same.
State Space
NOTE: video missing. Looks like a copy and paste error when I was making the notes.
So far we've looked at the state space of cities in Romania, a two-dimensional physical space. But the technology for problem solving through search can deal with many types of state spaces, dealing with abstract properties, not just (x,y) position in a plane.
Here we introduce another state space: the vacuum world. It's a very simple world in which there are only two positions, as opposed to the many positions in the Romanian state space, but there are additional properties to deal with as well. So the robot vacuum cleaner can be in either of the two positions, but as well as that, each of the positions can either have dirt in it or not have dirt in it. And now the question is, to represent this as a state space, how many states do we need?
The answer is that there are 8 states. There are two physical states that the robot vacuum cleaner can be in, either in state A or state B, but in addition to that there are states about how the world is, as well as where the robot is in the world. So, state A can be dirty or not, that's two possibilities. And B can be dirty or not, that's two more possibilities. When you multiply those together we get 2 x 2 x 2 = 8 possible states.
Here's a diagram of the state space for the vacuum world:
Note that there are 8 states, and we have the actions connecting these states, just as we did in the Romanian problem. Now let's look at a path through this state. Let's say we start in the top left, and apply the action of moving right, then we end up in a position where the state of the world looks the same, except the robot has moved, from position A to position B. Now if we apply the action "turn on sucking action", then we end up in a state where the robot is in the same position, but that position is no longer dirty.
Let's take this very simple vacuum world:
And make a slightly more complicated one. First we'll say the robot has a power switch which can be in one of three conditions: on, off, or sleep. Next we'll say that the robot has a dirt sensing camera and that camera can either be on or off. Third -- this is the deluxe model of robot -- in which the brushes that clean up the dust can be set at one of five different heights to be appropriate for whatever level of carpeting you have. And finally, rather that just having the two positions, we'll extend that out and have ten positions. And now the question is, how many states in this state space?
The answer is that the number of states is the cross-product of the numbers of all of the variables, since they are each independent and any combination can occur. For the power we have 3 possible positions; the camera has 2; the brush height has 5; the dirt has 2 for each of the 10 positions, so that's 210, or 1024; and then the robot's position can be in any of those 10 positions as well. And that works out to 307,200 states in the state space. And notice how a fairly trivial problem, we're only modelling a few variables and only 10 positions, works out to a large number of states. That's why we need efficient algorithms for searching through state spaces.
Sliding Blocks Puzzle
Here we introduce one more problem that can be solved with search techniques, and this is a sliding blocks puzzle, called the 15 puzzle. You may have seen something like this. So there are a bunch of little squares, or blocks, or tiles; and you can slide them around. The goal is to get into a certain configuration. So we'll say this is the goal state:
Where the numbers 1 to 15 are in order, left to right, top to bottom. And the starting state would be some state where all the positions are messed up. Now the question is: can we come up with a good heuristic for this? And let's examine that as a way of thinking about where heuristics come from. Now the first heuristic we're going to consider we'll call h1, and that's equal to the number of misplaced blocks. The second heuristics h2, is equal to the sum of the distances that each block would have to move to get to the right position. Now the question is, which, if any of these heuristics, are admissible?
h1 is admissible because every tile that's in the wrong position must be moved at least once to get into the right position, so h1 never overestimates. How about h2? h2 is also admissible, because every tile in the wrong position can be moved closer to the correct position no faster than one space per move. Therefore, both are admissible; but notice that h2 is always greater than or equal to h1. That means that, with the exception of breaking ties, an A* search using h2 will always expand fewer paths than one using h1.
Now we're trying to build an artificial intelligence that can solve problems like this all on its own. You can see that the search algorithms do a great job of finding solutions to problems like this. But, you might complain that in order for the search algorithms to work we had to provide it with a heuristic function. The heuristic function came from the outside. You might think that coming up with a good heuristic function is really where all the intelligence is. And so a problem solver that uses a heuristic function given to it really isn't intelligent at all. So let's think about where the intelligence could come from, and can we automatically come up with good heuristic functions?
Here we sketch a description of a program that can automatically come up with good heuristics given a description of a problem. Now suppose this program is given a description of the sliding blocks puzzle where we say that a block can move from square A to square B if A is adjacent to B and B is blank. Now, imagine that we tried to loosen this restriction: we cross out B is blank, and then we get the rule a block can move from A to B if A is adjacent to B. And that's equal to our heuristic h2, because a block can move anywhere to an adjacent state. Now we could also cross out the other part of the rule, and we now get: a block can move from any square A to any square B, regardless of any condition; and that gives us heuristic h1.
So we see that both of our heuristics can be derived from a simple mechanical manipulation of the formal description of the problem. Once we've generated automatically these candidate heuristics another way to come with a good heuristic is to say that a new heuristic h is equal to the maximum of h1 and h2. And that's guaranteed to be admissible as long as h1 and h2 are admissible, because it still never overestimates; and it's guaranteed to be better, because it's getting closer to the true value. The only problem with combining multiple heuristics like this is that there is some cost to compute the heuristic, and it could take longer to compute even if we end up expanding fewer paths.
Now crossing out parts of the rules like this is called generating a relaxed problem. What we've done is we've taken the original problem, where it's hard to move squares around, and made it easier by relaxing one of the constraints. And you can see that as adding new links in the state space, so if we have a state space in which there are only particular links, by relaxing the problem it's as if we were adding new operators that traversed the state in new ways. And so adding new operators only makes the problem easier, and thus never overestimates, and thus is admissible.
Problems with Search
We've seen what search can do for problem solving. It can find the lowest cost path to a goal, and it can do that in a way in which we never generate more paths than we have to, we can find the optimal number of paths to generate, and we can do that with a heuristic function that we generate on our own by relaxing the existing the existing problem definition. But let's be clear on what search can't do. All the solutions that we have found consist of a fixed sequence of actions; in other words the agent in Arad, thinks, comes up with a plan that it wants to execute and then essentially closes his eyes and starts driving, never considering along the way if something has gone wrong. And that works fine for this type of problem, but it only works when we satisfy the following conditions:
Problem solving technology works when the following set of conditions is true. First, the domain must be fully observable, in other words we must be able to see what initial state we start out with. Second, the domain must be known, that is, we have to know the set of available actions to us. Third, the domain must be discrete, there must be a finite number of actions to choose from. Fourth, the domain must be deterministic, we have to know the result of taking an action. Finally, the domain must be static, there must be nothing else in the world that can change the world except our own actions. If all of these conditions are true then we can search for a plan which solves the problem and is guaranteed to work. In later units we will see what to do if any of these conditions fail to hold.
A Note on Implementation
Our description of the algorithm has talked about paths in the state space. We're going to talk a little bit now about how to implement that in terms of a computer algorithm. We talk about paths, but we want to implement that in some way. In implementations we talk about nodes. A node is a data-structure and it has four fields. The state field indicates the state at the end of the path. The action was the action it took to get there. The cost is the total cost. And the parent is a pointer to another node.
So we have a linked list of nodes representing a path. We'll use the word "path" for the abstract idea, and the word "node" for the representation in the computer memory; but otherwise you can think of those two terms as being synonyms because they're in a one-to-one correspondence.
Now there are two main data structures that deal with nodes. We have the frontier, and we have the explored list. Let's talk about how to implement them. In the frontier the operations we have to deal with are removing the best item from the frontier and adding in the new ones. And that suggests we should implement it as a priority queue, which knows how to keep track of the best items in proper order. But, we also need to have an additional operation of a membership test, is a new item in the frontier?, and that suggests representing it as a set, which can be built from a hashtable or a tree. And so, the most efficient implementations of search actually have both representations. The explored set on the other hand is easier, all we have to do there is be able to add new members and check for membership. And so we represent that as a single set, which again can be done with either a hashtable or a tree.