<div class="notebook open-fullscreen"> <div class="nb-cell html" name="htm1"> <h1>A* Planning</h1> <small><em>Author: Paul Brown (<a href="https://twitter.com/PaulBrownMagic" target="_blank" title="Twitter">@paulbrownmagic</a>)</em></small> </div> <div class="nb-cell markdown" name="md1"> A* is a search algorithm using a heuristic to improve efficiency for most cases. We can use it for planning by considering a state of the world as a node and an action as an edge transitioning between nodes. Let's start with a scenario. ## Scenario Robbie the one-handed robot is in a hurry, it needs to vacuum the car before its owner goes to work. To do this it'll need to unlock and open the door to the garage, unlock and open the car door, take the vacuum to the car and use it. But Robbie's owner, who likes things clean and tidy, also expects everything returned to its proper place with doors locked before they wake up. ## Requirements For this to work we need to describe the world in a suitable manner so that we can use A*, then we'll need to code A* and run it to obtain a plan. ### Describing the World We have states of the world, which are comprised of fluents (things that change value between states). We have actions that connect these states, they're possible in some states and result in others. We don't want to encode every possible node and edge, so we'll encode these as rules. </div> <div class="nb-cell program" data-background="true" name="p2"> fluent(location(robbie, hallway)). fluent(location(car-key, hallway)). fluent(location(garage-key, hallway)). fluent(location(vacuum-cleaner, kitchen)). fluent(door(hallway-kitchen, unlocked)). fluent(door(kitchen-garage, locked)). fluent(door(garage-car, locked)). fluent(holding(nothing)). fluent(clean(car, false)). %facts, unlike fluents, don't change fact(home(car-key, hallway)). fact(home(garage-key, hallway)). fact(home(vacuum-cleaner, kitchen)). % s0, the initial situation, is the (ordered) set % of fluents s0(Situation) :- setof(S, fluent(S), Situation). % Take a list of Actions and execute them execute_process(S1, [], S1). % Nothing to do execute_process(S1, [Action|Process], S2) :- poss(Action, S1), % Ensure valid Process result(S1, Action, Sd), execute_process(Sd, Process, S2). % Does a fluent hold (is true) in the Situation? % This is the query mechanism for Situations % Use-case 1: check a known fluent holds(Fluent, Situation) :- ground(Fluent), ord_memberchk(Fluent, Situation), !. % Use-case 2: search for a fluent holds(Fluent, Situation) :- member(Fluent, Situation). % Utility to replace a fluent in the Situation replace_fluent(S1, OldEl, NewEl, S2) :- ord_del_element(S1, OldEl, Sd), ord_add_element(Sd, NewEl, S2). % Lots of actions to declare here... % Still less code than writing out the % graph we're representing % % Robbie's Action Repertoire : % - goTo(Origin, Destination) % - pickup(Item) % - drop(Item) % - put_away(Item) % the tidy version of drop % - unlock(Room1-Room2) % - lock(Room1-Room2) % - clean_car poss(goto(L), S) :- % If robbie is in X and the door is unlocked holds(location(robbie, X), S), ( holds(door(X-L, unlocked), S) ; holds(door(L-X, unlocked), S) ). poss(pickup(X), S) :- % If robbie is in the same place as X and not % holding anything dif(X, robbie), % Can't pickup itself! holds(location(X, L), S), holds(location(robbie, L), S), holds(holding(nothing), S). poss(put_away(X), S) :- % If robbie is holding X, it belongs in L % and robbie is in L (location(X, L) is implicit) holds(holding(X), S), fact(home(X, L)), holds(location(robbie, L), S). poss(drop(X), S) :- % Can drop something if holding it % Can't drop nothing! dif(X, nothing), holds(holding(X), S). poss(unlock(R1-R2), S) :- % Can unlock door between R1 and R2 % Door is locked holds(door(R1-R2, locked), S), % Holding the key to the room holds(holding(R2-key), S), % Located in one of the rooms ( holds(location(robbie, R1), S) ; holds(location(robbie, R2), S) ). poss(lock(R1-R2), S) :- % Can lock door R1-R2 % Only if it's locked, robbie has the key % and is in one of the rooms holds(door(R1-R2, unlocked), S), holds(holding(R2-key), S), ( holds(location(robbie, R1), S) ; holds(location(robbie, R2), S) ). poss(clean_car, S) :- % Robbie is in the car with the vacuum-cleaner holds(location(robbie, car), S), holds(holding(vacuum-cleaner), S). result(S1, goto(L), S2) :- % Robbie moves holds(location(robbie, X), S1), replace_fluent(S1, location(robbie, X), location(robbie, L), Sa), % If Robbie is carrying something, it moves too dif(Item, nothing), ( holds(holding(Item), S1), replace_fluent(Sa, location(Item, X), location(Item, L), S2) ; \+ holds(holding(Item), S1), S2 = Sa ). result(S1, pickup(X), S2) :- % Robbie is holding X replace_fluent(S1, holding(nothing), holding(X), S2). result(S1, drop(X), S2) :- % Robbie is no-longer holding X, % its location is not changed replace_fluent(S1, holding(X), holding(nothing), S2). result(S1, put_away(X), S2) :- % Robbie is no-longer holding X, % its location is not changed replace_fluent(S1, holding(X), holding(nothing), S2). result(S1, unlock(R1-R2), S2) :- % Door R1-R2 is unlocked replace_fluent(S1, door(R1-R2, locked), door(R1-R2, unlocked), S2). result(S1, lock(R1-R2), S2) :- % Door R1-R2 is locked replace_fluent(S1, door(R1-R2, unlocked), door(R1-R2, locked), S2). result(S1, clean_car, S2) :- % The car is clean replace_fluent(S1, clean(car, false), clean(car, true), S2). </div> <div class="nb-cell query" name="q2"> s0(S0), setof(A, poss(A, S0), PossibleActions). </div> <div class="nb-cell query" name="q3"> s0(S0), execute_process(S0, [goto(kitchen), pickup(vacuum-cleaner), goto(hallway)], S1), ord_subtract(S0, S1, Was), ord_subtract(S1, S0, Now), format("Was: ~w~nNow: ~w~n", [Was, Now]), !. </div> <div class="nb-cell markdown" name="md2"> OK, that's not an insignificant amount of code, we had a lot to declare. But we can now answer questions about what is possible in a situation and what the result of doing an action will be. So we can now search to reach a goal. The good news is, we've written a Depth-First search mechanism already in `execute_process/2`, the bad news is that'll fail miserably to find a plan by getting stuck in the `goto(kitchen), goto(hallway), goto(kitchen), goto(hallway)` loop immediately. ## Goals and an Iterative Deepening Aside Seen as we need predicates for goals, and we're only one other predicate away from iterative deepening, let's write that next to give us something to benchmark against. We'll express goals in the same manner as fluents. </div> <div class="nb-cell program" data-background="true" name="p3"> goal(clean(car, true)). goal(door(garage-car, locked)). goal(door(kitchen-garage, locked)). % Everything must be tidied away goal(location(X, L)) :- fact(home(X, L)). goal(holding(nothing)). % The Goal Situation is the (ordered) set of fluents that % describe a goal goal_situation(S) :- setof(G, goal(G), S). % Test to see if Situation satifies the Goal % Note that the Situation can contain fluents % not described in Goal reached_goal(GoalSituation, Situation) :- ord_subtract(GoalSituation, Situation, []). % [] -> no goals not in Situation % Checks if something is a proper list, % with a bonus use of generating lists % of increasing length list([]). list([_|T]) :- list(T). iterative_deepening_search(Process) :- s0(S0), goal_situation(GoalSituation), % generate a list (increasing length via failure) list(Process), % generate a solution execute_process(S0, Process, Result), % test solution reached_goal(GoalSituation, Result). </div> <div class="nb-cell query" name="q4"> time(iterative_deepening_search(Process)). </div> <div class="nb-cell markdown" name="md4"> So, when I run it I get "Time limit exceeded" from Pengines. In comparison, with only the goal of cleaning the car a solution is found in 81 seconds with 17 actions. The complexity of this problem is a combination of the length of steps to be taken and the revisiting of situations already discovered. Let's try and improve it via A*. ## A* Search So when we're doing this search, we're exploring our graph as a tree. Iterative deepening does a Depth-First Search of this tree to a limited depth, over and over again, thus approximating a breadth-first search in result, but not efficiency. As a tree we want to prune branches that are useless to follow. Typically we do this with costs associated with taking the action to find the shortest-path (Branch and Bound), but our actions have equal cost, so we'll set the cost of an action to 1 in our search. We also want to avoid revisiting situations we've already been in as a shorter path to those has already been found, so we'll cut those out and gain some pruning that way. This is the Extended List. The trick of A* is in it's use of a heuristic to order the nodes in the search in terms of a guess at how close to the goal they are. We can make such a heuristic by counting how many of the fluents that need to hold in the goal don't hold in the situation. So in the goal, that count will be 0. If you're thinking ahead, you'll notice that our example is a "there-and-back-again" kind of example, by which I mean many of the fluents that hold in the goal also hold in s0. Robbie will need to move away from the goal in order to reach it, so prior to the car been clean the heuristic will actually be working against the search. It would be irresponsible of me to choose an easy example! So we need to code our heuristic and the search. We'll use a module with heaps in to take care of ordering the goals for us. </div> <div class="nb-cell program" data-background="true" name="p1"> :- use_module(library(heaps)). % Use to order search heuristic_distance_to_goal(GoalSituation, Situation, Distance) :- ord_subtract(GoalSituation, Situation, Dif), length(Dif, Distance). % Add a Cost-Sit-Process triple to the search heap % of open nodes. Carrying Process for interest. add_to_open_nodes(AccCost, H1, Sit-Process, Goal, H2) :- heuristic_distance_to_goal(Goal, Sit, D), succ(AccCost, ActCost), % one action has been taken, so incr Priority is ActCost + D, % Priority cost add_to_heap(H1, Priority, ActCost-Sit-Process, H2). % Add a list of Sit-Process Pairs open_add_pairs(_, Heap, _, [], _, Heap). open_add_pairs(AccCost, H1, Sits, [S-P|T], G, H2) :- ( ord_memberchk(S, Sits) -> add_to_open_nodes(AccCost, H1, S-P, G, Hd) ; Hd = H1 ), open_add_pairs(AccCost, Hd, Sits, T, G, H2). % Convenience predicate to heap get_from_open_nodes(H1, Sit-Process, H2) :- get_from_heap(H1, _Priority, Sit-Process, H2). % convenience predicate to start a_star % and reverse the process for natural reading a_star(Sit, Process) :- s0(S0), goal_situation(GoalSituation), a_star(S0, GoalSituation, Sit-Answer), reverse(Answer, Process). % a_star search setup a_star(StartSituation, GoalSituation, Answer) :- % Create heap of open search nodes heuristic_distance_to_goal(GoalSituation, StartSituation, D), singleton_heap(Open, D, 0-StartSituation-[]), % Do the search a_star(Open, GoalSituation, [StartSituation], Answer). a_star(Open, GoalSituation, Closed, Answer) :- % Get the most promising Sit-Process pair get_from_open_nodes(Open, AccCost-Sit-Process, RemainingSearch), % If we've reached the goal, return the Answer ( reached_goal(GoalSituation, Sit), Answer = Sit-Process % Otherwise (or searching for other solutions), find the % reachable situations via some additional action A that is % recorded onto the process ; setof(S-[A|Process], (poss(A, Sit), result(Sit, A, S)), AS_Pairs), % Exclude visited nodes pairs_keys(AS_Pairs, Children), ord_union(Closed, Children, Closed1, Sits), % Add new open nodes (with tag-along processes) to search heap open_add_pairs(AccCost, RemainingSearch, Sits, AS_Pairs, GoalSituation, Open1), % Carry on searching a_star(Open1, GoalSituation, Closed1, Answer) ). </div> <div class="nb-cell query" name="q1"> time(once(a_star(S, P))). </div> <div class="nb-cell markdown" name="md3"> Well, that sure is a big difference! We've not even included all the possible Prolog optimizations, figuring the code was already hard enough to understand. Carrying along the Process with the Situation to reach it certainly doesn't help clarity, but it sure is useful for Robbie to know how to execute the plan. If we weren't interested in the shortest process to reach the goal, we'd be able to remove the accumulated costs from the A* algorithm and get a solution to this in about 0.2 seconds. Other variations include removing the heuristic or extended list: have fun! </div> </div>