<div class="notebook"> <div class="nb-cell markdown" name="md1"> # An introduction to Prolog for SQL programers Part 3 *By Robert Laing* Following on from [Part 1](https://swish.swi-prolog.org/p/sql2prolog.swinb) and [Part 2](https://swish.swi-prolog.org/p/sql2prolog2.swinb), this is a translation into Prolog of the examples given in Stanford's [Databases: OLAP and Recursion](https://learning.edx.org/course/course-v1:StanfordOnline+SOE.YDB-OLAP_RECURSION0001+2T2020/home) MooC. These are examples of [transitive closures](https://en.wikipedia.org/wiki/Transitive_closure), something very easy to do in Prolog: ### EXAMPLE 1: Ancestors </div> <div class="nb-cell program" data-background="true" name="p1"> %! parentof(?Parent:text, ?Child:text) is nondet parentof('Alice', 'Carol'). parentof('Bob', 'Carol'). parentof('Carol', 'Dave'). parentof('Carol', 'George'). parentof('Dave', 'Mary'). parentof('Eve', 'Mary'). parentof('Mary', 'Frank'). </div> <div class="nb-cell markdown" name="md2"> ```sql with recursive Ancestor(a,d) as (select parent as a, child as d from ParentOf union select Ancestor.a, ParentOf.child as d from Ancestor, ParentOf where Ancestor.d = ParentOf.parent) select a from Ancestor where d = 'Mary'; ``` A direct translation of the above would be: ```prolog ancestor(Parent, Child) :- parentof(Parent, Child). ancestor(Ancestor, Child) :- parentof(Descendent, Child), ancestor(Ancestor, Descendent). ``` I personally find the way I've written it below easier to undertand, but they both work the same. The key thing is to put the recursive call at the end, else there's a good chance of hanging. </div> <div class="nb-cell program" name="p2"> ancestor(Parent, Child) :- parentof(Parent, Child). ancestor(Ancestor, Child) :- parentof(Ancestor, Descendent), ancestor(Descendent, Child). </div> <div class="nb-cell markdown" name="md4"> > Find all of Mary's ancestors. </div> <div class="nb-cell query" data-chunk="10" data-tabled="true" name="q2"> ancestor(Ancestor, 'Mary'). </div> <div class="nb-cell markdown" name="md6"> > Find all of Alice's descendents. </div> <div class="nb-cell query" data-chunk="10" data-tabled="true" name="q1"> ancestor('Alice', Descendent). </div> <div class="nb-cell markdown" name="md3"> ### EXAMPLE 2: Company hierarchy </div> <div class="nb-cell program" data-background="true" name="p3"> %! employee(?ID:int, ?Salary:int) is nondet employee(123, 100). employee(234, 90). employee(345, 80). employee(456, 70). employee(567, 60). %! manager(?MID:int, ?EID:int) is nondet manager(123, 234). manager(234, 345). manager(234, 456). manager(345, 567). %! project(?Name:text, ?MgrID:int) is nondet project('X', 123). project('Y', 234). project('Z', 456). </div> <div class="nb-cell markdown" name="md5"> > Find total salary cost of project 'X' ```sql with recursive Superior as (select * from Manager union select S.mID, M.eID from Superior S, Manager M where S.eID = M.mID ) select sum(salary) from Employee where ID in (select mgrID from Project where name = 'X' union select eID from Project, Superior where Project.name = 'X' AND Project.mgrID = Superior.mID ); /*** Alternative formulation tied specifically to project 'X' **/ with recursive Xemps(ID) as (select mgrID as ID from Project where name = 'X' union select eID as ID from Manager M, Xemps X where M.mID = X.ID) select sum(salary) from Employee where ID in (select ID from Xemps); ``` </div> <div class="nb-cell program" data-background="true" name="p4"> superior(MID, EID) :- manager(MID, EID). superior(MID, EID) :- manager(MID, ID), superior(ID, EID). project_salary_cost(Project, Total) :- project(Project, MID), employee(MID, ManagerSalary), aggregate_all(sum(Salary), EID, ( superior(MID, EID), employee(EID, Salary) ), EmployeesSalaries), Total is ManagerSalary + EmployeesSalaries. </div> <div class="nb-cell query" name="q3"> project_salary_cost('X', XTotal). </div> <div class="nb-cell markdown" name="md7"> > Total salary costs for projects 'Y' and 'Z' ```sql insert into Project values ('Y', 234); insert into Project values ('Z', 456); with recursive Yemps(ID) as (select mgrID as ID from Project where name = 'Y' union select eID as ID from Manager M, Yemps Y where M.mID = Y.ID), Zemps(ID) as (select mgrID as ID from Project where name = 'Z' union select eID as ID from Manager M, Zemps Z where M.mID = Z.ID) select 'Y-total', sum(salary) from Employee where ID in (select ID from Yemps) union select 'Z-total', sum(salary) from Employee where ID in (select ID from Zemps); ``` Prolog is the clear winner here </div> <div class="nb-cell query" name="q4"> project_salary_cost('Y', YTotal), project_salary_cost('Z', ZTotal). </div> <div class="nb-cell markdown" name="md8"> ### EXAMPLE 3: Airline flights </div> <div class="nb-cell program" data-background="true" name="p5"> :- dynamic flights/4. %! flights(?Orig:text, ?Dest:text, ?Airline:text, ?Cost:int) is nondet flights('A', 'ORD', 'United', 200). flights('ORD', 'B', 'American', 100). flights('A', 'PHX', 'Southwest', 25). flights('PHX', 'LAS', 'Southwest', 30). flights('LAS', 'CMH', 'Frontier', 60). flights('CMH', 'B', 'Frontier', 60). flights('A', 'B', 'JetBlue', 195). </div> <div class="nb-cell markdown" name="md9"> > Find cheapest way to fly from 'A' to 'B' ```sql /*** First find all costs ***/ with recursive Route(orig,dest,total) as (select orig, dest, cost as total from Flight union select R.orig, F.dest, cost+total as total from Route R, Flight F where R.dest = F.orig) select * from Route where orig = 'A' and dest = 'B'; /*** Then find minimum; note returns cheapest cost but not route ***/ with recursive Route(orig,dest,total) as (select orig, dest, cost as total from Flight union select R.orig, F.dest, cost+total as total from Route R, Flight F where R.dest = F.orig) select min(total) from Route where orig = 'A' and dest = 'B'; /*** Alternative formuation tied specifically to origin 'A' ***/ with recursive FromA(dest,total) as (select dest, cost as total from Flight where orig = 'A' union select F.dest, cost+total as total from FromA FA, Flight F where FA.dest = F.orig) select * from FromA; with recursive FromA(dest,total) as (select dest, cost as total from Flight where orig = 'A' union select F.dest, cost+total as total from FromA FA, Flight F where FA.dest = F.orig) select min(total) from FromA where dest = 'B'; /*** Alternative formuation tied specifically to destination 'B' ***/ with recursive ToB(orig,total) as (select orig, cost as total from Flight where dest = 'B' union select F.orig, cost+total as total from Flight F, ToB TB where F.dest = TB.orig) select * from ToB; with recursive ToB(orig,total) as (select orig, cost as total from Flight where dest = 'B' union select F.orig, cost+total as total from Flight F, ToB TB where F.dest = TB.orig) select min(total) from ToB where orig = 'A'; ``` </div> <div class="nb-cell program" name="p6"> route(Orig, Dest, Total) :- route_(Orig, Dest, 0, Total). route_(Orig, Dest, Total, Final) :- flights(Orig, Dest, _Airline, Cost), Final is Total + Cost. route_(Orig, Dest, Total, Acc) :- flights(Orig, Stopover, _Airline, Cost), Total1 is Total + Cost, route_(Stopover, Dest, Total1, Acc). </div> <div class="nb-cell query" name="q5"> aggregate_all(min(Total), route('A', 'B', Total), Cheapest). </div> <div class="nb-cell markdown" name="md10"> ## Path Something Prolog can do easily, which would be tricky in SQL, is return the path. In this example, the cheapest travel option comes with the trade-off it involves four flights _A ✈ PHX ✈ LAS ✈ CMH ✈ B_, so the $20 saving in not taking JetBlue's direct flight for $195 may not be worth it. </div> <div class="nb-cell program" name="p7"> route(Orig, Dest, Path, Total) :- route_(Orig, Dest, [Orig], Reversed, 0, Total), reverse(Reversed, Path). route_(Orig, Dest, Path, [Dest|Path], Total, Final) :- flights(Orig, Dest, _Airline, Cost), Final is Total + Cost. route_(Orig, Dest, Path, Acc1, Total, Acc2) :- flights(Orig, Stopover, _Airline, Cost), Total1 is Total + Cost, route_(Stopover, Dest, [Stopover|Path], Acc1, Total1, Acc2). </div> <div class="nb-cell query" data-chunk="4" data-tabled="true" name="q6"> route('A', 'B', Path, Total). </div> <div class="nb-cell markdown" name="md11"> ## Guarding against endless cycles Next we are going to confuse the route finder by adding a loop back from CMH to PHX. A snag here is that tabling doesn't prevent Prolog from generating ever longer routes, meaning `aggregate_all(min(Total), route('A', 'B', Total), Cheapest).` will result in a stack overflow instead of cheapest price. ```prolog assertz(flights('CMH', 'PHX', 'Frontier', 80)), route2('A', 'B', Path, Total). ``` Limiting the solutions to 6 yields: ```prolog Path = ['A', 'B'], Total = 195 Path = ['A', 'ORD', 'B'], Total = 300 Path = ['A', 'PHX', 'LAS', 'CMH', 'B'], Total = 175 Path = ['A', 'PHX', 'LAS', 'CMH', 'PHX', 'LAS', 'CMH', 'B'], Total = 345 Path = ['A', 'PHX', 'LAS', 'CMH', 'PHX', 'LAS', 'CMH', 'PHX', 'LAS', 'CMH', 'B'], Total = 515 Path = ['A', 'PHX', 'LAS', 'CMH', 'PHX', 'LAS', 'CMH', 'PHX', 'LAS', 'CMH', 'PHX', 'LAS', 'CMH', 'B'], Total = 685 ``` Here keeping the path as a _history list_ is handy to halt recursion when we are returning to places already visited: </div> <div class="nb-cell program" name="p8"> route(Orig, Dest, Path, Total) :- route_(Orig, Dest, [Orig], Reversed, 0, Total), reverse(Reversed, Path). route_(Orig, Dest, Path, [Dest|Path], Total, Final) :- flights(Orig, Dest, _Airline, Cost), Final is Total + Cost. route_(Orig, Dest, Path, Acc1, Total, Acc2) :- flights(Orig, Stopover, _Airline, Cost), \+memberchk(Stopover, Path), Total1 is Total + Cost, route_(Stopover, Dest, [Stopover|Path], Acc1, Total1, Acc2). </div> <div class="nb-cell query" name="q7"> assertz(flights('CMH', 'PHX', 'Frontier', 80)), aggregate_all(min(Total), route('A', 'B', _, Total), Cheapest). </div> <div class="nb-cell markdown" name="md12"> Another way to prune the search is to keep a globally stored best price (initialised to something high, 1000 in the example), and halting recursion whenever the path being explored is leading somewhere more expensive. </div> <div class="nb-cell program" name="p9"> :- dynamic best_price/1. best_price(1000). route(Orig, Dest, Path, Total) :- route_(Orig, Dest, [Orig], Reversed, 0, Total), reverse(Reversed, Path). route_(Orig, Dest, Path, [Dest|Path], Total, Final) :- flights(Orig, Dest, _Airline, Cost), Final is Total + Cost, best_price(Best), ( Final < Best -> retract(best_price(Best)), assertz(best_price(Final)) ; true ). route_(Orig, Dest, Path, Acc1, Total, Acc2) :- flights(Orig, Stopover, _Airline, Cost), Total1 is Total + Cost, best_price(Best), Total1 < Best, route_(Stopover, Dest, [Stopover|Path], Acc1, Total1, Acc2). </div> <div class="nb-cell query" data-chunk="3" data-tabled="true" name="q8"> assertz(flights('CMH', 'PHX', 'Frontier', 80)), route('A', 'B', Path, Total), best_price(Best). </div> </div>