? users online
  • Logout
    • Open hangout
    • Open chat for current file
<div class="notebook">

<div class="nb-cell markdown" name="md1">
# An introduction to Prolog for SQL programers Part 3

*By Robert Laing*

Following on from [Part 1](https://swish.swi-prolog.org/p/sql2prolog.swinb) and [Part 2](https://swish.swi-prolog.org/p/sql2prolog2.swinb), this is a translation into Prolog of the examples given in Stanford's [Databases: OLAP and Recursion](https://learning.edx.org/course/course-v1:StanfordOnline+SOE.YDB-OLAP_RECURSION0001+2T2020/home) MooC.

These are examples of [transitive closures](https://en.wikipedia.org/wiki/Transitive_closure), something very easy to do in Prolog:

### EXAMPLE 1: Ancestors
</div>

<div class="nb-cell program" data-background="true" name="p1">
%! parentof(?Parent:text, ?Child:text) is nondet
parentof('Alice', 'Carol').
parentof('Bob', 'Carol').
parentof('Carol', 'Dave').
parentof('Carol', 'George').
parentof('Dave', 'Mary').
parentof('Eve', 'Mary').
parentof('Mary', 'Frank').
</div>

<div class="nb-cell markdown" name="md2">
```sql
with recursive
  Ancestor(a,d) as (select parent as a, child as d from ParentOf
                    union
                    select Ancestor.a, ParentOf.child as d
                    from Ancestor, ParentOf
                    where Ancestor.d = ParentOf.parent)
select a from Ancestor where d = 'Mary';
```

A direct translation of the above would be:

```prolog
ancestor(Parent, Child) :-
    parentof(Parent, Child).

ancestor(Ancestor, Child) :-
    parentof(Descendent, Child),
    ancestor(Ancestor, Descendent).
```

I personally find the way I've written it below easier to undertand, but they both work the same. The key thing is to put the recursive call at the end, else there's a good chance of hanging.
</div>

<div class="nb-cell program" name="p2">
ancestor(Parent, Child) :-
    parentof(Parent, Child).

ancestor(Ancestor, Child) :-
    parentof(Ancestor, Descendent), 
    ancestor(Descendent, Child).
</div>

<div class="nb-cell markdown" name="md4">
&gt; Find all of Mary's ancestors.
</div>

<div class="nb-cell query" data-chunk="10" data-tabled="true" name="q2">
ancestor(Ancestor, 'Mary').
</div>

<div class="nb-cell markdown" name="md6">
&gt; Find all of Alice's descendents.
</div>

<div class="nb-cell query" data-chunk="10" data-tabled="true" name="q1">
ancestor('Alice', Descendent).
</div>

<div class="nb-cell markdown" name="md3">
### EXAMPLE 2: Company hierarchy
</div>

<div class="nb-cell program" data-background="true" name="p3">
%! employee(?ID:int, ?Salary:int) is nondet
employee(123, 100).
employee(234, 90).
employee(345, 80).
employee(456, 70).
employee(567, 60).

%! manager(?MID:int, ?EID:int) is nondet
manager(123, 234).
manager(234, 345).
manager(234, 456).
manager(345, 567).

%! project(?Name:text, ?MgrID:int) is nondet
project('X', 123).
project('Y', 234).
project('Z', 456).
</div>

<div class="nb-cell markdown" name="md5">
&gt; Find total salary cost of project 'X'

```sql
with recursive
  Superior as (select * from Manager
               union
               select S.mID, M.eID
               from Superior S, Manager M
               where S.eID = M.mID )
select sum(salary)
from Employee
where ID in
  (select mgrID from Project where name = 'X'
   union
   select eID from Project, Superior
   where Project.name = 'X' AND Project.mgrID = Superior.mID );

/*** Alternative formulation tied specifically to project 'X' **/

with recursive
  Xemps(ID) as (select mgrID as ID from Project where name = 'X'
                union
                select eID as ID
                from Manager M, Xemps X
                where M.mID = X.ID)
select sum(salary)
from Employee
where ID in (select ID from Xemps);
```
</div>

<div class="nb-cell program" data-background="true" name="p4">
superior(MID, EID) :-
    manager(MID, EID).

superior(MID, EID) :-
    manager(MID, ID),
    superior(ID, EID).

project_salary_cost(Project, Total) :-
    project(Project, MID),
    employee(MID, ManagerSalary),
    aggregate_all(sum(Salary), EID,
                  (  superior(MID, EID), employee(EID, Salary)  ), 
                  EmployeesSalaries),
    Total is ManagerSalary + EmployeesSalaries.
</div>

<div class="nb-cell query" name="q3">
project_salary_cost('X', XTotal).
</div>

<div class="nb-cell markdown" name="md7">
&gt; Total salary costs for projects 'Y' and 'Z'

```sql
insert into Project values ('Y', 234);
insert into Project values ('Z', 456);

with recursive
  Yemps(ID) as (select mgrID as ID from Project where name = 'Y'
                union
                select eID as ID
                from Manager M, Yemps Y
                where M.mID = Y.ID),
  Zemps(ID) as (select mgrID as ID from Project where name = 'Z'
                union
                select eID as ID
                from Manager M, Zemps Z
                where M.mID = Z.ID)
select 'Y-total', sum(salary)
from Employee
where ID in (select ID from Yemps)
union
select 'Z-total', sum(salary)
from Employee
where ID in (select ID from Zemps);
```

Prolog is the clear winner here
</div>

<div class="nb-cell query" name="q4">
project_salary_cost('Y', YTotal),
project_salary_cost('Z', ZTotal).
</div>

<div class="nb-cell markdown" name="md8">
### EXAMPLE 3: Airline flights
</div>

<div class="nb-cell program" data-background="true" name="p5">
:- dynamic flights/4.

%! flights(?Orig:text, ?Dest:text, ?Airline:text, ?Cost:int) is nondet
flights('A', 'ORD', 'United', 200).
flights('ORD', 'B', 'American', 100).
flights('A', 'PHX', 'Southwest', 25).
flights('PHX', 'LAS', 'Southwest', 30).
flights('LAS', 'CMH', 'Frontier', 60).
flights('CMH', 'B', 'Frontier', 60).
flights('A', 'B', 'JetBlue', 195).
</div>

<div class="nb-cell markdown" name="md9">
&gt; Find cheapest way to fly from 'A' to 'B'

```sql
/*** First find all costs ***/

with recursive
  Route(orig,dest,total) as
    (select orig, dest, cost as total from Flight
     union
     select R.orig, F.dest, cost+total as total
     from Route R, Flight F
     where R.dest = F.orig)
select * from Route
where orig = 'A' and dest = 'B';

/*** Then find minimum; note returns cheapest cost but not route ***/

with recursive
  Route(orig,dest,total) as
    (select orig, dest, cost as total from Flight
     union
     select R.orig, F.dest, cost+total as total
     from Route R, Flight F
     where R.dest = F.orig)
select min(total) from Route
where orig = 'A' and dest = 'B';

/*** Alternative formuation tied specifically to origin 'A' ***/

with recursive
  FromA(dest,total) as
    (select dest, cost as total from Flight where orig = 'A'
     union
     select F.dest, cost+total as total
     from FromA FA, Flight F
     where FA.dest = F.orig)
select * from FromA;

with recursive
  FromA(dest,total) as
    (select dest, cost as total from Flight where orig = 'A'
     union
     select F.dest, cost+total as total
     from FromA FA, Flight F
     where FA.dest = F.orig)
select min(total) from FromA where dest = 'B';

/*** Alternative formuation tied specifically to destination 'B' ***/

with recursive
  ToB(orig,total) as
    (select orig, cost as total from Flight where dest = 'B'
     union
     select F.orig, cost+total as total
     from Flight F, ToB TB
     where F.dest = TB.orig)
select * from ToB;

with recursive
  ToB(orig,total) as
    (select orig, cost as total from Flight where dest = 'B'
     union
     select F.orig, cost+total as total
     from Flight F, ToB TB
     where F.dest = TB.orig)
select min(total) from ToB where orig = 'A';
```
</div>

<div class="nb-cell program" name="p6">
route(Orig, Dest, Total) :-
    route_(Orig, Dest, 0, Total).

route_(Orig, Dest, Total, Final) :-
    flights(Orig, Dest, _Airline, Cost),
    Final is Total + Cost.

route_(Orig, Dest, Total, Acc) :-
    flights(Orig, Stopover, _Airline, Cost),
    Total1 is Total + Cost,
    route_(Stopover, Dest, Total1, Acc).
</div>

<div class="nb-cell query" name="q5">
aggregate_all(min(Total), route('A', 'B', Total), Cheapest).
</div>

<div class="nb-cell markdown" name="md10">
## Path

Something Prolog can do easily, which would be tricky in SQL, is return the path. In this example, the cheapest travel option comes with the trade-off it involves four flights _A ✈ PHX ✈ LAS ✈ CMH ✈ B_, so the $20 saving in not taking JetBlue's direct flight for $195 may not be worth it.
</div>

<div class="nb-cell program" name="p7">
route(Orig, Dest, Path, Total) :-
    route_(Orig, Dest, [Orig], Reversed, 0, Total),
    reverse(Reversed, Path).

route_(Orig, Dest, Path, [Dest|Path], Total, Final) :-
    flights(Orig, Dest, _Airline, Cost),
    Final is Total + Cost.

route_(Orig, Dest, Path, Acc1, Total, Acc2) :-
    flights(Orig, Stopover, _Airline, Cost),
    Total1 is Total + Cost,
    route_(Stopover, Dest, [Stopover|Path], Acc1, Total1, Acc2).
</div>

<div class="nb-cell query" data-chunk="4" data-tabled="true" name="q6">
route('A', 'B', Path, Total).
</div>

<div class="nb-cell markdown" name="md11">
## Guarding against endless cycles

Next we are going to confuse the route finder by adding a loop back from CMH to PHX. A snag here is that tabling doesn't prevent Prolog from generating ever longer routes, meaning `aggregate_all(min(Total), route('A', 'B', Total), Cheapest).` will result in a stack overflow instead of cheapest price.

```prolog
assertz(flights('CMH', 'PHX', 'Frontier', 80)),
route2('A', 'B', Path, Total).
```
Limiting the solutions to 6 yields:

```prolog
Path = ['A', 'B'],
Total = 195
Path = ['A', 'ORD', 'B'],
Total = 300
Path = ['A', 'PHX', 'LAS', 'CMH', 'B'],
Total = 175
Path = ['A', 'PHX', 'LAS', 'CMH', 'PHX', 'LAS', 'CMH', 'B'],
Total = 345
Path = ['A', 'PHX', 'LAS', 'CMH', 'PHX', 'LAS', 'CMH', 'PHX', 'LAS', 'CMH', 'B'],
Total = 515
Path = ['A', 'PHX', 'LAS', 'CMH', 'PHX', 'LAS', 'CMH', 'PHX', 'LAS', 'CMH', 'PHX', 'LAS', 'CMH', 'B'],
Total = 685
```
Here keeping the path as a _history list_ is handy to halt recursion when we are returning to places already visited:
</div>

<div class="nb-cell program" name="p8">
route(Orig, Dest, Path, Total) :-
    route_(Orig, Dest, [Orig], Reversed, 0, Total),
    reverse(Reversed, Path).

route_(Orig, Dest, Path, [Dest|Path], Total, Final) :-
    flights(Orig, Dest, _Airline, Cost),
    Final is Total + Cost.

route_(Orig, Dest, Path, Acc1, Total, Acc2) :-
    flights(Orig, Stopover, _Airline, Cost),
    \+memberchk(Stopover, Path),
    Total1 is Total + Cost,
    route_(Stopover, Dest, [Stopover|Path], Acc1, Total1, Acc2).
</div>

<div class="nb-cell query" name="q7">
assertz(flights('CMH', 'PHX', 'Frontier', 80)),
aggregate_all(min(Total), route('A', 'B', _, Total), Cheapest).
</div>

<div class="nb-cell markdown" name="md12">
Another way to prune the search is to keep a globally stored best price (initialised to something high, 1000 in the example), and halting recursion whenever the path being explored is leading somewhere more expensive.
</div>

<div class="nb-cell program" name="p9">
:- dynamic best_price/1.
best_price(1000).

route(Orig, Dest, Path, Total) :-
    route_(Orig, Dest, [Orig], Reversed, 0, Total),
    reverse(Reversed, Path).

route_(Orig, Dest, Path, [Dest|Path], Total, Final) :-
    flights(Orig, Dest, _Airline, Cost),
    Final is Total + Cost,
    best_price(Best),
    (   Final &lt; Best
    -&gt;  retract(best_price(Best)), assertz(best_price(Final))
    ;   true
    ).

route_(Orig, Dest, Path, Acc1, Total, Acc2) :-
    flights(Orig, Stopover, _Airline, Cost),
    Total1 is Total + Cost,
    best_price(Best),
    Total1 &lt; Best,
    route_(Stopover, Dest, [Stopover|Path], Acc1, Total1, Acc2).
</div>

<div class="nb-cell query" data-chunk="3" data-tabled="true" name="q8">
assertz(flights('CMH', 'PHX', 'Frontier', 80)),
route('A', 'B', Path, Total),
best_price(Best).
</div>

</div>