# Shortest path in a graph

## Contents |

## The shortest ever shortest-path program

Having written "Hello world," let's move on to a useful problem: finding the shortest path from "Start" to "End" in a weighted directed graph.

If you have ever implemented Dijkstra's algorithm yourself, you will be interested to see how short, easy, and declarative it looks in Dyna.
Save this as `path.dyna`

:

:- item(item, double, inf). pathto(V) min= edge("Start", V). pathto(V) min= pathto(U) + edge(U,V). goal min= pathto("End").

We chose to use strings to name the graph's vertices, such as `"Start"`

and `"End"`

. The program is explained below, but let's run it first.

## Encoding the input

The following input graph is adapted from Goodrich & Tamassia's data structures textbook. It shows several available flights between U.S. airports, with their distances in miles. We would like to get from a friend's house, 10 miles from Boston (BOS), to our destination, 20 miles from Chicago (ORD).

This "parameter file," `flights.par`

, encodes the graph by specifying all of the graph's edges and their values (lengths in miles). Edges that are not mentioned have the default value of infinity, meaning "you can't get there from here."

% driving to and from the airport edge("Start","BOS") := 10. edge("ORD","End") := 20. % available flights edge("BOS","JFK") := 187. edge("BOS","MIA") := 1258. edge("JFK","DFW") := 1391. edge("JFK","SFO") := 2582. edge("JFK","MIA") := 1090. edge("MIA","DFW") := 1121. edge("MIA","LAX") := 2342. edge("DFW","ORD") := 802. edge("DFW","LAX") := 1235. edge("ORD","DFW") := 802. edge("LAX","ORD") := 1749.

## Compile and run the program

Now, what's the shortest path from `Start`

to `End`

? Compile the program as before. But this time, run it with an argument: the file of weighted edges.

dynac path.dyna --driver=goal ./path flights.par

The driver program reads in these edge facts at runtime. For each one, it executes a C++ statement such as `c[edge("Start","BOS")]=10`

. The resulting shortest path length magically appears in `c[goal]`

as soon as the driver program looks at `c[goal]`

in order to print it:

2410

In other words, the shortest path from "Start" to "End" is 2410 miles long.

You should be able to run `./path`

on far larger input files without recompiling. Notice that running the program is much, much faster than compiling it!

## Some variations

We have only been printing the value of `goal`

. If you wrote your own C++ driver program, you could just as easily ask it to print the lengths of the shortest paths from `"Start"`

to other places:

Note: *Instead of writing your own driver program from scratch you can always copy the generated driver program name.build/name_generated_driver.cpp, modify it and use its name on the command line, instead of the --driver option.*

cout << c[pathto("LAX")]; // length of shortest path to Los Angeles cout << c[pathto("MIA")]; // length of shortest path to Miami

In fact, you could delete the last line of the Dyna program in that case. There is no longer anything special about `"End"`

or `goal`

.

Your C++ program could also dynamically change the edges. Flying from Miami to Los Angeles may be 2342 miles, but it feels to you like -1000 because they show such good movies on that flight:

cout << c[goal]; // shortest trip (2410 miles) c[edge("MIA","LAX")] = -1000; // found out about the movies! cout << c[goal]; // so go via Miami (feels like 2037 miles) c[edge("BOS","ORD")] = 856; // new edge (flight) now available cout << c[goal]; // so fly direct (886 miles)

## Backtracing

So far, we have only found the *length* of a shortest path. But what is the path itself? That is, how was `goal`

reached? To find out, you can simply use a different driver program:

dynac path.dyna --driver=backtrace

But you should first add a compiler directive into your `path.dyna`

:

:- pragma(keep_best_antecedents(item)).

This directive tells the program to remember how it derived the values. That makes efficient backtracing possible. There are also inefficient ways to backtrace, but they are not implemented yet in the compiler, so the directive is currently *required* if you want to backtrace at all.

Here is the output. *Note:* `add(x,y)`

corresponds to the Dyna expression `x+y`

, and will in fact print as `x+y`

in a future version of the compiler.

goal = 2410 pathto("End") = 2410 add(pathto("ORD"),edge("ORD","End")) pathto("ORD") = 2390 add(pathto("DFW"),edge("DFW","ORD")) pathto("DFW") = 1588 add(pathto("JFK"),edge("JFK","DFW")) pathto("JFK") = 197 add(pathto("BOS"),edge("BOS","JFK")) pathto("BOS") = 10 edge("Start","BOS") = 10 edge("BOS","JFK") = 187 edge("JFK","DFW") = 1391 edge("DFW","ORD") = 802 edge("ORD","End") = 20

*Because of some temporary naming issues, it will display "times" instead of "add".*

As we'll see below, this backtrace implies that the shortest path is `Start->BOS->JFK->DFW->ORD->End`

.

Perhaps you would like to print this output in a different format, or trace back from `pathto("LAX")`

instead of from `goal`

. Then take a copy of the driver program (which is already specifically tailored for `path.dyna`

) and modify it to your taste:

cp path.build/path_driver.cpp . # copy to current directory # ... now edit path_driver.cpp ... dynac path.dyna path_driver.cpp # your own driver instead of --driver

## Understanding the program

To understand the backtrace, you should understand how the program works. It looks for all paths of at least one edge from "Start".

### Inference rules

The first inference rule says that there is a path to vertex `V`

if there is a single edge from the "Start" vertex to `V`

:

pathto(V) min= edge("Start", V).

Alternatively, there is a path to `V`

if there is a path to any vertex `U`

and an additional edge that continues from that `U`

to `V`

:

pathto(V) min= pathto(U) + edge(U,V).

The final rule merely says that we are looking for the best path to the "End" vertex:

goal min= pathto("End").

### Inference rules as equations

But what are the `min=`

and `+`

doing? In fact, the inference rules are equations. They state how to find the *values* of all `pathto`

and `goal`

items.

Those items have values just like the `hello`

, `world`

and `goal`

items in the previous example. But this program is more complicated. It involves lots of *different* `pathto`

items for different airports, distinguished from one another by their arguments: `pathto("JFK")`

, `pathto("End")`

, etc. These items may all have different values.

The program also mentions `edge`

items, but doesn't say how to find their values. We got their values at runtime from the `flights.par`

file.

The first line of the program says that all items have
`double`

values, and are initially **Failed to parse (Missing texvc executable; please see math/README to configure.): \infty**

("`inf`

"):

:- item(item, double, inf).

### Why these particular equations?

Assuming that each edge's value represents its length in the input graph, the rules are carefully written so that `pathto(V)`

's value will be the total length of the shortest path from vertex `"Start"`

to vertex `V`

.

In principle, there are several ways to get to `V`

: one can get there by an edge from `"Start"`

or an edge from some other `U`

. The `min=`

operator finds the minimum over all these possibilities. Think of it as keeping a running minimum (just as `+=`

would keep a running total). In particular, `pathto(V)`

is found as `min(edge("Start",V), min`

, which involves minimizing over all possible _{U} pathto(U)+edge(U,V))`U`

.

If there are no paths to `V`

, then `pathto(V)`

is a minimum over no lengths at all. **Failed to parse (Missing texvc executable; please see math/README to configure.): \infty**

is the minimum of no things (the identity element for`min`

), just as 0 is the sum of no things (the identity element for`+`

). This "default" value for items is also specified explicitly by the`:- item(item, double, inf)`

declaration.

### Understanding the backtrace

The result of these rules is illustrated by the backtrace. To repeat a fragment:

... pathto("DFW") = 1588 add(pathto("JFK"),edge("JFK","DFW")) pathto("JFK") = 197 ... edge("JFK","DFW") = 1391 ...

Each `pathto`

item is defined using `min=`

, so the item has considered a particular set of expressions and has picked the minimum-valued one.

- Indented under each
`pathto`

item, the backtrace shows the particular (minimum-valued) expression that supplied its value. For example, the minimum path from Start to DFW (`pathto("DFW")`

), of length 1588 miles, was found by going through JFK (`pathto("JFK")+edge("JFK","DFW")`

). - Indented under each expression, the backtrace lists the items that participate in the expression. For example, under
`pathto("JFK")+edge("JFK","DFW")`

, it lists`pathto("JFK")`

and`edge("JFK,"DFW")`

. Their lengths are 197 and 1391 miles, which sum to 1588.

### Solving simultaneous equations

Why is `min(edge("Start",V), min`

the right definition of
_{U} pathto(U)+edge(U,V))`pathto(V)`

? Because the shortest path to `V`

is either a single `"Start" -> V`

edge, or else ends in `...UV`

for some `U`

, in which case it must be the *shortest* path ending in `...UV`

. And the shortest path ending in `...UV`

must have length `pathto(U)+edge(U,V)`

, provided that `pathto(U)`

is itself correct.

Note that the correctness of `pathto(V)`

depends on the correctness of `pathto(U)`

. For example, `pathto("DFW")`

and `pathto("ORD")`

depend on each other, since they fall on the same cycle in the flights graph.

Thus, you can think of the Dyna program as specifying a system of simultaneous equations over `pathto`

, `edge`

, and `goal`

unknowns. The equations enforce consistency among the values. If all the items have the desired values, then all the equations will hold.

Executing the program solves this system of equations. (More accurately, when the driver program evaluates `c[goal]`

, the chart object `c`

solves *enough* of the system to determine the value of `goal`

.)

*Dyna is not just an ordinary equation solver. Its unknowns have structured names: items like pathto("DFW") or my(very(very(deeply(nested(term))))). Inference rules can construct new items on the fly. Thus, the number of equations and unknowns is determined dynamically as the program runs. This enables Dyna to simulate any Turing machine -- including Turing machines that use arbitrary amounts of memory and fail to halt.*

*In this simple program, though, the number of unknowns is limited by the number of nodes in the input graph. That is because there are no inference rules that infer the existence of additional airports outside the original graph.*