Shortest path in a graph
The shortest ever shortest-path program
Having written "Hello world," let's move on to a useful problem: finding the shortest path from "Start" to "End" in a weighted directed graph.
If you have ever implemented Dijkstra's algorithm yourself, you will be interested to see how short, easy, and declarative it looks in Dyna.
Save this as
:- item(item, double, inf). pathto(V) min= edge("Start", V). pathto(V) min= pathto(U) + edge(U,V). goal min= pathto("End").
We chose to use strings to name the graph's vertices, such as
"End". The program is explained below, but let's run it first.
Encoding the input
The following input graph is adapted from Goodrich & Tamassia's data structures textbook. It shows several available flights between U.S. airports, with their distances in miles. We would like to get from a friend's house, 10 miles from Boston (BOS), to our destination, 20 miles from Chicago (ORD).
This "parameter file,"
flights.par, encodes the graph by specifying all of the graph's edges and their values (lengths in miles). Edges that are not mentioned have the default value of infinity, meaning "you can't get there from here."
% driving to and from the airport edge("Start","BOS") := 10. edge("ORD","End") := 20. % available flights edge("BOS","JFK") := 187. edge("BOS","MIA") := 1258. edge("JFK","DFW") := 1391. edge("JFK","SFO") := 2582. edge("JFK","MIA") := 1090. edge("MIA","DFW") := 1121. edge("MIA","LAX") := 2342. edge("DFW","ORD") := 802. edge("DFW","LAX") := 1235. edge("ORD","DFW") := 802. edge("LAX","ORD") := 1749.
Compile and run the program
Now, what's the shortest path from
End? Compile the program as before. But this time, run it with an argument: the file of weighted edges.
dynac path.dyna --driver=goal ./path flights.par
The driver program reads in these edge facts at runtime. For each one, it executes a C++ statement such as
c[edge("Start","BOS")]=10. The resulting shortest path length magically appears in
c[goal] as soon as the driver program looks at
c[goal] in order to print it:
In other words, the shortest path from "Start" to "End" is 2410 miles long.
You should be able to run
./path on far larger input files without recompiling. Notice that running the program is much, much faster than compiling it!
We have only been printing the value of
goal. If you wrote your own C++ driver program, you could just as easily ask it to print the lengths of the shortest paths from
"Start" to other places:
Note: Instead of writing your own driver program from scratch you can always copy the generated driver program
name.build/name_generated_driver.cpp, modify it and use its name on the command line, instead of the
cout << c[pathto("LAX")]; // length of shortest path to Los Angeles cout << c[pathto("MIA")]; // length of shortest path to Miami
In fact, you could delete the last line of the Dyna program in that case. There is no longer anything special about
Your C++ program could also dynamically change the edges. Flying from Miami to Los Angeles may be 2342 miles, but it feels to you like -1000 because they show such good movies on that flight:
cout << c[goal]; // shortest trip (2410 miles) c[edge("MIA","LAX")] = -1000; // found out about the movies! cout << c[goal]; // so go via Miami (feels like 2037 miles) c[edge("BOS","ORD")] = 856; // new edge (flight) now available cout << c[goal]; // so fly direct (886 miles)
So far, we have only found the length of a shortest path. But what is the path itself? That is, how was
goal reached? To find out, you can simply use a different driver program:
dynac path.dyna --driver=backtrace
But you should first add a compiler directive into your
This directive tells the program to remember how it derived the values. That makes efficient backtracing possible. There are also inefficient ways to backtrace, but they are not implemented yet in the compiler, so the directive is currently required if you want to backtrace at all.
Here is the output. Note:
add(x,y) corresponds to the Dyna expression
x+y, and will in fact print as
x+y in a future version of the compiler.
goal = 2410 pathto("End") = 2410 add(pathto("ORD"),edge("ORD","End")) pathto("ORD") = 2390 add(pathto("DFW"),edge("DFW","ORD")) pathto("DFW") = 1588 add(pathto("JFK"),edge("JFK","DFW")) pathto("JFK") = 197 add(pathto("BOS"),edge("BOS","JFK")) pathto("BOS") = 10 edge("Start","BOS") = 10 edge("BOS","JFK") = 187 edge("JFK","DFW") = 1391 edge("DFW","ORD") = 802 edge("ORD","End") = 20
Because of some temporary naming issues, it will display "times" instead of "add".
As we'll see below, this backtrace implies that the shortest path is
Perhaps you would like to print this output in a different format, or trace back from
pathto("LAX") instead of from
goal. Then take a copy of the driver program (which is already specifically tailored for
path.dyna) and modify it to your taste:
cp path.build/path_driver.cpp . # copy to current directory # ... now edit path_driver.cpp ... dynac path.dyna path_driver.cpp # your own driver instead of --driver
Understanding the program
To understand the backtrace, you should understand how the program works. It looks for all paths of at least one edge from "Start".
The first inference rule says that there is a path to vertex
V if there is a single edge from the "Start" vertex to
pathto(V) min= edge("Start", V).
Alternatively, there is a path to
V if there is a path to any vertex
U and an additional edge that continues from that
pathto(V) min= pathto(U) + edge(U,V).
The final rule merely says that we are looking for the best path to the "End" vertex:
goal min= pathto("End").
Inference rules as equations
But what are the
+ doing? In fact, the inference rules are equations. They state how to find the values of all
Those items have values just like the
goal items in the previous example. But this program is more complicated. It involves lots of different
pathto items for different airports, distinguished from one another by their arguments:
pathto("End"), etc. These items may all have different values.
The program also mentions
edge items, but doesn't say how to find their values. We got their values at runtime from the
The first line of the program says that all items have
double values, and are initially Failed to parse (Missing texvc executable; please see math/README to configure.): \infty
:- item(item, double, inf).
Why these particular equations?
Assuming that each edge's value represents its length in the input graph, the rules are carefully written so that
pathto(V)'s value will be the total length of the shortest path from vertex
"Start" to vertex
In principle, there are several ways to get to
V: one can get there by an edge from
"Start" or an edge from some other
min= operator finds the minimum over all these possibilities. Think of it as keeping a running minimum (just as
+= would keep a running total). In particular,
pathto(V) is found as
min(edge("Start",V), minU pathto(U)+edge(U,V)), which involves minimizing over all possible
If there are no paths to
pathto(V) is a minimum over no lengths at all. Failed to parse (Missing texvc executable; please see math/README to configure.): \infty
is the minimum of no things (the identity element for
min), just as 0 is the sum of no things (the identity element for
+). This "default" value for items is also specified explicitly by the
:- item(item, double, inf)declaration.
Understanding the backtrace
The result of these rules is illustrated by the backtrace. To repeat a fragment:
... pathto("DFW") = 1588 add(pathto("JFK"),edge("JFK","DFW")) pathto("JFK") = 197 ... edge("JFK","DFW") = 1391 ...
pathto item is defined using
min=, so the item has considered a particular set of expressions and has picked the minimum-valued one.
- Indented under each
pathtoitem, the backtrace shows the particular (minimum-valued) expression that supplied its value. For example, the minimum path from Start to DFW (
pathto("DFW")), of length 1588 miles, was found by going through JFK (
- Indented under each expression, the backtrace lists the items that participate in the expression. For example, under
pathto("JFK")+edge("JFK","DFW"), it lists
edge("JFK,"DFW"). Their lengths are 197 and 1391 miles, which sum to 1588.
Solving simultaneous equations
min(edge("Start",V), minU pathto(U)+edge(U,V)) the right definition of
pathto(V)? Because the shortest path to
V is either a single
"Start" -> V edge, or else ends in
...UV for some
U, in which case it must be the shortest path ending in
...UV. And the shortest path ending in
...UV must have length
pathto(U)+edge(U,V), provided that
pathto(U) is itself correct.
Note that the correctness of
pathto(V) depends on the correctness of
pathto(U). For example,
pathto("ORD") depend on each other, since they fall on the same cycle in the flights graph.
Thus, you can think of the Dyna program as specifying a system of simultaneous equations over
goal unknowns. The equations enforce consistency among the values. If all the items have the desired values, then all the equations will hold.
Executing the program solves this system of equations. (More accurately, when the driver program evaluates
c[goal], the chart object
c solves enough of the system to determine the value of
Dyna is not just an ordinary equation solver. Its unknowns have structured names: items like
my(very(very(deeply(nested(term))))). Inference rules can construct new items on the fly. Thus, the number of equations and unknowns is determined dynamically as the program runs. This enables Dyna to simulate any Turing machine -- including Turing machines that use arbitrary amounts of memory and fail to halt.
In this simple program, though, the number of unknowns is limited by the number of nodes in the input graph. That is because there are no inference rules that infer the existence of additional airports outside the original graph.