Several perspectives on Dyna

From Dyna
Jump to: navigation, search

Contents

Dyna as an equation solver

A Dyna program looks like a system of equations. Various quantities are defined in terms of other quantities (perhaps circularly).

The twist is that the overall number of variables isn't known when you write the program, or even when you provide the input. The input is a set of "axiomatic" quantities from which Dyna will try to infer the values of a potentially unlimited number of other quantities. This deductive ability is very useful for parsing and other search problems.

How can you write down a system of equations if you don't even know how many unknowns you have? Each equation is really an equational schema that can be instantiated as needed to obtain many possible equations. Dyna's notation is very handy even just for explanations between people.

Dyna as a language for dynamic programming

Dyna was originally developed for implementing dynamic programming algorithms. Dynamic programming is a computer science technique that is most simply described as divide-and-conquer with storage and reuse of the subproblems.

Dyna favors the bottom-up or "data-driven" approach where subproblems are solved first, and then their stored solutions are combined into solutions of larger problems. However, it will soon also support mixing this with the top-down or "goal-driven" approach, where one starts by trying to solve a larger problem, and recursively solves subproblems, storing ("memoizing") their solutions for potential reuse.

Dyna's twist in both cases is a prioritized agenda. It treats "important" subproblems first for efficiency's sake. This is especially helpful for optimization or search problems: Dyna tries to extend or combine the most promising partial solutions first. The general strategy is the A* algorithm (in its generalization to hypergraphs).

Also discuss Bellman-style dynamic programming with an infinite horizon.
Note DPROG as related work.

Dyna as a weighted logic programming language

Logic programming, in particular the Prolog language, enjoyed a spell of popularity in natural language processing some years ago. Parsing and other NLP tasks can be regarded as examples of deductive reasoning, which is Prolog´s strength. However, classical deductive reasoning is less relevant to the statistical methods that are now the meat and potatoes of the field (and of related fields).

Dyna's twist is that it is a weighted language. In Dyna, theorems are not merely deduced. They are deduced to have particular values (0.3, -65, "text", true, false, or perhaps other terms or data structures of your choice). This is a powerful extension that allows probabilistic reasoning, natural handling of negation (one can deduce a false value for a theorem), and numerical computation in general (e.g., neural networks, where the values of some nodes are derived from the values of others, perhaps cyclically). Furthermore, Dyna supports training of the weights in order to maximize an objective function, which may itself be specified as the value of a designated "goal" theorem.

In short, Dyna makes it easier to build statistical models (and more) by harnessing the power of deduction.

(There already exists a probabilistic variant of Prolog called PRISM, but it is limited to probabilities, in fact to generative branching stochastic processes.)

Another useful twist is that one can change the basic facts (axioms), or their values, and the changes to theorems will percolate through the proof forest. This is useful for weight training and for dynamic algorithms (e.g., dynamic graph algorithms).

Dyna as an efficient logic programming language

Dyna also tries to improve on Prolog's efficiency. Prolog´s default backtracking search saves on memory but takes exponential time to parse a sentence. Getting Prolog to do something else is both inconvenient and expensive.

Search strategies. Dyna's default strategy comes from chart parsing. Instead of starting with a goal and "chaining backwards" to find rationales, it starts with a set of facts and "chains forward" to see what can be proved from them. Furthermore, as it proves intermediate results, it prioritizes them and follows the most promising ones first. So a machine translation system written in Dyna can use probabilities not only to define the best translation, but also to speed up the search for it. (cite Caraballo & Charniak, Klein & Manning)

Compiler strategies. Dyna programs are intended to be entirely declarative specifications. They explicitly do not prescribe any execution strategy. The compiler is therefore free to use forward chaining, backward chaining with or without memoization (and flushing), any priority function, multiresolution search, equation solvers, mixed strategies, etc. The compiler's choices can be controlled by compiler directives (pragmas), but the eventual goal is for these directives to be generated automatically by learning from actual executions. The same applies to various storage and memory layout decisions.

Storage strategies. The compiler can also decide how to lay out your data in memory. If you were implementing some graph algorithm by hand, you would have a range of correct options for how to store your graph in memory. Different graph data structures would be optimal for different typical kinds of graphs and different algorithms to run on them. Dyna gives you a simple, uniform description of all your data as terms, but under the hood, the compiler is free to choose efficient data structures that implement this description. As above, we are starting with manual storage pragmas with the eventual goal of generating the pragmas automatically.

Strong typing. Dyna has a strong (and interesting) type system that can improve time and space efficiency as well as code safety.

Compilation into C++. Dyna programs compile into C++ classes. They are not designed to be extended at runtime. The compiler already knows all the queries that the program will have to issue at runtime, and can produce hard-coded implementations for them, as well as code to maintain indices in their service (perhaps with some help from the profiler.

Ground terms only. The Dyna protoype drops Prolog's ability to compute with partially instantiated terms by unification. All such patterns of computation must be removed before compilation, by transforming the program. This considerably reduces overhead in storing and analyzing terms. (However, the next version will handle non-ground terms.)

(discuss Mercury and XSB projects)

Dyna as an in-memory deductive database

A Dyna item foo("a","b") with value 3 can be regarded as a tuple ("a","b",3) in a relation foo, where the sub-tuple ("a","b") serves as the unique key.

Thus, a Dyna chart (collection of items) can be regarded as a database of tuples. Inference rules can be used to derive new tuples automatically from old ones.

Once you compile a Dyna program into C++ classes, you can easily use C++ to manipulate this database and perform precompiled database queries on it. The deep connection between logic programming and databases has been recognized for over 25 years.

Dyna differs from an ordinary relational database in several ways:

  • It is designed to run in main memory. Such databases are supposedly 10 times faster than running an ordinary commercial database system in memory. Dyna may get further speedups from the fact that all the tables and queries are known at compile time.
  • It adds deduction and truth maintenance. When you add to the chart or change it, other parts of the chart may grow or change in response, according to precompiled rules stated in your Dyna program.
  • Its terms are a convenient generalization of relational database tuples: they can contain other tuples as elements (i.e., HiLog rather than Datalog). Furthermore, legal tuples and subtuples can be restricted via Dyna's pattern types. Think of XML tree-structured markup, with restrictions defined by a DTD schema. (Unlike XML, equal subtrees are automatically structure-shared.)
  • Terms are immutable objects. However, (axiomatic) terms may be asserted or retracted, and their associated values can be changed. This supports several styles of database modification.
  • Dyna is Turing-complete, unlike relational algebra or Datalog.
  • Dyna is primarily designed to support the management of data generated during a computation, rather than data arriving from the real world. So it doesn't currently waste time ensuring the atomicity or recoverability of transactions in the case of a power failure. (Nor does it lock records for parallel access to the database, although parallelization is a future topic.)

Most programming is simply a matter of maintaining data structures in memory, looking things up in them, and recording new data as a result. These actions can often be regarded as deduction or truth maintenance according to rules. That is why Dyna seems widely applicable.

Dyna as a system for statistical modeling

(discuss both modeling in Dyna and Dyna's support for parameter training)

Dyna as a language for dynamic algorithms

Let Failed to parse (Missing texvc executable; please see math/README to configure.): F(x)

be a function, such as the function that returns the minimum spanning tree of a graph Failed to parse (Missing texvc executable; please see math/README to configure.): x

. If the graph changes slightly (e.g., an edge is added or removed), one would rather not have to recompute the new graph's minimum spanning tree from scratch. A dynamic algorithm for Failed to parse (Missing texvc executable; please see math/README to configure.): F

is one that can rapidly update Failed to parse (Missing texvc executable; please see math/README to configure.): F(x)
as Failed to parse (Missing texvc executable; please see math/README to configure.): x
changes.  Typically, a dynamic algorithm will maintain extra information about how it computed the current Failed to parse (Missing texvc executable; please see math/README to configure.): F(x)

.

We suspect that Dyna is a natural language for implementing dynamic algorithms. The input to a Dyna program is encoded as a set of axioms, from which the program derives intermediate quantities and a final answer. In a sense, any Dyna program is dynamic, because the axioms can be changed, and the changes automatically percolate through the (stored) intermediate quantities to the final result.

(Whether this is efficient is up to the Dyna programmer. The art of designing a good dynamic algorithm is to organize the overall computation so that percolation of changes is cheap. For example, one might use heap-like data structures that would be wasteful in a non-dynamic setting but ensure O(log n) updates when an axiom is changed.)

Dyna as a general-purpose programming language

Dyna didn't start out as a general-purpose programming language. It was just going to be a tool for rapid prototyping of NLP systems that used dynamic programming. But one of the first test programs that anyone wrote was a Turing machine; so we knew it was Turing-complete. We also noticed connections to logic programming. Then we realized we needed garbage collection, and the type system got interesting because of efficiency and queries. Someone wrote an Emacs mode for syntax highlighting of Dyna programs. We realized that our parse-forest or proof-forest visualizer was really a new kind of debugger. And lately we have been writing parts of the compiler itself using Dyna (declaration inference and program transformation) because it seemed much easier that way. All in all, it seems fair to call it a programming language now.

Dyna as an extension to C++

Even if you are just writing ordinary C++ programs, Dyna may be a useful tool for you. You can jot down a few Dyna type declarations and compile them into C++ classes that provide

  • appropriate constructors and accessors
  • good storage layout
  • memory management and garbage collection
  • standard object methods: equality, copy, hash, etc.
  • tagged unions, subtypes, and pattern matching
  • storage of associated values in a chart
  • automatic computation of other values in the chart (by rule)
  • database queries over the chart
  • optimized memory layout via profiling

Such classes may be handy in your C++ program. We've used a slew of C++ tricks to make them feel like natural extensions to the C++ language.

Here's a simple example.

Personal tools
Namespaces

Variants
Actions
Navigation
Tools