# Syntactic sugar for inference rules

Future versions of Dyna may add a couple of special notations for use in inference rules. These notations are merely syntactic sugar intended to improve readability.

## Anonymous accumulations

Sometimes it is handy to write a summation, or other accumulation, without giving it a name. Dyna might introduce notation as in these two examples:

foo1(U) max= edge(U,V) * (+= edge(V,W)).

foo2(U) += edge(U,V) * (max= edge(V,W)).


which would be syntactic sugar, respectively, for

foo1(U) max= edge(U,V) * temp1(V).
temp1(V) += edge(V,W).

foo2(U) += edge(U,V) * temp2(V).
temp2(V) max= edge(V,W).


Thus, (+= ...) is a summation operator. It sums over all variables that are mentioned only inside (+= ...); in this case, it sums over W. Similarly, (max= ...) is a maximization operator.

### Binarizing inference rules using anonymous accumulations

Notice that the CKY parsing rule

constit(X,I,K) += rewrite(X,Y,Z)*constit(Y,I,J)*constit(Z,J,K)..


can be asymptotically sped up -- in fact Dyna currently does this speedup automatically -- by taking advantage of the distributive law and introducing an intermediate summation such as

constit(X,I,K) += (+= rewrite(X,Y,Z)*constit(Y,I,J)) * constit(Z,J,K).


which sums over Y, yielding the "binarized" program

constit(X,I,K) += temp(X,Z,I,J)*constit(Z,J,K).
temp(X,Z,I,J) += rewrite(X,Y,Z)*constit(Y,I,J).    % slashed constituent


Notice that temp(X,Z,I,J) is really a "slashed" X/Z constituent, i.e., an X that is missing a Z. Another possible binarization is

constit(X,I,K) += rewrite(X,Y,Z) * (+= constit(Y,I,J)*constit(Z,J,K)).


which sums over J and is equivalent to

constit(X,I,K) += rewrite(X,Y,Z)*temp(Y,Z,I,K).
temp(Y,Z,I,K) += constit(Y,I,J)*constit(Z,J,K).    % double constituent


## The if ever syntax

Note also that an if ever construction such as

foo(X) += bar(X) if ever edge(X,Y) != 0.


can be regarded as syntactic sugar for an anonymous accumulation that performs the existential quantification

foo(X) += bar(X) if (|= edge(X,Y) != 0).


This in turn is syntactic sugar for

:- item(temp, bool, false).
foo(X) += bar(X) if temp(X).
temp(X) |= edge(X,Y) != 0.


## Evaluated subterms

This part of the language is still being worked out.

In future, Dyna will probably let you write expressions something like these:

foo(=2+3)              to mean foo(5)
foo(=bar(nil))         to mean foo(5), if bar(nil) has value 5
foo(=complex("3+4i"))  to mean foo(Failed to parse (Missing texvc executable; please see math/README to configure.): 3+4i


), where

                          Failed to parse (Missing texvc executable; please see math/README to configure.): 3+4i
is a constructed primitive term
word(W,I,=I+1)         in an inference rule
baz(P,=baz(P,Q))       in an inference rule


A subterm preceded by = is evaluated in place.

### Evaluated subterms as syntactic sugar

In fact, the above examples are just syntactic sugar for the semi-expressions

foo(X) whenever X is 2+3.
foo(X) whenever X is complex("3+4i").
foo(X) whenever X is bar(baz).
word(W,I,X) whenever X is I+1.
baz(P,X) whenever X is baz(P,Q).  % note universal quantification over Q


Because these semi-expressions contain variables, they are really patterns, not expressions, and can only be used in queries and inference rules. In particular, foo(=bar(nil)) is not a term and cannot be constructed in C++. (The argument of a foo term would have to be an integer, in our example, not a variable like X or an integer-valued expression like bar(nil).)

Given the inference rules

 goal += foo(=bar(baz)).
goal += word(W,I,=I+1).
goal += baz(P,=baz(P,Q)).


the antecedents of goal would be matches to the above patterns:

 foo(5) whenever 5 is bar(baz).
word(np,5,6) whenever 6 is 5+1.
baz(a,c) whenever c is baz(a,b).


These semi-expressions are slightly clumsy, but they provide the necessary information, especially if analyzed in C++ with pattern-match accessors (see pattern).

### The is operator

The is operator in the semi-expressions is adapted from Prolog. Failed to parse (Missing texvc executable; please see math/README to configure.): \alpha

is Failed to parse (Missing texvc executable; please see math/README to configure.): \beta
is true if the term Failed to parse (Missing texvc executable; please see math/README to configure.): \alpha
(usually a primitive) is the value of the expression Failed to parse (Missing texvc executable; please see math/README to configure.): \beta


.

(By contrast, Failed to parse (Missing texvc executable; please see math/README to configure.): \alpha =Failed to parse (Missing texvc executable; please see math/README to configure.): \beta

is true if


Failed to parse (Missing texvc executable; please see math/README to configure.): \alpha  and Failed to parse (Missing texvc executable; please see math/README to configure.): \beta

are equal terms, and


Failed to parse (Missing texvc executable; please see math/README to configure.): \alpha ==Failed to parse (Missing texvc executable; please see math/README to configure.): \beta

is true if Failed to parse (Missing texvc executable; please see math/README to configure.): \alpha


 and Failed to parse (Missing texvc executable; please see math/README to configure.): \beta

are expressions with equal values.  All three have boolean values, but Failed to parse (Missing texvc executable; please see math/README to configure.): \alpha


=Failed to parse (Missing texvc executable; please see math/README to configure.): \beta

is properly regarded as a foreign axiom because it does not evaluate its subterms Failed to parse (Missing texvc executable; please see math/README to configure.): \alpha
and Failed to parse (Missing texvc executable; please see math/README to configure.): \beta


, Failed to parse (Missing texvc executable; please see math/README to configure.): \alpha ==Failed to parse (Missing texvc executable; please see math/README to configure.): \beta

as a computed expression because it does evaluate them, and Failed to parse (Missing texvc executable; please see math/README to configure.): \alpha
is Failed to parse (Missing texvc executable; please see math/README to configure.): \beta
as a piece of special Dyna syntax.)


### Evaluation of subterms versus non-subterms

Note that the antecedent expression in an inference rule is always evaluated. Thus, if the value of a is the term b*c, and the value of b*c is 6, then the following are different:

 goal = a.        % goal is the actual term b*c
goal = =a.       % goal is 6
goal = ==a.      % goal is 6 (since the value of 6 is itself)

 goal = foo(a).   % goal is the value of the item foo(a)
goal = foo(=a).  % goal is the value of the item foo(b*c)
goal = foo(==a). % goal is the value of the item foo(6)


The last example expands to

 goal = foo(=X) whenever X is a.


which expands further to

 goal = foo(Y) whenever Y is X whenever X is a.


or more concisely,

 goal = foo(X) whenever (Y is X) & (X is a).


### Automatic evaluation of subterms

It might be convenient to allow the = to be omitted in some or all circumstances, allowing syntax like word(W,I,I+1).

However, this must be done carefully since it is often useful to write terms such as constit(s/np,3,5) where s/np is simply a structured subterm (akin to slash(s,np)) rather than a request to divide the value of s by the value of np.

Various conventions could be adopted. Why should I+1 but not s/np be evaluated?

• Perhaps because the user has declared that + subterms (or int+int subterms) should be automatically evaluated. (Maybe it's even enough to explicitly declare them to be computed expressions; note that common operators like + and / will be implicitly declared.)
• Perhaps because the user has declared that / subterms (or nonterm/nonterm subterms) should not be automatically evaluated. (Maybe it's even enough to explicitly declare them to be structures; note that such declarations would otherwise be inferred).
• Perhaps because I+1 has a value whereas s/np doesn't (either because s and np don't, or because / isn't defined to return a value when it takes arguments of their types). In other words, all computed expressions should be evaluated, even as subterms (unless somehow protected, e.g., by a non-traditional quote operator, or by having also been declared as structures, or because their context says not to evaluate them).
• An even stronger version of the previous option would be to evaluate all expressions by default, not just computed expressions but also items. But this seems to interfere with convenient constructions like priority(constit(...)) and gradient(constit(...)).
• Perhaps because the context of this instance of I+1 allows an int but not a +. (This is a bit dangerous, though, and would also make type inference difficult.)

Some of the issues in designing this include

• whether evaluation of non-void terms is opt-in or opt-out
• whether the decision of whether to evaluate is up to the term or the context in which it appears
• concrete syntax for declarations and for the quote operator that protects against evaluation
• ensuring that foo(bar(3)) means the same thing in Dyna as in C++

#### Possible approach

At present, I think the leading proposal -- if we do auto-evaluation at all -- is to do it aggressively as follows.

• Ordinary evaluation:
• Evaluate all terms that have values (as in C).
• Never evaluate void terms (as in Prolog; this allows nesting). Or conceivably, just get this effect by adopting a convention where void terms evaluate to themselves.
• Pass by reference: One can declare contexts that suppress evaluation. foo(int X, literal term Y) or perhaps foo(int X, quoted term Y) will suppress evaluation of foo's second argument: it is supposed to be a term, not something that might evaluate to a term. Not sure whether type inference should try to infer this kind of declaration.
• Example: int s(literal peano Predecessor). Now s(s(s(s(z)))) can evaluate to 4 wherever it's used (thanks to rules like s(X) = X+1), but each s protects the Peano number inside it from evaluation.
• Example: double priority(literal term) attaches priorities to terms, not their values.
• Example: X==foo(Y) evaluates its arguments (value equality), while X~~foo(Y) does not (term equality) and is equivalent to quote(X)==quote(foo(Y)). (One can mix them, e.g., X==quote(foo(Y)).)
• Quote: We can quote terms explicitly to prevent their evaluation, e.g., foo(quote(s(s(s(s(z)))))). Formally, we'd define the polymorphic function foreign T quote(quoted T) to be the identity. Notice that quoting a void term has no effect (so it is always safe to quote if you're not sure whether something is void).
• Quoting only goes down one level, I think, so quote(foo(2+2)) would be equivalent to quote(foo(4)). This is different from Lisp and prevents the need for the backquote operator, I think. If you want to quote at multiple levels, do it explicitly: quote(foo(quote(2+2))
• Variable substitution is not affected by quoting, so quote(foo(Y)) ranges over the terms foo(1), foo(2), etc.
• An aggregation operator can be declared to suppress evaluation or not. They typically suppress evaluation of the consequent, but generally allow the antecedent to evaluate if it's non-void. Unevaluated antecedents could be useful for aggregation operators on terms, notably cons= or whatever we call adding an element to a set.
• Eval: We can eval terms explicitly to force their evaluation, e.g., =favoriteitem to get the value of the term stored by favoriteitem. Not clear how this should work where evaluation is suppressed: should priority(=favoriteitem) set the priority of favoriteitem, or -- more consistently but maybe less usefully -- the priority of a new term eval(favoriteitem)?
• Question: What do the instantiated antecedent expressions of goal += foo(X*Y) if bar(X,Y) look like?
• Disavantage: When reading a program, it's not immediately obvious which terms are evaluated and which aren't. You have to know who is void and who suppresses evaluation. A smart syntax-coloring editor might help by colorizing the functors (and parens, if not infix) of unevaluated terms.
• Disadvantage: Presumably this convention introduces an unfortunate discrepancy between the concrete syntax on the Dyna side and the C++ side.

Cool idea: Possibly we should steal C++ punctuation. We can think of terms as pointers to values; they just happen to have lots more structure than a 32-bit sequence.

• &s(s(z)) for quoting (rather than quote(s(s(z))), or 's(s(z)), which we can't parse because ' is already used to quote functors)
• *favoriteitem for eval(favoriteitem) (rather than =favoriteitem)
• priority(term &) to suppress evaluation of the argument (rather than priority(quoted term)); after all, this is just pass-by-reference
• double *derivative(double *Expression) to define a function on double-valued terms (rather than double =derivative(double =Expression))
• Warning: C++ notation in Dyna could be really confusing if the analogy proves faulty!

Maybe this C++-like notation is the secret to getting the C++ concrete syntax to work as in Dyna?

• Note that operator& and operator* can be overloaded.
• Given the declaration int foo(string), the C++ expression &foo("arg") would have to return some kind of term -- maybe a term<int>, analogous to an int *. (Presumably this would be a subtype of term<term>, which is the highest-level class).
• foo("arg") itself would have to return int (or more plausibly, something that implicitly casts to int; it would also have to support operator& and -- if foo is an axiom type -- be usable as an lvalue).
• We'd probably want to drop the use of a special chart object and just use terms directly as both lvalues (in the case of axioms) and rvalues. But the c[foo(bar))] notation could still be used as syntactic sugar, equivalent to chart(c,foo(bar)) where foo(bar) is not evaluated. The question is how bar is evaluated: arguably c[foo(bar)] should use c[bar] as the argument to foo, since the whole expression is to be evaluated in the context of the chart; I'm not sure whether we can pull that off.