lfg user interface thoughts

[Avery.Andrews 940922.1519]

I've been doing some thinking about how to redo my LFG parser
interface, and attempting to apply PCT. Though I can't be sure about the
extent to which the results are PCT, common sense, or nonsense,
here's what I've come up with so far. Presumably the idea of a good
UI is to make as simple as possible the relationship between
(a) the difference between what the user wants and what they've got
(b) what they've got to do to reduce this difference. On this account,
my current interfaces are pretty bad, because what the user is doing
most of the time is trying to find a grammar that gives an `acceptable'
structure to a string (acceptable in that the structure doesn't
make silly claims about meaning and constituency). But these UI's
require the user to cycle in a rather clumsy way betweem 'edit grammar'
and 'parse sentence' modes, which don't mesh very well with what I
conjecture to be the basic structure of the activity, a control-system
trying to diddle `this-grammar' so that it will give a decent result
for `this-sentence' (Chapman and Agre-style 'indexed representions').

So there should simultaneously be on the screen one window with the
sentence to be parsed, and another open on the grammar, and a
parse-button, so that clicking the parse-button recompiles the grammar
if it has been modified since the last parse, and the parses the
sentence.

Furthermore the sentence being parsed is typically one of a number of
sentences, so the window containing this sentence should be a view on
a file containing lots of sentences. There should be another button
that expands the sentence window in order to look at more at a time,
and a select-sentence button that will contract the view to the sentence
that the caret is in (one button with caption toggling between
'view sentences' and 'select sentence' might be the way to go).

A fundamentally more difficult aspect of the problem is
grammar-debugging. Grammars fail in two ways: (a) providing structures that
shuld be blocked (b) failing to provide structures that ought to be provided.
Failure type (a) is probably beyond the powers of a UI to help with,
but perhaps some help can be provided for type (b). What my present
system does is just say 'no structures found', but this isn't very
helpful because it's all-or-nothing, & what we need is a
structured space of possible results for the parse such that gives some
clues about what direction to go in to make things better.

One perhaps useful idea would be to provide a table of 'maximal partial
analysis', that is, the different ways of breaking the string down
into constituents that aren't incorporatable into any larger
constituent. Suppose the sentence is:

  the man from Queensland's coastline walked

and the user has forgotten to provide an S -> NP VP rule.
Then the chart would look something like this:

             NP:2 | VP
  NP | PP |
the man from Queensland's coastline walked

where the NP:2 in the top row indicates that the first 5 words constitute an
NP, which is two-ways ambiguous, and the bottom row indicates a second
analysis that doesn't group 'the man from queensland's coastline' under
a single NP node. Parts of speech for the words is another problem:
perhaps listing them underneath the line would be useful.

The idea is that by leaving out of the chart information about sequences
of constituents that do get integrated into a larger constituent, the
user will have their attention focussed on the structures that indicate
how close they are getting to have the whole sentence parsed (pretty
close, in the example above).

There are several things that the user will want to do with the chart,
includin (a) inspecting the structure(s) ascribed to a given
substring, (b) adding the chart the 'expansion' of a given
constituent. For example, expanding the NP in the above chart might
yield

             NP:2 | VP
Det> NOM |
          NP | NOM |
the man from Queensland's coastline walked

(c) trying to find out why a specific sequence of subconstituents
doesn't itself make up a constituent of a given type. For example,
why the NP followed by VP above doesn't constitute an S (in this
case, putatively, because the user forgot to provide a suitable
S rule).

Since current linguistic theories work by imposing many different
kinds of constraints simultaneously, a possible approach is to
disable, or 'relax' some kind of constraint. The results of this
are often hard to make sense of, however, since relaxing a constraint
lets in floods of structures for the constituents that are working,
as well as the one that's problematic. Restricting
constraint-relaxation to a single sequence of constituents might
ameliorate this problem. E.g. when we relax all the constraints except
the PS rules, and still can't get the NP and the VP to form an S, it
might be obvious that the problem is with the PS rules. If relaxing
agreement gives us a structure, then the problem will be with agreement.

Well, that's as far as I've gotton. B.t.w., to actually code these
ideas up into an interface would be a pretty monstrous task, I think,
so don't hold your breath waiting for me to do it ...

and apologies for my previous un-time_stamped posting (on
supervenience), which was supposed to be a reply to (Bill Powers
(940917.1640 MDT))

Avery.Andrews@anu.edu.au