new summary re language

[From: Bruce Nevin (Tue 93039 12:17:00, Thu 930311 12:08:14)]

In our discussions of language we have had the proposal (from
Bill) that a word-perception be one of the inputs satisfying the
input function for a category-level comparator.

To the extent that (most) words are to be identified with
category perceptions, analysis of structure in discourses should
tell us something about the organization of the category level.
The advantage of this is that words and word dependencies are
much easier to manage and analyze than nonverbal perceptions, and
it is much easier to reach agreement about results. This is
partly because words are by "design" public, and nonverbal
perceptions are ineluctably private. I believe it is also
because dependencies among words are much more well defined, but
a comparison with the aim of demonstrating this is virtually
impossible so long as we have only the words with which the
nonverbal perceptions are associated as our means for reaching
agreement about those nonverbal perceptions. To me, that very
limitation is a pretty convincing demonstration in itself.

In what follows, I will rely much on a notion of dependency
between words. The saying of one word ("jump") is dependent upon
the saying of one or more other words in the same utterance
("child" or "fish", for example). I will clarify this notion
presently, after some preliminary definitions. (The first
indented paragraph within square braces, below.)

Analysis of an utterance yields a structure of dependencies among
the particular words in it.

Words are classified according to these dependencies across many
(by predictive claim, all) utterances of a language.

Def: Operator words (operators) are words the saying of which is
  dependent upon the saying of other words. Symbol: O.

Def: The words upon which an operator depends in a given
  utterance are (in) its argument, or are its argument words
  (arguments).

Def: Words with no such dependency are primitive arguments --
  mostly what we think of as concrete nouns. Symbol: N.

Operators are further classified according to the above
dependency classes of words in their argument. All that is
needed is this simple dependency on dependency. Thus, On depends
on one N (jump), Ono depends on one N and one O in that order
(believe), Oon on the same argument word classes in the reverse
order (surprise), and so on.

In addition to the operator and argument words, there are other
morphemes that correlate with no category perception, but rather
help to identify the dependency relations, morphemes like `-ing',
`to' of the infinitive, `that' in e.g. `I think that you
understand me'.

Words may be present in zero form if their presence is so
strongly expectable in a given position relative to other words
in the utterance that they need not actually be said in that
position: John plays piano and Mary [plays] violin. Or
expectable words may be given reduced form (e.g. pronouns).

  [What is meant by dependency can be seen intuitively in
  perceptual terms. Perceptions that I imagine in association
  with the On operator "jump" necessarily involve something or
  someone (indicated by some word of class N) jumping.
  Perceptions that I imagine in association with "child" do not
  entail anything in particular about the child that I
  imagine--no word of some O subclass is required. However, the
  notion of dependency used here has its basis in the occurrence
  or occurrability of words in utterances. It may be that the
  latter reflects merely the former, but the privileges of
  occurrence of words are enormously easier to manage and analyze
  (in a way that enables agreement among investigators) than
  those of perceptions. Furthermore, it is not clear the extent
  to which words and discourses, and agreements reached and
  communicated by means of them, may determine which perceptual
  inputs humans pass along for higher-level perceptual control,
  which they supply by imagination, and which they ignore, and it
  seems wise not to beg the question.]

Given these preliminaries, I will summarize proposals I made in
1969-1970 (in my MA thesis at Penn).

At the first stage of analysis, the words in utterances are in
linear sequence (some of them perhaps having their phonemic
content reduced, even to zero, as noted).

a child
        \
           jumped
                  \
                   > over
                  /
            a book

A child jumped over a book. This is reduced from A child jumped;
said jumping is over a book. The past tense is reduced from A
child jumps; said jumping is before my saying this. I'll ignore
these reductions for now. It's easier to draw the dependency
graph at the next stage of analysis, undoing the linearization
step of producing sentences.

At a further stage of analysis, the words/category perceptions
are not in a particular linear order. (Alternative
linearizations as sentences are possible.)

  [Where (the saying of) an operator word is dependent upon (the
  saying of) two or more argument words, it is convenient for us
  to speak of the argument words as the first argument, second
  argument, etc., based upon the linearization felt to be most
  basic or most common, but the means for identifying and
  differentiating the words required in the argument of a given
  operator must be in other than sequential terms. I don't have
  a proposal at hand. Case grammar is an obvious candidate, but
  gets notoriously messy and fraught with disagreements between
  investigators. The simplest solution is to use the most basic
  or prevalent linear order as means for distinguishing multiple
  arguments prior to the linearization step. This runs into
  problems as dependencies in discourse pile up, and even in a
  single sentence such as the present example.]

child _____________________
      \ / > over
         jump ---\ book
             \ \
              \ > before
               \ /
                say
              /
             I

Where at the earlier stage we often had repetitions of a word,
where the word-occurrences were identified as "the same" (and
often one occurrence was reduced to zero on that account), at
this stage we may be able to have a single word/category
perception signal satisfy the argument requirement under both
operators (for both occurrences).

Q: do we postulate any mechanism in PCT with the plasticity
   required for this?

We no longer have a dependency tree for each sentence (a
rooted, directed, oriented graph with a single root, one or a few
branches at each node, and primitive argument words at the
leaves). For sentences with modifiers or conjunctions we have
something like a semilattice structure.

Continuing further with the analysis, we reach a mesh-like
structure for a discourse. Central, topical, or thematic words
have more dependencies than other words/categories occurring in
the discourse.

Discourses in the same subject-matter domain use the same
vocabulary, and they use it in the same way. Put in other terms,
the "same" word may satisfy the input of different category
perceptions, depending upon the subject-matter domain of the
discourse and other perceptual context in which it occurs.
"Wear" means one thing to a tailor and another to a sailor (in
context of their respective specializations). One aspect of this
is seen in the classifier vocabulary for a given domain. To take
an example from work done over the past 20 years, in a physiology
domain, "heart" is classified as a "body part". In a
pharmacology domain, "heart" is not a primitive term but occurs
only in phrases, and it is these phrases which are then treated
as primitives for the domain and classified, as for example, "the
beating of the heart" is classified as a "symptom". (In this
way, a study of these matters can disclose, in the structures of
linguistic information found in subfield discourses, a basis for
logical precedence and other relations among different scientific
subfields.)

One subject-matter domain may be distinguished from another by
shared vocabulary across utterances in the subject matter, where
the words satisfy classifier vocabulary for the domain, in a
relatively small number of dependency relations specifyable among
the classifier words. The shared vocabulary, recurrent
dependencies, and the classifier vocabulary appear in the mesh of
word dependencies in memory on the basis of discourses previously
encountered. (Again: for "word dependencies" read perceptions of
dependencies between words and between associated category
perceptions.)

An example of classifier terminology for a domain:

"Mail should not be thought of as an application, it's an
  enabling technology." --Eugene Lee, director of product
  planning for Email software vendor Beyond, quoted in Byte,
  3/93:90.)

Some software packages are called applications. Some things
(including but not limited to software packages) are called
enabling technologies. Perceptions associated with the software
packages called applications, especially those generalizable from
the set of them with which a person is familiar, come to be
treated as expectations about some new thing called an
application. Similarly with the different perceptions associated
with things called enabling technologies and generalizable from
the known set of them. Mr. Lee tells us we ought to have the
latter expectations of electronic mail, rather than the former.
(Um--give us some examples of enabling technologies, please?)

Word dependencies are clearly also dependencies between category
perceptions. I talk about them in terms of word relations
because the words are for us a principal way of getting at the
categories; and ex hypotheosi the two are equivalent for our
purposes (see ref to Bill's proposal, above).

Now, if I have encountered a given operator-argument dependency
o-a in a conversation or text, I have a basis in memory for
finding that dependency acceptable in subsequent discourse. (I
might have a basis in memory of nonverbal perceptual input for
expecting the dependency between category perceptions associated
with o and a respectively. Or I might not--I might only have
imagined perceptual signals generalized from various experiences
categorized as o and as a in the past, perhaps only vaguely, just
enough to "make sense" of the utterance. I might have only a
fuzzy notion of what an "enabling technology" is, or of what
email is.) If o satisfies classifier word O and a satisfies
classifier word A in the given subject-matter domain (in the mesh
of word dependencies [perceptions of dependencies between words
and between associated category perceptions] in memory on the
basis of prior discourses), and if I have encountered other
satisfiers of O and A (or even O and A themselves) in the given
dependency relation, then I have a strong expectation that some
(to me) novel o-a dependency is also acceptable.

A lot of learning a subject matter is learning of its classifier
vocabulary. As mentioned, relatively few dependencies among
classifier words recur with great regularity in discourses of a
given domain. (Harris et al. _The Form of Information in
Science_, Naomi Sager, _Natural Language Processing_). For nautical
discourse, we might find the following sorts of dependencies
(among others):

        VESSEL MANEUVERS IN DIRECTION

        ketch wears away around starboard tack
        yawl bends to port
        schooner comes about into the wind

On this basis, we would find acceptable something like "the
schooner wore away to port", but "the starboard wore away around
the yawl" is absolutely unacceptable -- nonsensical -- nautical
usage. (Though a person ignorant of domain norms might well find
this perfectly sayable and perfectly understandable -- say, the
starbord side of the yawl rubbed against the pier during winter
storage, and was eroded away, around the curve of the vessel on
that side.) Similarly, "the [rubbing of the] rope wore away the
paint" is acceptable not as specifically nautical usage but on
the basis of dependencies in a more basic subject matter, perhaps
mechanics, even though it may occur embedded in a nautical
discussion.

As we learn the word dependencies for a domain, we learn to match
our nonverbal perceptions to the mesh of dependencies remembered
from prior discourse. We must do so for the words to "make
sense". It is also true, though perhaps not so obvious, that we
must do so for the nonverbal perceptions to "make sense", in
particular for them to make the same sense to us as they do to
those with whom we seek to cooperate and from whom we seek to
learn. We must replace imagined perception with memory of actual
perception, and so we become experienced and expert in the
domain. But while learning (and perhaps while supposedly expert)
we may prefer imagined perceptions that match domain norms shared
with others over actual but discrepant perceptions, if control
for the perception of agreement and cooperation has higher gain
than control for discrepancies between imagined perception and
actual perception, however that is managed.

And now once more beyond the pale: in learning to be people with
those around us we learn to discount and ignore some perceptions,
or to cover or reinterpret them with the aid of imagined
perceptions. This is what has happened with socalled "psychic"
perceptions, which have for many centuries been the occasion of
painful and shameful death, and which apparently not all people
access equally (variability analogous to color blindness).

        Bruce Nevin
        bn@bbn.com