Selection (Resnik's Thesis)

I've only started to look at this, but it does seem very interesting.

Phil Resnik was here at BBN but I didn't know him.

Steve Johnson is the one who implemented Harris' operator grammar in
Prolog for his NYU thesis. Currently in medical informatics at
Columbia Presbyterian.

  Bruce

------- Forwarded Message

Received: from BBN.COM by LAGAVULIN.BBN.COM id aa07615; 8 Dec 93 19:16 EST
Received: from cucis.cis.columbia.edu by BBN.COM id aa04154; 8 Dec 93 20:26 EST
Received: by cucis.cis.columbia.edu (AIX 3.2/UCB 5.64/4.04)
          id AA56744; Wed, 8 Dec 1993 20:26:29 -0500

···

Date: Wed, 8 Dec 1993 20:26:29 -0500
From: Stephen Johnson <johnson@cucis.cis.columbia.edu>
Message-Id: <9312090126.AA56744@cucis.cis.columbia.edu>
To: bnevin@BBN.COM
Subject: looks interesting...

Delivery-Date: Mon, 06 Dec 93 20:30:22 -0500
From: Philip.Resnik@east.sun.com (Philip Resnik - Sun Microsystems Labs
      BOS) (by way of yarowsky@unagi.cis.upenn.edu (David Yarowsky))
Subject: Philip Resnik's Dissertation Available
To: empiricists@CSLI.Stanford.EDU
Date: Mon, 06 Dec 1993 13:54:39 -0800
Sender: roscheis@CSLI.Stanford.EDU

Hello,

Some readers of this list may be interested in my doctoral
dissertation, now available by anonymous ftp.

  @phdthesis{resnik:dissertation,
    author = "Philip Resnik",
    title = "Selection and Information:
        A Class-Based Approach to Lexical Relationships",
    school = "University of Pennsylvania",
    month = "December",
    year = "1993",
    note = (Institute for Research in Cognitive Science
        report IRCS-93-42)"}

It can be retrieved in postscript format from linc.cis.upenn.edu as
file /pub/ircs/technical-reports/93-42.ps -- abstract and sample ftp
session are included at the end of this message. The file is about 1
megabyte, 173 printed pages.

Hardcopies can be ordered for the cost of reproducing and mailing the
thesis ($7.35) by contacting Jodi Kerper, IRCS, 3401 Walnut Street,
Suite 400C, Philadelphia, PA 19104-6228 USA
(jbkerper@central.cis.upenn.edu).

- ----------------------------------------------------------------

             Abstract

  Selectional constraints are limitations on the applicability of
  predicates to arguments. For example, the statement ``The number two
  is blue'' may be syntactically well formed, but at some level it is
  anomalous --- BLUE is not a predicate that can be applied to numbers.

  In this dissertation, I propose a new, information-theoretic account
  of selectional constraints. Unlike previous approaches, this proposal
  requires neither the identification of primitive semantic features nor
  the formalization of complex inferences based on world knowledge. The
  proposed model assumes instead that lexical items are organized in a
  conceptual taxonomy according to class membership, where classes are
  defined simply as sets --- that is, extensionally, rather than in
  terms of explicit features or properties. Selection is formalized in
  terms of a probabilistic relationship between predicates and concepts:
  the selectional behavior of a predicate is modeled as its
  distributional effect on the conceptual classes of its arguments,
  expressed using the information-theoretic measure of relative entropy.
  The use of relative entropy leads to an illuminating interpretation of
  what selectional constraints are: the strength of a predicate's
  selection for an argument is identified with the quantity of
  information it carries about that argument.

  In addition to arguing that the model is empirically adequate, I
  explore its application to two problems. The first concerns a
  linguistic question: why some transitive verbs permit implicit direct
  objects (``John ate.'') and others do not (``*John brought.''). It
  has often been observed informally that the omission of objects is
  connected to the ease with which the object can be inferred. I have
  made this observation more formal by positing a relationship between
  inferability and selectional constraints, and have confirmed the
  connection between selectional constraints and implicit objects in a
  set of computational experiments.

  Second, I have explored the practical applications of the model in
  resolving syntactic ambiguity. A number of authors have recently
  begun investigating the use of corpus-based lexical statistics in
  automatic parsing; the results of computational experiments using the
  present model suggest that often lexical relationships are better
  viewed in terms of underlying conceptual relationships such as
  selectional preference and concept similarity. Thus the
  information-theoretic measures proposed here can serve not only as
  components in a theory of selectional constraints, but also as tools
  for practical natural language processing.

- ----------------------------------------------------------------

An example of retrieving and printing the file.

  % ftp linc.cis.upenn.edu
  Name (linc.cis.upenn.edu:presnik): anonymous
  331 Guest login ok, send your complete e-mail address as password.
  Password: <<YOUR E-MAIL ADDRESS>>
  > cd pub/ircs/technical-reports
  > get 93-42.ps
  > quit
  % lpr 93-42.ps

------- End of Forwarded Message