social references, perceptual phenomena

[From Rick Marken (930520.1300)]

Bruce Nevin (Wed 930519 16:20:22) --

My understanding is that the reference signal is the playback
from memory

Probably not the best way to think of it. I think if you read about
(or, better still, work with) my spreadsheet model (in "Mind
Readings") you'll get a better feel for how it works.

I said:

What you
describe is exactly what happens in Tom's cooperation experiment

You say:

I don't understand how it is the same.

What I meant was that a conversation is similar to what is going on
in Tom's experiment. In order to control a higher order variable (say,
"being polite") both people in the conversation have to be controlling
for a similar variable (like both subjects in Tom's experiment had to control
for "three lines in a row"). In order to achieve this higher level goal, both
people have to control their parts of the conversation properly -- just as
the subject's in Tom's experiment both had to control "their pair of lines"
properly. The analysis of the conversation is a bit more difficult
than the analysis of Tom's experiment because we don't know what variables
are actually being controlled by the parties to the conversation. But the
principle is the same in both cases -- which is that, for each control system
involved in the interaction, effects on at least some of the variables being
controlling are mediated by a feedback connection that runs through the
other control system; that is, some of the feedback functions for both
control systems EXIST BECAUSE the other control system is CONTROLLING.

Bruce Nevin (Thu 930520 14:30:21) --

Some interesting experimental data about speech:

We pause here for you to guess what you would perceive with these
experimental setups.

Fun questions. My experience is that it is very hard to predict how
you will perceive. For example, a friend and I once tried to make an
auditory analog of the Julesz random dot stereograms by phase shifting
some of the components of computer generated "noise" (harmonically
spaced equal amplitude sine waves). We thought we could phase shift
a "vowel" type sound out -- it should have been heard in the left ear when
the noise was shifted right, for example. But no -- just noise.

I also tried an auditory analog of another Julesz demo -- spacial frequency
masking. That's the one where a bunch of grey squares turns into Lincoln's
face when you squint or move back far enough (presumably filtering out the
high spatial frequency noise of the square edges). I mixed vowels with
"noise" that had a "high frequency" or "low frequency" SPECTRAL profile.
Since vowel perception seems to be based on "low frequency"
spectral shape (the "formants") I thought the low frequency "noise"
would mask better than the high frequency "noise" -- even though the
noise energy was the same in both cases, producing the same level of
masking based on S/N ratio. In fact, the result was that both noises
produced the same level of "masking" -- because the result of mixing
vowel and noise was always a "new vowel". A linguist told me that
that would happen but I didn't believe him -- I'm sure both you and Avery
wish that I had learned my lesson right then and there about the wisdom
of listening to linguists.

Experiment 1: In one ear (say, the right ear) play the acoustic signal
for syllables, only with gaps (silence) in place of the transitions.
In the other ear (the left) play the glissandi for various syllables,
appropriately timed to match the gaps.

In experiment 1, what do you hear in the right ear
and what do you hear in the left?

My guess is that the right ear heard, if not the "correct" syllables,
at least some syllables; the configuration systems that hear syllables
were probabaly able to piece the left ear stuff in to produce the
syllable perception. The left ear probably sounded like clicks --
sensations that were just happening in parallel to the syllable

Are there changes as amplitude is
decreased equally in both channels, and if so, what?

As amplitude is decreased equally in both channels? If so, my guess
is that new syllables were heard as the amplitude decreased;
'baa' became 'daa' perhaps. But there would still be the separate
clicks and syllables in each channel.

Experiment 2: play the syllable /ba/ three times while displaying
a face pronouncing /bE vE thE/ perfectly synchronized with the
acoustic signal. (/E/ is the vowel in "bed", /dh/ is the consonant
in "the".).

In experiment 2,
what do you hear in the headphones?

I think they would hear what they see -- they hear /bE vE thE/. I think
this is because language perception has a visual component and
the perceptual functions that produce these syllables probably
have inputs from the visual system. If the sound input to these
functions is not radically different than what could go with the
visual input, then the visual input could be the main determiner
of the magnitude of the perceptual signal generated by each
syllable configuration detector.

Given my track record at predicting perceptual phenomena, I would
not be surprised if I'm 0 for 2 on your test.

Looking forward to the results (sort of)