Similarities and differences

[From Bill Powers (920701.1300)]

Martin Taylor (920701.0240) --

In trying to sort out the roles of ECSs in various modes -- passive
observation, active control, and model-based "shadowing" -- I think you're
exploring interesting territory. But I think your thesis concerning
"similarities and differences" is a step backward.

I'm sure that these terms have been in your mind for a long time and that
they've settled down to some meanings that are perfectly clear to you. It
isn't clear to me, however, that either "similarity" or "difference"
describes a basic feature of perception. I don't doubt that one can point
out similarities and differences between different objects of perception,
but I don't think that the labels describe what's going on. I just finished
paying attention for 15 minutes or so as I made and ate lunch, and didn't
find any occasion to notice either a similarity or a difference. Of course
if you had pointed out similarities and differences to me I would probably
have agreed with you that they seem to exist. But they're so arbitrary! And
they simply don't seem to dominate experience as you claim they do (unless
you go into a mode where you're actively looking for them).

How is a locomotive similar to a diamond ring? They're both expensive. How
is one window in this room different from the window next to it? One is to
my left, the other is to my right. The idea of a generalized similarity or
difference detector seems to me impractical, because there are simply too
many ways to decide that perceptions are either similar to or different
from other perceptions --or both at the same time.

By examining any pair of perceptions, we can always find a multitude of
ways in which they're different, and another multitude of ways in which
they're similar. To see how two perceptions are similar, we look for
higher-level perceptions that can arise from either of them and are in fact
the same perception. And to see how they are different, we simply note
lower-level attributes that change as we transfer attention from one to the
other. Two chairs can be similar in that both give rise to the perception
"chair," a category, even though at the same time they may belong to
disjoint categories such as reclining and electric. In terms of the simple
category chair they are identical. But when we transfer our attention or
gaze from one to the other and at a lower level of perception, we
experience a change of configuration or sensation. So we can say they are
different, and point to the attributes that are not identical in the two
chairs.

I think, in fact, that differences are probably just transition
perceptions, and that we experience differences primarily in terms of
changes of configuration. It's difficult to state the difference between
two events such as bouncing and exploding, or between two relationships
such as above and between, or between dogs and automobiles, or between
falling and rotating, or between honesty and persistence (to sample the
higher levels). This is because they are above the transition level where
we notice change. The best we can do is say that they're not the same
perceptions (and logically, therefore, are "different" even though we can
see no basis for the difference).

If you could characterize similarity detectors or difference detectors in
some way beside just saying that they detect similarities and differences,
perhaps I could come closer to understanding what you mean. Or to skip to
the real issue, just what do you mean by similarities and differences?

I also have trouble with ECS's wresting control from each other. I think I
have to ask you to diagram this process and explain how it works. I think
some of the notions in your treatise are carrying us dangerously close to
the modern Scholasticism that infects so much of academia (as some of my
own recent pronouncements have been doing). We're drifting into a mode of
discourse that is far from the spirit of modeling and experimentation, the
approach in which models aren't just POSSIBLE explanations of experimental
results, but are the ONLY PLAUSIBLE explanations, with no serious rivals or
alternatives.

Maybe I'm just suffering a backlash from my own long spew about
reorganization that kept me up past my bedtime last night. I woke up this
morning feeling very dissatisfied with what's going on on the net. Your
post hit me when I was already in that state. I suddenly got a picture of a
bunch of old men (and young people trying to imitate old ones) sitting
around a table comfortably debating about angels and pinheads, trying to
solve the riddles of the universe by clever manipulations of words.
Thinking about Brooks' subsumption architecture, I realized that one motive
behind my criticisms is simple jealousy: I wish I were building little
robots and making them actually do things that are interesting. I wish I
were doing real experiments to test real hypothesis instead of just sitting
here and writing and writing and writing. I have a longing to be doing
something REAL. I want some meat to get my teeth into. RAW RED MEAT.

I need another vacation from retirement.

···

----------------------------------------------------------------------
Grumpily,

Bill P.

[Martin Taylor 920702 11:00]
(Bill Powers 920701 13:00)

Two points, that I will try to make brief, for a change. Similarity and
difference detectrion, and mode of discourse.

I think you misinterpret the similarity and difference detectors as relating
CEVs currently affecting sensor systems. That's not what I meant. In pre-PCT
terms, they refer to similarity to or difference from some template. I
look at the "template" now as being a reference level. A similarity detector
has a gain function concave upward (e.g. gain = error to a power greater than
unity), and may well have zero gain for some finite level of error. A
difference detector has a gain function that has some appreciable slope
near zero error. There is a second kind of similarity detector, which is
not an ECS so far as I can see, but a perceptual function akin to what
neural network people call a "radial basis function". It emits a larger
signal the nearer the incoming perceptual pattern matches its "template."
It could well be in an ECS whose reference level is near zero, causing an
error when something like the target is in the sensory data stream.

Does that help? As for whether it is a scholastic point: no it isn't,
because regardless of your theoretical viewpoint, it nonetheless happens
that tasks the depend on similarity give different results than tasks that
depend on difference (precision). Failure to notice which is important in
a particular task has led to some pretty sterile arguments among "scholasts"
of different schools.

On modes of discourse, I thoroughly agree with your wish to do reality testing.
It is so clear that working in the imagination mode permits one to entertain
contradictions that would be soon exposed when tested. But we can't always
"go real" because of resource limitations, as I discussed in the "similarity
difference" posting. But we can work out the implications of our models
to the extent possible, using what you called the truthsaying approach. What
MUST be true? What CANNOT be true? We may be wrong in our analyses, but
they do help to show where reality testing might be fruitful.

I have no resources to devote to reality testing, other than the contract
to Chris Love, of which the Little Baby is a small part (and he finishes
the contract fairly soon). So perforce I devote the time to consideration
of the implications of fundamental truths applied to situations that might
become real. Then I look in the natural world to see if there is anything
that applies. The similarity-difference issue is a case in point. There
are a bunch of strange phenomena out there, which become perfectly natural
and obvious once one realizes that PCT must be true and that there are more
degrees of freedom for sensor systems than for external output (joints,
and shape-changing effects like facial expression). Similarity and difference
are a natural consequence of resource limitation in a control system. They
have been observed as puzzling phenomena. Now they are not. Is that "modern
Scholasticism?"

Martin

[Martin Taylor 920703 11:00]
(Bill Powers 920703.0600)

"Unfortunately" I think we are converging to a state of similarity too fast
to retain sufficient difference to sustain disagreements. I'll try to get
to substantive comments later today or over the weekend. But one quick
point about "apples and oranges": the difference system is not expected to
come into play except when the percept being controlled is within the range
of control. This means that its value is not wildly far from the reference.
If this were not the case, all ECSs would be in conflict about almost every
percept. I assume "apple controllers" come into play only with respect to
things that are sufficiently like apples to make it reasonable to try to
perceive them as apples.

I'll answer the plausibility post later. I think we are even closer to
agreement there, but some questions do remain.

Martin

[From Bill Powers (920703.0600)]

Martin Taylor (920701.1100) --

RE: Similarities and differences.

A similarity detector has a gain function concave upward (e.g. gain =
error to a power greater than unity), and may well have zero gain for
some finite level of error. A difference detector has a gain function
that has some appreciable slope near zero error.

Allow me to pursue the question of perceiving similarities and differences
between distinct percepts. I'll get to templates at the end.

Suppose that the basic form of a perception p of a single variable v is
nonlinear and approximated by p = k*v^2. This is an approximation of the
low to middle range of the perceptual response. The slope at zero input is
zero, increasing linearly as the variable departs from zero.

In detection of a difference relationship, I propose that the perception is
derived from the difference in a single attribute between perceptions of
two sets of variables (the same argument can be extended to multiple
attributes). In general, the amount of one attribute in one set can be
expressed as c + d/2, and in the other as c - d/2, where c is the amount
common to both variables and d is the amount of difference. The difference
in amount of attribute as perceived at the relationship level is p1 - p2.
If each perception has the same nonlinear relationship to the amount of
attribute, approximated as a square, we have (leaving out scaling factors)

perceived difference = p1 - p2 = (c + d/2)^2 - (c - d/2)^2, or

             p1 - p2 = 2*d*c.

The variable d is the difference itself. The function p1 - p2 is the
computation that yields perception of this difference, a relationship. Note
that if there is no common attribute at all (c = 0), there can be no
difference! This suggests the old "apples and oranges" observation: you
can't compare things that have nothing in common. Note also that if the
functions are linear, so that p1 = v1 and p2 = v2, then the perception of
difference is just p1 - p2 = v1 - v2, and the slope is still nonzero at
zero difference. And finally, note that a nonzero slope at zero difference
will still be found for other forms of positively-accelerating
nonlinearity. This is easy to show graphically.

Similarity is more difficult to define because there is no simple natural
definition as there is for difference. If two percepts are identical, how
much similarity do they indicate? An infinite amount? A lot? According to
your definition, the closer two percepts come to being identical, the
higher the perceived similarity, with the slope increasing as identity is
approached. If this rule applies generally, identity must correspond with
maximum slope, so there is maximum slope when the amounts of the common
attribute in each set of percepts reaches equality, and a cusp is found at
that point, or a singularity.

A simple proposal is that similarity is perceived through a function that
is the reciprocal of the difference relationship:

     sim = k/(p1 - p2).

As division by zero is approached, of course, the similarity perception
simply goes to maximum, not infinity, as analog dividers have finite
limits. This function has the required accelerating nonlinearity as
similarity shades toward identity. It changes sign abruptly at p1 = p2.

A reciprocal function would, of course, resemble a power function with an
exponent greater than one -- I doubt that judgements of similarity yield
data that is quantitative enough to distinguish between a best-fit
reciprocal and a best-fit power function. The only reason I could see for
choosing a power function would be an attempt to be consistent with
Stevens' "power laws" of stimulus magnitude estimates. There's no physical
reason to suppose that power laws are involved, although of course one can
always fit a power-law curve to a nonlinear relationship.

I've treated similarity and difference detection here simply as perceptual
functions, examples of relationship perceptions. I don't think there's any
need to bring in error signals, especially as I resist including them in
the world of conscious experience (although they keep getting back into it,
especially with the Revised Model that we keep abandoning again).

There is a second kind of similarity detector, which is not an ECS so >far

as I can see, but a perceptual function akin to what neural network >people
call a "radial basis function". It emits a larger signal the >nearer the
incoming perceptual pattern matches its "template."

As I've defined a similarity detector, there doesn't seem to be any
difference here, except that a perception is being compared with a template
instead of another perception. I really don't think that "similarity" is an
appropriate idea in relation to templates. What knows that there is a
similarity? Another similarity detector?

Nor do I think that "templates" are a necessary construct here. The concept
of a template is an alternative to the concept of a perceptual function,
and one that is hard to defend. The image of moving a negative around over
a positive image looking for a match doesn't fit the pandemonium model in
which all perceptions and reference signals are one-dimensional variables.

I will try to obtain the references you cited. I anticipate that what you
are interpreting as reference signals or templates will also be explainable
as perception of a difference-relationship or a similarity-relationship
between two distinct percepts -- perhaps one of them being remembered. My
bias is always to associate objects of consciousness with perceptual
signals produced by input functions, not with reference signals (unless
imagination routes them into the perceptual channels) or error signals
(which in the model as it stands today are not part of the perceptual
system). Perception of differences should not be confused with error
signals.

···

-----------------------------------------------------------------------
I agree that when reality testing is difficult, it's best to rely on
truthsaying -- saying what MUST be true and what CAN'T be true, according
to the model. There's little point in spending a lot of time on what MIGHT
be true (as I've done above). The model is most easily tested when it leads
to flat statements allowing of no conceivable alternatives -- when it
stands or falls on statements of truth. And I also think that such extreme
statements lead very naturally to simple experimental designs. What we're
talking about is REAL falsifiability.
----------------------------------------------------------------------
Best

Bill P.

[Martin Taylor 920703 16:00]
(Bill Powers 920703.0600)

I think I can accept most of what you wrote, and once again find myself
frustrated by my evident obscurity in writing. I'll try to rephrase one or
two points, and maybe we will have a common understanding. Or maybe some
real disagreement lurks in the words.

Nor do I think that "templates" are a necessary construct here. The concept
of a template is an alternative to the concept of a perceptual function,
and one that is hard to defend. The image of moving a negative around over
a positive image looking for a match doesn't fit the pandemonium model in
which all perceptions and reference signals are one-dimensional variables.

I intended "template" as a non-specific notion that included a perceptual
function but was not restricted to it. Pour battre le cheval mort, if an ECS
has four inputs, A, B, C, D, then if its perceptual function is F(3A+B-C-3D) it
has a template for linearity, in my terminology. I'm sorry to say that I was
aware of this possible misunderstanding when I used the term, but hoped
(ignoring Murphy) that it would not occur. In this case, "template" can
certainly be a remembered percept. It can equally be any source of a reference
signal.

···

---------------
There's also a terminological problem in "similarity" and "difference", because
the distinction shows up in behaviour rather than in any perceptual logic.
It is a question of what one is controlling for, so far as I can see. I was
describing a (to me, plausible) mechanism when I should have been standing
back a little. Let me try again.

I consider three situations, with respect to the organism (not necessarily
with respect only to an ECS).

(1) some percept is being actively controlled to be as close to its reference
    as the other controlled percepts permit.

(2) some percept is not being actively controlled, but if
      (2a) it departs too far from some reference, or
      (2b) it comes sufficiently close to some reference, then
    control relating to this percept must become active or bad things happen.

Condition (1) is what I have identified with the difference detection of an
active ECS. I could equally have talked about "identity" detection, but that
word has connotations of labelling and category. Some people use it, however.
Condition (2) is what I have identified with similarity, whether the
criterion for action is approach to or departure from a reference. (2a) is
"I don't want to be too hot or too cold", and (2b) is "I don't want to see
a tiger looking at me hungrily from too close." To complete the set, (1)
is "I want to keep near the centre line of my traffic lane."

In my discussion, I put all the nonlinearity into the comparator, but as you
point out, it could equally well be in the perceptual function (but see below).
In the para on "template" above, I used F(3A+B-C-3D) as the template for
linearity. Now let us use this function F to illustrate a possibility. In all
cases, we are dealing with a normal ECS that has a simple difference as a
comparator, and a linear output gain function. We will assume that the
reference level is zero. (If it isn't, the function F can be applied to the
comparator output rather than the output of the perceptual function, a rather
more sanitary procedure that does not affect the argument). There are at
least three cases, corresponding to my three sorts of detector.

Type (1): F(x) = x (|x| < some limit)
   This is a difference (identity) detector--an active zero-seeking controller.
It tries to see its input as linear, and continuously controls it to maintain
linearity as closely as it can.

Type (2a): F(x) = 0 (|x| <= t)
               = |x-t|^2 (|x| > t)
   This is a similarity detector that produces output only when x deviates from
zero by a sufficient quantity. If the input is sufficiently nearly linear, it
does nothing.

Type (2b): F(x) = t^2 - x^2 (|x| <= t)
               = 0 (|x| > t)
   This is a similarity detector that produces output when x is close to t. It
does not want to perceive linearity, and if the input is sufficiently far
from a straight line, it does nothing. Otherwise it produces output that
presumably indices actions that cause the input to deviate from linearity.
This is an alerting ECS.

The thesis, from the degrees of freedom argument, is that ECSs of type 1 can
exist to control any of the degrees of freedom implicit in the sensory input,
but that no more than a few can be simultaneously satisfied. To affect which
few degrees of freedom are controlled, many parallel ECSs of type 2 can be
accepting input, but none will provide output unless the similarity condition
is violated, at which time they will provide output.

I attempted in my big posting to suggest several different possibilities for
what happens when one of the type 2 ECSs does start to provide output, all of
which have the same functional result: a Type 1 ECS will be controlling a
perceptual degree of freedom that was not previously being controlled, and
that will occur at the sxpense of removing from control another perceptual
degree of freedom.

I anticipate that what you
are interpreting as reference signals or templates will also be explainable
as perception of a difference-relationship or a similarity-relationship
between two distinct percepts -- perhaps one of them being remembered. My
bias is always to associate objects of consciousness with perceptual
signals produced by input functions, not with reference signals (unless
imagination routes them into the perceptual channels) or error signals
(which in the model as it stands today are not part of the perceptual
system). Perception of differences should not be confused with error
signals.

Yes, I try to go along with that. If I say something that disagrees with it,
or seems to, either I have been thinking sloppily or I have been writing
sloppily. You should pull me up on such occasions. I don't think this
was such an occasion, as I never conceived of error signals (or references)
contributing to percepts at any level.
---------------------

The model is most easily tested when it leads
to flat statements allowing of no conceivable alternatives -- when it
stands or falls on statements of truth. And I also think that such extreme
statements lead very naturally to simple experimental designs. What we're
talking about is REAL falsifiability.

The problem with truthsaying is that language is not logic, and it is very
easy sometimes to believe you are making a watertight case for the necessity
of something that isn't necessary at all. I know I'm arguing out of both
sides of my mouth here. The problem is that reality testing needs lots of
resources, so the best truthsaying I can achieve is the most cost-effective
procedure for me. But I can never be really sure that what I say MUST be
true is not standing on some rickety foundation that might be swept away.

Extreme statements need to be interpreted, and as with "template," the
boundaries of the intent of the statement are not always clear. It is often
very difficult to say just what has been falsified by any experiment.

On falsification, my notion is that all exact theories are false, so that
one gains no information in finding one to be false. Vaguer theories may
include the truth somewhere (insofar a truth means something that will not
be found false within a finite time), but they are harder to demonstrate
to be false when they don't include the truth.

As you may have guessed some long time ago, I reject firmly the notion we
are taught in school, that science consists of the rejection of falsifiable
hypotheses. I take it to be more like the reorganization problem (when we
agree on it), continual evolution of better descriptions of nature, by which
I mean more concise descriptions that cover wider ranges of conditions.

Martin