A Long Post

[From Gabriel 921212 12:44CST]

This is a long post, being the concatenation of three in an offline
correspondence. I and Bill C. have noted Gary's concern about offline
discussion depriving net mbrs of necessary context. We'd like to share
our whole discussion, but it is complicated by being in part telephone
conversations, in part subject to non-disclosure requirements, and
part very technical and full of alphabet soup.

Some background on the participants. One group is the (by now) gang
of five interested in data-fusion/information-fusion, why the two are
so often confused, and uses in better Government and Defense. All of
the gang of five work in more or less defense related institutions,
and are to varying extents "thinking the unthinkable" so as to keep
it from happening.

There is also a "gang of three", less active in the discussion, one
being also a member of the gang of five. The gang of three is mainly
interested in making better Management Infomation Systems organisations,
because the present breed work with great expense, low speed, and
terrible cost effectiveness.

The thing we all have in common with PCT is feedback paths, the
information that moves along them and the way people behave when
the information is perceived. The gang of five is concerned
with military matters mainly because there is more experimental
data and recorded detailed history, so that we can test theories
best in the military context. The long term agenda is beating swords
into plowshares, even though we still each keep a sword hung by the door
just in case. We feel there is compelling evidence that CEOs and
military commanders have rather largely the same problems except
that being laid off is preferable to being killed or badly wounded.
But each is a likely consequence of utter defeat of your unit.

The gang of three are concerned with people and information. There
is no need to cut and weld pipe, or to shoot guns. This makes all
the theory simpler. Our experimental evidence is about a quarter
century of collective experience in the software business. This has
advantages and disadvantges. Having grown up with the industry we
can see the evidence of evolution in the embryology. But three is
a small sample.

···

---------------------------------------------------------------------
[Gabriel to Bill Cunningham at some ungodly hour of the morning]

This continues our afternoon TELCON about the problem of search/alert modes
in PCT, and also scratches my itch about discovery being a collective
phenomenon of populations of ECSs.

For those readers who did not share the earlier discussion, Bill and I
have been discussing the question of how a single control system can
ever "discover" anything, as distinct from "tracking" something already
perceived, and whether this is a fundamentally absent issue in PCT, so
we need to add it to the theory of advice to Cdrs.

I think the following suggests an interesting line of thought.

An N'th order control system is exactly mimiced by and exactly mimics,
i.e. it is precisely represented by a set of N coupled first order
ordinary differential equations (ODEs) - Bill Gear's lectures 101.
(By the way, Bill is now directing a research consortium in Tokyo,
funded by the Japanese Govt.)

A set of N coupled first order ODEs for the kind of N we usually can
think about - say N <= 50 has nothing to do with algorithms to examine
search spaces, that is to say, PCT seems to have little to tell us
intuitively about discovery unless there is some unknown mathematics,
which seems unlikely. N~10**4, N~10**7, or N~10**9 are far enough outside
ordianry experience to make an intuitive argument straight out of PCT
unlikely to carry any weight.

How to make a connection?? Well, if you uncouple the ODE's a bit,
you can make them essentially all the same (good neuroanatomy),
and have each one generate a path through state space determined
by initial conditions. Different conditions, different paths.

Now, suppose we take a leap of faith, and guess that a collective
set of ECSs (i.e. first order ODEs) are given a set of different
initial conditions such that for any point P in the search space
there exists at least one path passing within distance epsilon
of P, and we can make epsilon as small as we want by taking
enough ECSs (concentrating enough attention on the problem).

If as an ECS traverses a path, it notes the minimum value of an
objective function on that path, a vote at the end of the collective
behaviour can tell us which set of initial conditions led to the
closest approach to the global minimum of the objective function
in the search space defined by the set of all sets of initial
conditions tried, one for each ECS in the ECS's assigned to the
task of discovery by focussing attention. This vote is good
neuroanatomy - it's done by a voting tree, and is not very far
from the Kanerva neuroanatomical model for memory.

This model also explains some of Joe's and Bill's observations.

If you don't have a good set of Bayesian priors (initial conditions)
Joe's observation that you can't cope with soemthing new on short
notice follows, because you have not built the setup to explore
that part of the search space.

The thing about false lock, only local minima, and misguided
prosecution at law, is a case where you made a first pass through
the space, decided that you could abandon a large part of the
initial condition sets you tried and concentrate on the promising
ones, which turned out not to go anywhere near the global minimum.
This is one Bill and I have debated, ever since I raised the case
of the possible terrorist found through a pistol permit. Perhaps
we have a resolution.

Well, that's my 4AM insight for today. Try it on Stark/Vincennes,
and the other litany of INTEL failures. Note by the way that it's
a devil and deep sea problem. If you don't focus in on the problem
you don't have the resources to solve it. If you focus in too early
you have a high probability of being on the wrong track. Is this
a type 1 vs type 2 error issue too. I have a strong feeling that
the type1/type2 tradeoff in Sequential Statistics a la Abraham Wald
is at the bottom of a lot of things we see.

This is the spot incidentally where the exceptionally able strike
team of software developers always beats the Mongolian horde of average
ones, and it's why most software these days is so very bad.

It also seems to me to bear on the distinction that Bill and Tom make
between managers and leaders, and what Bill keeps repeating is different
about a 4*.

                John 921211 04:32 CST
-----------------------------------------------------------------------------

Adopting CSGL date/time conventionsince this discussion likely to expand.

:From Bill Cunningham 921211.0830 EST:

(John Gabriel 921211.0432 CST)

John, Ifind your suggestion "not implausible" and very attractive, on the
surface. The idea is certainly convenient, but I haven't a clue whether
it's just a convenient model or a DEEP explanation of what actually happens.

My first thought is "What do I tell a mathematical illiterate?" I have to do
that sooner or later.

ANS: "Keep your options open as long as you possibly can, knowing that as
soon as you exclude new ideas you are committed to one/few. Actively
seek as many working hypotheses as possible. Use brainstorming.
Be very slow to throw out old data."

I think Joe would say play ALL the data (especially negative) against
ALL the hypotheses, and if selection of a promising minimum doesn't
lead to satisfactory closure--reopen the search by generating new
hypotheses.
--------------
I think an action oriented commander would argue that he didn't have time
for that b------t, that most of the time available had to be spent planning
the execution.

My response would be, remember your METT-T doctrine. Mission, enemy, tactics,
terrain--and time available. Start with mission and time available and
identify drop dead points when you MUST make a decision.

I'd remind of Stark/Vincennes and say make your drop dead decision to
defend the ship if you can't find contrary evidence by t=tcrit. Put
that order into motion and search like hell for the disproof. But
don't fire til you see the whites of their eyes.

I also think I'd drag in the fratricide problem. If I expect no friendlies,
my range of hypotheses is limited to one and my pucker factor reduces my
search time accordingly. But if I expect an ambivalent situation
and my mission is say hostage rescue, then I have to allow more decision
time--even at the risk of getting shot at. I can only justify that if
the mission warrants. So the critical search is one for those prior
constraints that can't be changed. And that also restricts the search
space.

Or I might cite yesterday's XXX report and say you have to ask the right
question.Shaping the debate so it starts with wide range and narrows
"efficiently/effectively" is the ART of command. I'll bet I can find
that in Druzhinin & Kontorov.It certainly fits with my telephonic
comments that the great commanders seem to be in a very wide search mode,
and then focus like a hawk on some issue. They are bimodal to the max.

Along the same lines, how about the restraint of the Marines in Somalia?
I'm referring to the incident where the blonde female ABC reporter found
herself face down in the sand with a rifle in her back. The reporters
had all their lights on the Marines, and a few shots were fired. My
first reaction, if illuminated and under fire, would be to eliminate the
illuminator whilst my buddy sprayed the area to discourage any aimed
fire that might occur until the lights went out. Those guys done good.
__________________

With respect to the strike team of software writers vs the horde, I'd
comment that, intellect aside, the strike team is deliberately in a search
mode and the horde is deliberately in track mode. Once you have a horde,
the problem becomes one of work breakdown structure and you've already
constrained the search space. Now, it certainly helps if the strike
team is made up of associative thinkers with sufficiently different
backgrounds to enrich the search space and sufficiently common background/
focus to communicate efficiently and place SOME constraint on the search.
Rosabeth Moss Kantor (sp?) writes that innovative organizations have very
good lateral communication whilst the stagnant ones are highly vertical.
__________________

Now the question for Martin. How does this square with your dual channel
models? Particularly the fast association/slow detail?

Bill C
------------------------------------------------------------------------
Subject: Reply to Bill C's note
Status: R

[From Gabriel 921211 11:12CST]
A wondrous bright light Bill, and nobody shoving an M16 in your back
either. Let me respond mathematically, but without formality - I don't
get to do that often which is why you shone such a great light on the
subject. Usually it's not possible to set out the idea without the
formalism. And I truly enjoy talking about my real research to
somebody other than myself.

The reason why my 04:30 note is appealing but not proven is that it's
an imagined neuroanatomical implementation of the mathematical
abstractions of search spaces, ordinary differential equations,
optimal decision theory, and Hamiltonian system dynamics.

There are lots of other possible implementations, and one can only
tell between them by experiment outside the realm of the common
mathematical abstraction. That is to say, the mathematics has degrees
of freedom which are lost once you go a physical system of any kind.
I can implement the procedure in neuroanatomical wetware (perhaps),
organisational wetware (certainly but I have serious constraints
about who I choose to put in the organisation if I want it to be
cost effective) silicon (at great capital expense, but practically
zero cost of replication) software (same as silicon except easier
to change when I find a mistake).

BUT since the mathematical abstraction has lots of properties we
can observe to be approximately true outside the neuroanatomical
black box, and actually observe in operations like YYY,
it's useful independent of the details of wetware or software or
silicon or organisations. Its usefulness lies in the observed
fact that it's a detailed abstract implementation of the phenomena
we observe, or a "model." It's predictive, it elicits strong
recognition reflexes from those who know the physical implementations,
and so on. That is to say, it has the properties of PCT. The
basis for PCT in neuroanatomy is not bad, but certainly not
conclusive for anything much bigger than the Moths and the Bats.
The real justification for PCT is that Bill P. can build the Little Man
and Little Arm models in software, and they have lots of the
properties of their analogues in wetware, but they are built from
the piece parts of the PCT premise.

Now, there are some other universal abstractions that have been very
useful, and it's worth giving them a passing glance because they
are meta-meta.....meta theories. Hard to use, but very powerful.
And they have to do with constricting degrees of freedom which
is one of the things that interest us.

If you look at planetary dynamics for instance, you find there are
some transformations of coordinates that don't change the equations
of motion. For example those that arise because gravitation is a
central force, and so, although orbits are not circular, one ellipse
is as good an orbit as another of the same size but rotated in 3 space,
and two ellipses with the same T**2/A**3 are both equally good. And
that (r**2)*(d theta/dt) is the same at all points in the orbit.

These facts, observed by Kepler, can be made to yield some astonishing
results. That acceleration is directed towards the sun, that it is
inversely proportional to r**2, and that the constant of proportionality
is the same for all the planets.

Now we make the great gedanken experiment. Introduce an imaginary
new planet, and assume the same things are true. We can at once
conclude the usual statements of Newton's Laws and Gravitation are
true for this imaginary new planet, and so on .... BUT the conclusion
depends on two things, the Bayesian priors, and the assumption that
they are true for the new planet - which was OK until Einstein, and
still good enough for Govt. work.

Now, when we do the same kind of gedanken experiment for the next
level up in the hierarchy, we arrive at the idea of symmetry
operators for the system, and the theorem that initial conditions
and the symmetry operators alone determine an orbit.

This has a counterpart in psychology, it's Gestalt theory, and
invariants, and although I still don't really know what reorganisation
is, I suspect it has to do with throwing away some invariants, as
distinct from simply changing initial conditions.

Now back to software, which can mimic any physical system, so it's
potentially a useful abstraction too. If we have a program

        y = f(x)

if it's going to be useful it had better yield reproducible
results, so that for each input x, there is only one possible output y.
That is to say, f is a many:1 mapping - several x may each give the same
answer, but a particular x better not give different answers Mon Tue Wed,
from Thur Fri Sat - on the seventh day f takes a rest.

This divides the set of all possible x into subsets, such that for every
x in a subset, y=f(x) is the same.

You can see there are all kinds of symmetry lollygagging around -
the "brotherhood" of all the x giving the same y is just a restriction
of necessary varieties from the set of all x to the set of all y, i.e.
a Ross Ashby necessary variety. If you leave a few brotherhoods out
of your consideration of possible inputs, you have left out some
y values, i.e. possible outcomes of running the dynamical system
(campaign, TACWAR model....) represented by f().

Better get off my soapbox before it breaks. Merry Christmas to All.

This gets mathematical at about the same rate as the deterioration of
enemy war capacity in Bill's example from strategic bombing, where B29s
mined the Straits of Tsushima and damaged the Japanese prosecution
of the war in the Pacific.

But the idea of symmetries, brotherhoods of scenarios all leading to
(nearly enough) the same outcome, and Gestalt, and human perception of
same, and search amongst them is very much our business.

                John

PS I think I just did Ross Ashby wrong. The set of all brotherhoods
is the necessary variety, and we need just one representative from
each to get all the possible outcomes. But, if a brotherhood ain't
really a brotherhood, i.e. we've put two different phenomena in the
same class (f(x) can be a classifier) we may get an unpleasant
surprise if the actual scenario we face is not the one belonging
with the representative we chose for the "brotherhood". For the
mathematician, there is less error in the abstraction than in
the implementation - that's how I did Ross A. wrong, and why
there are bugs in programs.

-------------------------------------------------------------------------
[Cunningham to Gabriel]

Ref my 0900 response to John's 0432 post

Any military commander will jump down your throat to tell you that
a halfway right solution boldly and fully executed is far more likely
to succeed that the best solution implemented 30 milliseconds later.
This is so well ingrained that search beyond the the first local minimum
isn't likely.

This takes us right back to the information campaign, and I suspect
there is a commercial world counterpart. Given the fog and friction of war,
the above approach is more likely to catch the adversary unable to perceive
and respond to the action boldly taken. The friction problem places a
premium on early decision. The guy who first perceives more or less the
right situation and who acts immediately upon that perception wins every
meeting engagement.

That also applies to predator/prey or breeding competition in nature. So
good old Ma Nature fosters evolution of a two-channel fusion system. One
is fast acting, finding approximate solution AND IMMEDIATELY ALERTING THE
PROPER EFFECTORS/OVERRIDING ONGOING CONTROL. The other is the slower, more
precise tracker. Rather obviously, you can't survive without both qualities--
and neither can our commander.

I guess that argues nature hasn't selected for processing negative information.
Wonder why. I'll bet because the system is optimized for real time response
to own sensors. We can certainly point to species alerted by negative
info of "no birds singing==danger"

One more great argument for the information campaign is that we can process
much more, and more quickly. A stated goal should be to extend search time
to permit selection of a global minumum in time to execute fully. We're
not actually extending the search time, but exploring more possibilities in
less time. Now, that's going to take a change in commander's mindset!!!!
And that's a training issue, once rest of system is in place.

Now, I feel more comfortably aligned with Martin's two channels.

Bill C.