[From Chad Green (2014.08.21 15:03 ET)]
Erling, I wanted to let you know that I just shared your observations below with Francis Heylighen of the Global Brain Institute (http://pespmc1.vub.ac.be/HEYL.html )
since he had expressed a strong interest in Hawkins’s book On Intelligence.
Best,
Chad
···
Chad T. Green, PMP
Program Analyst
Loudoun County Public Schools
21000 Education Court
Ashburn, VA 20148
Voice: 571-252-1486
Fax: 571-252-1575
Web: http://cmsweb1.loudoun.k12.va.us/page/1966
“The future is already here—it’s just not very eevenly distributed.â€? - William Gibson
From: Control Systems Group Network (CSGnet) [mailto:CSGNET@LISTSERV.ILLINOIS.EDU]
On Behalf Of Erling Jorgensen
Sent: Saturday, November 16, 2013 2:19 PM
To: CSGNET@LISTSERV.ILLINOIS.EDU
Subject: B:CP Chapter 15: Memory & Address Signals
B:CP Chapter 15: Memory & Address Signals
[From Erling Jorgensen (2013.11.16 1100EST)]
I’m a little behind in keeping up with the chapters.
There is a piece of Bill Powers’ chapter on memory that is only slowly coming into resolution for me. It strikes me as a critical piece, but I haven’t seen much discussion of it. By way of forewarning, this turned into an extended essay,
touching first on Power’s understanding of reference signals, and then examining Jeff Hawkins’ treatment of the neurophysiology of the cerebral cortex of the brain.
I’m using the first edition of Behavior: The Control of Perception (1973).
On p. 217, Bill says, almost in passing:
“This, incidentally, solves the problem of translating an error in a higher-order variable into a specific value of a lower-order perception, a problem that has quietly been lurking in the background.�
I’d like to unpack what I think is going on in this passage and in the surrounding pages of B:CP.
First, what problem is being solved? And why is it a problem in the first place? To give it a name, it is the issue of commensurability. Wikipedia defines commensurability as follows: “Two concepts or things are commensurable if they
are measurable or comparable by a common standard.�
It’s the problem of interface. How will two things communicate in a way that is understandable to both sides? What language or units or frames do they hold in common?
At this juncture in the history of Perceptual Control Theory, this may seem a trivial problem. After all, we have already had numerous rigorous demonstrations of principle, in the various computer simulations that show just how well a
negative-feedback control architecture can operate. There are even simulations of hierarchical arrangements of control loops, where different classes of perception are being defined by the perceptual input functions, but stable control is achieved nonetheless
as higher levels set reference standards for lower level loops.
In all these simulations, commensurability between the different levels is presupposed. It’s handled by the computer. The equations assign numbers to the different variables, and the computations proceed straightaway. It is why we operate
with an assumption that variables perform as scalar quantities, even if they may represent a more complex form of perception.
That scalar assumption is itself based on a decision that Bill sets out on page 22 of B:CP:
“As the basic measure of nervous-system activity, therefore, I choose to use neural current, defined as the number of impulses passing through a cross section of all parallel redundant fibers in a given bundle per unit time.� (italics omitted)
Bill was an engineer, and he knew you had to pay attention to the units and definitions of the variables, so that they could be properly compared and manipulated. So again, the issue of commensurability seems to be solved, by how Bill
sets out his definitions. When we lay out equations for how control loops operate, so says he, we’re talking about “neural currents.� That is the common measure that will make the different layers commensurate.
This is a credible (and insightful) way to map the concepts of engineering control theory onto the biological mechanisms of organisms with their nervous systems. Whenever we build maps, we use what can be called “convenient fictions,�
which simplify but hopefully retain essential features of the territory we are trying to map.
The problem that Bill knew was “quietly lurking in the background� was that it can sometimes be hard to squeeze the underlying details into the convenient fictions we choose for our map. In the case of nervous system operation, there is
a sizeable risk of incommensurability, as Bill rightly understood.
Think about this. Neurons operate by means of neural impulses and frequency of neural firing. They can synapse laterally onto other nearby neurons, and affect the patterns of firing of those cells. This creates the possibility of both
spatial and temporal patterns occurring within clusters of cells. Some of those clusters may form what we in CSG call, somewhat sweepingly, “perceptual input functions.� (Notice, a lot of unspecified details are hidden behind our terms here.)
Now, when a reference signal comes from a higher level, exactly how is it to specify what patterns will be the standard of reference, to be reproduced by the control loops at that point, and implemented by the control loops cascading down
from that level in the hierarchy? This is no trivial problem. And the way Bill approaches it I think is quite brilliant.
Returning to p. 217 in the Memory chapter of B:CP (1st ed.), Bill introduces a key postulate:
“We will assume from now on that all reference signals are retrieved recordings of past perceptual signals.� (italics omitted)
This is the first step in handling incommensurability. If the pattern of a current perceptual signal is being compared to the pattern of a past perceptual signal, at the same point in the brain where that signal first arose, then at least
apples are being compared to apples.
Bill goes on to draw the implication of that postulate, continuing on p. 217:
“This requires giving the outputs from higher-order systems the function of address signals, whereas formerly they were reference signals. The address signals select from lower-order memory those past values of perceptual signals that
are to be recreated in present time.�
This is a part of PCT we rarely discuss. Reference signals are address signals. It is like saying, “Give me more of “_that_�.� The reference doesn’t even need to know what “that� signifies. The “address� here is a way of ensuring the
reference signal is getting to the right locale, and by taking the form of a retrieved perceptual signal, both the reference and the perception are guaranteed to be speaking a comparable “language�.
There is a final step that Bill takes, which I think is highly significant. On pp. 212-213, Bill speaks of associative addressing:
“In associative addressing, the information sent to the computing device’s address input is not a location number, but a fragment of what is recorded in one or more locations in memory� (p. 212). He continues, “Any part of the stored information
can be used as an address, a match resulting in replay of the rest of the information in that memory unit� (p. 213).
This results in a pointer (an “address�) that is already contextually relevant. In other words, “Give me the rest of “that�.�
So then, a reference signal has the following features:
a) It only has to be an address signal, essentially signaling more of “that.�
b) It is a recorded copy of what it is looking for, leading to a common “currency� for comparison between perception and reference.
c) It only needs to carry a fragment of what it is looking for, because it will result in the replay of “the rest� of what it is looking for.
This is quite an amazing solution set for keeping signals commensurate between different levels of control and perception in the brain. By giving reference signals these features back in 1973, Bill was implicitly making scientific predictions
about the brain’s wiring and its functional characteristics.
Here is where it really gets interesting.
Jeff Hawkins and his colleagues at the Redwood Neuroscience Institute have articulated arrangements of cells in the neocortex of the brain that may be able to produce the features laid out under Bill Powers’ proposal.
Hawkins’ main published work in this area (with Sandra Blakeslee) is On Intelligence (2004). His goal is to discern a neurophysiologically plausible way to understand intelligence, and how actual brains construct and carry out that process.
My reservation, as I have studied Hawkins’ proposals, is that he seems to put a lot of stock in the construct of “prediction.� The core of his theory is what he calls “the memory-prediction model� (p. 5). Nonetheless, some of what his
model uses prediction for is not that far off from what Power’s model of PCT ascribes to “control.� So there may be a family resemblance going on.
For instance, Hawkins gives an analogy for what a higher region is saying to a lower region when it sends a prediction down the hierarchy: “This is basically an instruction to you about what to look for in your own input stream� (p. 136).
That sounds almost like providing a reference for a specified perception. Or consider how Hawkins talks about motor commands, (which we in CSG consider commands for certain perceptual consequences of motor behavior): “As strange as it sounds, when your own
behavior is involved, your predictions not only precede sensations, they determine sensation. …Thinking, predicting, and doing are all part of the same unfolding of sequences moving down the cortical hierarchy� (p. 158). To speak of determining sensation
is to start to speak of control.
Be that as it may, Hawkins is one of the few to try to show the broad functional utility of the “wetware� in the brain. In other words, he tries to spell out not just what brains do, but how they may be doing it.
The part that I think relates to Bill Powers’ idea of reference signals as memory address signals is when Hawkins, in chapter 6 of On Intelligence, refers to cortical inputs that relay the “name� of a given sequence. It’s a complicated
discussion, but let me see if I can walk through it.
The physiological heart of Hawkins’ work is the neocortex, in particular the almost modular arrangement into cortical columns. His research institute has tried to distill the essence of a vast literature on the horizontal and vertical
structure and synapses among the six cellular layers of those columns. He considers the neocortical columns as the basic functional units, which construct invariances and sequential patterns out of the neural firings from a flow of sensory experience.
Let me give a flavor for what Hawkins is trying to do. In the process, I’ll raise various features that may be consistent with a PCT view of what is going on. These are just a series of quotes from Hawkins (2004), to hopefully provide
context for what I’ll present below:
** “Higher regions of your cortex are keeping track of the big picture while lower areas are actively dealing with the fast changing, small details� (p. 127).
** “All objects in your world are composed of subobjects that occur consistently together; that is the very definition of an object. When we assign a name to something, we do so because a set of features consistently travels together�
(p. 126).
** “(E)ach cortical region has a name for each sequence it knows. This “name� is a group of cells whose collective firing represents the set of objects in the sequence� (p. 129).
** “So whenever I see any of these events, I will refer to them by a common name. It is this group name, not the individual patterns, that I will pass on to higher regions of the cortex� (p. 129).
** “By collapsing predictable sequences into “named objects� at each region in our hierarchy, we achieve more and more stability the higher we go. This creates invariant representations� (p. 130).
** “The opposite effect happens as a pattern moves back down the hierarchy: stable patterns get “unfolded� into sequences� (p. 130).
** “(T)he unfolding pattern is not a rigid sequence, but the end result is the same: slow-changing, high-level patterns unfolding into faster-changing, low-level patterns� (p. 132).
In the last quotation listed above, it is worth noticing the right timing relationships, which would be needed for stable hierarchical control. Higher level patterns would change more slowly, lower level patterns would change more quickly.
Without those relative time differentiations, you cannot obtain stabilized control within a hierarchical system.
With those kinds of components, Hawkins identifies what he considers the functional job of cells at various layers of the cortical column. For instance, he notes: “Converging inputs from lower regions always arrive at layer 4 – the main
input layer� (p. 1411). And because of synapses to other layers within the column, it seems “the entire column becomes active when driven from below� (p. 148). In PCT, we might think of the columnar pattern of firing as constructing a perception out of the
ascending inputs coming from lower levels of perception in the cortex.
In a similar manner, Hawkins notes: “Layer 6 cells are the downward-projecting output cells from a cortical column and project to layer 1 in the regions hierarchically below� (p. 141). From a PCT perspective, we’re starting to see the
construction of input and output functions, including their possible locale within the cortical columns. He even makes a fascinating aside, a bit later on: “(I)n addition to projecting to lower cortical regions, layer 6 cells can send their output back into
layer 4 cells of their own column. When they do, our predictions become the input. This is what we do when daydreaming or thinking� (p. 156). Seems like we’re getting close to a physiological locale for what in PCT we have called “the imagination connection.�
Let’s shift over to aspects that may represent reference signals, or more specifically Powers’ notion of “address signals.� Hawkins speaks of “two inputs to layer 1. Higher regions of cortex spread activity across layer 1 in lower regions.
Active columns within a region also spread activity across layer 1 in the same region via the thalamus� (p. 146). He goes on to suggest the functional significance of these two inputs. “We can think of these inputs to layer 1 as the name of a song (input
from above) and where we are in a song (delayed activity from active columns in the same region)� (p. 146). This is remarkably similar to what I raised above about a reference signal being a contextually relevant pointer.
So then, there are three main streams of activation in Hawkins’ schematic layout: “converging patterns going up the cortical hierarchy, diverging patterns going down the cortical hierarchy, and a delayed feedback through the thalamus� (p.
147). He further specifies how these streams interact.
He states, “(H)alf of the input to layer 1 comes from layer 5 cells in neighboring columns and regions of the cortex. This information represents what was happening moments before� (p. 149). He also notes, “The other half of the input
to layer 1 comes from layer 6 cells in hierarchically higher regions. …It represents the name of the sequence you aree currently experiencingâ€? (p. 149). This is the portion I paraphrased above with the analogy, “Give me more of that.â€?
Hawkins’ conclusion is as follows: “Thus the information in layer 1 represents both the name of a sequence and the last item in the sequence. In this way, a particular column can be shared among many different sequences without getting
confused. Columns learn to fire in the right context and in the correct order� (p. 149). Thus, there is commensurate communication, which is not getting confused as to where and how it is talking. It uses the right name, in the right context, using a common
currency to avoid confusion.
There are a few fine points that fit into the picture. The cortical columns seem to have ways of storing the sequences they have constructed. Once a column has become active, it seems through synaptic strengthening that cells in layers
2, 3, and 5 can learn to keep firing, even when the initiating input via layer 4 is no longer active. That would suggest that they can be summoned (or retrieved), given the right naming input from above.
It appears as though the “name� is a key way that columns are communicating vertically up and down the hierarchy. Going up the hierarchy, “(w)hen a layer 4 cell fires, it is “voting� that the input fits its label� (p. 147). Its synapses
get the whole column active, and further projections continue up the hierarchy. Furthermore, there is lateral inhibition of nearby columns, to further shape and refine the name, so that higher regions do not get a jumble of possible names from below.
Going down the hierarchy, a layer 6 cell does the communicating. Hawkins suggests what layer 6 may be saying: “I speak for my region of cortex, …my job is to tell the lower regions of cortex what we think is happening. I represent our
interpretation of the worldâ€? (p. 154f.). That descending output is projected (via layer 1) to cells of layer 2 at the next lower cortical level, which “learn to be driven purely from the hierarchically higher regions of cortex. …The layer 2 cells would therefore
represent the constant name pattern from the higher region� (p. 153). Here we come incredibly close to seeing a reference signal in action, as constrained by the features that Powers laid out.
Hawkins summarizes the overall scheme as follows: “Every moment in your waking life, each region of your neocortex is comparing a set of expected columns driven from above with the set of observed columns driven from below� (p. 156). To
bring this closer in line with Perceptual Control Theory, we would call the signals from above as the “desiredâ€? input, not merely the “expectedâ€? set. To use a PCT formulation, perceptions get compared to §references. Either way the “bottom-up/top-down
matching mechanism� (p. 156) enables sending the right contextually relevant name to the next lower region of the hierarchy. That fragment of what is wanted will then be matched and unfolded into relevant patterns of what is desired, all the way down.
The striking part for me is that Jeff Hawkins’ neurophysiological way of understanding how the cortex works is quite compatible with the broad functional contours that Bill Powers laid out, three decades earlier. I have not seen indications
that Hawkins is familiar with Powers’ approach, although his institute has obviously surveyed a wide swatch of theory and research into the brain. If there is no direct familiarity, I think that would make this potential overlap between these two theories
all the more remarkable.
Comments, criticisms, and questions are welcome.
All the best,
Erling
NOTICE: This e-mail communication (including any attachments) is CONFIDENTIAL and the materials contained herein are PRIVILEGED and intended only for disclosure to or use by the person(s) listed
above. If you are neither the intended recipient(s), nor a person responsible for the delivery of this communication to the intended recipient(s), you are hereby notified that any retention, dissemination, distribution or copying of this communication is strictly
prohibited. If you have received this communication in error, please notify me immediately by using the “reply” feature or by calling me at the number listed above, and then immediately delete this message and all attachments from your computer. Thank you.
