Fairly typical!
···
On 29 September 2017 at 12:34, Rupert Young rupert@perceptualrobots.com wrot
e:
[From Rupert Young (2017.09.29 12.35) ]
Very useful response Ben. My comments below.
On 29/09/2017 02:50, B Hawker wrote:
BH: The purpose of the state based approach is to allow itto solve problems that do not compliment simple error
reduction of a single controller. For example, reaching an
exit point of a room where one must bypass a wall first which
involves going in the wrong direction. Simply trying to get as
close to the exit as possible would fail, hence the weighting
of states allows it to bypass this problem by learning that
the gap(s) in the wall are key places to reach in the
trajectory. Do I think this is a sensible approach? Not
necessarily at all. A properly structured hierarchy of error
reducing nodes should solve the problem.
Do such examples actually exist yet? I've been thinking about theMountain Car problem
(https://en.wikipedia.org/wiki/Mountain_car_problem ) which would
seem a good and reasonably simple learning problem, of continuous
variables, that PCT should be able to address, and is analogous to
your problem above. Do we have a hierarchy of error reducing nodes
that can solve that problem?To actually help address theoriginal point, reinforcement learning could be used in a
variety of ways. Would it be effective? Probably not. The
biggest underlying conceptual difference that I can see is
that RL assumes that behaviour is state or action driven,
whereas PCT and other closed loop control theories put
behaviour as an emergent property of the control system’s
response to a problem over time. Trying to put action or
state spaces into PCT anywhere will be problematic.
I guess this RL assumption comes out of the trad AI problem solvinglegacy which involved discrete state spaces. PCT still needs to
provide ways of solving such cases, which would address the
high-level reasoning (or program level) aspects of cognition.
Perhaps its a matter of re-framing state spaces and actions as
perceptions and goals.Could reinforcement learningbe used to learn the value for gains? It could, but it would
be poor results and overkill. Could reinforcement learning
combined with Deep Learning act as a reorganiser for a PCT
hierarchy? Yes, but I think as alluded to before, it should
be clear where needs reorganising anyway if the rate of
error is high. You don’t need RL or DNN (Deep Neural
Networks) to do that. Could Reinforcement Learning combined
with Deep Neural Networks act as a perception generator?
That’s a much more plausible possibility, but I don’t know
where you’d start. It could definitely learn to identify
patterns in the world that relate to specific problems, and
then a PCT hierarchy could incorporate it and minimise
error. After all, if you have the right perceptions, HPCT
should be more than sufficient. It’s where the perceptions
come from that I think is the thing PCT doesn’t answer.
Yes, I think that although there is some work on learning gains (armreorg in LCS3), what is lacking is how perceptual functions are
learned. Though we should be able to take techniques from neural
networks for this; autoencoders or convolutional nns.On a slightly different matter I came across this paper which usesRL for adaptive control systems, rather than discrete state spaces,
which is much closer to what PCT normally addresses."Reinforcement learning and optimal adaptive control: An overviewand implementation examples"
https://pdfs.semanticscholar.org/b4cf/9d3847979476af5b1020c61367fa03626d22.pdf
I'm still looking at it so not sure what it is doing yet. Are youfamiliar with this approach?
Rupert