I finally got a chance to see your talk and it was quite impressive. My only complaint is that you gave only a cursory description of the your learning algorithm. You did show a quick set of different hierarchies that were generated by the algorithm but you didn’t say much about how the “evolution” process worked or what the criterion for working was. And so when you compared your algorithm to the RL algorithm, showing that former did much better than the latter, I didn’t know what made one better than the other. I also had questions about how the relative levels were selected and whether the output parameters were varied independently of the input parameters. Is there a place where you give some details on the two learning algorithms you compared in this study?
One little nit: In the Cartpole Controller chart at 32:00 minutes you refer to the variables IPA, IPV, ICP and ICV – which I presume refer to pole angle, pole velocity, cart position and cart velocity – as environmental variables. These are actually perceptions since they have to be derived from more elementary sensory variables that are more direct analogs environmental variables. It would be nice to show what those environmental variables are. For example, pole angle is probably derived from the x and y position of the tip of the pole relative to the bed of the cart.
Anyway, it was a very interesting talk. I especially like the way your algorithm came up with two different successful pole balancing hierarchies, one much more elegant than the other.