Roman Ilin
Ph. D. Candidate
The University of Memphis
Department of Computer Science
Computational Neurodynamics Lab

Home


Brain is a Non-linear Intelligent Controller

The main function of the brain can be defined as "Intelligent Control".  Which means that the brain guides our actions with the goal of maximizing our long term utility.  Incidentally, this is the definition of Dynamic Programming (DP)!  The solution to any DP is given by the Bellman Equation.   However the brain does not solve equations.   The solutions to DP problems are approximated by the neural activity.  Thus, we can say that the brain solves to the Approximate Dynamic Programming (ADP) problem.  In general, ADP is referred to any attempt to approximate solution of the Bellam Equation with neural networks.

Example: 2D Maze Navigation

2D Maze navigation is a simple problem that is solved using DP.   The problem itself consists of finding the shortest path from any location on a 2D rectangular  grid to the goal cell, given that some of the grid cells contain obstacles.  The brute force approach is for each maze location to consider the length of each path from the location to the goal and find the minimum.  The number of paths is obviously exponential in the size of the maze.   And faster way of solving this problem is by computing the length of the shortest path starting from the goal to each of the cells.  We get one number for each maze cell and they form the values of the J-function, which is the Strategic Utility function of the Bellman Equation!  Now we find the path by simply following the gradient of J.

2D maze is a toy example however is shows that the ability to calculate the J function is critical.  Humans can calculate it easily for any maze configuration.  Can a neural network to the same?

It turned out that a feed-forward architecture (MLP) cannot solve the generalized 2D maze navigation.  The recurrent network however is able to solve it.  This special type or recurrent NN is called Simultaneous Recurrent neural network, or SRN.  It is even possible to design a network which can solve this particular problem.  However, we are really interested in training the network.  We achieved some encouraging results applying Extended Kalman Filter to train the SRN.