Steepest Descent Based Adaptive Step-Size Reinforcement Learning And Application Of Reinforcement Learning In Brain Fiber Tracking
Loading...
Date
2015-06
Authors
Zadeh, Khosrow Amiri
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Incremental temporal difference (TD) models of reinforcement learning (RL) offer powerful
techniques for value estimation in problems with sequential-selections operation in the areas of
machine learning, adaptive control, decision support systems and industrial/autonomous
robotics. Since these models mostly operate based-on the Gradient-descent learning, they are
presented with certain limitations especially in their step-size settings. Gradient descent TD
(GTD) learning models are sensitive to the type of observations and parameter settings. These
limitations are more pronounced in on-line applications where GTD models are expected to be
adaptive under non stationary observations. These issues indicate a gap; the existing TD models
are not adaptive. It means that, there is the parameter dependency problem in the incremental
GTD algorithms. Consequently, presenting a set of enhanced models that eliminate or minimize
these issues is desirable. This thesis presents a new class of TD model in reinforcement learning.
The proposed class including separate TD learning models which are governed by the steepest
descent (SD) optimization approach. The major focus is on the optimal computation of the stepsize
in the incremental TD learning. Experimental results indicate that, the proposed models
converge faster than similar existing models. The term, faster, refers to the rate of convergence
of error curves towards their minimum level are happened by minimum trials. This
improvement, depending of each model, is between 40% and 70%, while the proposed models
maintain the linear complexity similar as standard TD model. Besides, the new models do not
depend on the step-size settings. These improvements indicate that the presented TD models are
adaptive and may fill the gap. Extending a reinforcement learning approach to the brain fiber
tracking process, which is known as tractography, is the second objective of this thesis. Because
the tractography process behaves essentially as a sequential-selection task under uncertainty
condition which is same target of the RL. The tractography process plays an important role in
pre/post brain surgery studies and traumatic brain injury assessments. However, the process still
suffers from long computational time where the Streamline tractography models, with short
computational time, fail to precisely reveal the brain fiber profiles in the areas containing
different fiber structures. These are reasons which motivated the application of a suitable RL
approach to the brain fiber tracking process. Experimental results both on artificial and real
datasets indicated considerable improvement in the fiber tracking process, especially in
conditions of uncertainty that were challenging to other fiber tracking techniques.
Description
Keywords
Reinforcement Learning , Brain Fiber Tracking