Steepest Descent Based Adaptive Step-Size Reinforcement Learning And Application Of Reinforcement Learning In Brain Fiber Tracking

Loading...
Thumbnail Image
Date
2015-06
Authors
Zadeh, Khosrow Amiri
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Incremental temporal difference (TD) models of reinforcement learning (RL) offer powerful techniques for value estimation in problems with sequential-selections operation in the areas of machine learning, adaptive control, decision support systems and industrial/autonomous robotics. Since these models mostly operate based-on the Gradient-descent learning, they are presented with certain limitations especially in their step-size settings. Gradient descent TD (GTD) learning models are sensitive to the type of observations and parameter settings. These limitations are more pronounced in on-line applications where GTD models are expected to be adaptive under non stationary observations. These issues indicate a gap; the existing TD models are not adaptive. It means that, there is the parameter dependency problem in the incremental GTD algorithms. Consequently, presenting a set of enhanced models that eliminate or minimize these issues is desirable. This thesis presents a new class of TD model in reinforcement learning. The proposed class including separate TD learning models which are governed by the steepest descent (SD) optimization approach. The major focus is on the optimal computation of the stepsize in the incremental TD learning. Experimental results indicate that, the proposed models converge faster than similar existing models. The term, faster, refers to the rate of convergence of error curves towards their minimum level are happened by minimum trials. This improvement, depending of each model, is between 40% and 70%, while the proposed models maintain the linear complexity similar as standard TD model. Besides, the new models do not depend on the step-size settings. These improvements indicate that the presented TD models are adaptive and may fill the gap. Extending a reinforcement learning approach to the brain fiber tracking process, which is known as tractography, is the second objective of this thesis. Because the tractography process behaves essentially as a sequential-selection task under uncertainty condition which is same target of the RL. The tractography process plays an important role in pre/post brain surgery studies and traumatic brain injury assessments. However, the process still suffers from long computational time where the Streamline tractography models, with short computational time, fail to precisely reveal the brain fiber profiles in the areas containing different fiber structures. These are reasons which motivated the application of a suitable RL approach to the brain fiber tracking process. Experimental results both on artificial and real datasets indicated considerable improvement in the fiber tracking process, especially in conditions of uncertainty that were challenging to other fiber tracking techniques.
Description
Keywords
Reinforcement Learning , Brain Fiber Tracking
Citation