Control of nonholonomic mobile robot by an adaptive actor-critic method with simulated experience based value-functions

Rafiuddin Syam, Keigo Watanabe, Kiyotaka Izumi, Kazuo Kiguchi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

In this paper, a new adaptive actor-critic algorithm is proposed under the assumption that a predictive model (such as kinematic model for a robot) is available and only the measurement at time k is used to update the learning algorithms. Two value-functions are realized as a pure static mapping, according to the fact that they can be reduced to nonlinear current estimators, which can be easily constructed by using any artificial neural networks (NNs) with sigmoidal function or radial basis function (RBF), if all the inputs to the present value-functions are based on simulated experiences generated from the predictive model. In addition, if a predictive model is assumed to be used to construct a model-based actor (MBA) in the framework of adaptive actor-critic approach, then this type of MBA can be viewed as a network whose connection weights are composed of the elements of feedback gain matrix, so that the temporal difference (TD) learning can also be naturally applied to update the weights of the actor. Since the present method can update the learning by using only one measurement at time k, a relatively fast learning is expected, compared with the previous approach that needs two measurements at times k and k + 1 to update the actor-critic networks. The effectiveness of the proposed approach is illustrated by simulating a trajectory-tracking control problem for a nonholonomic mobile robot.

Original languageEnglish
Title of host publicationProceedings - IEEE International Conference on Robotics and Automation
Pages3960-3965
Number of pages6
Volume4
Publication statusPublished - 2002
Externally publishedYes
Event2002 IEEE International Conference on Robotics and Automation - Washington, DC, United States
Duration: May 11 2002May 15 2002

Other

Other2002 IEEE International Conference on Robotics and Automation
CountryUnited States
CityWashington, DC
Period5/11/025/15/02

Fingerprint

Mobile robots
Learning algorithms
Kinematics
Trajectories
Robots
Neural networks
Feedback

Keywords

  • Adaptive actor-critic method
  • Neural networks
  • Nonholonomic mobile robot
  • Predictive model
  • Temporal difference (TD) learning
  • Value-function

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering

Cite this

Syam, R., Watanabe, K., Izumi, K., & Kiguchi, K. (2002). Control of nonholonomic mobile robot by an adaptive actor-critic method with simulated experience based value-functions. In Proceedings - IEEE International Conference on Robotics and Automation (Vol. 4, pp. 3960-3965)

Control of nonholonomic mobile robot by an adaptive actor-critic method with simulated experience based value-functions. / Syam, Rafiuddin; Watanabe, Keigo; Izumi, Kiyotaka; Kiguchi, Kazuo.

Proceedings - IEEE International Conference on Robotics and Automation. Vol. 4 2002. p. 3960-3965.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Syam, R, Watanabe, K, Izumi, K & Kiguchi, K 2002, Control of nonholonomic mobile robot by an adaptive actor-critic method with simulated experience based value-functions. in Proceedings - IEEE International Conference on Robotics and Automation. vol. 4, pp. 3960-3965, 2002 IEEE International Conference on Robotics and Automation, Washington, DC, United States, 5/11/02.
Syam R, Watanabe K, Izumi K, Kiguchi K. Control of nonholonomic mobile robot by an adaptive actor-critic method with simulated experience based value-functions. In Proceedings - IEEE International Conference on Robotics and Automation. Vol. 4. 2002. p. 3960-3965
Syam, Rafiuddin ; Watanabe, Keigo ; Izumi, Kiyotaka ; Kiguchi, Kazuo. / Control of nonholonomic mobile robot by an adaptive actor-critic method with simulated experience based value-functions. Proceedings - IEEE International Conference on Robotics and Automation. Vol. 4 2002. pp. 3960-3965
@inproceedings{9ef7f5fb61cc4aa1a5b9b65e2c2f2bf4,
title = "Control of nonholonomic mobile robot by an adaptive actor-critic method with simulated experience based value-functions",
abstract = "In this paper, a new adaptive actor-critic algorithm is proposed under the assumption that a predictive model (such as kinematic model for a robot) is available and only the measurement at time k is used to update the learning algorithms. Two value-functions are realized as a pure static mapping, according to the fact that they can be reduced to nonlinear current estimators, which can be easily constructed by using any artificial neural networks (NNs) with sigmoidal function or radial basis function (RBF), if all the inputs to the present value-functions are based on simulated experiences generated from the predictive model. In addition, if a predictive model is assumed to be used to construct a model-based actor (MBA) in the framework of adaptive actor-critic approach, then this type of MBA can be viewed as a network whose connection weights are composed of the elements of feedback gain matrix, so that the temporal difference (TD) learning can also be naturally applied to update the weights of the actor. Since the present method can update the learning by using only one measurement at time k, a relatively fast learning is expected, compared with the previous approach that needs two measurements at times k and k + 1 to update the actor-critic networks. The effectiveness of the proposed approach is illustrated by simulating a trajectory-tracking control problem for a nonholonomic mobile robot.",
keywords = "Adaptive actor-critic method, Neural networks, Nonholonomic mobile robot, Predictive model, Temporal difference (TD) learning, Value-function",
author = "Rafiuddin Syam and Keigo Watanabe and Kiyotaka Izumi and Kazuo Kiguchi",
year = "2002",
language = "English",
volume = "4",
pages = "3960--3965",
booktitle = "Proceedings - IEEE International Conference on Robotics and Automation",

}

TY - GEN

T1 - Control of nonholonomic mobile robot by an adaptive actor-critic method with simulated experience based value-functions

AU - Syam, Rafiuddin

AU - Watanabe, Keigo

AU - Izumi, Kiyotaka

AU - Kiguchi, Kazuo

PY - 2002

Y1 - 2002

N2 - In this paper, a new adaptive actor-critic algorithm is proposed under the assumption that a predictive model (such as kinematic model for a robot) is available and only the measurement at time k is used to update the learning algorithms. Two value-functions are realized as a pure static mapping, according to the fact that they can be reduced to nonlinear current estimators, which can be easily constructed by using any artificial neural networks (NNs) with sigmoidal function or radial basis function (RBF), if all the inputs to the present value-functions are based on simulated experiences generated from the predictive model. In addition, if a predictive model is assumed to be used to construct a model-based actor (MBA) in the framework of adaptive actor-critic approach, then this type of MBA can be viewed as a network whose connection weights are composed of the elements of feedback gain matrix, so that the temporal difference (TD) learning can also be naturally applied to update the weights of the actor. Since the present method can update the learning by using only one measurement at time k, a relatively fast learning is expected, compared with the previous approach that needs two measurements at times k and k + 1 to update the actor-critic networks. The effectiveness of the proposed approach is illustrated by simulating a trajectory-tracking control problem for a nonholonomic mobile robot.

AB - In this paper, a new adaptive actor-critic algorithm is proposed under the assumption that a predictive model (such as kinematic model for a robot) is available and only the measurement at time k is used to update the learning algorithms. Two value-functions are realized as a pure static mapping, according to the fact that they can be reduced to nonlinear current estimators, which can be easily constructed by using any artificial neural networks (NNs) with sigmoidal function or radial basis function (RBF), if all the inputs to the present value-functions are based on simulated experiences generated from the predictive model. In addition, if a predictive model is assumed to be used to construct a model-based actor (MBA) in the framework of adaptive actor-critic approach, then this type of MBA can be viewed as a network whose connection weights are composed of the elements of feedback gain matrix, so that the temporal difference (TD) learning can also be naturally applied to update the weights of the actor. Since the present method can update the learning by using only one measurement at time k, a relatively fast learning is expected, compared with the previous approach that needs two measurements at times k and k + 1 to update the actor-critic networks. The effectiveness of the proposed approach is illustrated by simulating a trajectory-tracking control problem for a nonholonomic mobile robot.

KW - Adaptive actor-critic method

KW - Neural networks

KW - Nonholonomic mobile robot

KW - Predictive model

KW - Temporal difference (TD) learning

KW - Value-function

UR - http://www.scopus.com/inward/record.url?scp=0036057068&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0036057068&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0036057068

VL - 4

SP - 3960

EP - 3965

BT - Proceedings - IEEE International Conference on Robotics and Automation

ER -