Civil and Environmental Engineering and Construction Faculty Research

A Graph-Based Reinforcement Learning Method with Converged State Exploration and Exploitation

Han Li, Wenzhou University
Tianding Chen, Minnan Normal UniversityFollow
Hualiang Teng, University of Nevada, Las VegasFollow
Yingtao Jiang, University of Nevada, Las VegasFollow

Document Type

Article

Publication Date

1-1-2019

Publication Title

Computer Modeling in Engineering and Sciences

Publisher

Tech Science Press

Volume

118

Issue

First page number:

253

Last page number:

274

Abstract

In any classical value-based reinforcement learning method, an agent, despite of its continuous interactions with the environment, is yet unable to quickly generate a complete and independent description of the entire environment, leaving the learning method to struggle with a difficult dilemma of choosing between the two tasks, namely exploration and exploitation. This problem becomes more pronounced when the agent has to deal with a dynamic environment, of which the configuration and/or parameters are constantly changing. In this paper, this problem is approached by first mapping a reinforcement learning scheme to a directed graph, and the set that contains all the states already explored shall continue to be exploited in the context of such a graph. We have proved that the two tasks of exploration and exploitation eventually converge in the decision-making process, and thus, there is no need to face the exploration vs. exploitation tradeoff as all the existing reinforcement learning methods do. Rather this observation indicates that a reinforcement learning scheme is essentially the same as searching for the shortest path in a dynamic environment, which is readily tackled by a modified Floyd-Warshall algorithm as proposed in the paper. The experimental results have confirmed that the proposed graph-based reinforcement learning algorithm has significantly higher performance than both standard Q-learning algorithm and improved Q-learning algorithm in solving mazes, rendering it an algorithm of choice in applications involving dynamic environments.

Keywords

Reinforcement learning; Graph; Exploration and exploitation; Maze

Disciplines

Computer Engineering

File Format

pdf

File Size

1140 KB

Language

English

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Repository Citation

Li, H., Chen, T., Teng, H., Jiang, Y. (2019). A Graph-Based Reinforcement Learning Method with Converged State Exploration and Exploitation. Computer Modeling in Engineering and Sciences, 118(2), 253-274. Tech Science Press.
http://dx.doi.org/10.31614/cmes.2019.05807

Download

UNLV article access

Search your library

Find in your library

Included in

Computer Engineering Commons

COinS

Digital Scholarship@UNLV

Civil and Environmental Engineering and Construction Faculty Research

A Graph-Based Reinforcement Learning Method with Converged State Exploration and Exploitation

Document Type

Publication Date

Publication Title

Publisher

Volume

Issue

First page number:

Last page number:

Abstract

Keywords

Disciplines

File Format

File Size

Language

Creative Commons License

Repository Citation

Included in

Browse

Links

Digital Scholarship@UNLV

Civil and Environmental Engineering and Construction Faculty Research

A Graph-Based Reinforcement Learning Method with Converged State Exploration and Exploitation

Authors

Document Type

Publication Date

Publication Title

Publisher

Volume

Issue

First page number:

Last page number:

Abstract

Keywords

Disciplines

File Format

File Size

Language

Creative Commons License

Repository Citation

Included in

Share

Browse

Links