一种强化学习行动策略ε-greedy的改进方法

李琛; 李茂军; 杜佳佳

一种强化学习行动策略ε-greedy的改进方法

引用本文：李琛，李茂军，杜佳佳.一种强化学习行动策略ε-greedy的改进方法[J].计算技术与自动化,2019,(2):141-145

摘要点击次数: 1154

全文下载次数: 0

作者	单位
李琛，李茂军，杜佳佳	（长沙理工大学电气与信息工程学院，湖南长沙 410114）

中文摘要:强化学习作为机器学习中的一种无监督式学习，在实际应用中的难点之一便是如何平衡强化学习中探索和利用之间的关系。在Q学习结合ε-greedy的基础上，提出了一种参数动态调整的策略。该策略是以学习者在学习过程中各状态下的学习状况为依据，实现参数的自适应，从而更好地平衡探索和利用之间的关系。同时，引入一种结合了试错法的动作删减机制，对备选动作集合进行"删减"，来提高学习者的探索效率。最后通过迷宫问题的实验仿真，验证了所提方法的有效性。

中文关键词:强化学习 ε-greedy策略探索与利用

A Modified Method to Reinforcement Learning Action Strategy ε-greedy

Abstract:Reinforcement learning，as an unsupervised learning in machine learning，one of difficulties problem in practical application is how to balance the relation between exploration and exploitation. To solve this problem，a dynamic adjustment strategyof parameter basis of Q learning combined with ε-greedy strategy is presented. This strategy is based on the learning status of agent in various states of environment in the learning process，making parameter self-adaptation，to better balance the relation between exploration and exploitation.Meanwhile，an reduction method combiningtrial and error method is introduced to delete the action sets，so as to improve the exploration efficiency of agent. The simulation resultof maze verify the effectiveness of the proposed method.

keywords:reinforcement learning ε-greedy strategy exploration and exploitation

查看全文 查看/发表评论 下载pdf阅读器