一种强化学习行动策略ε-greedy的改进方法
    点此下载全文
引用本文:李琛,李茂军,杜佳佳.一种强化学习行动策略ε-greedy的改进方法[J].计算技术与自动化,2019,(2):141-145
摘要点击次数: 1102
全文下载次数: 0
作者单位
李琛,李茂军,杜佳佳 (长沙理工大学 电气与信息工程学院湖南 长沙 410114) 
中文摘要:强化学习作为机器学习中的一种无监督式学习,在实际应用中的难点之一便是如何平衡强化学习中探索和利用之间的关系。在Q学习结合ε-greedy的基础上,提出了一种参数 动态调整的策略。该策略是以学习者在学习过程中各状态下的学习状况为依据,实现参数 的自适应,从而更好地平衡探索和利用之间的关系。同时,引入一种结合了试错法的动作删减机制,对备选动作集合进行"删减",来提高学习者的探索效率。最后通过迷宫问题的实验仿真,验证了所提方法的有效性。
中文关键词:强化学习  ε-greedy策略  探索与利用
 
A Modified Method to Reinforcement Learning Action Strategy ε-greedy
Abstract:Reinforcement learning,as an unsupervised learning in machine learning,one of difficulties problem in practical application is how to balance the relation between exploration and exploitation. To solve this problem,a dynamic adjustment strategyof parameter basis of Q learning combined with ε-greedy strategy is presented. This strategy is based on the learning status of agent in various states of environment in the learning process,making parameter self-adaptation,to better balance the relation between exploration and exploitation.Meanwhile,an reduction method combiningtrial and error method is introduced to delete the action sets,so as to improve the exploration efficiency of agent. The simulation resultof maze verify the effectiveness of the proposed method.
keywords:reinforcement learning  ε-greedy strategy  exploration and exploitation
查看全文   查看/发表评论   下载pdf阅读器