引用本文:李伟科 1,4 ,岳洪伟 1 ,王宏民 1 ,杨 勇 3 ,赵 敏 2 ,邓辅秦 1,2,3.基于改进强化学习的模块化自重构机器人编队[J].计算技术与自动化,2022,(3):6-13
摘要点击次数: 295
全文下载次数: 0
李伟科 1,4 ,岳洪伟 1 ,王宏民 1 ,杨 勇 3 ,赵 敏 2 ,邓辅秦 1,2,3 (1. 五邑大学 智能制造学部, 广东 江门 5290202. 深圳市人工智能与机器人研究院, 广东 深圳 5181163. 深圳市杉川机器人有限公司, 广东 深圳 5180064.中电科普天科技股份有限公司研发中心广东 广州 510310) 
中文关键词:模块化自重构机器人  强化学习  多机器人  编队
Formation of Modular Self-reconfigurable Robots Based on Improved Reinforcement Learning
Abstract:Based on the traditional reinforcement learning algorithm, due to a lack of prior knowledge of the surrounding environment, the modular self-reconfigurable robot will randomly select actions, resulting in a waste of iterations and slow convergence. A two-stage reinforcement learning algorithm is proposed. In the first stage, based on knowledge sharing among robots, the improved Q-learning algorithm is proposed to speed up the training process and obtain the optimal Q table. In this stage, to reduce the number of iterations and improve the convergence speed of the algorithm, Manhattan distance is introduced as the reward value to guide the robot to move in the direction favorable to the center point and reduce the influence of sparse reward. In the second stage, according to the resulting Q table and the current position, each robot finds the optimal path to the specified target point and forms the specified formation. The experimental results show that in a 50×50 grid map, compared with the comparison algorithm, the algorithm successfully trains the robots to reach the specified target points, reducing the total number of exploration steps by nearly 50%. In addition, when the robots perform formation switching, the formation runtime is reduced by nearly five times.
keywords:modular self-reconfigurable robots  reinforcement learning  multi-robot  formation
查看全文   查看/发表评论   下载pdf阅读器