基于强化学习的感应钎焊机械臂动态跟随控制
投稿时间:2026-01-16  修订日期:2026-01-29  点此下载全文
引用本文:
摘要点击次数: 5
全文下载次数: 0
作者单位邮编
张逸扬 苏州科技大学大学 电子信息工程学院 215009
杨亭亭 苏州科技大学大学 电子信息工程学院 
李廷成 苏州科技大学大学 电子信息工程学院 
丁戍辰 苏州科技大学大学 电子信息工程学院 
吕 喆* 苏州科技大学大学 电子信息工程学院 
基金项目:国家自然科学(62503347,62273247);江苏省自然科学(BK20250992);江苏省高校自然科学研究项目(25KJB120009);苏州市科技计划项目(SYG2025131)。
中文摘要:在传统感应钎焊中,工艺偏差依赖人工调整,存在响应滞后、周期长的瓶颈。为此,提出了一种改进的深度确定性策略梯度(Deep Deterministic Policy Gradient, DDPG)算法。首先,该算法通过多目标的奖励函数设置,同时对跟踪与避障能力进行优化;其次,通过设计动态优先级评分机制对每个样本的学习价值进行量化;并采用双层经验回放框架平衡训练过程中的探索与利用效率。最后在六自由度(six degrees of freedom, 6DOF)机械臂(ABB IRB 2600)上的仿真表明,改进算法使训练速度提升7.5%,末端跟踪误差降低40%-60%。这验证了所提方法能够提升机械臂的学习效率与跟踪精度,为精密连接的智能化实现提供了可行方案。
中文关键词:感应钎焊  深度确定性策略梯度算法  机械臂
 
Dynamic Following Control of Induction Brazing Robotic Arm based on Deep Reinforcement Learning
Abstract:In traditional induction brazing, process deviations rely on manual adjustment, which presents bottlenecks such as response lag and long adjustment cycles. To address this, an improved Deep Deterministic Policy Gradient (DDPG) algorithm is proposed. First, a multi-objective reward function is designed to simultaneously optimize tracking and obstacle avoidance capabilities. Second, a dynamic priority scoring mechanism is introduced to quantify the learning value of each sample. Furthermore, a dual-layer experience replay framework is adopted to balance exploration and exploitation during training. Finally, simulation on six degrees of freedom (6DOF) robotic arm (ABB IRB 2600) shows that the improved algorithm increases training speed by 7.5% and reduces end-point tracking error by 40%-60%. These results verify that the proposed method enhances the learning efficiency and tracking accuracy of the robotic arm, providing a feasible solution for the intelligent implementation of precision joining.
keywords:Inductive brazing  Deep Deterministic Policy Gradient algorithm  Robotic arm
查看全文   查看/发表评论   下载pdf阅读器