基于BiLSTM-Attention唇语识别的研究
    点此下载全文
引用本文:刘大运1,房国志2?覮,骆天依3,魏华杰1,王倩1,李修政1,李骜1.基于BiLSTM-Attention唇语识别的研究[J].计算技术与自动化,2020,(1):150-155
摘要点击次数: 1133
全文下载次数: 0
作者单位
刘大运1,房国志2?覮,骆天依3,魏华杰1,王倩1,李修政1,李骜1 (1. 哈尔滨理工大学 计算机科学与技术学院黑龙江 哈尔滨 150080 2.哈尔滨理工大学 测控技术与通信工程学院黑龙江 哈尔滨 150080 3.哈尔滨理工大学 自动化学院黑龙江 哈尔滨 150080) 
中文摘要:为了解决唇语识别中唇部特征提取和时序关系识别存在的问题,提出了一种双向长短时记忆网络(BiLSTM)和注意力机制(Attention Mechanism)相结合的深度学习模型。首先将唇部20个关键点得到的唇部不同位置的高度和宽度作为唇部的特征,使用BiLSTM对唇部特征序列进行时序编码,然后利用注意力机制来发掘不同时刻唇部时序特征对于整体唇语识别的不同权重,最后利用Softmax进行分类。在公开的唇语识别数据集GRID和MIRACL-VC上与传统的唇语识别模型进行实验对比。在GRID数据集上准确率至少提高了13.4%,在MIRACL-VC单词数据集上准确率至少提高了15.3%,短语数据集上准确率至少提高了9.2%。同时还与其他编码模型进行了实验对比,实验结果表明该模型能有效地提高唇语识别的准确率。
中文关键词:唇语识别  双向长短时记忆网络  注意力机制  深度学习  时序编码
 
Research on Lip-reading Based on BiLSTM-Attention
Abstract:In order to solve the existing problems in lip feature extraction and temporal relation recognition during the research of lip-reading,a deep learning model based on bi-directional long short-term memory(BiLSTM) and attention mechanism(Attention) is proposed. Firstly,the height and width of the different positions of the lip obtained from the 20 key points of the lip are taken as the characteristics of the lip. Secondly,the BiLSTM model is utilized to encode temporal information. Thirdly,the attention mechanism is used to explore different weights of lip sequential features at different times toward the overall lip language recognition. Finally,we use Softmax classifier to classify. Compared with the conventional lip-learning models at the current lip language recognition database GRID and MIRACL-VC,we find the recognition accuracy rate is more than 13.4% higher than that on GRID. In the MIRACL-VC word database,the accuracy rate increased by at least 15.3%,and the accuracy rate in the phrase database increased by at least 9.2%. At the same time,compared with other coding models,the experimental results show that this model can effectively improve the accuracy of lip-reading.
keywords:lip-reading  bi-directional long short-term memory  attention mechanism  deep learning  sequential coding
查看全文   查看/发表评论   下载pdf阅读器