融合注意力机制的人机交互信息半监督敏感数据抽取算法
投稿时间:2022-07-13  修订日期:2022-08-26  点此下载全文
引用本文:
摘要点击次数: 23
全文下载次数: 0
作者单位邮编
牟少霞 菲律宾永恒大学 250011
吕冰彩 山东省教育招生考试院 
中文摘要:为提高敏感数据抽取效果,提出融合注意力机制的人机交互信息半监督敏感数据抽取方法。融合类卷积以及人机交互注意力机制构建融合交互注意力机制双向长短词记忆(Bi-LSTM-CRF)模型,通过模型的类卷积交互注意力机制将敏感词转化为字符矩阵,采用Bi-LSTM对该矩阵进行编码取得敏感词字符级特点的分布式排列,通过Bi-LSTM对该分布式排列的二次编码取得敏感词上下文信息的隐藏状态,基于该隐藏状态通过类卷积注意力层与交互注意力层进行注意力加权,获得类卷积注意力矩阵与交互注意力矩阵,拼接两个矩阵得到双层注意力矩阵,利用交互注意力层门控循环单元升级双层注意力矩阵成新的注意力矩阵,经全连接降维获取敏感词对应的预测标签,实现人机交互信息半监督敏感数据抽取。实验结果说明:该方法可有效减少敏感数据抽取复杂度,具有较高的敏感数据抽取查全率。
中文关键词:注意力机制  人机交互  半监督  敏感数据抽取  BiLSTM模型  CRF模型
 
Semi?supervised?sensitive?data?extraction?algorithm?for?human-computer? interaction?information?based?on?attention?mechanism
Abstract:In order to improve the extraction effect of sensitive data, a semi-supervised sensitive data extraction method of human-computer interaction information integrating attention mechanism is proposed. Bilstm-crf model is constructed by integrating convolution and human-computer interaction attention mechanism. Sensitive words are transformed into character matrix through the convolution interaction attention mechanism of the model. Bi-lstm is used to encode the matrix to obtain distributed arrangement of character level characteristics of sensitive words. Through the Bi-LSTM is sensitive to the distributed array secondary coding gain word context information hidden state, based on the hidden state of combining class convolution attention at close range for all the words of attention weight distribution on the word to get kind of convolution attention matrix, the matrix through the model the interaction layer focus attention for all of the sensitive word weight distribution, Attention to obtain interaction matrix, convolution attention yourself matrix and interaction matrix using the class splicing into double attention matrix, using interactive gating circulation cell upgrade double attention attention layer matrix into new attention matrix, the matrix through the connection dimension reduction access to sensitive word corresponding forecast label, realize human-computer interaction information a semi-supervised sensitive data extraction. Experimental results show that this method can effectively reduce the complexity of sensitive data extraction and has a high recall rate of sensitive data extraction.
keywords:Attention mechanism  Human-computer interaction  A semi-supervised  Sensitive data extraction  BiLSTM model  CRF model
查看全文   查看/发表评论   下载pdf阅读器