基于支持向量机的复杂场景中多人对话语音智能识别方法研究
    点此下载全文
引用本文:刘子寒,沈力,奚梦婷,陆佳鑫,朱佳佳,查俊杰.基于支持向量机的复杂场景中多人对话语音智能识别方法研究[J].计算技术与自动化,2024,(4):59-65
摘要点击次数: 278
全文下载次数: 0
作者单位
刘子寒,沈力,奚梦婷,陆佳鑫,朱佳佳,查俊杰 (国网江苏省电力有限公司信息通信分公司江苏 南京 210000) 
中文摘要:面对多人对话语音单一特征表征性别组合信息不足,导致语音识别结果不精准的问题,提出了基于支持向量机的复杂场景中多人对话语音智能识别方法。使用距离度量方法,检测复杂场景多人对话变化点。计算任意两个数据集的对数似然概率值,构建得分集。结合T-Test相似性度量方法,判断两个数据集显著差异性。构造支持向量机判别函数,利用支持向量机的映射逻辑实现相似话音的分离。使用支持向量机的二元分类超线性分类器构建最优判别函数,结合男性、女性基音频率、信号非谐振频率特征,实现多人对话语音智能识别。由实验结果可知,所研究方法对于基音频率识别结果,男性、女性幅度波动范围分别为-0.5~0.5、-0.7~0.7,与实验数据一致;对于信号非谐振频率识别结果,男性、女性频率波动范围分别为-600~600 Hz、-360~405 Hz,男性频率波动范围与实验数据仅存在50 Hz的误差,女性频率波动范围与实验数据一致。
中文关键词:支持向量机  复杂场景  多人对话  语音智能识别
 
Research on Intelligent Speech Recognition Method for Multi Person Conversation in Complex Scenes Based on Support Vector Machine
Abstract:In the face of the problem of insufficient gender combination information represented by a single feature in multi-person conversation speech, resulting in inaccurate speech recognition results, a support vector machine based intelligent recognition method for multi-person conversation speech in complex scenes is proposed. Using distance measurement methods to detect changes in multi-person conversations in complex scenes. Calculate the logarithmic likelihood probability values of any two datasets and construct a diversity set. Using the T-Test similarity measurement method, determine the significant differences between the two datasets. Construct a support vector machine discriminant function and use the mapping logic of the support vector machine to achieve the separation of similar voices. The binary classification super linear classifier of support vector machine is used to construct the optimal discriminant function, and combined with male and female pitch frequency and signal non resonant frequency characteristics, the intelligent recognition of multi person conversation speech is realized. From the experimental results, it can be seen that the range of amplitude fluctuations for pitch frequency recognition in the research method is -0.5~0.5 for males and -0.7~0.7 for females, which is consistent with the experimental data; For the non resonant frequency identification results of the signal, the frequency fluctuation ranges for males and females are -600~600 Hz and -360~405 Hz, respectively. There is only a 50 Hz error between the male frequency fluctuation range and the experimental data, while the female frequency fluctuation range is consistent with the experimental data.
keywords:support vector machine  complex scenarios  multi person dialogue  speech intelligent recognition
查看全文   查看/发表评论   下载pdf阅读器