一种基于决策树和词义相似度的N1+N2结构语法关系判定方法
投稿时间:2020-08-11  修订日期:2020-09-16  点此下载全文
引用本文:
摘要点击次数: 17
全文下载次数: 0
作者单位E-mail
杨泉 北京师范大学 yangquan@bnu.edu.cn 
基金项目:国家语委科研项目
中文摘要:摘要:建立了一种基于决策树算法的N1+N2结构语法关系判定方法,首先建立了该结构的语料库,对每条语料都标注构建特征集所需的词性、《同义词词林》语义编码、语法关系和词义相似度等信息;然后为证明相似度在判断语法关系中的合理性,根据语言学原理研究了N1+N2结构两个名词间语义相似度与语法关系之间的内在规律:①从语法关系的角度比较两个名词间的语义相似度结果为:并列关系>复指关系>定中关系>主谓关系;②从语言功能焦点的角度比较两个名词间的语义相似度结果为:并焦型短语>后焦型短语。最后以此为基础构建了特征集,运用决策树C4.5算法建立了自动判定N1+N2结构语法关系的方法。运用该算法在自建语料库的测试集中进行测试,正确率为89.39%。
中文关键词:关键词:词义相似度  《同义词词林》  短语层级  语法关系  决策树
 
Research on the judgment method of N1+N2 structure grammatical relation based on decision tree and word semantic similarity
Abstract:Abstract: A decision tree algorithm based method for determining the grammatical relationship of N1+N2 structure is established. Firstly, a corpus of the structure is established, and the part of speech, semantic code of Cilin, grammatical relation and similarity of word semantic are marked on each corpus. Then, in order to prove the rationality of similarity in judging grammatical relationship, two nouns of N1 + N2 structure are studied according to linguistic principles. The internal law between semantic similarity and grammatical relationship: 1) from the perspective of grammatical relationship, the results of semantic similarity between two nouns are: coordinate relation > anaphora relation > attributive relation > subject predicate relation; 2) from the perspective of language function focus, the result of semantic similarity between two nouns is: parallel focus phrase > post focus phrase. Finally, the feature set is constructed on this basis, and a method to automatically determine the grammatical relationship of N1+N2 structure is established by using decision tree C4.5 algorithm. The algorithm is applied to the test set of self-built corpus, and the correct rate is 89.39%.
keywords:Keywords: word semantic similarity  Cilin  phrase level  grammatical relationship  decision tree
查看全文   查看/发表评论   下载pdf阅读器