摘要点击次数: 1056
全文下载次数: 34
郑轶 (东北石油大学 计算机与信息技术学院黑龙江 大庆163318) 
中文关键词:CRFs  人物  人物信息  信息抽取
Character Information Extraction Based on Conditional Random Fields
Abstract:This paper considered the character information extraction from the Baike HTML as a sequence labeling question, and used CRFs to label the raw data. This paper also detailed the methods of data analysis and feature selection, and the way to extract information from the raw data directly, which do not contain the data preprocessing part and the sentence parser part. By this way, it developed the efficiency of information extraction effectively. And two comparable tests show that the method proposed can extract the character information from the row HTML accurately.
keywords:CRFs  CRF  character  information extraction
查看全文   查看/发表评论   下载pdf阅读器