基于决策树算法的客服终端冗余数据迭代消除方法
    点此下载全文
引用本文:张 莉1,丁毛毛2,李 玮3,王 颖4,吕静贤5,王笑一1.基于决策树算法的客服终端冗余数据迭代消除方法[J].计算技术与自动化,2022,(4):118-122
摘要点击次数: 309
全文下载次数: 0
作者单位
张 莉1,丁毛毛2,李 玮3,王 颖4,吕静贤5,王笑一1 (1.天津大学天津 3000722.中国农业大学北京 1001933.索尔福德大学英国 曼彻斯特 031014.华北电力大学,北京 1022065.波尔多第一大学法国 波尔多 33000) 
中文摘要:为了提高客服终端数据可利用性,降低冗余数据干扰程度,挖掘潜在客户,制定销售策略,研究一种基于决策树算法的客服终端冗余数据迭代消除方法。采用数据仓库法抽取并集成客服终端数据,对字符类数据进行去停用词和中文分词预处理,对数值类数据进行缺失值填补和离散值删除预处理。构建ID3决策树,分类客服终端数据,计算同一类数据的类间相似度,构建冗余数据判断规则,检测客服终端冗余数据,联合消除器消除冗余数据。实验结果表明:所研究方法应用后,可以消除客服终端冗余数据,空间缩减比更接近冗余率。
中文关键词:决策树算法  客服终端  冗余数据消除  数据类间相似度
 
Iterative Elimination Method of Redundant Data in Customer Service Terminal Based on Decision Tree Algorithm
Abstract:In order to improve the availability of customer service terminal data, reduce the interference degree of redundant data, mine potential customers and formulate sales strategies, an iterative elimination method of redundant data of customer service terminal based on decision tree algorithm is studied. The data warehouse method is used to extract and integrate the customer service terminal data, preprocess the character data with de stop words and Chinese word segmentation, and preprocess the numerical data with missing value filling and discrete value deletion. Build ID3 decision tree, classify customer service terminal data, calculate the similarity between classes of the same type of data, build redundant data judgment rules, detect redundant data of customer service terminal, and combine eliminators to eliminate redundant data. The experimental results show that after the application of the research method, the redundant data of customer service terminal can be eliminated, the space reduction ratio is closer to the redundancy rate, the elimination effect is better and the accuracy is higher.
keywords:decision tree algorithm  customer service terminal  redundant data elimination  similarity between data classes
查看全文   查看/发表评论   下载pdf阅读器