基于相关系数的MID3算法改进
    点此下载全文
引用本文:吕洁,王凤芹,王丽娜.基于相关系数的MID3算法改进[J].计算技术与自动化,2023,(1):119-122
摘要点击次数: 163
全文下载次数: 0
作者单位
吕洁,王凤芹,王丽娜 (海军航空大学山东 烟台 264001 ) 
中文摘要:针对决策树算法在分类时的多值偏向问题,提出了一种合理的基于相关系数的MID3算法的改进算法。该算法在生成决策树的过程中,将属性与分类结果之间的相关关系引入决策树节点的属性选择中,从而在一定程度上解决ID3算法的多值倾向问题,同时考虑系统两层节点从全局上优化树的结构。利用UCI数据集样本进行实验,将本文算法与ID3算法进行对比,得到了算法的效率的比较结果。实验结论表明,算法提高了数据的平均分类准确率,生成的决策树结构更加合理。
中文关键词:相关系数  ID3  MID3  信息熵  决策树
 
Improvement of MID3 Algorithm Based on Correlation Coefficient
Abstract:Aiming at the multi-value bias problem of decision tree algorithm in classification , a reasonable improved algorithm of MID3 algorithm based on correlation coefficient is proposed. In the process of generating decision tree, the algorithm introduces the correlation between attributes and classification into the attribute selection of decision tree nodes, so as to solve the multivalued tendency problem of ID3 algorithm to a certain extent. At the same time, the two-tier nodes of the system are considered to optimize the structure of the tree from the whole situation. By comparing with ID3 and improved algorithm, the efficiency of the algorithm is tested and compared with specific UCI data set samples. The conclusion shows that the algorithm improves the average classification accuracy, and the structure of decision tree is more reasonable.
keywords:correlation coefficient  ID3  MID3  information entropy  desicion tree
查看全文   查看/发表评论   下载pdf阅读器