基于信息熵的粗糙集属性应急数据去重挖掘算法研究
投稿时间:2020-06-12  修订日期:2020-07-08  点此下载全文
引用本文:
摘要点击次数: 12
全文下载次数: 0
作者单位E-mail
曾维佳 大连科技学院数字技术学院 jiabishao0ll@163.com 
秦放 大连科技学院数字技术学院  
李琳 大连科技学院数字技术学院  
徐鹏 大连科技学院数字技术学院  
基金项目:2019年度辽宁省社会科学规划基金项目一般项目“大数据驱动下辽宁省生态环境监测、预警与防控立体网格优化构架研究”(项目编号:L19BGL044)
中文摘要:粗糙集属性应急数据存在冗余特征,降低挖掘效率,提出基于信息熵的粗糙集属性应急数据去重挖掘算法。将粗糙集理论和信息熵相结合,离散化处理应急数据,离散化完成后,约简对于决策表的条件信息熵大小不产生任何影响的属性,设定决策属性集合和条件属性集合,选取将同约简属性集合B的属性组合数目最小的熵值实现约简,去除冗余特征,完成应急数据去重挖掘。以大型船舶应急数据为研究对象展开数据去重挖掘,结果表明:可有效去重挖掘到船舶旋回性相关应急数据,利用数据增比特征能够分析到各因素对船舶旋回性的影响,并且所研究算法的挖掘效率较高,在数据量为1400条时,耗时仅为0.33s。
中文关键词:信息熵  粗糙集属性  应急数据  去重挖掘  离散化  约减
 
Research on the algorithm of re mining the attribute emergency data of rough set based on information entropy
Abstract:The attribute emergency data of rough set has redundant features, which reduces the efficiency of mining. A re mining algorithm of attribute emergency data of rough set based on information entropy is proposed. Combining the theory of rough set and information entropy, discretize the emergency data. After discretization, the attribute which does not have any influence on the conditional information entropy of decision table is reduced. The decision attribute set and conditional attribute set are set, and the entropy value with the minimum number of attribute combinations of attribute set B is selected to realize reduction, to remove redundant features, and to complete the emergency data re excavation Dig. The results show that: it can effectively mine the emergency data related to the ship's cycle, and analyze the influence of various factors on the ship's cycle by using the increasing ratio characteristics of data, and the mining efficiency of the algorithm is high. When the data volume is 1400, the time-consuming is only 0.33s.
keywords:Information entropy  rough set attribute  emergency data  re mining  discretization  reduction
查看全文   查看/发表评论   下载pdf阅读器