| 一种有效的估计关系数据库中空值的方法 |
点此下载全文 |
| 引用本文:刘力,王立松,吴非.一种有效的估计关系数据库中空值的方法[J].计算技术与自动化,2016,(3):110-114 |
| 摘要点击次数: 1114 |
| 全文下载次数: 9 |
|
|
| 中文摘要:由于客观世界的复杂性,信息缺失、不确定信息是普遍存在的。数据库作为表达现实世界的一种工具,使用空值来表达信息缺失的问题。针对关系数据库中的空值问题,提出一种基于模糊聚类和线性回归的空值估计方法。该方法首先对数据表中的数据进行挖掘,找出与被估计属性相关联的属性集。该过程仅利用数据本身提供的信息,避免了由专家决定条件属性时由于主观性造成的误差。其次根据所得属性集进行模糊聚类得到对原始数据的一个划分,再基于所得分簇和线性回归给出一个估计关系表中空值的方法。最后利用平均绝对错误率来衡量算法估值的准确率。实验结果表明该方法估值的结果与其他方法相比具有较高的准确率。 |
| 中文关键词:关系数据库 空值 模糊聚类 多元线性回归 |
| |
| An Efficient Method for Estimating Null Values in Relational Database |
|
|
| Abstract:Missing information, indefinite information as well as ambiguous information truly exists due to the complexity of the real world. Relational database, as an important tool to express the real world, use null value to express the missing of information. Focusing on estimation of null values in relational databases, the paper proposes a new method to estimate null values based on fuzzy clustering and multiple regressions. It starts with data mining of databases, finds out the attribute set connected with estimated attributes.The information provided by data exclusively without any other prior knowledge leads to relatively objective condition attributes, thus avoiding certain errors resulted from subjectivity when it is up to professors to determine condition attributes. Then we obtained a partition of original data based on the attribute set. And the clustering and multiple regressions we come up with enable us to find a method to estimate null values in databases. Finally, mean of absolute error rate is adopted to measure the estimation accuracy. The experiments results show that the proposed method has relatively high accuracy. |
| keywords:relational database null value fuzzy clustering multiple linear regression |
| 查看全文 查看/发表评论 下载pdf阅读器 |
|
|
|