基于相关子空间的高维离群数据检测算法
    点此下载全文
引用本文:赵向兵,张天刚.基于相关子空间的高维离群数据检测算法[J].计算技术与自动化,2022,(1):82-86
摘要点击次数: 229
全文下载次数: 0
作者单位
赵向兵,张天刚 (山西大同大学 计算机与网络工程学院山西 大同 037009) 
中文摘要:为了提高离群数据检测精度和效率,提出了一种基于相关子空间的离群数据检测算法。该算法首先根据数据局部密度分布特征得出稀疏度矩阵,通过高斯相似核函数放大稀疏度特征;然后计算各属性维中数据稀疏度相似因子,确定子空间向量及相关子空间,结合数据稀疏度和维度权值得出数据对象的离群因子,选取最大的若干个对象为离群数据;最后采用人工数据集和UCI实验数据集验证算法准确性和有效性。
中文关键词:数据挖掘  离群数据;稀疏度;高斯核函数  相似度因子  相关子空间;仿真实验;算法分析
 
High Dimensional Outlier Detection Algorithm Based on Correlation Subspace
Abstract:In order to improve the accuracy and efficiency of outlier detection, an outlier detection algorithm based on correlation subspace is proposed. Firstly, the sparsity matrix is obtained according to the local density distribution of data, and the sparsity feature is amplified by Gaussian similarity kernel function. Then, the data sparsity similarity factor in each attribute dimension is calculated, and the subspace vector and correlation subspace are determined; The outlier factors of data objects are obtained by combining data sparsity and dimension weight, and the largest objects are selected as outlier data. Finally, the artificial data set and UCI experimental data set are used to verify the accuracy and effectiveness of the algorithm.
keywords:data mining  outlier data  sparsity  Gaussian kernel function  similarity factor  correlation subspace  simulation experiment  algorithm analysis
查看全文   查看/发表评论   下载pdf阅读器