林鹏1,2?覮,陈曦1,2,龙鹏飞1,2,傅明1,2 (1.长沙理工大学 综合交通运输大数据智能处理湖南省重点实验室 湖南 长沙 410114 2. 长沙理工大学 计算机与通信工程学院湖南 长沙 410114) 
中文关键词:边界修正方法  滑动网格方法  CLIQUE算法  MapReduce
Improved CLIQUE Algorithm and its Parallelization
Abstract:CLIQUE is an efficient algorithm. But its clustering result is defective with the serrated boundary.And with the increase of data size and dimension,the efficiency of the algorithm has been greatly affected. This paper proposes an improved CLIQUE algorithm.The algorithm firstly uses the boundary-correcting method and grid-sliding method to improve the quality of meshing by Scanning the dense area border and sparse area and then retrieving the pruned dense grid.Then the parallelization of the improved algorithm is achieved on top of MapReduce.A series of experiments are carried out and the clustering accuracy,processing time,speedup and scalability of the improved algorithm are tested.The result of experiments proves that the algorithm is improved 17% to 26% in accuracy.The parallel algorithm decreases the runtime effectively in massive data processing,which shows excellent attribute in scalability.
keywords:boundary-correcting method  grid-sliding method  CLIQUE  MapReduce
