OSSData: 面向开源社区的分布式数据采集框架
    点此下载全文
引用本文:林维?覮,陈曦,王松.OSSData: 面向开源社区的分布式数据采集框架[J].计算技术与自动化,2019,(1):102-107
摘要点击次数: 873
全文下载次数: 0
作者单位
林维?覮,陈曦,王松 (长沙理工大学 综合交通运输大数据智能处理湖南省重点实验室湖南 长沙 410114) 
中文摘要:近些年,开源软件发展迅猛,其应用领域和适用范围越来越广泛;与此同时,开源软件的成功也吸引了大量的开发者投入到开源软件的开发。因此,开源软件社区积累了大量的软件应用和开发数据。这些丰富的数据逐步引起了研究人员的关注,已经有相关工作对开源软件的群体开发模式和质量保证机制等展开了一系列研究。为了更好地支持此类研究工作的有效开展,面向开源社区提出了一个用户可定制的数据采集框架,该框架具有较高的灵活性和鲁棒性,能够根据用户的实际需求进行深度定制,并保持稳定持续的工作状态,从而提高数据采集的效率和质量。
中文关键词:开源社区  数据采集  网络爬虫  分布式框架
 
OSSData:A Distributed Acquisition Framework for Open Source Communities
Abstract:In recent years,open source software has developed rapidly and its application fields and scopes have become increasingly extensive. At the same time,the success of open source software has attracted plenty of developers to contribute. Consequently,the open source software communities have accumulated a mass of data relating to software applications and development. The richness of the data has gradually attracted the attention of researchers,and a series of researches have been conducted on the collaborative development and quality assurance of open source software. In order to better support this kind of academic research,a customizable data acquisition framework for open source communities is proposed. The framework is designed to have high flexibility and robustness,and stable and long-running acquisition is available for customized tasks,which improve the efficiency and quality of data acquisition.
keywords:open source community  data acquisition  web crawler  distributed framework.
查看全文   查看/发表评论   下载pdf阅读器