面向大数据存储的HBase二级索引设计

李斌; 郭景维; 彭骞

面向大数据存储的HBase二级索引设计

引用本文：李斌，郭景维，彭骞.面向大数据存储的HBase二级索引设计[J].计算技术与自动化,2019,(2):124-129

摘要点击次数: 953

全文下载次数: 0

作者	单位
李斌，郭景维，彭骞	（国网宁夏电力有限公司，宁夏银川 750011）

中文摘要:针对HBase缺乏二级索引的功能，导致在非行键列上的查询需要使用过滤器并配合全表扫描完来完成。在大数据的场景下性能较差的问题，结合HBase表行键的索引结构与关系型数据库的二级索引结构提出了索引列值聚集的二级索引解决方案。此外，还提出二级索引机制的支持联合索引与特殊的索引列值的处理，提高了二级索引的性能并拓宽了二级索引的适用场景。最后，通过构建系统测试证明了二级索引极大地提高了HBase的查询效率。

中文关键词:计算机软件 HBase 二级索引聚集转义

Design of HBase Secondary indexes for big Data Storage

Abstract:Due to the lack of secondary indexes in HBase，queries on non-row key columns need to use filters and complete the scan in the full table. In the scenario of massive data，the performance is poor. relational database proposes combining the indexes of HBase table row keys. With the secondary index structure of the structure a secondary index solution for index column value aggregation is proposed. In addition，this paper studies that the secondary index mechanism supports the processing of joint indexes and special index column values，improves the performance of the secondary index and broadens the application of the secondary index. Finally，this paper proves that the secondary index greatly improves the HBase query efficiency by building system tests.

keywords:computer software HBase secondary index aggregation escape

查看全文 查看/发表评论 下载pdf阅读器