HBase: Limitations, Advantage & Problems

本文深入探讨了HBase的局限性和优势,指出其在大规模数据存储和实时查询方面的强大能力,但同时也存在如单点故障、跨表操作困难、硬件需求高等挑战。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

HBase architecture always has "Single Point Of Failure" feature, and there is no exception handling mechanism associated with it.

Problems with HBase

  • In any production environment, HBase is running with a cluster of more than 5000 nodes, only Hmaster acts as the master to all the slaves Region servers. If Hmaster goes down, it can be only be recovered after a long time. Even though the client is able to connect region server. Having another master is possible but only one will be active. It will take a long time to activate the second Hmaster if the main Hmaster goes down. So, Hmaster is a performance bottleneck.
  • In HBase, we cannot implement any cross data operations and joining operations, of course, we can implement the joining operations using MapReduce, which would take a lot of time to designing and development. Tables join operations are difficult to perform in HBase. In some use case, its impossible to create join operations that related to tables that are present in HBase
  • HBase would require new design when we want to migrate data from RDBMS external sources to HBase servers. However, this process takes a lot of time.
  • HBase is really tough for querying. We may have to integrate HBase with some SQL layers like Apache phoenix where we can write queries to trigger the data in the HBase. It's really good to have Apache Phoenix on top of HBase.
  • Another drawback with HBase is that, we cannot have more than one indexing in the table, only row key column acts as a primary key. So, the performance would be slow when we wanted to search on more than one field or other than Row key. This problem we can overcome by writing MapReduce code, integrating with Apache SOLR and with Apache Phoenix.
  • Slow improvements in the security for the different users to access the data from HBase.
  • HBase doesn't support partial keys completely
  • HBase allows only one default sort per table
  • It's very difficult to store large size of binary files in HBase
  • The storage of HBase will limit real-time queries and sorting
  • Key lookup and Range lookup in terms of searching table contents using key values, it will limit queries that perform on real time
  • Default indexing is not present in HBase. Programmers have to define several lines of code or script to perform indexing functionality in HBase
  • Expensive in terms of Hardware requirements and memory blocks allocations.
    • More servers should be installed for distributed cluster environments (like each server for NameNode, DataNodes, ZooKeeper, and Region Servers)
    • Performance wise it require high memory machines
    • Costing and maintenance wise it is also higher

 

 

 

Advantage of HBase:

  • Can store large data sets on top of HDFS file storage and will aggregate and analyze billions of rows present in the HBase tables
  • In HBase, the database can be shared
  • Operations such as data reading and processing will take small amount of time as compared to traditional relational models
  • Random read and write operations
  • For online analytical operations, HBase is used extensively.
  • For example: In banking applications such as real-time data updates in ATM machines, HBase can be used.

Limitations with HBase:

  • We cannot expect completely to use HBase as a replacement for traditional models. Some of the traditional models features cannot support by HBase
  • HBase cannot perform functions like SQL. It doesn't support SQL structure, so it does not contain any query optimizer
  • HBase is CPU and Memory intensive with large sequential input or output access while as Map Reduce jobs are primarily input or output bound with fixed memory. HBase integrated with Map-reduce jobs will result in unpredictable latencies
  • HBase integrated with pig and Hive jobs results in some time memory issues on cluster
  • In a shared cluster environment, the set up requires fewer task slots per node to allocate for HBase CPU requirements

https://www.guru99.com/hbase-limitations-advantage-problems.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值