HBase – Performance Tuning & Benchmarking
OVERVIEW OF APACHE HBASE
The whole concept of big data, or total data, and how to collect it and get it to the data lake can sound scary, but
it becomes less so if you break down the data collection problem into subsets. All of these are technologies
mentioned below are part of Big Data framework:
- Apache HBase is a popular open source scale-out NoSQL, columnar database designed for low-latency
random access to data stored on the popular Hadoop platform. - Apache Hadoop is a Big Data framework which uses HDFS(Hadoop Distributed File System) to Store the data
and MapReduce framework to process that data. Java is used as native language to write MapReduce
programs. - Apache Hbase and Cassandra are both NoSQL databases which does not follow the strict ACID transactions.
Both are columnar databases and needs proper data modelling to be used effectively. Their performance can
be evaluated by benchmarking the database using a tool called YCSB(Yahoo Cloud Serving Benchmark).