Capítulo de livro

NoSQL Overview and Performance Testing of HBase Over Multiple Nodes with MySQL

2018; Springer Nature; Linguagem: Inglês

10.1007/978-981-13-1498-8_24

ISSN

2194-5357

Autores

Nabanita Das, Swagata Paul, Bidyut Biman Sarkar, Satyajit Chakrabarti,

Tópico(s)

Data Stream Mining Techniques

Resumo

The escalating amount of web-based applications in the fields of social networks, media, biology, physics, and the Internet of things are continuously generating large volume of data or Bigdata in terabytes, petabytes, and zetabytes over a short period of time. Consequently, an immense amount of read and write requests is generated without much latency. It is an immediate concern to store and analyze such huge amount of mixed ASCII and non-ASCII data efficiently, economically, and in no time. The conventional database systems like MySQL are incapable to handle such large volume of data in real time. At this point, there is a claim that column-based NoSQL databases like Accumulo, Cassandra, HBase, or document-based Apache CouchDB, Couchbase, MongoDB are capable of handling such huge data volume efficiently. In this work, we focussed on column-based Apache HBase, a NoSQL distributed database management system developed in the Bigdata domain on distributed file system architecture provided by Hadoop (HDFS). Let us begin the discussion on NoSQL HBase and the association between HBase and Hadoop. Then some of the important features of HBase are explained. After that, we discussed the advantages and limitations of HBase in distributed data processing over the other NoSQL database management systems. Finally, we performed some experiments to compare the time performance of HBase with traditional database MySQL as data size increases.

Referência(s)