What are the benefits of using either Hadoop or HBase or Hive ?
From my understanding, HBase avoids using map-reduce and has a column oriented storage on top of HDFS. Hive is a sql-like interface for Hadoop and HBase.
I would also like to know how Hive compares with Pig.
Use of Hive, Hbase and Pig w.r.t. my real time experience in different projects.
Hive is used mostly for:
--Analytics purpose where you need to do analysis on history data
--Generating business reports based on certain columns
--Efficiently managing the data together with metadata information
--Joining tables on certain columns which are frequently used by using bucketing concept
--Efficient Storing and querying using partitioning concept
--Not useful for transaction/row level operations like update, delete, etc.
Pig is mostly used for:
--Frequent data analysis on huge data
--Generating aggregated values/counts on huge data
--Generating enterprise level key performance indicators very frequently
Hbase is mostly used:
--For real time processing of data
--For efficiently managing Complex and nested schema
--For real time querying and faster result
--For easy Scalability with columns
--Useful for transaction/row level operations like update, delete, etc.