Is there anyway in which I can create indexes in Solr to perform full-text search from HBase for Near Real Time.
I didn't wanted to store the whole text in my solr indexes. Made "stored=false"
Note: - Keeping in mind, I am working on large datasets and want to do Near Real Time search. WE are talking TB/PB of data.
UPDATED
Cloudera Distribution : 5.4.x is used with Cloudera Search components.
Solr : 4.10.x
HBase : 1.0.x
Indexer Service : Lily HBase Indexer with cloudera morphlines
Is there any other NRT Indexer services or frameworks which can be used instead of Lily on Cloudera. Just a thought.
Cloudera : please check this article and Hbase-Solr using Cloudera-search which describes how to achieve that. see below screen shot as described by those articles. Have a look at known issues with Cloudera Search
Yes you can consider Morphlines. they can be used for near real-time applications as well as batch processing applications.
I don't know much about hortonworks platform and how this can be achieved.