I am designing an app to run on hbase and want to interactively explore the contents of my cluster. I am in the hbase shell and I want to perform a scan of all keys starting with the chars "abc". Such keys might inlcude "abc4", "abc92", "abc20014" etc... I tried a scan
hbase(main):003:0> scan 'mytable', {STARTROW => 'abc', ENDROW => 'abc'}
But this does not seem to return anything since there is technically no rowkey "abc" only rowkeys starting with "abc"
What I want is something like
hbase(main):003:0> scan 'mytable', {STARTSROWPREFIX => 'abc', ENDROWPREFIX => 'abc'}
I hear HBase can do this quickly and is one of its main selling points. How do I do this in the hbase shell?
So it turns out to be very easy. The scan ranges are not inclusive, the logic is start <= key < end. So the answer is
I think what you need is a filter
checkout the answer for following question Scan with filter using HBase shell
more filters are listed in http://hbase.apache.org/book/client.filter.html
In recent versions of HBase you can now do in the hbase shell:
This effectively does this (and also works for binary situations)
This method is a LOT more efficient than the "PrefixFilter" approach because the latter puts all records through the comparison code the is present in this PrefixFilter class.
The accepted solution won't work in all cases (binary keys). In addition, using a PrefixFilter can be slow because it performs a table scan until it reaches the prefix. A more performant solution is to use a STARTROW and a FILTER like so: