HBase (Easy): How to Perform Range Prefix Scan in

2019-01-21 11:29发布

I am designing an app to run on hbase and want to interactively explore the contents of my cluster. I am in the hbase shell and I want to perform a scan of all keys starting with the chars "abc". Such keys might inlcude "abc4", "abc92", "abc20014" etc... I tried a scan

hbase(main):003:0> scan 'mytable', {STARTROW => 'abc', ENDROW => 'abc'}

But this does not seem to return anything since there is technically no rowkey "abc" only rowkeys starting with "abc"

What I want is something like

hbase(main):003:0> scan 'mytable', {STARTSROWPREFIX => 'abc', ENDROWPREFIX => 'abc'}

I hear HBase can do this quickly and is one of its main selling points. How do I do this in the hbase shell?

4条回答
迷人小祖宗
2楼-- · 2019-01-21 11:32

So it turns out to be very easy. The scan ranges are not inclusive, the logic is start <= key < end. So the answer is

scan 'mytable', {STARTROW => 'abc', ENDROW => 'abd'}
查看更多
爷、活的狠高调
3楼-- · 2019-01-21 11:38

I think what you need is a filter

checkout the answer for following question Scan with filter using HBase shell

more filters are listed in http://hbase.apache.org/book/client.filter.html

查看更多
再贱就再见
4楼-- · 2019-01-21 11:40

In recent versions of HBase you can now do in the hbase shell:

scan 'mytable', {ROWPREFIXFILTER => 'abc'}

This effectively does this (and also works for binary situations)

scan 'mytable', {STARTROW => 'abc', ENDROW => 'abd'}

This method is a LOT more efficient than the "PrefixFilter" approach because the latter puts all records through the comparison code the is present in this PrefixFilter class.

查看更多
欢心
5楼-- · 2019-01-21 11:44

The accepted solution won't work in all cases (binary keys). In addition, using a PrefixFilter can be slow because it performs a table scan until it reaches the prefix. A more performant solution is to use a STARTROW and a FILTER like so:

 scan 'my_table', {STARTROW => 'abc', FILTER => "PrefixFilter('abc')"}
查看更多
登录 后发表回答