起因: 使用happybase 访问hbase 时
def scan(self, row_start=None, row_stop=None, row_prefix=None, columns=None, filter=None, timestamp=None, include_timestamp=False, batch_size=1000, scan_batching=None, limit=None, sorted_columns=False):
查看源码,我们能看到
if row_prefix is not None: if row_start is not None or row_stop is not None: raise TypeError( "'row_prefix' cannot be combined with 'row_start' " "or 'row_stop'") row_start = row_prefix row_stop = str_increment(row_prefix)
def str_increment(s): """Increment and truncate a byte string (for sorting purposes) This functions returns the shortest string that sorts after the given string when compared using regular string comparison semantics. This function increments the last byte that is smaller than ``0xFF``, and drops everything after it. If the string only contains ``0xFF`` bytes, `None` is returned. """ for i in xrange(len(s) - 1, -1, -1): if s[i] != '\xff': return s[:i] + chr(ord(s[i]) + 1) return None
当有如下场景
微博表
用户ID_微博ID
假定我们想获取此用户的所有微博,在scan时就没有必要设定scan范围 ‘用户ID_0‘ ~ ‘用户ID_a‘
而可以直接使用row_prefix = ‘用户ID‘
PS:回头我会提供str_increment 的java 实现
原文地址:http://blog.csdn.net/woshiaotian/article/details/41746009