码迷,mamicode.com
首页 > 其他好文 > 详细

hbase

时间:2015-06-16 16:11:15      阅读:112      评论:0      收藏:0      [点我收藏+]

标签:

http://grokbase.com/t/hbase/user/125ya2cxxs/scan-addfamily-vs-familyfilter-equal

 

Just to add on.
The java doc clearly says in FamilyFilter that

* If an already known column family is looked for, use {@link
org.apache.hadoop.hbase.client.Get#addFamily(byte[])}
* directly rather than a filter.

So addFamily should be better.

Regards
Ram

-----Original Message-----
From: Anoop Sam John
Sent: Thursday, May 31, 2012 11:49 AM
To: user@hbase.apache.org
Subject: RE: Scan addFamily vs FamilyFilter(EQUAL, ...)

Hi,
As per my understanding of the Scan code in your scenario where
you want to go with scanning of some CFs ( not all) You go with
Scan#addFamily.
The FamilyFilter also doing the same thing. But there is a difference
in the performance.
When one specify the CFs in the scan, the scanner will be created for
only those many Stores. For the other CFs, there wont be any scanners
and so those stores are not scanned. ( The HFile data is not fetched )
Instead when one use the FamilyFilter and not specify any specific
columns (using Scan#addFamily) all the stores will get scanned and data
will get fetched from HFiles. Later these KVs corresponding to which
you needed (as per your FamilyFilter) only will get included in the
Result and others just avoided. So there will be performance
difference I feel.. Correct me if I am wrong pls...

@Stack
One thing I ran into when using the Scan.addFamily / Scan.addColumn is
that those two methods overwrite each other.
In the Scan#addColumn javadoc it is clearly telling about this
overwrites... So this seems intentionally done correct?


-Anoop-
________________________________________
From: saint.ack@gmail.com [saint.ack@gmail.com] on behalf of Stack
[stack@duboce.net]
Sent: Wednesday, May 30, 2012 11:13 PM
To: user@hbase.apache.org
Subject: Re: Scan addFamily vs FamilyFilter(EQUAL, ...)
On Wed, May 30, 2012 at 9:59 AM, Kevin wrote:
I am curious and trying to learn which method is best when wanting to limit
a scan to a particular column or column family. The Scan class carries a
Filter instance and a TreeMap of the family map and I am unsure how they
get carried through to the server-side functionality. In terms of
performance is there any difference between doing Scan.addFamily(x) and
Scan.setFilter(new FamilyFilter(CompareFilter.CompareOp.EQUAL, x)?
There is probably not noticeable difference in performance but
Scan#addFamily is the more natural way of expressing column family
scoping.
St.Ack
 
 
 

hbase

标签:

原文地址:http://www.cnblogs.com/jvava/p/4580956.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!