hbase

时间：2015-06-16 16:11:15 阅读：112 评论：0 收藏：0 [点我收藏+]

标签：

http://grokbase.com/t/hbase/user/125ya2cxxs/scan-addfamily-vs-familyfilter-equal

Just to add on.
The java doc clearly says in FamilyFilter that

* If an already known column family is looked for, use {@link
org.apache.hadoop.hbase.client.Get#addFamily(byte[])}
* directly rather than a filter.

So addFamily should be better.

Regards
Ram

-----Original Message-----
From: Anoop Sam John
Sent: Thursday, May 31, 2012 11:49 AM
To: user@hbase.apache.org
Subject: RE: Scan addFamily vs FamilyFilter(EQUAL, ...)

Hi,
As per my understanding of the Scan code in your scenario where
you want to go with scanning of some CFs ( not all) You go with
Scan#addFamily.
The FamilyFilter also doing the same thing. But there is a difference
in the performance.
When one specify the CFs in the scan, the scanner will be created for
only those many Stores. For the other CFs, there wont be any scanners
and so those stores are not scanned. ( The HFile data is not fetched )
Instead when one use the FamilyFilter and not specify any specific
columns (using Scan#addFamily) all the stores will get scanned and data
will get fetched from HFiles. Later these KVs corresponding to which
you needed (as per your FamilyFilter) only will get included in the
Result and others just avoided. So there will be performance
difference I feel.. Correct me if I am wrong pls...

@Stack

One thing I ran into when using the Scan.addFamily / Scan.addColumn is

that those two methods overwrite each other.
In the Scan#addColumn javadoc it is clearly telling about this
overwrites... So this seems intentionally done correct?

-Anoop-
________________________________________
From: saint.ack@gmail.com [saint.ack@gmail.com] on behalf of Stack
[stack@duboce.net]
Sent: Wednesday, May 30, 2012 11:13 PM
To: user@hbase.apache.org
Subject: Re: Scan addFamily vs FamilyFilter(EQUAL, ...)

On Wed, May 30, 2012 at 9:59 AM, Kevin wrote:
I am curious and trying to learn which method is best when wanting to limit
a scan to a particular column or column family. The Scan class carries a
Filter instance and a TreeMap of the family map and I am unsure how they
get carried through to the server-side functionality. In terms of
performance is there any difference between doing Scan.addFamily(x) and
Scan.setFilter(new FamilyFilter(CompareFilter.CompareOp.EQUAL, x)?

There is probably not noticeable difference in performance but
Scan#addFamily is the more natural way of expressing column family
scoping.
St.Ack

hbase

标签：

原文地址：http://www.cnblogs.com/jvava/p/4580956.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行