在Hadoop-3.1.2上安装HBase-2.2.1

时间：2019-11-09 11:51:05 阅读：123 评论：0 收藏：0 [点我收藏+]

标签：ges 开始 version create vm设置 miss 负载读取主机名

1. 前言

本文将HBase-2.2.1安装在Hadoop-3.1.2上，关于Hadoop-3.1.2的安装，请参见《基于zookeeper-3.5.5安装hadoop-3.1.2》一文。安装环境为64位CentOS-Linux 7.2版本。

本文将在HBase官方提供的quickstart.html文件的指导下进行，在docs/getting_started目录下可找到quickstart.html，或直接浏览在线的：http://hbase.apache.org/book/quickstart.html。

安装使用外置的ZooKeeper，有关ZooKeeper的安装，请参见《基于zookeeper-3.5.5安装hadoop-3.1.2》一文。

关于分布式安装，请浏览：http://hbase.apache.org/book/standalone_dist.html#distributed，关于HBase使用外置的ZooKeeper配置，请浏览：http://hbase.apache.org/book/zookeeper.html。所有在线的文档，均会出现在二进制安装包解压后的docs目录下。

2. 缩略语

缩写	全写	说明
JVM	Java Virtual Matchine	Java虚拟机
jps	JVM Process Status	Java虚拟机进程状态工具
HDFS	Hadoop Distributed File System	Hadoop分布式文件系统
HBase	Hadoop Database	Hadoop数据库
WAL	Write-Ahead Log	预写日志，类似于MySQL的binlog
RS	Region Server
zk	Zookeeper
mr	Mapreduce
rs	RegionServer
DistCp	Distributed Copy	分布式复制
RFA	RollingFileAppender	Log4j的一种日志器类型，文件大小到达指定大小时产生一个新的文件
DRFA	DailyRollingFileAppender	Log4j的一种日志器类型，每天产生一个日志文件

3. 安装规划

3.1. 用户规划

	安装用户名	运行用户名	安装和运行用户组名
HBase	hbase	hbase	supergroup
Hadoop	hadoop	hadoop
Zookeeper	zk	zk

Hadoop的默认用户组名为supergroup，为避免一些权限问题，所以最好HBase也置于相同用户组，以减少后续的麻烦。当然也可以在安装Hadoop时，改变配置项dfs.permissions.supergroup值来设置用户组，但不管如何，最好是同一用户组。

一般建立HBase和Hadoop共享同一个Zookeeper集群，所以独立安装部署Zookeeper集群。

为何Zookeeper的安装运行用户名为zk，而不是zookeeper？这是因为当用户名超过8个字符时，ps等一些命令的结果将不显示用户名，替代的是用户ID。

3.2. 目录规划

	安装目录
HBase	/data/hbase
Hadoop	/data/hadoop
Zookeeper	/data/zookeeper

4. 相关端口

2888	ZooKeeper，如果是Leader，用来监听Follower的连接
3888	ZooKeeper，用于Leader选举
2181	ZooKeeper，用来监听客户端的连接
16010	hbase.master.info.port，HMaster的http端口
16000	hbase.master.port，HMaster的RPC端口
16030	hbase.regionserver.info.port，HRegionServer的http端口
16020	hbase.regionserver.port，HRegionServer的RPC端口
8080	hbase.rest.port，HBase REST server的端口
9095	hbase.thrift.info.port，HBase Thrift Server的http端口号

5. 下载安装包

官网：http://hbase.apache.org/，在这里即可找到下载HBase的链接。

下载国内映像站点：http://mirror.bit.edu.cn/apache/hbase/，HBase-2.2.1版本的下载网址：http://mirror.bit.edu.cn/apache/hbase/2.2.1/。选择下载hbase-2.2.1-bin.tar.gz。

6. 修改配置文件

6.1. 修改策略

在一台机器上修改好，再批量复制到集群中的其它节点。使用批量命令工具mooon_ssh和批量上传文件工具mooon_upload即可达到目的。

6.2. 修改conf/regionservers

可选的，如果Master到RegionServers间没有设置免密码登录，或者不会使用到start-hbase.sh和stop-hbase.sh，则可保留RegionServers文件为空。

regionservers类似于Hadoop的slaves文件，不需要在RegionServer机器上执行些修改。

将所有HRegionServers的IP或主机名一行一行的例举在在regionservers文件中，注意必须一行一个，不能一行多个。本文配置如下：

hadoop@test_64:~/hbase/conf> cat regionservers

192.168.31.30

192.168.31.31

192.168.31.32

6.3. 修改conf/hbase-env.sh

需要在所有机器上做同样的操作，可以借助scp命令，先配置好一台，然后复制过去，主要修改内容如下。

1) 设置JAVA_HOME

# The java implementation to use. Java 1.6 required.

export JAVA_HOME=/data/jdk

上述/data/jdk是JDK的安装目录。

2) 设置HADOOP_HOME

export HADOOP_HOME=/data/hadoop

3) 设置HBASE_CLASSPATH

# Extra Java CLASSPATH elements. Optional.

export HBASE_CLASSPATH=$HADOOP_HOME/etc/hadoop

这个设置是不是有点让人迷惑？CLASSPATH怎么指向了hadoop的conf目录？这个设置是让hbase能找到hadoop，名字确实没取好。

除此之外，还可以考虑在hbase的conf目录下建立hadoop的hdfs-site.xml软链接。

4) 设置HBASE_MANAGES_ZK

# Tell HBase whether it should manage it‘s own instance of Zookeeper or not.

export HBASE_MANAGES_ZK=false

如果HBASE_MANAGES_ZK值为true，则表示使用HBase自带的ZooKeeper，建议单独部署ZooKeeper，这样便于ZooKeeper同时为其它系统提供服务。

5) 设置JVM

JVM项	值	说明
HBASE_THRIFT_OPTS	export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xmx2048m -Xms2048m"	HBase Thrift JVM设置，-Xmx指定最大堆占用的内存，-Xms指定起始时分析的堆内存，如果物理内存为32G，可以考虑设置为2048m。
SERVER_GC_OPTS	开始GC日志	可用来分析线程的停顿时长等，有三处SERVER_GC_OPTS，只需要取消任意一注释即可。

如果希望命令行执行“hbase shell”时不打屏INFO和WARN级别日志，可修改hbase-env.sh中的“HBASE_ROOT_LOGGER”。

# HBASE_ROOT_LOGGER=INFO,DRFA

HBASE_ROOT_LOGGER=OFF,DRFA

6.4. 修改conf/log4j.properties

1) 设置日志文件目录：

hbase.log.dir=/data/hbase/log

2) 设置Cleaner的日志级别：

log4j.logger.org.apache.hadoop.hbase.master.cleaner.CleanerChore=DEBUG

6.5. 修改conf/hbase-site.xml

hbase-site.xml是HBase的配置文件。默认的hbase-site.xml是空的，如下所示：

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--

/**

* Licensed to the Apache Software Foundation (ASF) under one

* or more contributor license agreements. See the NOTICE file

* distributed with this work for additional information

* regarding copyright ownership. The ASF licenses this file

* to you under the Apache License, Version 2.0 (the

* "License"); you may not use this file except in compliance

* with the License. You may obtain a copy of the License at

* http://www.apache.org/licenses/LICENSE-2.0

* Unless required by applicable law or agreed to in writing, software

* distributed under the License is distributed on an "AS IS" BASIS,

* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

* See the License for the specific language governing permissions and

* limitations under the License.

-->

<configuration>

</configuration>

没关系，就用它。不要用docs目录下的hbase-default.xml，这个会让你看得难受。

编辑hbase-site.xml，添加如下内容（摘自standalone_dist.html，搜索“Fully-distributed”）：

<name>hbase.rootdir</name>

<value>hdfs://hadoop10102:8020/hbase</value>

<description>The directory shared by RegionServers.</description>

</property>

<name>hbase.cluster.distributed</name>

<description>The mode the cluster will be in. Possible values are

false: standalone and pseudo-distributed setups with managed Zookeeper

true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)

</description>

</property>

<name>hbase.zookeeper.quorum</name>

<value>hadoop10108,hadoop10109,hadoop10110,hadoop10111,hadoop10112</value>

<description>Comma separated list of servers in the ZooKeeper Quorum.

For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".

By default this is set to localhost for local and pseudo-distributed modes

of operation. For a fully-distributed setup, this should be set to a full

list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh

this is the list of servers which we will start/stop ZooKeeper on.

</description>

</property>

<name>hbase.master.maxclockskew</name>

<description>Time(ms) difference of regionserver from master</description>

</property>

</configuration>

“hbase.zookeeper.quorum”可以填写IP列表。hdfs://172.25.40.171:9001对应hdfs-site.xml中的“dfs.namenode.rpc-address”。“hbase.zookeeper.quorum”配置为ZooKeeper集群各节点主机名或IP。

如果HDFS是cluster模式，那么建议hbase.rootdir请改成集群方式，如：

<name>hbase.rootdir</name>

<value>hdfs://test/hbase</value>

</property>

即值为hdfs-site.xml中的dfs.nameservices值，再加上hbase目录。上述示例中的test，实际为hdfs-site.xml中的dfs.nameservices的值，core-site.xml中的fs.defaultFS也用到了。

更多的信息，可以浏览：http://hbase.apache.org/book/config.files.html。

6.5.1. hbase.master.info.port

用于指定HMaster的http端口。

6.5.2. hbase.master.info.bindAddress

用于指定HMaster的http的IP地址，如果不设定该值，可能使用IPv6地址。

6.5.3. hbase.hregion.majorcompaction

MajorCompaction间隔时间，单位为毫秒，如果为0表示手工执行MajorCompaction。HBase为防止小文件过多，在必要时通过Compaction合并小文件，由两类Compaction组成：一是MinorCompaction，二是MajorCompaction。

MajorCompaction对所有StoreFile进行合并操作，而MinorCompaction只对部分StoreFile进行合并。

默认值为604800000毫秒，表示每7天执行一次MajorCompaction操作。MajorCompaction时性能影响十分严重，一般关闭自动MajorCompaction，而采取空闲时手工完成。

在HBase shell中手工MajorCompation：

major_compact ‘tablename‘

或命令行中执行：

echo "tablename"|hbase shell

major_compact使用方法：

Examples:

Compact all regions in a table: // 合并整个表

hbase> major_compact ‘t1‘

hbase> major_compact ‘ns1:t1‘

Compact an entire region: // 合并一个Region

hbase> major_compact ‘r1‘

Compact a single column family within a region: // 合并一个Region的一个列族

hbase> major_compact ‘r1‘, ‘c1‘

Compact a single column family within a table: // 合并一个表的一个列族

hbase> major_compact ‘t1‘, ‘c1‘

6.5.4. zookeeper.session.timeout

默认为90秒（90000ms），RegionServer和ZooKeeper间的超时时长。当达到超时时间后，RegionServer会被ZooKeeper从RegionServer集群列表中清除，同时HBase Master会收到清除通知，然后HBase Master会对这台RegionServer的Region做Balance，让其它RegionServer接管，该参数决定了RegionServer在多长时间内可完成Failover。

6.5.5. 其它参数

参数名	默认值	说明
hbase.regionserver.thrift.port	9090	ThriftServer的服务端口号
hbase.regionserver.thrift.http	false	是否为HTTP模式。注意要么是HTTP模式，要么是非HTTP模式，不能同时是HTTP模式，又是其它模式。
hbase.thrift.ssl.enabled	false	ThriftServer是否启用SSL
hbase.master.hfilecleaner.ttl	300000	单位为毫秒，默认5分钟，为org.apache.hadoop.hbase.master.cleaner.CleanerChore所使用，仅源代码文件StorefileRefresherChore.java和TimeToLiveHFileCleaner.java直接使用了hbase.master.hfilecleaner.ttl。如果使用ExportSnapshot集群间迁移数据，则该值要比最大表的迁移时间长，否则迁移过程中已迁移的表文件可能被删除，如果看到错误“Can‘t find hfile”，一般是这个原因，新版本将修复该BUG，此BUG由HBASE-21511引入。
hbase.cleaner.scan.dir.concurrent.size		Cleaner线程数系数
hbase.hregion.majorcompaction	604800000	单位为毫秒，默认为7天，Major Compact的周期，值为0时表示关闭自动Major Compact
hbase.hregion.majorcompaction.jetter	0.2	防止RegionServer在同一时间进行Major Compact
hbase.hstore.compactionThreshold	3	Minor Compact的最少文件数
hbase.hstore.compaction.max	10	表示一次Minor Compact中最多选取10个StoreFile
hbase.hregion.max.filesize	10737418240	默认为10G，StoreFile达到此大小时分裂
hbase.hregion.memstore.flush.size	134217728	单位为字节，默认为整128M，当一个MemStore达到该大小时，即Flush到HDFS生成HFile文件。
hbase.hregion.memstore.block.multiplier	4	hbase-common /org/apache/hadoop/hbase/HConstants.java 系数值，当一个Region的大小达到（hbase.hregion.memstore.block.multiplierhbase.hregion.memstore.flush.size）时，会阻塞该Region*的写操作，并强制Flush。在Flush成功后才会恢复写操作，这个时候可看到错误“RegionTooBusyException: Over memstore limit”。
hbase.regionserver.global.memstore.upperLimit（老） hbase.regionserver.global.memstore.size（新）	0.4	hbase-server /org/apache/hadoop/hbase/io/util/MemorySizeUtil.java 浮点值，RegionServer全局memstore的大小。当RegionServer的memstore超过这个比率，该RegionServer的所有update都会被阻塞，所以应当避免RegionServer出现memstore总大小超过upperLimit。
hbase.regionserver.global.memstore.lowerLimit（老） hbase.regionserver.global.memstore.size.lower.limit（新）	0.95	hbase-server /org/apache/hadoop/hbase/io/util/MemorySizeUtil.java 浮点值，当RegionServer的memstore超过这个比率时，即使没有Region达到Flush，MemstoreFlusher也会挑选一个Region去Flush。
hbase.hstore.compaction.kv.max	10	Compact时每次从HFile中读取的KV数
hbase.hstore.blockingStoreFiles	10	当StoreFile文件数超过该值时，则在Flush前先进行Split或Compact，并阻塞Flush操作。直到完成或达到hbase.hstore.blockingWaitTime指定的时间后才可以Flush。当RegionServer日志大量出现“has too many store files; delaying flush up to”时表示需要调整该值，一般可以调大些，比如调成100。
hbase.hstore.blockingWaitTime	90000	单位为毫秒，默认为90秒
hbase.hregion.memstore.mslab.enabled	true	MemStore-Local Allocation Buffer，是否开启mslab，作用是减少因内存碎片导致的Full GC
hbase.regionserver.regionSplitLimit	1000	单台RegionServer管理的最多Region个数
hbase.regionserver.logroll.period	3600000	单位为毫秒，默认为1小时，WAL文件滚动时间间隔
hbase.server.thread.wakefrequency	10000	定时检查是否需要Compact，检查周期为：(hbase.server.thread.wakefrequency*hbase.server.compactchecker.interval.multiplier)
hbase.server.compactchecker.interval.multiplier	1000	定时检查是否需要Compact，检查周期为：(hbase.server.thread.wakefrequency*hbase.server.compactchecker.interval.multiplier)
hbase.thrift.info.port	9095	信息WEB端口，如果值小于0，则不会开启WEB
hbase.thrift.info.bindAddress	0.0.0.0	信息WEB地址，默认为“0.0.0.0”
hbase.regionserver.hlog.blocksize		WALUtil.java 当WAL的大小达到（hbase.regionserver.hlog.blocksize *hbase.regionserver.maxlogs）时，触发Flush
hbase.regionserver.maxlogs		AbstractFSWAL.java
hbase.regionserver.logroll.multiplier		AbstractFSWAL.java 系数值，注意区分 hbase.hregion.memstore.block.multiplier

7. 启动HBase

7.1. 启动Master

停止Master，只需要将下列的start改成stop即可。

hbase-daemon.sh start master

7.2. 启动RegionServer

停止RegionServer，只需要将下列的start改成stop即可。

hbase-daemon.sh start regionserver

7.3. 启动ThriftServer

HBase有两个版本的ThriftServer，对应两套不同的不兼容的接口实现：thrift和thrift2，其中thrift2是后加入的。从HBase2.X开始，两个版本除接口不同，其它是共享的。

启动ThriftServer使用如下语句，而停止ThriftServer只需将start改成stop即可。

hbase-daemon.sh start thrift2 --framed -nonblocking

正式环境一般不要使用Java版本的“-nonblocking”模式，因为它是单线程的。而应当使用“-hsha”或“-threadedselector”，推荐使用并发最强的“-threadedselector”模式。

有关“hbase thrift2”的参数，可执行“hbase thrift2 --help”查看到，“hbase thrift”也类似。

1) Hsha模式

hbase-daemon.sh start thrift2 --framed -hsha --workers 30或

hbase-daemon.sh start thrift2 --framed -hsha -w 30或

hbase thrift2 --framed -hsha -w 30

注意，参数是“-hsha”，而不是“--hsha”，其中“-f”和“--framed”等价，“-w”和“--workers”等价。

2) ThreadedSelector模式

hbase-daemon.sh start thrift2 --framed -threadedselector -s 10 -m 10 -w 50或

hbase thrift2 --framed -threadedselector -s 10 -m 10 -w 50

其中，-s”和“--selectors”等价，“-m”和“--minWorkers”等价。

3) 错误“Cannot get replica 0 location for”

此错误表示有请求访问了不存在的HBase表。

ERROR [thrift-worker-18] client.AsyncRequestFutureImpl: Cannot get replica 0 location for

4) 错误“ExecutorService rejected execution”

队列大小偏小，启动ThriftServer时指定队列大小：“-q 10000”或“--queue -10000”。

WARN [Thread-16] server.TThreadedSelectorServer: ExecutorService rejected execution!

java.util.concurrent.RejectedExecutionException: Task org.apache.thrift.server.Invocation@1642f20a rejected from org.apache.hadoop.hbase.thrift.THBaseThreadPoolExecutor@7cc22596[Running, pool size = 20, active threads = 20, queued tasks = 1000, completed tasks = 32792177]

5) 错误“RegionTooBusyException: Over memstore limit”

错误相关配置项：hbase.hregion.memstore.block.multiplier和hbase.hregion.memstore.flush.size，前者默认值为4，后者默认值为128MB。出错此错误时，可考虑将hbase.hregion.memstore.block.multiplier调大为8。

INFO [thrift-worker-2] client.RpcRetryingCallerImpl: RegionTooBusyException: Over memstore limit=512.0M, regionName=f146450279edde75342d003affa36be6, server=hadoop003,16020,1572244447374

at org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:4421)

at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:3096)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2877)

6) 错误“client.RpcRetryingCallerImpl: Exception: Over memstore limit”

错误相关配置项：hbase.hregion.memstore.block.multiplier和hbase.hregion.memstore.flush.size，前者默认值为4，后者默认值为128MB。

INFO [thrift-worker-24] client.RpcRetryingCallerImpl: Exception: Over memstore limit=512.0M, regionName=f80b7247cce17f3faac8731212ce43ba, server=hadoop003,16020,1572244447374

at org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:4421)

at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:3096)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2877)

。。。

, details=row ‘13871489517670268‘ on table ‘test‘ at region=test,13628864449,1572513486889.f80b7247cce17f3faac8731212ce43ba., hostname=hadoop003,16020,1572244447374, seqNum=10634659, see https://s.apache.org/timeout

7.4. 启动HBase rest server

默认端口号为8080，可使用命令行参数“-p”或“--port”指定为其它值。

bin/hbase-daemon.sh start rest -p 8080

简单访问示例（假设在10.143.136.232上启动了HBase rest server）：

1) 查看HBase版本：

http://10.143.136.232:8080/version/cluster

2) 查看集群状态

http://10.143.136.232:8080/status/cluster

3) 列出所有非系统表

http://10.143.136.232:8080/

4) 列出表test的所有regions

http://10.143.136.232:8080/test/regions

5) 取rowkey为100000797550117的整行数据（返回结果需要base64解密）

http://10.143.136.232:8080/test/100000797550117

6) 取rowkey为100000797550117，列族cf1下列field0列的数据（返回结果需要base64解密）

http://10.143.136.232:8080/test/100000797550117/cf1:field0

更多请浏览：

http://hbase.apache.org/book.html#_rest

7.4.1. Cluster-Wide

Endpoint	HTTP Verb	说明	示例
/version/cluster	GET	查看HBase版本	curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/version/cluster"
/status/cluster	GET	查看集群状态	curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/status/cluster"
/	GET	列出所有的非系统表	curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/"

注：可浏览器中直接打开，如：http://10.143.136.232:8080/version/cluster。

7.4.2. Namespace

Endpoint	HTTP Verb	说明	示例
/namespaces	GET	列出所有namespaces	curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/namespaces/"
/namespaces/*namespace*	GET	查看指定*namespace*的说明	curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/namespaces/*special_ns*"
/namespaces/*namespace*	POST	创建一个新的*namespace*	curl -vi -X POST \ -H "Accept: text/xml" \ "example.com:8000/namespaces/*special_ns*"
/namespaces/*namespace*/tables	GET	列出指定*namespace*下的所有表	curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/namespaces/*special_ns*/tables"
/namespaces/*namespace*	PUT	修改一个已存在的*namespace*	curl -vi -X PUT \ -H "Accept: text/xml" \ "http://example.com:8000/namespaces/*special_ns*
/namespaces/*namespace*	DELETE	删除一个*namespace，前提是该namespace*已为空	curl -vi -X DELETE \ -H "Accept: text/xml" \ "example.com:8000/namespaces/*special_ns*"

注：斜体部分是需要输入的。

7.4.3. Table

Endpoint	HTTP Verb	说明	示例
/*table*/schema	GET	查看指定表的schema	curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/*users*/schema"
/*table*/schema	POST	使用schema创建一个新的表或修改已存在表的schema	curl -vi -X POST \ -H "Accept: text/xml" \ -H "Content-Type: text/xml" \ -d ‘<?xml version="1.0" encoding="UTF-8"?><TableSchema name="users"><ColumnSchema name="cf" /></TableSchema>‘ \ "http://example.com:8000/*users*/schema"
/*table*/schema	PUT	使用schema更新已存在的表	curl -vi -X PUT \ -H "Accept: text/xml" \ -H "Content-Type: text/xml" \ -d ‘<?xml version="1.0" encoding="UTF-8"?><TableSchema name="users"><ColumnSchema name="cf" KEEP_DELETED_CELLS="true" /></TableSchema>‘ \ "http://example.com:8000/*users*/schema"
/*table*/schema	DELETE	删除表	curl -vi -X DELETE \ -H "Accept: text/xml" \ "http://example.com:8000/*users*/schema"
/*table*/regions	GET	列出表的所有regions	curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/*users*/regions

7.4.4. Get

Endpoint	HTTP Verb	说明	示例
/*table/row/column:qualifier/timestamp*	GET	取指定表指定列族下指定列的指定时间戳的值，返回的值为经过base64编码的，因此使用时需要做base64解码	curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/*users/row1" curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/row1/cf:a/1458586888395*"
/*table/row/column:qualifier*	GET	取指定表的指定列族下指定列的值	curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/*users/row1/cf:a" curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/users/row1*/cf:a/"
/*table/row/column:qualifier/?v=number_of_versions*	GET	取指定表的指定列族下指定列的指定版本值	curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/*users/row1*/cf:a?v=2"

7.4.5. Scan

Endpoint	HTTP Verb	说明	示例
/*table*/scanner/	PUT	创建一个scanner	curl -vi -X PUT \ -H "Accept: text/xml" \ -H "Content-Type: text/xml" \ -d ‘<Scanner batch="1"/>‘ \ "http://example.com:8000/*users*/scanner/"
/*table*/scanner/	PUT	带Filter创建一个scanner，过滤器可以写在一个文本文件中，格式如： <Scanner batch="100"> <filter> { "type": "PrefixFilter", "value": "u123" } </filter> </Scanner>	curl -vi -X PUT \ -H "Accept: text/xml" \ -H "Content-Type:text/xml" \ -d @filter.txt \ "http://example.com:8000/*users*/scanner/"
/*table/scanner/scanner-id*	GET	取下一批数据，如果已无数据，则返回的HTTP代码为204	curl -vi -X GET \ -H "Accept: text/xml" \ "http://example.com:8000/*users/scanner/145869072824375522207*"
*table/scanner/scanner-id*	DELETE	删除指定的scanner，释放资源	curl -vi -X DELETE \ -H "Accept: text/xml" \ "http://example.com:8000/*users/scanner/145869072824375522207*"

7.4.6. Put

Endpoint

HTTP Verb

说明

示例

/table/row_key

PUT

往指定表写一行数据，注意行键、列族、列名和列值都必须采用base64编码

curl -vi -X PUT \

-H "Accept: text/xml" \

-H "Content-Type: text/xml" \

-d ‘<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="cm93NQo="><Cell column="Y2Y6ZQo=">dmFsdWU1Cg==</Cell></Row></CellSet>‘ \

"http://example.com:8000/users/fakerow"

curl -vi -X PUT \

-H "Accept: text/json" \

-H "Content-Type: text/json" \

-d ‘{"Row":[{"key":"cm93NQo=", "Cell": [{"column":"Y2Y6ZQo=", "$":"dmFsdWU1Cg=="}]}]}‘‘ \

"example.com:8000/users/fakerow"

8. 基本HBase命令

通过执行“hbase shell”进入命令行操作界面。详细请浏览官方文档：quickstart.html。

# 查看有哪些表

hbase(main):003:0> list

hbase(main):003:0> create ‘test‘, ‘cf‘ # 创建表test，一个列族cf

0 row(s) in 1.2200 seconds

hbase(main):003:0> list ‘test‘

# 只看指定namespace下的表

list_namespace_tables ‘default‘

hbase(main):003:0> desc ‘test‘ # 查看表状态

1 row(s) in 0.0550 seconds

hbase(main):004:0> put ‘test‘, ‘row1‘, ‘cf:a‘, ‘value1‘ # 往表test的cf列族的a字段插入值value1

0 row(s) in 0.0560 seconds

hbase(main):005:0> put ‘test‘, ‘row2‘, ‘cf:b‘, ‘value2‘

0 row(s) in 0.0370 seconds

hbase(main):006:0> put ‘test‘, ‘row3‘, ‘cf:c‘, ‘value3‘

0 row(s) in 0.0450 seconds

hbase(main):007:0> scan ‘test‘ # 扫描表test

ROW COLUMN+CELL

row1 column=cf:a, timestamp=1288380727188, value=value1

row2 column=cf:b, timestamp=1288380738440, value=value2

row3 column=cf:c, timestamp=1288380747365, value=value3

3 row(s) in 0.0590 seconds

# 带条件scan

hbase(main):007:0> scan ‘test‘,{LIMIT=>1}

hbase(main):007:0> scan ‘test‘,{LIMIT=>1,STARTROW=>‘0000‘}

hbase(main):008:0> get ‘test‘, ‘row1‘ # 从表test取一行数据

COLUMN CELL

cf:a timestamp=1288380727188, value=value1

1 row(s) in 0.0400 seconds

# 取某列的数据

get ‘test‘, ‘row1‘, ‘cf1:col1‘

# 或者

get ‘test‘, ‘row1‘, {COLUMN=>‘cf1:col1‘}

语法：

get <table>,<rowkey>,[<family:column>,....]

scan <table>, {COLUMNS => [ <family:column>,.... ], LIMIT => num}

示例（在执行drop删除一个表之前，需先使用disable禁止该表）：

scan ‘ns1:t1‘, {COLUMNS => [‘c1‘, ‘c2‘], LIMIT => 10, STARTROW => 20180906}

hbase(main):012:0> disable ‘test‘

0 row(s) in 1.0930 seconds

hbase(main):013:0> drop ‘test‘

0 row(s) in 0.0770 seconds

# 清空一个表

truncate ‘test‘

# 查表行数用法1

count ‘test’

# 查表行数用法2

count ‘test‘,{INTERVAL=>10000}

# 删除行中的某个列值

delete ‘t1‘,‘row1‘,‘cf1:col1‘

# 删除整行

deleteall ‘t1‘,‘row1‘

# 退出hbase shell

hbase(main):014:0> exit

# 预分区1：命令行指定EndKey

create ‘mytable‘,‘cf1‘,SPLITS=>[‘EndKey1‘,‘EndKey2‘,‘EndKey3‘]

# 预分区2：文件指定EndKey（mytable.splits文件每行一个EndKey）

create ‘mytable‘,‘cf1‘,SPLITS_FILE=>‘mytable.splits‘

# 对例族做Snappy压缩

create ‘mytable‘,{NAME=>‘cf1‘,COMPRESSION=>‘SNAPPY‘},SPLITS=>[‘EndKey1‘,‘EndKey2‘,‘EndKey3‘]

# 分割策略分区

create ‘mytable‘,‘cf1‘,{NUMREGIONS=>7,SPLITALGO=>‘HexStringSplit‘}

查表行数第二种方法（非HBase shell命令，下列的test为表名）：

bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter ‘test‘

9. 常用HBase命令

1) 查看命令的用法

hbase(main):007:0> help "count" # 查看count命令的用法

2) 名字空间相关命令

hbase(main):007:0> create_namespace ‘mynamespace‘ # 创建名字空间

hbase(main):007:0> delete_namespace ‘mynamespace‘ # 删除名字空间

hbase(main):007:0> list_namespace # 列出名字空间

3) 别名方式

test=get_table "test" # 后续可以对象方式操作表"test"

test.count # 统计表行数

test.scan # 扫描整张表

test.scan LIMIT=>1 # 有限扫描

t.scan LIMIT=>1,STARTROW=>"0" # 带条件有限扫描

4) import

以下命令均直接在HBase shell中运行，包括“import”部分：

import org.apache.hadoop.hbase.filter.SingleColumnValueFilter

import org.apache.hadoop.hbase.filter.CompareFilter

import org.apache.hadoop.hbase.util.Bytes

# 包含所有列

scan ‘test‘,{STARTROW =>‘2016081100AA1600011516‘, STOPROW =>‘2016081124ZZ1600011516‘,LIMIT=>2, FILTER=>SingleColumnValueFilter.new(Bytes.toBytes(‘cf1‘),Bytes.toBytes(‘id‘),CompareFilter::CompareOp.valueOf(‘EQUAL‘),Bytes.toBytes(‘1299840901201608111600011516‘))}

# 不包含过滤的列的其它所有列

import org.apache.hadoop.hbase.filter.SingleColumnValueExcludeFilter

scan ‘test‘,{STARTROW =>‘2016081100AA1600011516‘, STOPROW =>‘2016081124ZZ1600011516‘,LIMIT=>2, FILTER=>SingleColumnValueExcludeFilter.new(Bytes.toBytes(‘cf1‘),Bytes.toBytes(‘id‘),CompareFilter::CompareOp.valueOf(‘EQUAL‘),Bytes.toBytes(‘1299840901201608111600011516‘))}

# 预分区建表（splits是针对整个表的，而非某列族,因此独立的{}）

create ‘test‘,{NAME => ‘cf1‘, VERSIONS => 1},{SPLITS_FILE => ‘splits.txt‘}

10. 备HMaster

备HMaster可以有0到多个，配置和主HMaster完全相同，所以只需要复制一份已配置好的HMaster过去即可，然后同样的命令启动。启动好后，一样可以执行HBase shell命令。

11. 访问控制配置

11.1. 修改配置

为启用HBase的访问控制，需在hbase-site.xml文件中增加如下两个配置项：

<name>hbase.coprocessor.master.classes</name>

<value>org.apache.hadoop.hbase.security.access.AccessController</value>

</property>

<name>hbase.coprocessor.region.classes</name>

<value>

org.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController

</value>

</property>

11.2. 权限管理

可以通过HBase shell进行权限管理，可以控制表（Table）和列族（Column Family）两个级别的权限，superuser为超级用户：

11.2.1. 授权权限

grant <user> <permissions> <table> [ <column family> [ <column qualifier> ] ]

permissions取值为0或字母R、W、C和A的组合（R：read，W：write，C：create，A：admin）。

11.2.2. 收回权限

revoke <user> <table> [ <column family> [ <column qualifier> ] ]

11.2.3. 更改权限

alter ‘tablename‘, {OWNER => ‘username‘}

11.2.4. 查看权限

查看用户有哪些权限：user_permission <table>。

12. HBase Web

12.1. Master Web

12.2. Region Web

查看Region的MemStore和StoreFile信息，可观察到当MemStore的大小达到128M时即会Flush出新的StoreFile文件（以下test为表名，包含一个列族cf1）：

http://192.168.1.31:8080/region.jsp?name=72d8e0fec34a847d6e8c6ad62e6ec154

Region: test,196401049269744,1506394586723.72d8e0fec34a847d6e8c6ad62e6ec154.

Column Family: cf1

Memstore size (MB): 48

Store Files

4 StoreFile(s) in set.

Store File Size (MB) Modification time

hdfs://hadoop/hbase/data/default/test/72d8e0fec34a847d6e8c6ad62e6ec154/cf1/1d0a9744d1084de7b87629bad9459ac9 1041 Tue Sep 26 10:59:17 CST 2017

hdfs://hadoop/hbase/data/default/test/72d8e0fec34a847d6e8c6ad62e6ec154/cf1/56ab5f353fe94c15b183d4cf0880772a 206 Tue Sep 26 10:59:44 CST 2017

hdfs://hadoop/hbase/data/default/test/72d8e0fec34a847d6e8c6ad62e6ec154/cf1/a1b48379e66846a8b15bf42e830e0711 68 Tue Sep 26 10:59:52 CST 2017

hdfs://hadoop/hbase/data/default/test/72d8e0fec34a847d6e8c6ad62e6ec154/cf1/69595130c92c439ca939f366bcd2a843 68 Tue Sep 26 11:00:02 CST 2017

13. 运维操作

13.1. 优雅重启RegionServer

使用命令graceful_stop.sh，格式为：

graceful_stop.sh --restart --reload --debug regionserver_hostname

原理为先move该RegionServer上所有的Region到其它RegionServer，然后再stop/restart该RegionServer。注意，该命令会关闭balancer，完成后再打开balancer：

Valid region move targets:

hadoop-095,16020,1505315231353

hadoop-092,16020,1505315376659

hadoop-094,16020,1505389352958

hadoop-096,16020,1505459565691

[main] region_mover: Moving 60 region(s) from hadoop-091,16020,1505315285824 on 4 servers using 1 threads.

thread-pool.rb:28] region_mover: Moving region ... (1 of 60) to ... for ...

thread-pool.rb:28] region_mover: Moving region ... from ... to ...

thread-pool.rb:28] region_mover: Moved region ... cost: 0.646

Reloaded hadoop-091 region(s)

Restoring balancer state to true

13.2. region均衡

让HMaster自动均衡和RegionServer间的Region数量命令（在HBase shell中执行，false为关闭）：

hbase(main):042:0> help "balance_switch"

Enable/Disable balancer. Returns previous balancer state.

Examples:

hbase> balance_switch true

hbase> balance_switch false

当重启一个RegionServer后，会关闭balancer，命令“balance_switch true”返回上一次balance的状态（true或false）。

13.3. 分拆Region

当一个Region的大小超过hbase.hregion.max.filesize值（默认为10GB）时，该Region会被自动分拆（Split）成两个Region。

也可以主动分拆Region，主动分拆Region最简单的方式是利用HBase web提供的Split功能，只需要输入被分拆的Region Key即可，如要拆分名为“test,03333333,1467613810867.38b8ef87bbf2f1715998911aafc8c7b3.”的Region，只需要输入：test,03333333,1467613810867，然后点Split即可。

38b8ef87bbf2f1715998911aafc8c7b3为Region的ENCODED名，是一个MD5值，即md5(test,03333333,1467613810867)的结果。

在hbase shell中操作为：split ‘regionName‘, ‘splitKey‘。进入HBase shell，直接执行split可以得到使用帮助：

hbase(main):041:0> help "split"

Split entire table or pass a region to split individual region. With the

second parameter, you can specify an explicit split key for the region.

Examples:

split ‘TABLENAME‘

split ‘REGIONNAME‘

split ‘ENCODED_REGIONNAME‘

split ‘TABLENAME‘, ‘splitKey‘

split ‘REGIONNAME‘, ‘splitKey‘

split ‘ENCODED_REGIONNAME‘, ‘splitKey‘

13.4. 合并Region

预分Region时，可能会产生一些过小或空的Region，这个时候可以考虑合并空的和过小的Region，HBase shell内置了合并region命令merge_region。

HBase shell通过调用lib/ruby目录下的ruby脚本来完成许多命令，这些命令的脚本全用ruby编码，均位于lib/ruby/shell/commands目录下。不能直接运行lib/ruby/shell/commands目录下的ruby脚本，它们只是各种功能的ruby模块，需进入hbase shell环境后运行，文件名即为命令名，不带参数运行，可以得到用法，如：

hbase(main):039:0> help "merge_region"

Merge two regions. Passing ‘true‘ as the optional third parameter will force

a merge (‘force‘ merges regardless else merge will fail unless passed

adjacent regions. ‘force‘ is for expert use only).

You can pass the encoded region name or the full region name. The encoded

region name is the hash suffix on region names: e.g. if the region name were

TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. then

the encoded region name portion is 527db22f95c8a9e0116f0cc13c680396

Examples:

hbase> merge_region ‘FULL_REGIONNAME‘, ‘FULL_REGIONNAME‘

hbase> merge_region ‘FULL_REGIONNAME‘, ‘FULL_REGIONNAME‘, true

hbase> merge_region ‘ENCODED_REGIONNAME‘, ‘ENCODED_REGIONNAME‘

hbase> merge_region ‘ENCODED_REGIONNAME‘, ‘ENCODED_REGIONNAME‘, true

实际上，编码的Region名ENCODED_REGIONNAME是一个MD5值。在线合并示例：

hbase(main):003:0> merge_region ‘000d96eef8380430d650c6936b9cef7d‘,‘b27a07c88dbbc070f716ee87fab15106‘

0 row(s) in 0.0730 seconds

13.5. 查看指定Region的数据

hbase hfile -p -f /hbase/data/default/test/045ed11a1fdcd8bfa621f1160788a124/cf1/2502aeb799b84dda8340cdf4ad59e1f8

test是表名，045ed11a1fdcd8bfa621f1160788a124是Region name，2502aeb799b84dda8340cdf4ad59e1f8是hfile文件

运行输出示例：

K: 471200000000798496948594841700001313/cf1:c1/1493957134206/Put/vlen=28/seqid=0 V: 1214271401201503056013176259

K: 471200000000798496948594841700001313/cf1:c2/1493957134206/Put/vlen=4/seqid=0 V: 110

K: 471200000000798496948594841700001313/cf1:c3/1493957134206/Put/vlen=3/seqid=0 V: ABC

K: 471200000000798496948594841700001313/cf1:c4/1493957134206/Put/vlen=12/seqid=0 V: 16.12.57.1

K: 471200000000798496948594841700001313/cf1:c5/1493957134206/Put/vlen=28/seqid=0 V: 1214271401201503051700001313

K: 471200000000798496948594841700001313/cf1:c6/1493957134206/Put/vlen=0/seqid=0 V:

K: 471200000000798496948594841700001313/cf1:c7/1493957134206/Put/vlen=19/seqid=0 V: 2015-09-05 12:05:15

K: 471200000000798496948594841700001313/cf1:c8/1493957134206/Put/vlen=3/seqid=0 V: bko

13.6. 如何迁移Region？

当一个表的region在各RegionServer间分配不均的时候，可以做迁移，方法为进入hbase的命令行界面，执行move命令迁移。

move命令的格式为：

move ‘被迁region的ENCODED值‘,‘目标RegionServer全名‘

示例：

move ‘bd118dd803bb4c8f8e28ea87ebec8335‘,‘hadoop-391,16020,1499860514933‘

在HMaster和RegionServer的Web界面均可看到RegionServer的全名。

13.7. 如何查看Region大小？

使用hdfs命令，比如查看mytable这张表各region的大小，执行命令：

hdfs dfs -du hdfs:///hbase/data/default/mytable

即可看到各region大小，其中第一列为region的大小：

306 hdfs:///hbase/data/default/mytable/.tabledesc

0 hdfs:///hbase/data/default/mytable/.tmp

2866280 hdfs:///hbase/data/default/mytable/023b0259adb715e3b4f6abecc1073ef3

2846541 hdfs:///hbase/data/default/mytable/2d2b9c6497e94a1f5227e9e4d26c4dc3

2911682 hdfs:///hbase/data/default/mytable/46e92458ce457ae69650013ca12e2415

2846062 hdfs:///hbase/data/default/mytable/6c1e8b29a5fc941434343dce11648e01

2846135 hdfs:///hbase/data/default/mytable/6c86670d1d3a2b23d9eee836efb959ca

2875079 hdfs:///hbase/data/default/mytable/7ee492fbc245a34a075e961f75e20f85

2846527 hdfs:///hbase/data/default/mytable/8b41e528bd529b123c9ce8d66816a21b

2846194 hdfs:///hbase/data/default/mytable/913f32ee373d382be3062562b87367fe

2846291 hdfs:///hbase/data/default/mytable/b9a1ef3d48f3fa45637473ae82d258c1

2912207 hdfs:///hbase/data/default/mytable/b9f3d37ce1c7a73035a517decde7ffec

2846121 hdfs:///hbase/data/default/mytable/bd118dd803bb4c8f8e28ea87ebec8335

2912197 hdfs:///hbase/data/default/mytable/c2c25cdc9376377c96b9d45768c7d793

2846253 hdfs:///hbase/data/default/mytable/c9e1362c88aeae241a5737989d2c8bf8

2911869 hdfs:///hbase/data/default/mytable/cf4f2bda053820b56db112929e077b74

2913065 hdfs:///hbase/data/default/mytable/e6ed2d434d3e0b8b59f859fdcb615a80

2912094 hdfs:///hbase/data/default/mytable/fec9c2376b0c3466c6f872aef3c3177d

13.8. WAL工具

HBase WAL类似于MySQL的binlog，WAL工具位于库文件hbase-server-X.Y.Z.jar中，导出和分拆WAL，执行方式：

$ hbase org.apache.hadoop.hbase.regionserver.wal.FSHLog

Usage: FSHLog <ARGS>

Arguments:

--dump Dump textual representation of passed one or more files

For example: FSHLog --dump hdfs://example.com:9000/hbase/.logs/MACHINE/LOGFILE

--split Split the passed directory of WAL logs

For example: FSHLog --split hdfs://example.com:9000/hbase/.logs/DIR

示例：

$ hbase org.apache.hadoop.hbase.regionserver.wal.FSHLog --dump hdfs:///hbase/WALs/hadoop-203,16020,1493809707539/hadoop-203%2C16020%2C1493809707539.1506389208593

Writer Classes: ProtobufLogWriter

Cell Codec Class: org.apache.hadoop.hbase.regionserver.wal.WALCellCodec

Sequence=117 , region=f9c1b75be5fcb024a9cf74c47affb52e at write timestamp=Tue Sep 26 09:32:34 CST 2017

row=g-FMZffo5E36_sJCNx2BS4jJorpo-616451068748770146, column=cf1:js

row=g-FMZffo5E36_sJCNx2BS4jJorpo-616451068748770146, column=cf1:lsud

row=g-FMZffo5E36_sJCNx2BS4jJorpo-616451068748770146, column=cf1:at

row=g-FMZffo5E36_sJCNx2BS4jJorpo-616451068748770146, column=cf1:rat

Sequence=118 , region=f9c1b75be5fcb024a9cf74c47affb52e at write timestamp=Tue Sep 26 09:32:44 CST 2017

row=g-FMZffo5E36_sJCNx2BS4jJorpo-616451068748770160, column=cf1:js

row=g-FMZffo5E36_sJCNx2BS4jJorpo-616451068748770160, column=cf1:lsud

row=g-FMZffo5E36_sJCNx2BS4jJorpo-616451068748770160, column=cf1:at

row=g-FMZffo5E36_sJCNx2BS4jJorpo-616451068748770160, column=cf1:rat

Sequence=119 , region=f9c1b75be5fcb024a9cf74c47affb52e at write timestamp=Tue Sep 26 09:32:51 CST 2017

row=g-FMZffo5E36_sJCNx2BS4jJorpo-616451068748767888, column=cf1:js

row=g-FMZffo5E36_sJCNx2BS4jJorpo-616451068748767888, column=cf1:lsud

row=g-FMZffo5E36_sJCNx2BS4jJorpo-616451068748767888, column=cf1:at

row=g-FMZffo5E36_sJCNx2BS4jJorpo-616451068748767888, column=cf1:rat

13.9. Compact和Split

如果同时做Split和Compact，建议先Compact后Split。Compact的目的是将一个Region下的所有StoreFile合并成一个StoreFile文件。

之后再对超过指定大小的Region进行Split操作，可修改参数hbase.hstore.compactionThreshold减少Compact频率，通过设置较大的hbase.hregion.majorcompaction降低Major Compact频率。

Major Compact操作方式：

hbase(main):001:0> major_compact

ERROR: wrong number of arguments (0 for 1)

Here is some help for this command:

Run major compaction on passed table or pass a region row

to major compact an individual region. To compact a single

column family within a region specify the region name

followed by the column family name.

Examples:

Compact all regions in a table:

hbase> major_compact ‘t1‘

hbase> major_compact ‘ns1:t1‘

Compact an entire region:

hbase> major_compact ‘r1‘

Compact a single column family within a region:

hbase> major_compact ‘r1‘, ‘c1‘

Compact a single column family within a table:

hbase> major_compact ‘t1‘, ‘c1‘

1) 示例1：合并表ID为a6e65f0540bfcd1cb2740bb4b033d134的Region的列族cf2

hbase(main):003:0> major_compact ‘a6e65f0540bfcd1cb2740bb4b033d134‘,‘cf2‘

0 row(s) in 0.1990 seconds

13.10. 是否可直接kill掉RegionServer进程？

如果直接kill掉RegionServer进程，而不是通过graceful_stop.sh优雅退出，则只有经历zookeeper.session.timeout指定的毫秒时间后，HMaster才会将该RegionServer上的Region迁移至其它的RegionServer上。

可以简单理解zookeeper.session.timeout为RegionServer和HMaster间的心跳超时时间，但实际上两者并不直接联系，而是通过ZooKeeper节点方式HMaster感知RegionServer超时。

实际上kill掉后立即重新启动RegionServer，也是可以的。

13.11. 下线RegionServer

在需要下线的RegionServer上执行graceful_stop.sh，停止RegionServer。在下线时，HBase会关闭Load Balance，因此在下线完成后需要进入HBase shell执行下：balance_switch true。

13.12. RowKey和RegionServer不一致问题

如果出现ROWKEY和RegionServer不一致，可以使用“hbase hbck -repair”进行修复，如果有空洞，使用“hbase hbck -repairHoles”。

13.13. 快照迁移数据

快照方式支持跨版本迁移数据，比如源HBase集群版本为“Hadoop-2.7.3 + HBase-1.2.6”，目标HBase集群版本为“Hadoop-3.1.2 + HBase-2.2.1”。

以将HBase表test从集群192.168.31.30迁移到192.168.32.30为例。数据迁移除了采用快照方式下，还可以使用HBase层面的copyTable和export/import，以及HDFS层面的distcp。其中，export/import和copyTable两种支持指定时间范围的部分复制。

不带参数执行，可得到相应的帮助，如：

hbase org.apache.hadoop.hbase.mapreduce.Export

hbase org.apache.hadoop.hbase.mapreduce.Import

hbase org.apache.hadoop.hbase.mapreduce.CopyTable

hadoop distcp

13.13.1. 创建快照

这一步在源集群的HBase shell中完成，其中“test”为表名，“test.snapshot”为快照名。

snapshot ‘test‘,‘test.snapshot‘

13.13.2. 迁移快照文件

这一步直接在Linux shell中操作，可在目标集群机器上操作，也可在源集群机器上操作，建议在目标集群上实施，迁移实际是一个MR过程（只有Map，没有Reduce）。执行时，注意通过参数bandwidth控制好流量，迁移时流量会很大。

hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot test.snapshot -copy-from hdfs://192.168.31.30/hbase -copy-to hdfs://192.168.32.30/hbase -mappers 10 -bandwidth 50

注意一定要设置bandwidth参数，以控制迁移时的流量，通常流量会很大的，能够吃满万兆网卡，如果不加以控制。还可指定迁移后文件在目标集群中的Owner：

hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -overwrite -snapshot test.snapshot -copy-from hdfs://192.168.31.30/hbase -copy-to hdfs://192.168.32.30/hbase -mappers 10 -bandwidth 30 -chuser hbase -chgroup supergroup

另外，还可以加上参数“-overwrite”:

hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -overwrite -snapshot test.snapshot -copy-from hdfs://192.168.31.30/hbase -copy-to hdfs://192.168.32.30/hbase -mappers 10 -bandwidth 50

在迁移过程中，如果遇到错误“Can‘t find hfile”，则应当将hbase.master.hfilecleaner.ttl的值调大，至少要比迁移时长大的值。

如果执行ExportSnapshot时报错“Operation category READ is not supported in state standby”，表示ExportSnapshot的参数“-copy-to”指向了备NameNode，改成指向主NameNode后再执行。

13.13.3. 恢复快照

根据快照名恢复快照，这一步在目标集群的HBase shell中完成。

restore_snapshot ‘test.snapshot’

使用restore_snapshot的前提是表已被disable，否则需改用bulkload方式恢复快照。

hbase \

org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles \

-Dhbase.mapreduce.bulkload.max.hfiles.perRegion.perFamily=1024 \

hdfs://192.168.31.30/hbase/archive/datapath/tablename/filename \

tablename

参数hbase.mapreduce.bulkload.max.hfiles.perRegion.perFamily指定在Load过程中每个Region下每个列族的hfile文件数上限，默认为32。

BulkLoad是一个个文件逐个导入，所以处理起来要麻烦许多。BulkLoad时，可以考虑设置hbase.client.bulk.load.validate.hfile.format值为false来加快Load效率。

13.13.4. 删除快照

delete_snapshot ‘test.snapshot’

13.13.5. 查看快照列表

list_snapshots

附1：元数据

HBase在zookeeper上的目录结构：

[zk: localhost:2181(CONNECTED) 0] ls /hbase

[backup-masters, draining, flush-table-proc, hbaseid, master, master-maintenance, meta-region-server, namespace, online-snapshot, replication, rs, running, splitWAL, switch, table, table-lock]

从0.96版本开始root-region-server被meta-region-server替代，原来的root被删除了，新的meta像原来的root一样，只有一个Region，不再会有多个Region。

从0.96版本开始引入了namespace，删除了-ROOT-表，之前的.META.表被hbase:meta表替代，其中hbase为namespace名。namespace可以认为类似于MySQL中的DB名，用于对表进行逻辑分组管理。

客户端对hbase提供DML操作不需要访问master，但DDL操作依赖master，在hbase shell中的list也依赖于master。

在主hbase master的web上，可以看到有三个系统表：hbase:meta、hbase:namespace和hbase:acl（如果没开启ACL，则无此表），注意hbase:namespace和hbase:acl的元数据也存储在hbase:meta中，这可以通过在hbase shell中执行scan ‘hbase:meta‘观察到。

hbase(main):015:0* scan ‘hbase:meta‘,{LIMIT=>10}

hbase:acl,,1460426731436.0bbdf170c309223c0ce830 column=info:regioninfo, timestamp=1460426830411, value={ENCODED => 0bbdf170c309223c0ce830facdff9edd, NAME => ‘hbase:acl,,1460426731436.0bbdf

facdff9edd. 170c309223c0ce830facdff9edd.‘, STARTKEY => ‘‘, ENDKEY => ‘‘}

hbase:acl,,1460426731436.0bbdf170c309223c0ce830 column=info:seqnumDuringOpen, timestamp=1461653766642, value=\x00\x00\x00\x00\x00\x00\x002

facdff9edd.

hbase:acl,,1460426731436.0bbdf170c309223c0ce830 column=info:server, timestamp=1461653766642, value=hadoop-034:16020

facdff9edd.

hbase:acl,,1460426731436.0bbdf170c309223c0ce830 column=info:serverstartcode, timestamp=1461653766642, value=1461653610096

第一列，即红色串为Region name；serverstartcode为Regsion server加载region的时间；server为Region server的IP和端口；regioninfo结构为：

1) ENCODED 为Region name的MD5值

2) NAME 为Region name

3) STARTKEY 为空表示为第一个Region

4) ENDKEY 如果也为空，则表示该表只有一个Region。

附2：基本概念

1）HBase和HDFS

HBase的表数据存储在HDFS上，假设有一HBase表test，它的namespace为abc（默认的namespace名为default），则该表的HDFS路径为：hdfs:///hbase/data/abc/test，表名目录之下为各Regions的子目录，每个Region均一个独立的子目录，如：

$ hdfs dfs -ls hdfs:///hbase/data/abc/test/

Found 7 items

drwxr-xr-x - hadoop supergroup 0 2017-07-19 17:45 hdfs:///hbase/data/abc/test/.tabledesc

drwxr-xr-x - hadoop supergroup 0 2017-07-19 17:45 hdfs:///hbase/data/abc/test/.tmp

drwxr-xr-x - hadoop supergroup 0 2019-05-30 16:07 hdfs:///hbase/data/abc/test/37b03ca897147840c3676bb7d622af2f

drwxr-xr-x - hadoop supergroup 0 2019-05-30 16:07 hdfs:///hbase/data/abc/test/8e52bcdc2b1292fabf6bc8ea8e8be8ba

drwxr-xr-x - hadoop supergroup 0 2019-09-07 09:36 hdfs:///hbase/data/abc/test/ba6fa3def428d0d5e53a17d30c5fa7de

drwxr-xr-x - hadoop supergroup 0 2019-09-07 09:36 hdfs:///hbase/data/abc/test/cdf6d4ff4dc0b6cf1c23e1db133dbfe1

drwxr-xr-x - hadoop supergroup 0 2018-04-17 23:05 hdfs:///hbase/data/abc/test/fb3b4847d6cf504aea3990859e2b8092

在Region目录下，为各列族的子目录，每个列族均一个独立的子目录：

$ hdfs dfs -ls hdfs:///hbase/data/abc/test/fb3b4847d6cf504aea3990859e2b8092

Found 4 items

-rw-r--r-- 3 hadoop supergroup 99 2018-04-17 23:05 hdfs:///hbase/data/abc/test/fb3b4847d6cf504aea3990859e2b8092/.regioninfo

drwxr-xr-x - hadoop supergroup 0 2018-04-17 23:06 hdfs:///hbase/data/abc/test/fb3b4847d6cf504aea3990859e2b8092/.tmp

drwxr-xr-x - hadoop supergroup 0 2018-04-17 23:06 hdfs:///hbase/data/abc/test/fb3b4847d6cf504aea3990859e2b8092/cf1

drwxr-xr-x - hadoop supergroup 0 2019-10-15 10:56 hdfs:///hbase/data/abc/test/fb3b4847d6cf504aea3990859e2b8092/recovered.edits

从上不难发现，可很方便的查看一个表的大小或者Region的大小，只需使用命令“hdfs dfs -du”即可，比如：

$ hdfs dfs -du -h hdfs:///hbase/data/abc/

4.7 G hdfs:///hbase/data/abc/test

$ hdfs dfs -du -h hdfs:///hbase/data/abc/test

287 hdfs:///hbase/data/abc/test/.tabledesc

0 hdfs:///hbase/data/abc/test/.tmp

167.1 M hdfs:///hbase/data/abc/test/37b03ca897147840c3676bb7d622af2f

137.3 M hdfs:///hbase/data/abc/test/8e52bcdc2b1292fabf6bc8ea8e8be8ba

2.6 G hdfs:///hbase/data/abc/test/ba6fa3def428d0d5e53a17d30c5fa7de

870.1 M hdfs:///hbase/data/abc/test/cdf6d4ff4dc0b6cf1c23e1db133dbfe1

1000.0 M hdfs:///hbase/data/abc/test/fb3b4847d6cf504aea3990859e2b8092

2）Region&MemStore&StoreFile

HBase的负载均衡单位为Region，RegionServer负责操作Region，如加载Region到内存，提供读写、分拆和合并等。一个Region同一时刻只会被一个RegionServer操作，可以通过move命令在不同RegionServer间迁移。

一个Region由一个或多个Store组成，每个Store存储一个列族（Column Family）。每个Store又由一个MemStore和0或多个StoreFile组成，其中StoreFile存储在HDFS上（即HFile文件），MemStore存储在内存中（可通过WAL日志文件重放恢复MemStore）。

MemStore被Flush到HDFS即生成新的StoreFile，当StoreFile达到一定数量时触发合并（Compact）,Major Compact将所有的StoreFile合并成一个StoreFile文件。

如果没有开启WAL日志（Write Ahead Log，类似于MySQL的binlog），则在MemSotre被Flush之前发生故障，会丢失MemStore部分数据。

Major Compact的作用是将一个Region同一列族下的所有StoreFile合并成一个大的StoreFile。不同列族对应不同StoreFile文件，如果只有一个列族则合并后只有一个StoreFile，如果有两个列族则合并后有两个StoreFile，依次类推。

3）StoreFile文件

可通过RegionServer的Web查看，如：

http://192.168.1.31:8080/region.jsp?name=78d2bddb0fdc2c735c79da68658f3011

假设Hadoop集群名为hadoop，表名为test，Region ID为78d2bddb0fdc2c735c79da68658f3011则StoreFile文件路径可能类似如下：

hdfs://hadoop/hbase/data/default/test/78d2bddb0fdc2c735c79da68658f3011/cf1/4facea216b15471e9d6d85cf59bd9d8a

hdfs://hadoop/hbase/data/default/test/78d2bddb0fdc2c735c79da68658f3011/cf1/404aea6371cb4722af91a07c10a3fcf3

可通过“hdfs dfs -ls”取得StoreFile文件大小，如：

$ hdfs dfs -ls hdfs://hadoop/hbase/data/default/test/78d2bddb0fdc2c735c79da68658f3011/cf1/4facea216b15471e9d6d85cf59bd9d8a

-rw-r--r-- 3 zhangsan supergroup 6283978864 2017-09-24 12:47 hdfs://hadoop/hbase/data/default/test/78d2bddb0fdc2c735c79da68658f3011/cf1/4facea216b15471e9d6d85cf59bd9d8a

4）Region name

Region name用来标识一个Region，它的格式为：表名,StartKey,随机生成的RegionID，如：

test,83--G40V6UdCnEHKSKqR_yjJo798594847946710200000795,1461323021820.d4cc7afbc2d6bf3843c121fedf4d696d.

上述test为表名，中间蓝色串为Startkey，最后红色部分为Region ID（注意包含了2个点号）。如果为第一个Region，则StartKey为空，比如变成这样：

t_user,,1461549916081.f4e17b0d99f2d77da44ccb184812c345.

附3：crontab监控脚本

使用这种方法，不需要在Master和Slaves节点间建立免密码登录关系，因为相关进程由crontab拉起，并不需要显示调用start-all.sh、start-dfs.sh、start-yarn.sh、start-hbase.sh等。

监控工具下载：

https://github.com/eyjian/libmooon/blob/master/shell/process_monitor.sh

监控工具使用示例：

PMONITOR=/usr/local/bin/process_monitor.sh

JAVA=/usr/local/jdk/bin/java

HBASE_DAEMON=/data/hbase/bin/hbase-daemon.sh

# 监控HBase Master（仅在 Master 上启动）

#* * * * * $PMONITOR "$JAVA -Dproc_master" "$HBASE_DAEMON start master"

# 监控HBase RegionServer（仅在 RegionServer 上启动）

#* * * * * $PMONITOR "$JAVA -Dproc_regionserver" "$HBASE_DAEMON start regionserver"

# 监控HBase ThriftServer2（一般和 RegionServer 共享相同机器）

#* * * * * $PMONITOR "$JAVA -Dproc_thrift2" "$HBASE_DAEMON start thrift2 --framed -threadedselector -s 10 -m 20 -w 20"

# 监控HBase RESTServer（一般和 RegionServer 共享相同机器）

#* * * * * $PMONITOR "$JAVA -Dproc_rest" "$HBASE_DAEMON start rest -p 8080"

附4：批量操作工具

批量操作工具下载：

https://github.com/eyjian/libmooon/releases

其中，mooon_ssh为批量命令工具，mooon_upload为批量上传文件工具。

批量操作工具使用示例：

# 设置环境变量

export H=‘192.168.1.21,192.168.1.22,192.168.1.23,192.168.1.24,192.168.1.25,192.168.1.26‘

export U=root

export P=password

export PORT=201810

# 上传/etc/hosts和/etc/profile到：192.168.31.12,192.168.31.13,192.168.31.14,192.168.31.15

mooon_upload -s=/etc/hosts,/etc/profile -d=/etc

# 检查/etc/profile文件是否一致

mooon_ssh -c=‘md5sum /etc/hosts‘

附5：JVM基础知识

1）JPS命令

当执行jps命令时，如果看不到运行中的Java进程。假设用户名为test，则检查目录/tmp/hsperfdata_test的Owner和权限是否正确，如果权限不正确，则jps将看不到运行中的Java进程。

2）JVM内存

JVM内存由3部分组成：

	作用	说明
新生代（Young）	存活时间较短，一般存储刚生成的一些对象	又分为一个伊甸园（Eden）区，和两个Survivor区，两个Survivor区总有一个是空的。对新生代的垃圾回收叫做Minor GC，特点是次数频繁，但每次回收时间较短。对象在Survivor每经历一次Minor GC，年龄就增长1岁，默认增长到15岁时就晋升到Old代。
老年代（Tenured）	也叫旧生成（Old），存活时间较长，主要存储在应用程序中生命周期较长的对象	对象优先在Eden上分配，但大对象直接在Old上分配。对老年代的垃圾回收叫做MajorGC或Full GC，特点是次数相对少，每次回收时间较长。
永久代（Perm）	存储meta和class的信息，JDK8已删除永久代	永久代垃圾回收比较少，效率也比较低。因为在JDK8中已无永久代，所以JVM参数“-XX:PermSize”和 “-XX:MaxPermSize”已作废，代替的参数分别为“-XX:MetaspaceSiz”和“-XX:MaxMetaspaceSize”。

3）JVM GC（Garbage Collection，垃圾回收）

当Eden空间不足时，即触发Minor GC。Full GC的触发条件为：Old空间不足；Perm空间不足；统计得到的Minor GC晋升到Old的平均大小大于Old的剩余空间。

回收策略
Serial	串行收集器	串行单线程处理所有垃圾回收工作，适合小数据量的回收，使用“-XX:+UseSerialGC”打开。
Parrallel New Collector	并行收集器	Serial的多线程版本，并行多线程处理，使用“-XX:+UseParallelOldGC”打开，用“-XX:ParallelGCThreads=<N>”指定垃圾回收线程数。
CMS（Concurrent Mark Sweep）	并发收集器，响应时间优先回收器	并发多线程处理，使用“-XX:+UseConcMarkSweepGC”打开，只针对老年代。

4）并行和并发

并行	Parallel	多条垃圾收集线程并行工作，应用线程处于等待状态
并发	Concurrent	垃圾收集线程与应用线程一段时间内同时工作，不是并行而是交替执行

5）JVM的CMS垃圾回收

CMS（Concurrent Mark Sweep）是一个并发使用标记的GC，以牺牲CPU吞吐量为代价获得最短回收停顿时间的垃圾回收器。

CMS不对堆（Heap）进行整理和压缩，节约了垃圾回收停顿时间，但产生了空间碎片，增加了堆空间的浪费。

CMS虽然是老生代的GC，但仍然需要扫描新生代。

启用方式：JVM参数加上“XX:+UseConcMarkSweepGC”，这个参数表示对于老年代的回收采用CMS，CMS采用“标记—清除”算法。

CMS分成以下几个过程：

1) 初始标记（STW initial mark）

需暂停JVM，官方的叫法STW（Stop The Word），这个过程虽然暂停了JVM，但很快完成。这个过程只标记GC ROOT可直接关联的对象，不包括间接关联的对象。

2) 并发标记（Concurrent marking）

无停顿，应用程序的线程和并发标记的线程并发执行，本过程标记所有可达对象，通过GC ROOT TRACING可达到的对象是活着的对象。

3) 并发预清理（Concurrent precleaning）

无停顿，这个过程也是并发的。

4) 重新标记（STW remark）

这个过程也会暂停JVM，重新扫描堆中的对象，标记活着的对象，包含子过程Rescan。

5) 并发清理（Concurrent sweeping）

无停顿，应用程序线程和收集器线程并发执行。

6) 并发重置（Concurrent reset）

无停顿，重置CMS收集器的数据结构，等待下一次垃圾回收。

6）JVM内存参数

执行命令“java -X”可看到相关参数的说明，不同版本的JDK默认值不一定相同，可执行命令“java -XX:+PrintFlagsFinal -version | grep HeapSize”查看默认值。

参数名	参数说明	示例
-Xms	初始Java堆大小，JVM启动时分配的内存	-Xms256m
-Xmx	最大Java堆大小，运行时可分配的最大内存	-Xmx2048m
-Xss	Java线程堆栈大小，每个线程分配的内存大小	-Xss128m

-XX:OnOutOfMemoryError	内存溢出时的动作	-XX:OnOutOfMemoryError=‘kill -9 %p‘
-XX:+UseConcMarkSweepGC	设置并发垃圾收集器	-XX:+UseConcMarkSweepGC
-XX:+UseParallelGC	设置并行垃圾收集器，同时运行在多CPU上，独占式收集器（收集过程中进程暂停STW），不能与CMS收集器一起。使用这个参数，新生代为并行回收，老年代为串行回收。	-XX:+UseParallelGC
-XX:+UseSerialGC	设置串行垃圾收集器，也是独占式收集器。使用这个参数，新生代和老年代都为串行回收。	-XX:+UseSerialGC
-XX:+UseParNewGC	设置并发串行收集器，实际是串行垃圾回收器的多线程化，即串行垃圾回收器的多线程版本，也是独占式收集器。使用这个参数时，新生代为并行回收，老年代仍是串行回收。	-XX:+UseParNewGC
-XX:+UseParallelOldGC	新生代和老年代都使用并行收集器	-XX:+UseParallelOldGC
-XX:+UseConcMarkSweepGC	老年代回收器，停顿（STW）减少，但吞吐量会降低。	-XX:+UseConcMarkSweepGC
-XX:ParallelGCThreads	设置并行收集线程数	-XX:ParallelGCThreads=10

在Hadoop-3.1.2上安装HBase-2.2.1

标签：ges 开始 version create vm设置 miss 负载读取主机名

原文地址：https://www.cnblogs.com/aquester/p/11824565.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行