高可用集群之heartbeat基于crm进行资源管理(二)

时间：2015-06-19 21:40:09 阅读：377 评论：0 收藏：0 [点我收藏+]

标签：heartbeat下的crm运用

一、高可用集群之heartbeat基于crm进行资源管理

1、集群的工作模型：

A/P：两个节点，工作与主备模型

N-M N>M，N个节点，M个服务

N-N：N个节点，N个服务

A/A：双主模型：

2、资源转移的方式

rgmanager：failover domain priority

pacemaker：

资源黏性：

资源约束（三种类型）：

位置约束：资源更倾向于那个节点上

inf：无穷大

-n:

-inf:负无穷

排列约束：资源运行在同一节点的倾向性

inf：

-inf：

顺序约束：资源的启动次序及关闭次序

3、如何让web service中的三个资源：VIP、httpd和filesystem运行于同一节点上

1.排列约束

2.资源组（resource group）

4、如果节点不在是集群节点成员时，如何处理运行于当前节点的资源

stopped：停止

ignore：忽略

freeze：不连接新的请求

suicide：将服务器kill

5、一个资源刚配置完成时，是否启动

target-role？

6、RA类型

heartbeat legacy

LSB

OCF

STONITH

7、资源类型

primitive，native：主资源，只能运行于一个节点

group：组资源

clone：克隆资源

总克隆数，每个节点最多可运行的克隆数

stonith cluster filesystem

master/salve：主从资源

8、分布式锁：

/usr/lib64/heartbeat

hearsources2cib.py

9、图形化配置

ha.cf

crm on

/usr/lib64/heartbeat/ha_propagate 将配置文件传送到别的节点

10、安装gui

heartbeat v2使用crm作为ijiqun资源管理器：需要在ha.cf中添加

crm on

crm通过mgmtd集成监听5560/tcp

需要启动hb_gui的主机为hacluster用户添加密码，使用hb_gui启动

with quorum：拥有法定票数

without quorum ：不拥有法定票数

11、定义高可用的web service

VIP

httpd

from

to：以它为基础

web service

VIP

httpd

NFS

注意haresources与crm不兼容，不被crm所读取

二、配置

1、ha.cf

[root@snn heartbeat]# vim /etc/ha.d/ha.cf

mcast eth0 225.0.100.19 694 1 0

crm on

[root@snn heartbeat]# /usr/lib64/heartbeat/ha_propagate

Propagating HA configuration files to node datanode4.abc.com.

ha.cf 100% 10KB 10.4KB/s 00:00

authkeys 100% 694 0.7KB/s 00:00

Setting HA startup configuration on node datanode4.abc.com.

2、注意haresources与crm不兼容，不被crm所读取

[root@snn heartbeat]# mv /etc/ha.d/haresources /root

底下mv是datanode4的主机

[root@datanode4 ha.d]# mv haresources /root/

[root@snn heartbeat]# service heartbeat start

logd is already running

Starting High-Availability services:

Done.

[root@snn heartbeat]# ssh datanode4 ‘service heartbeat start‘

logd is already running

Starting High-Availability services:

Done.

3、查看日志

[root@snn heartbeat]# tail -f /var/log/messages

Jun 19 16:00:29 snn crmd: [2223]: notice: populate_cib_nodes: Node: datanode4.abc.com (uuid: 0862d824-047e-4826-9e26-21a7603f53c8)

Jun 19 16:00:30 snn crmd: [2223]: notice: populate_cib_nodes: Node: snn.abc.com (uuid: 6009ca6a-56eb-4d35-872e-3b8dc0fc9851)

Jun 19 16:00:30 snn crmd: [2223]: info: do_ha_control: Connected to Heartbeat

Jun 19 16:00:30 snn crmd: [2223]: info: do_ccm_control: CCM connection established... waiting for first callback

Jun 19 16:00:30 snn crmd: [2223]: info: do_started: Delaying start, CCM (0000000000100000) not connected

Jun 19 16:00:30 snn crmd: [2223]: info: crmd_init: Starting crmd‘s mainloop

Jun 19 16:00:30 snn crmd: [2223]: notice: crmd_client_status_callback: Status update: Client snn.abc.com/crmd now has status [online]

Jun 19 16:00:30 snn crmd: [2223]: notice: crmd_client_status_callback: Status update: Client datanode4.abc.com/crmd now has status [online]

Jun 19 16:00:30 snn cib: [2219]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm

Jun 19 16:00:30 snn cib: [2219]: info: mem_handle_event: instance=5, nodes=2, new=2, lost=0, n_idx=0, new_idx=0, old_idx=4

Jun 19 16:00:30 snn cib: [2219]: info: cib_ccm_msg_callback: PEER: datanode4.abc.com

Jun 19 16:00:30 snn cib: [2219]: info: cib_ccm_msg_callback: PEER: snn.abc.com

Jun 19 16:00:31 snn crmd: [2223]: info: do_started: Delaying start, CCM (0000000000100000) not connected

Jun 19 16:00:31 snn crmd: [2223]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm

Jun 19 16:00:31 snn crmd: [2223]: info: mem_handle_event: instance=5, nodes=2, new=2, lost=0, n_idx=0, new_idx=0, old_idx=4

Jun 19 16:00:31 snn crmd: [2223]: info: crmd_ccm_msg_callback: Quorum (re)attained after event=NEW MEMBERSHIP (id=5)

Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: NEW MEMBERSHIP: trans=5, nodes=2, new=2, lost=0 n_idx=0, new_idx=0, old_idx=4

Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: #011CURRENT: datanode4.abc.com [nodeid=0, born=3]

Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: #011CURRENT: snn.abc.com [nodeid=1, born=5]

Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: #011NEW: datanode4.abc.com [nodeid=0, born=3]

Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: #011NEW: snn.abc.com [nodeid=1, born=5]

Jun 19 16:00:31 snn crmd: [2223]: info: do_started: The local CRM is operational

Jun 19 16:00:31 snn crmd: [2223]: info: do_state_transition: State transition S_STARTING -> S_PENDING [ input=I_PENDING cause=C_CCM_CALLBACK origin=do_started ]

4、查看集群监控状态

[root@snn heartbeat]# crm_mon

Refresh in 6s...

============

Last updated: Fri Jun 19 16:11:34 2015

Current DC: snn.abc.com (6009ca6a-56eb-4d35-872e-3b8dc0fc9851)

2 Nodes configured.

0 Resources configured.

============

Node: datanode4.abc.com (0862d824-047e-4826-9e26-21a7603f53c8): online

Node: snn.abc.com (6009ca6a-56eb-4d35-872e-3b8dc0fc9851): online

4、crm的命令工具

[root@snn heartbeat]# crm_sh

/usr/sbin/crm_sh:31: DeprecationWarning: The popen2 module is deprecated. Use the subprocess module.

from popen2 import Popen3

crm # help

Usage: crm (nodes|config|resources)

crm # nodes

crm nodes # help

Usage: nodes (status|list)

crm nodes # list

crm nodes #

5、安装heartbeat的时候自动创建一个用户hacluster，但没有密码，需要创建

[root@snn heartbeat]# cat /etc/passwd ｜grep hacluster

hacluster:x:498:498:heartbeat user:/var/lib/heartbeat/cores/hacluster:/sbin/nologin

[root@snn heartbeat]# passwd hacluster

更改用户 hacluster 的密码。

新的密码：

无效的密码： WAY 过短

无效的密码：过于简单

重新输入新的密码：

passwd：所有的身份验证令牌已经成功更新。

6、直接运行hb_gui

[root@snn ~]# hb_gui

Traceback (most recent call last):

File "/usr/bin/hb_gui", line 41, in <module>

import gtk, gtk.glade, gobject

File "/usr/lib64/python2.6/site-packages/gtk-2.0/gtk/__init__.py", line 64, in <module>

_init()

File "/usr/lib64/python2.6/site-packages/gtk-2.0/gtk/__init__.py", line 52, in _init

_gtk.init_check()

RuntimeError: could not open display

以上有错误提示

在客户端下载安装Xmanager即可

在重执行命令

三、ha_gui定义

1、定义主资源名称

2、继继定义主资源

3、让两个资源运行同一个节点，方法有两种：（1）定义排列约束，（2）定义资源组

（1）定义排列约束

4、让snn节点成为备的

四、定义组的方式

web server:

vip:192.168.1.8

httpd

nfs:/192.168.1.4:/web/htdocs挂在到/var/www/html

1、删除原来主资源

2、定义群主源

3、httpd无法启动，查看日志如下

从日志来看，nfs正常挂在到4这主机上，但httpd先启动后又关闭，我到现在还没找出原因，期望一些坛知道，不妨指点一下，谢谢大家了

本文出自 “散人” 博客，请务必保留此出处http://zouqingyun.blog.51cto.com/782246/1663710

高可用集群之heartbeat基于crm进行资源管理(二)

标签：heartbeat下的crm运用

原文地址：http://zouqingyun.blog.51cto.com/782246/1663710

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行