时钟偏移故障现象:
[root@node5 ~]# ceph -w
cluster b516386f-cb9d-49d5-bf48-07f0dac29e97
health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean
monmap e1: 3 mons at {node1=10.240.217.101:6789/0,node4=10.240.217.104:6789/0,node5=10.240.217.105:6789/0}, election epoch 18, quorum 0,1,2 node1,node4,node5
osdmap e63: 3 osds: 2 up, 2 in
pgmap v249: 192 pgs, 3 pools, 0 bytes data, 0 objects
10314 MB used, 2063 GB / 2073 GB avail
192 active+degraded
2014-06-19 10:46:24.736860 mon.0 [WRN] mon.1 10.240.217.104:6789/0 clock skew 0.060021s > max 0.05s
解决上面问题的方法:
ceph默认的时钟偏移的时间是0.05s,由于这个时间太小,导致集群间的时间偏移值都大于0.05s,解决这个问题
需要到各个monitor节点修改ceph.conf的配置,在配置文件中加入下面的配置
[root@node1 ~]# vi /etc/ceph/ceph.conf
[mon]
mon clock drift allowed = .50
修改后重启ceph进程
[root@node1 ~]# service ceph restart
=== mon.node1 ===
=== mon.node1 ===
Stopping Ceph mon.node1 on node1...kill 4723...done
=== mon.node1 ===
Starting Ceph mon.node1 on node1...
Starting ceph-create-keys on node1...
更详细的处理方法可以看官方文档
http://ceph.com/docs/master/rados/configuration/mon-config-ref/#monitor-store-synchronization
本文出自 “zhangdh开放空间” 博客,请务必保留此出处http://linuxblind.blog.51cto.com/7616603/1710173
原文地址:http://linuxblind.blog.51cto.com/7616603/1710173