标签:
一、环境
节点概述 mon : ceph-node01 ceph-node02 ceph-node03 osd :ceph-node01 ceph-node02 ceph-node03 mds : ceph-node01 ceph-node02 操作系统:Ubuntu 14.10 每个osd主机有一个OSD,每个OSD可用容量15GB。
二、测试过程
1、ceph -s查看概述
root@ceph-node01:~# ceph -s cluster 9ae8eb40-4f71-49ec-aa77-eda1cb6edbc3 health HEALTH_OK monmap e1: 3 mons at {ceph-node01=192.168.239.161:6789/0,ceph-node02=192.168.239.162:6789/0,ceph-node03=192.168.239.163:6789/0}, election epoch 50, quorum 0,1,2 ceph-node01,ceph-node02,ceph-node03 mdsmap e19: 1/1/1 up {0=ceph-node01=up:active}, 1 up:standby osdmap e24: 3 osds: 3 up, 3 in pgmap v202: 192 pgs, 3 pools, 672 kB data, 21 objects 3204 MB used, 45917 MB / 49122 MB avail 192 active+clean
2、df查看可用空间
root@ceph-node01:~# df -Pm Filesystem 1048576-blocks Used Available Capacity Mounted on /dev/sda1 12277 1474 10804 13% / none 1 0 1 0% /sys/fs/cgroup udev 1959 1 1959 1% /dev tmpfs 394 2 393 1% /run none 5 0 5 0% /run/lock none 1970 0 1970 0% /run/shm none 100 0 100 0% /run/user /dev/sdb 16374 1067 15308 7% /data 192.168.239.161,192.168.239.162,192.168.239.163:/ 49120 3208 45912 7% /mnt/cephfs
上述显示的可用空间肯定是错误的,按照ceph一式三份原则,真实可用的空间应该小于15GB,下面用写 16GB文件的方法来验证。
3、dd写文件
把文件系统挂载到/mnt/cephfs下,用脚本生成8个dd文件,每个文件2GB,为的是撑爆OSD。
脚本内容很简单
#!/bin/bash count=0 max=8 while [ $count -lt $max ];do printf "Writing test${count}.dat\n" dd if=/dev/zero bs=1M count=2048 of=test${count}.dat ((count++)) done
三、测试结果
写到最后一个文件时,三个节点上的ceph-mon进程无法用ps看到,检查日志文件,有如下提示:
2015-01-03 21:13:55.066943 7f0da98ce700 0 mon.ceph-node01@0(leader).data_health(48) update_stats avail 5% total 16766976 used 15915768 avail 851208 2015-01-03 21:13:55.067245 7f0da98ce700 -1 mon.ceph-node01@0(leader).data_health(48) reached critical levels of available space on local monitor storage -- shutdown! 2015-01-03 21:13:55.067266 7f0da98ce700 0 ** Shutdown via Data Health Service ** 2015-01-03 21:13:55.067292 7f0da7ec9700 -1 mon.ceph-node01@0(leader) e1 *** Got Signal Interrupt *** 2015-01-03 21:13:55.067300 7f0da7ec9700 1 mon.ceph-node01@0(leader) e1 shutdown 2015-01-03 21:13:55.067338 7f0da7ec9700 0 quorum service shutdown 2015-01-03 21:13:55.067339 7f0da7ec9700 0 mon.ceph-node01@0(shutdown).health(48) HealthMonitor::service_shutdown 1 services 2015-01-03 21:13:55.067340 7f0da7ec9700 0 quorum service shutdown
看样子,是ceph-mon进程自己退出了,常见的本地文件系统快满时,出错的是应用程序,为什么ceph-mon设计成干掉自己呢?
标签:
原文地址:http://my.oschina.net/cytan/blog/363302