在vmware workstation创建多台linux虚机,在这几台虚机中搭建openstack环境,然后做云主机的迁移实验。
例如下面的实验:
操作主机
主机IP 主机名 角色
192.168.0.11 YUN-11 控制节点
192.168.0.12 YUN-12 扩展节点
下面以控制节点为例,但是每台涉及迁移的主机都要做操作
1)各节点之间nova账号无密码访问
1.1)在各个需要相互无密码访问节点上做以下操作
# usermod -s /bin/bash nova
# su nova
$ cd
$ ssh-keygen
$ touch .ssh/authorized_keys
1.2)、把其他节点的公钥拷贝过来,追加到本地的认证文件中
以控制节点为例
$ scp root@192.168.0.12:/var/lib/nova/.ssh/id_rsa.pub .
$ cat id_rsa.pub >> .ssh/authorized_keys
$ scp root@192.168.0.126:/var/lib/nova/.ssh/id_rsa.pub .
$ cat id_rsa.pub >> .ssh/authorized_keys
之后两个扩展节点就能够利用nova用户无密码访问控制节点了
依照这种方法在其他节点做类似操作,最终就会实现各节点之间nova用户的无密码访问
2)【可选,确认即可】网上文档上做了修改,但是本集群按默认配置
如果希望可以在Dashboard里设置root的密码
inject_password=true
修改虚拟机配置,不需要迁移
allow_resize_to_same_host=true
(可选)
迁移和修改配置,不需要手工确认,1表示1秒的时间让你确认,如果没确认就继续
resize_confirm_window=1
重启服务
service openstack-nova-compute restart
3)热迁移(block-migration)
3.1)所有的节点上修改nova.conf
live_migration_flag=VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_UNSAFE
开启热迁移功能
3.2)【确认即可,此处也按系统默认配置】
然后需要配置versh免密码连接,修改/etc/libvirt/libvirtd.conf
去掉注释
listen_tls = 0
listen_tcp = 1
去掉注释并修改值
auth_tcp = “none” # 注意这里必须设为none,否则需要认证。
测试下:
virsh --connect qemu+tcp://192.168.0.12/system list
如果不需要输入用户名和密码就能够列出所有的虚拟机,则表示配置成功。
重启所有计算节点nova-compute libvirt-bin服务
此时就可以使用novaclient命令进行迁移,比如要把vm1从测试机迁移到YUN-12,则
nova live-migration --block-migrate vm1 YUN-12
注意选项--block-migrate是必要的,否则默认以共享存储的方式迁移,另外需要在控制节点做/etc/hosts文件主机名和IP的解析
4)测试迁移
测试之前关闭两台虚机系统的防火墙
在虚机的环境下测试迁移和物理机下不同在于,以上步骤在虚机下就可以完成迁移了,整个云平台也没有问题,但是在物理机下还需要做额外的配置,物理机系统中防火墙不能关闭。
另外一点就是在虚机环境中云平台在做云主机增加资源的操作和物理环境也有不同,虚机环境下增加和减除资源都可以做到,而在物理机环境下只能做到增加云主机资源。
在物理机平台上的openstack云平台,在控制节点关闭系统防火墙后,计算节点上都无法创建云主机,这时候就需要打开防火墙,重启物理机,转而选择在防火墙配置文件中添加策略的方式。
另外selinux也需要关闭,修改/etc/sysconfig/selinux
SELINUX=enforcing
to
SELINUX=disable
在虚机环要下面的配置就可以了,编境下修改防火墙只需辑/etc/sysconfig/iptables
添加
-A INPUT -p tcp -m multiport --port 16509 -m comment --comment "libvirt" -j ACCEPT
-A INPUT -p tcp -m multiport --port 49152:49216 -m comment --comment "migraton" -j ACCEPT
但是在物理机环境下需要做下面的配置,在防火墙配置文件中做修改
修改之前的状态
YUN-11防火墙
-A INPUT -s 192.168.0.11/32 -p tcp -m multiport --dports 5900:5999,16509 -m comment --comment "001 nova compute incoming nova_compute" -j ACCEPT
-A INPUT -s 192.168.0.11/32 -p tcp -m multiport --dports 16509,49152:49215 -m comment --comment "001 nova qemu migration incoming nova_qemu_migration_192.168.0.11_192.168.0.11" -j ACCEPT
YUN-12防火墙
-A INPUT -s 192.168.0.11/32 -p tcp -m multiport --dports 5900:5999,16509 -m comment --comment "001 nova compute incoming nova_compute" -j ACCEPT
-A INPUT -s 192.168.0.12/32 -p tcp -m multiport --dports 16509,49152:49215 -m comment --comment "001 nova qemu migration incoming nova_qemu_migration_192.168.0.12_192.168.0.12" -j ACCEPT
做修改
YUN-11防火墙配置需要添加
-A INPUT -s 192.168.0.12/32 -p tcp -m multiport --dports 16509,49152:49215 -m comment --comment "001 nova qemu migration incoming nova_qemu_migration_192.168.0.11_192.168.0.12" -j ACCEPT
YUN-12防火墙配置需要添加
-A INPUT -s 192.168.0.11/32 -p tcp -m multiport --dports 16509,49152:49215 -m comment --comment "001 nova qemu migration incoming nova_qemu_migration_192.168.0.12_192.168.0.11" -j ACCEPT
依照上面的实例,如果有其他的物理机的话则需根据实际情况添加策略。
查看镜像信息
[root@YUN-11 51222f9c-5074-440d-92c6-fccaeadc8032_resize(keystone_admin)]# qemu-img info disk
image: disk
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 868K
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/87ae9a3ca6476837c0cb656bd99ee1dcca238134
[root@YUN-11 51222f9c-5074-440d-92c6-fccaeadc8032(keystone_admin)]# ll
total 1508
-rw-rw----. 1 root root 16618 Apr 23 10:01 console.log
-rw-r--r--. 1 root root 1572864 Apr 23 14:08 disk
-rw-r--r--. 1 nova nova 79 Apr 23 10:01 disk.info
-rw-r--r--. 1 nova nova 1634 Apr 23 10:01 libvirt.xml
[root@YUN-11 51222f9c-5074-440d-92c6-fccaeadc8032(keystone_admin)]# chmod o+r console.log
[root@YUN-11 51222f9c-5074-440d-92c6-fccaeadc8032(keystone_admin)]# ll
total 1508
-rw-rw-r--. 1 root root 16618 Apr 23 10:01 console.log
-rw-r--r--. 1 root root 1572864 Apr 23 14:08 disk
-rw-r--r--. 1 nova nova 79 Apr 23 10:01 disk.info
-rw-r--r--. 1 nova nova 1634 Apr 23 10:01 libvirt.xml
# su nova
[nova@YUN-11 instances(keystone_admin)]$ cp -r 51222f9c-5074-440d-92c6-fccaeadc8032 51222f9c-5074-440d-92c6-fccaeadc8032_resize
[nova@YUN-11 instances(keystone_admin)]$ cd 51222f9c-5074-440d-92c6-fccaeadc8032_resize
[nova@YUN-11 51222f9c-5074-440d-92c6-fccaeadc8032_resize(keystone_admin)]$ ls -la
total 900
drwxr-xr-x. 2 nova nova 4096 Apr 23 16:35 .
drwxr-xr-x. 8 nova nova 4096 Apr 23 16:35 ..
-rw-r--r--. 1 nova nova 16618 Apr 23 16:35 console.log
-rw-r--r--. 1 nova nova 1572864 Apr 23 16:35 disk
-rw-r--r--. 1 nova nova 79 Apr 23 16:35 disk.info
-rw-r--r--. 1 nova nova 1634 Apr 23 16:35 libvirt.xml
这个过程最后的结果貌似没有记录,,有待以后测试
另一种冷迁移方式类似于vmware workstation中虚拟机的迁移一样
实例处理:
实例选择linux的系统,在系统中
创建目录、编辑文件,迁移后查看创建的目录和修改的文档是否正常
迁移之前关闭要迁移的实例
查看实例所在目录下的文档信息
[root@YUN-17 2dccde39-31a4-48d5-8f62-0f963ffec481_copy]# ll
total 6896
-rw-r-----. 1 root root 1 Apr 30 10:18 console.log
-rw-r--r--. 1 root root 7536640 Apr 30 10:18 disk
-rw-r--r--. 1 root root 79 Apr 30 10:18 disk.info
-rw-r--r--. 1 root root 1635 Apr 30 10:18 libvirt.xml
[root@YUN-17 2dccde39-31a4-48d5-8f62-0f963ffec481_copy]# qemu-img convert -O qcow disk disk4
把镜像disk4拷贝到YUN-11上(YUN-11是YUN-17所在集群的控制节点)
添加镜像
[root@YUN-11 ~(keystone_admin)]# glance add name=test-26 is_public=true container_format=bare disk_format=raw < /root/disk4
Added new image with ID: 3573cf89-7697-48cd-b07c-51344f416156
在dash中从test-26镜像启动一个云主机
启动成功,之后进入该系统的控制台,发现主机中的目录和文件保存完整
如果出现在绑定浮动IP后云主机PING不通的现象,如下面所示:
从镜像test-27启动的实例在绑定浮动ip后,发现外部的机器PING不通
解决办法:
进入实例发现网卡是eth1,但是网卡配置文件时ifcfg-eth0,配置文件中没有MAC和IP的信息,只是BOOTPROTO=dhcp
对比正常的实例
正常的实例网卡是eth0,网卡配置文件是ifcfg-eth0,配置文件也中没有MAC和IP的信息,只是BOOTPROTO=dhcp
另外发现使用cirros镜像的实例所做的迁移,没有出现这样的情况,在绑定浮动IP后,外部机器可以PING通
所做处理
在test-27实例上,修改网卡配置文件
mv ifcfg-eth0 ifcfg-eth1
修改文件参数
vi ifcfg-eth1
DEVICE=eth0
to
DEVICE=eth1
保存修改后重启网络
发现外部机器可以ping通该实例
升级内核步骤如下
查看现在系统的内核参数
[root@YUN-15 ~]# uname -r
2.6.32-431.el6.x86_64
上传内核源码包linux-3.19.3.tar.xz
解压
# tar -Jxvf linux-3.19.3.tar.xz -C /usr/src
安装需要的组件
# yum install -y gcc
# yum install -y ncurses ncurses-devel
# yum install -y bc
调整参数
# make menuconfig
编译安装
# make
# make modules_install install
异常:
sh ./arch/x86/boot/install.sh 3.19.3 arch/x86/boot/bzImage \
System.map "/boot"
ERROR: modinfo: could not find module ipt_MASQUERADE
ERROR: modinfo: could not find module iptable_nat
ERROR: modinfo: could not find module crc_t10dif
ERROR: modinfo: could not find module scsi_tgt
修改启动项
# vi /boot/grub/grub.conf
default=1
to
default=0
# reboot
在升级内核之前就已经存在openstack环境
系统启动之后出现的状况:
实例启动失败
恢复状态失败
[root@YUN-11 ~(keystone_admin)]# nova reset-state test
[root@YUN-11 ~(keystone_admin)]# nova stop test
[root@YUN-11 ~(keystone_admin)]# nova start test
ERROR: Instance 93efe724-7288-4269-92d7-0346a00a724a in vm_state error. Cannot start while the instance is in this state. (HTTP 409) (Request-ID: req-d080c99c-e18a-4013-b677-d0ac98bf4575)
创建实例失败
[root@YUN-11 ~]# nova boot --image test-mini --flavor 1 test-1 --availability-zone nova:YUN-15 --nic net-id=e49ae481-4
ERROR: You must provide a username via either --os-username or env[OS_USERNAME]
在dashboard中查看“管理员”---“主机集合”,可以看到YUN-15的服务为停止状态
在YUN-15主机上
# service openstack-nova-compute restart
再次创建实例
出错信息:
错误:创建实例“test-1”失败: 请稍后再试[错误: Unexpected vif_type=binding_failed].
用命令查看异常的主机YUN-15和正常的主机服务的区别
[root@YUN-14 ~]# openstack-status
== Nova services ==
openstack-nova-api: dead (disabled on boot)
openstack-nova-compute: active
openstack-nova-network: dead (disabled on boot)
openstack-nova-scheduler: dead (disabled on boot)
== neutron services ==
neutron-server: inactive (disabled on boot)
neutron-dhcp-agent: inactive (disabled on boot)
neutron-l3-agent: inactive (disabled on boot)
neutron-metadata-agent: inactive (disabled on boot)
neutron-lbaas-agent: inactive (disabled on boot)
neutron-openvswitch-agent: active
== Ceilometer services ==
openstack-ceilometer-api: dead (disabled on boot)
openstack-ceilometer-central: dead (disabled on boot)
openstack-ceilometer-compute: active
openstack-ceilometer-collector: dead (disabled on boot)
== Support services ==
libvirtd: active
openvswitch: active
messagebus: active
Warning novarc not sourced
[root@YUN-15 ~]# openstack-status
== Nova services ==
openstack-nova-api: dead (disabled on boot)
openstack-nova-compute: active
openstack-nova-network: dead (disabled on boot)
openstack-nova-scheduler: dead (disabled on boot)
== neutron services ==
neutron-server: inactive (disabled on boot)
neutron-dhcp-agent: inactive (disabled on boot)
neutron-l3-agent: inactive (disabled on boot)
neutron-metadata-agent: inactive (disabled on boot)
neutron-lbaas-agent: inactive (disabled on boot)
neutron-openvswitch-agent: dead
== Ceilometer services ==
openstack-ceilometer-api: dead (disabled on boot)
openstack-ceilometer-central: dead (disabled on boot)
openstack-ceilometer-compute: active
openstack-ceilometer-collector: dead (disabled on boot)
== Support services ==
libvirtd: active
openvswitch: dead
messagebus: active
Warning novarc not sourced
发现YUN-15和YUN-14的区别在于openvswitch和neutron-openvswitch-agent服务的状态,YUN-15是关闭状态
重启服务
[root@YUN-15 ~]# service openvswitch restart
Killing ovsdb-server (4862) [ OK ]
Starting ovsdb-server [ OK ]
Configuring Open vSwitch system IDs [ OK ]
Starting ovs-vswitchd [ OK ]
Enabling remote OVSDB managers [ OK ]
[root@YUN-15 ~]# service neutron-openvswitch-agent restart
Stopping neutron-openvswitch-agent: [FAILED]
Starting neutron-openvswitch-agent: [ OK ]
再次在YUN-15创建云主机,发现云主机可以看见配置的地址,只是一直处于创建状态
在另外一台物理机上做
和上面的操作基本一致,仅在下面一步上变化
# make menuconfig
在选择"IPv4"模块时没有勾选其下面的ipt_MASQUERADE
编译安装异常
# make modules_install install
sh ./arch/x86/boot/install.sh 3.19.3 arch/x86/boot/bzImage \
System.map "/boot"
ERROR: modinfo: could not find module crc_t10dif
ERROR: modinfo: could not find module scsi_tgt
重启之后物理机器死机
再次重启系统启动成功
扩展这台升级内核之后的系统为openstack计算节点
扩展失败
错误信息:
192.168.0.11_neutron.pp: [ DONE ]
192.168.0.16_neutron.pp: [ ERROR ]
Applying Puppet manifests [ ERROR ]
ERROR : Error appeared during Puppet run: 192.168.0.16_neutron.pp
Error: sysctl -p /etc/sysctl.conf returned 255 instead of one of [0]
You will find full trace in log /var/tmp/packstack/20150421-110210-5TXrme/manifests/192.168.0.16_neutron.pp.log
Please check log file /var/tmp/packstack/20150421-110210-5TXrme/openstack-setup.log for more information
查看日志文件/var/tmp/packstack/20150421-110210-5TXrme/openstack-setup.log
2015-04-21 11:09:22::ERROR::run_setup::921::root:: Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/packstack/installer/run_setup.py", line 916, in main
_main(confFile)
File "/usr/lib/python2.6/site-packages/packstack/installer/run_setup.py", line 605, in _main
runSequences()
File "/usr/lib/python2.6/site-packages/packstack/installer/run_setup.py", line 584, in runSequences
controller.runAllSequences()
File "/usr/lib/python2.6/site-packages/packstack/installer/setup_controller.py", line 68, in runAllSequences
sequence.run(config=self.CONF, messages=self.MESSAGES)
File "/usr/lib/python2.6/site-packages/packstack/installer/core/sequences.py", line 98, in run
step.run(config=config, messages=messages)
File "/usr/lib/python2.6/site-packages/packstack/installer/core/sequences.py", line 44, in run
raise SequenceError(str(ex))
SequenceError: Error appeared during Puppet run: 192.168.0.16_neutron.pp
Error: sysctl -p /etc/sysctl.conf returned 255 instead of one of [0]^[[0m
You will find full trace in log /var/tmp/packstack/20150421-110210-5TXrme/manifests/192.168.0.16_neutron.pp.log
网上查看相关信息,初步认定是内核缺少模块(内核编译问题)
现在就有个问题,内核编译和扩展节点先后顺序比较
之前是在YUN-15机器上先扩展节点,再升级内核,最后扩展完节点,实例起不来;现在是先升级内核,再扩展节点,又出现由于缺少网桥模块无法扩展节点的情况
1)、查看内核升级前的系统版本
[root@YUN-17 ~]# uname -r
2.6.32-431.el6.x86_64
2)、环境准备
# yum install -y xz
# tar -Jxvf linux-3.19.3.tar.xz -C /usr/src (这里选择的linux版本是linux-3.19.3.tar.xz)
# yum install -y hmaccalc zlib-devel binutils-devel elfutils-libelf-devel
# yum install -y bc
# cd /usr/src/linux-3.19.3
# yum groupinstall -y "Development Tools"
3)、升级内核的配置文件
这里选择使用老的内核配置,因如果编辑内核配置文件的话,在编译安装的时候就会出现缺少内核模块的错误
[root@YUN-17 linux-3.19.3]# cp /boot/config-2.6.32-431.el6.x86_64 .config
# sh -c 'yes "" | make oldconfig'
# make oldconfig
# make
# make modules_install install
sh ./arch/x86/boot/install.sh 3.19.3 arch/x86/boot/bzImage \
System.map "/boot"
ERROR: modinfo: could not find module crc_t10dif
ERROR: modinfo: could not find module scsi_tgt
这里出现的错误提示可以忽略
4)、给YUN-17扩展节点,创建实例
创建实例成功
正常运行
注意事项:
注意系统升级内核和扩展节点的先后顺序,如果先扩展节点再升级内核的话,计算节点将出现异常,不能再创建实例
在deploy节点上
192.168.1.200 admin-node
192.168.1.201 node1
192.168.1.202 node2
192.168.1.203 node3
2.4.3、配置本地YUM源
163源、cephh源和epel源
[ceph]
name=Ceph packages for $basearch
gpgkey=http://192.168.1.199/ceph.com/release.asc
enabled=1
baseurl=http://192.168.1.199/ceph.com/rpm-giant/el6/$basearch
priority=1
gpgcheck=1
type=rpm-md
[ceph-source]
name=Ceph source packages
gpgkey=http://192.168.1.199/ceph.com/release.asc
enabled=1
baseurl=http://192.168.1.199/ceph.com/rpm-giant/el6/SRPMS
priority=1
gpgcheck=1
type=rpm-md
[ceph-noarch]
name=Ceph noarch packages
gpgkey=https://192.168.1.199/ceph.com/release.asc
enabled=1
baseurl=http://192.168.1.199/ceph.com/rpm-giant/el6/noarch
priority=1
gpgcheck=1
type=rpm-md
每个节点上都做
[root@admin-node ~]# useradd -d /home/ceph -m ceph
[root@admin-node ~]# passwd ceph
[root@admin-node ~]# echo "ceph ALL = (root) NOPASSWD:ALL" | tee /etc/sudoers.d/ceph
ceph ALL = (root) NOPASSWD:ALL
[root@admin-node ~]# chmod 0440 /etc/sudoers.d/ceph
[root@admin-node ~]# su ceph
[ceph@admin-node root]$ cd
[ceph@admin-node ~]$ ssh-keygen
[ceph@admin-node ~]$ ssh-copy-id ceph@node1
bash: ssh-copy-id: command not found
[ceph@admin-node ~]$ sudo yum install openssh-clients -y
[ceph@admin-node ~]$ ssh-copy-id ceph@node1
[ceph@admin-node ~]$ ssh-copy-id ceph@node2
[ceph@admin-node ~]$ ssh-copy-id ceph@node3
测试
[ceph@admin-node ~]$ ssh ceph@node1
[ceph@node1 ~]$ exit
logout
Connection to node1 closed.
[ceph@admin-node ~]$ ssh ceph@node2
[ceph@node2 ~]$ exit
logout
Connection to node2 closed.
[ceph@admin-node ~]$ ssh ceph@node3
[ceph@node3 ~]$ exit
logout
Connection to node3 closed.
[ceph@admin-node ~]$ pwd
/home/ceph
[ceph@admin-node ~]$ vi .ssh/config
Host node1
Hostname node1
User ceph
Host node2
Hostname node2
User ceph
Host node3
Hostname node3
User ceph
再次验证出错
[ceph@admin-node ~]$ ssh ceph@node1
Bad owner or permissions on /home/ceph/.ssh/config
修改权限解决问题
[ceph@admin-node .ssh]$ ll
total 16
-rw-rw-r--. 1 ceph ceph 135 Mar 2 18:47 config
-rw-------. 1 ceph ceph 1675 Mar 2 18:31 id_rsa
-rw-r--r--. 1 ceph ceph 397 Mar 2 18:31 id_rsa.pub
-rw-r--r--. 1 ceph ceph 1203 Mar 2 18:42 known_hosts
[ceph@admin-node .ssh]$ ssh ceph@node1
Bad owner or permissions on /home/ceph/.ssh/config
[ceph@admin-node .ssh]$ chmod 600 *
[ceph@admin-node .ssh]$ ll
total 16
-rw-------. 1 ceph ceph 135 Mar 2 18:47 config
-rw-------. 1 ceph ceph 1675 Mar 2 18:31 id_rsa
-rw-------. 1 ceph ceph 397 Mar 2 18:31 id_rsa.pub
-rw-------. 1 ceph ceph 1203 Mar 2 18:42 known_hosts
[ceph@admin-node .ssh]$ ssh ceph@node1
Last login: Mon Mar 2 20:40:38 2015 from 192.168.1.120
[ceph@node1 ~]$ exit
logout
Connection to node1 closed.
[ceph@node1 ~]$ ifconfig
eth2 Link encap:Ethernet HWaddr 00:0C:29:59:C7:57
inet addr:192.168.1.201 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe59:c757/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:45308 errors:0 dropped:0 overruns:0 frame:0
TX packets:7103 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:59108143 (56.3 MiB) TX bytes:648018 (632.8 KiB)
eth3 Link encap:Ethernet HWaddr 00:0C:29:59:C7:61
inet addr:172.16.1.201 Bcast:172.16.1.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe59:c761/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2317 errors:0 dropped:0 overruns:0 frame:0
TX packets:7 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:219673 (214.5 KiB) TX bytes:538 (538.0 b)
网卡都相同
mon端口
[ceph@admin-node .ssh]$ sudo iptables -A INPUT -i eth2 -p tcp -s 192.168.1.201/24 --dport 6789 -j ACCEPT
osd端口
[ceph@admin-node .ssh]$ sudo iptables -A INPUT -i eth2 -m multiport -p tcp -s 192.168.1.202/24 --dports 6800:7100 -j ACCEPT
[ceph@admin-node .ssh]$ sudo iptables -A INPUT -i eth2 -m multiport -p tcp -s 192.168.1.203/24 --dports 6800:7100 -j ACCEPT
TTY
sudo visudo
Defaults requiretty
to
Defaults:ceph !requiretty
selinux
[ceph@admin-node .ssh]$ sudo setenforce 0
[ceph@admin-node ~]$ cd my-cluster/
[ceph@admin-node my-cluster]$ ceph-deploy new node1
出错:
[ceph_deploy][ERROR ] RuntimeError: remote connection got closed, ensure ``requiretty`` is disabled for node1
Error in sys.exitfunc:
解决方法: 需要在node1、node2、和node3三个节点中使用ceph用户的身份执行sudo visudo命令,然后修改
Defaults requiretty 为Defaults:ceph !requiretty
删除配置
[ceph@admin-node my-cluster]$ ceph-deploy purgedata node1
[ceph@admin-node my-cluster]$ ceph-deploy forgetkeys
[ceph@admin-node my-cluster]$ ceph-deploy purge node1
[ceph@admin-node my-cluster]$ ceph-deploy new node1
生成
ceph.conf ceph.log ceph.mon.keyring
vi ceph.conf
osd pool default size = 2
ceph@admin-node my-cluster]$ ceph-deploy install node1 node2 node3
出错:
[node1][INFO ] Running command: sudo rpm --import https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc
[node1][WARNIN] curl: (6) Couldn't resolve host 'ceph.com'
[node1][WARNIN] error: https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc: import read failed(2).
[node1][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: rpm --import https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc
解决办法:
查找配置文件
[root@admin-node my-cluster]# find / -type f -name "*.py" | xargs grep "https://ceph.com/git"
/usr/lib/python2.6/site-packages/ceph_deploy/hosts/fedora/install.py: "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/{key}.asc".format(key=key)
/usr/lib/python2.6/site-packages/ceph_deploy/hosts/fedora/install.py: "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/{key}.asc".format(key=key),
/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py: "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/{key}.asc".format(key=key)
/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py: "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/{key}.asc".format(key=key),
/usr/lib/python2.6/site-packages/ceph_deploy/hosts/debian/install.py: 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/{key}.asc'.format(key=key),
/usr/lib/python2.6/site-packages/ceph_deploy/conf/cephdeploy.py:# gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/autobuild.asc
/usr/lib/python2.6/site-packages/ceph_deploy/conf/cephdeploy.py:# gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/autobuild.asc
/usr/lib/python2.6/site-packages/ceph_deploy/install.py: gpg_fallback = 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc'
cd /usr/lib/python2.6/site-packages/ceph_deploy
可以看到install.py 、install.pyc 和install.pyo三个文件
pyc是一种二进制文件,是由py文件经过编译后,生成的文件,是一种byte code
vi install.py
#gpg_fallback = 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc'
gpg_fallback = 'http://192.168.1.199/ceph.com/release.asc'
还是出错
[ceph@admin-node centos]$ pwd
/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos
[ceph@admin-node centos]$ ls
__init__.py __init__.pyo install.pyc mon pkg.pyc uninstall.py uninstall.pyo
__init__.pyc install.py install.pyo pkg.py pkg.pyo uninstall.pyc
再次修改文件
vi /usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py
if adjust_repos:
if version_kind != 'dev':
remoto.process.run(
distro.conn,
[
'rpm',
'--import',
#"https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/{key}.asc".format(key=key)
"http://192.168.1.199/ceph.com/release.asc"
]
)
if version_kind == 'stable':
url = 'http://192.168.1.199/ceph.com/rpm-{version}/{repo}/'.format(
version=version,
repo=repo_part,
)
elif version_kind == 'testing':
url = 'http://192.168.1.199/ceph.com/rpm-testing/{repo}/'.format(repo=repo_part)
#remoto.process.run(
# distro.conn,
# [
# 'rpm',
# '-Uvh',
# '--replacepkgs',
# '{url}noarch/ceph-release-1-0.{dist}.noarch.rpm'.format(url=url, dist=dist),
# ],
#)
再次执行 ceph-deploy install node1 node2 node3
执行成功
结束
[node3][DEBUG ] Complete!
[node3][INFO ] Running command: sudo ceph --version
[node3][DEBUG ] ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
Error in sys.exitfunc:
[root@YUN17 ~]# umount /home
[root@YUN17 ~]# e2fsck -f /dev/mapper/vg_YUN2-lv_home
e2fsck 1.41.12 (17-May-2010)
e2fsck: No such file or directory while trying to open /dev/mapper/vg_YUN2-lv_home
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
[root@YUN17 ~]# e2fsck -f /dev/mapper/vg_YUN17-lv_home
e2fsck 1.41.12 (17-May-2010)
/dev/mapper/vg_YUN17-lv_home is in use.
e2fsck: Cannot continue, aborting.
[root@YUN17 ~]# resize2fs -p /dev/mapper/vg_YUN17-lv_home 2G
resize2fs 1.41.12 (17-May-2010)
resize2fs: Device or resource busy while trying to open /dev/mapper/vg_YUN17-lv_home
Couldn't find valid filesystem superblock.
[root@YUN17 ~]# mount /home
[root@YUN17 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_YUN17-lv_root
50G 9.9G 37G 22% /
tmpfs 253G 4.0K 253G 1% /dev/shm
/dev/sda1 477M 62M 387M 14% /boot
/srv/loopback-device/swiftloopback
1.9G 3.1M 1.7G 1% /srv/node/swiftloopback
/dev/mapper/vg_YUN17-lv_home
769G 69M 730G 1% /home
[root@YUN17 ~]# lvreduce -L 2G /dev/mapper/vg_YUN17-lv_home
WARNING: Reducing active and open logical volume to 2.00 GiB
THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce lv_home? [y/n]: y
Size of logical volume vg_YUN17/lv_home changed from 780.90 GiB (199911 extents) to 2.00 GiB (512 extents).
Logical volume lv_home successfully resized
[root@YUN17 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_YUN17-lv_root
50G 9.9G 37G 22% /
tmpfs 253G 4.0K 253G 1% /dev/shm
/dev/sda1 477M 62M 387M 14% /boot
/srv/loopback-device/swiftloopback
1.9G 3.1M 1.7G 1% /srv/node/swiftloopback
/dev/mapper/vg_YUN17-lv_home
769G 69M 730G 1% /home
[root@YUN17 ~]# vgdisplay
--- Volume group ---
VG Name vg_YUN17
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 5
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 3
Open LV 3
Max PV 0
Cur PV 1
Act PV 1
VG Size 834.90 GiB
PE Size 4.00 MiB
Total PE 213735
Alloc PE / Size 14336 / 56.00 GiB
Free PE / Size 199399 / 778.90 GiB
VG UUID UY5pX2-BCLJ-x4ig-Cr0z-sAS1-mUEc-nixVw9
--- Volume group ---
VG Name cinder-volumes
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 1
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 0
Open LV 0
Max PV 0
Cur PV 1
Act PV 1
VG Size 20.60 GiB
PE Size 4.00 MiB
Total PE 5273
Alloc PE / Size 0 / 0
Free PE / Size 5273 / 20.60 GiB
VG UUID ND2yf7-BRxo-sYm1-VB6A-t0nt-aVPg-Fq3MPH
[root@YUN17 ~]# lvextend -L +750G /dev/mapper/vg_YUN17-lv_root
Size of logical volume vg_YUN17/lv_root changed from 50.00 GiB (12800 extents) to 800.00 GiB (204800 extents).
Logical volume lv_root successfully resized
[root@YUN17 ~]# resize2fs -p /dev/mapper/vg_YUN17-lv_root
resize2fs 1.41.12 (17-May-2010)
Filesystem at /dev/mapper/vg_YUN17-lv_root is mounted on /; on-line resizing required
old desc_blocks = 4, new_desc_blocks = 50
Performing an on-line resize of /dev/mapper/vg_YUN17-lv_root to 209715200 (4k) blocks.
The filesystem on /dev/mapper/vg_YUN17-lv_root is now 209715200 blocks long.
[root@YUN17 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_YUN17-lv_root
788G 10G 745G 2% /
tmpfs 253G 4.0K 253G 1% /dev/shm
/dev/sda1 477M 62M 387M 14% /boot
/srv/loopback-device/swiftloopback
1.9G 3.1M 1.7G 1% /srv/node/swiftloopback
/dev/mapper/vg_YUN17-lv_home
769G 69M 730G 1% /home
[root@YUN17 ~]# fdisk -l
Disk /dev/sda: 897.0 GB, 896998047744 bytes
255 heads, 63 sectors/track, 109053 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x0000292d
Device Boot Start End Blocks Id System
/dev/sda1 * 1 64 512000 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2 64 109054 875461632 8e Linux LVM
Disk /dev/mapper/vg_YUN17-lv_root: 859.0 GB, 858993459200 bytes
255 heads, 63 sectors/track, 104433 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000
Disk /dev/mapper/vg_YUN17-lv_swap: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000
Disk /dev/mapper/vg_YUN17-lv_home: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000
openstack项目中遇到的各种问题总结 其二(云主机迁移、ceph及扩展分区)
原文地址:http://blog.51cto.com/xiaoxiaozhou/2113356