1. 背景说明
glance在openstack中负责镜像相关的服务,支持将运行的虚拟机转换为快照,镜像和快照都存储在glance中,glance的后端支持多种存储方式,包括本地的文件系统,http,glusterfs,ceph,swift等等。
默认情况下,glance采用本地文件系统的方式存储image,存储的路径为/var/lib/glance/images,随着时间的推移,当镜像越来越多的时候,根目录的空间将会越来越大,所以对于glance的路径来说,需要提前做好规划和准备,如划分一个单独的空间存储image,或者存放在分布式的文件系统,如ceph,swift上等。我所在的环境中,刚上线的时候,由于缺乏对glance的规划,采用默认的路径/var/lib/glance/images,后来因为空间的不够的问题,而采取更改路径,在更改的过程中,引发了"血案".
2. 血案现场
创建一台虚拟机
#获取镜像id [root@controller ~]# glance image-list +--------------------------------------+---------------+-------------+------------------+-------------+--------+ | ID | Name | Disk Format | Container Format | Size | Status | +--------------------------------------+---------------+-------------+------------------+-------------+--------+ | 37aaedc7-6fe6-4fc8-b110-408d166b8e51 | cirrors | qcow2 | bare | 13200896 | active | #获取网络的id号 [root@controller ~]# neutron net-list +--------------------------------------+---------------+-------------------------------------------------------+ | id | name | subnets | +--------------------------------------+---------------+-------------------------------------------------------+ | 99c68a93-336a-4605-aa78-343d41ca1206 | vmTest | 79cb82a1-eac1-4311-8e6d-badcabd22e44 192.168.100.0/24 | +--------------------------------------+---------------+-------------------------------------------------------+ #获取flavor的id号码 [root@controller ~]# nova flavor-list +--------------------------------------+------------------+-----------+------+-----------+------+-------+-------------+-----------+ | ID | Name | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public | +--------------------------------------+------------------+-----------+------+-----------+------+-------+-------------+-----------+ | 1 | m1.large | 8192 | 100 | 10 | | 4 | 1.0 | True |
2. 创建instance
[root@controller ~]# nova boot --flavor m1.large --image 37aaedc7-6fe6-4fc8-b110-408d166b8e51 --nic net-id=99c68a93-336a-4605-aa78-343d41ca1206 glance_image_error_test +--------------------------------------+------------------------------------------------+ | Property | Value | +--------------------------------------+------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-SRV-ATTR:host | - | | OS-EXT-SRV-ATTR:hypervisor_hostname | - | | OS-EXT-SRV-ATTR:instance_name | instance-000001ff | | OS-EXT-STS:power_state | 0 | | OS-EXT-STS:task_state | scheduling | | OS-EXT-STS:vm_state | building | | OS-SRV-USG:launched_at | - | | OS-SRV-USG:terminated_at | - | | accessIPv4 | | | accessIPv6 | | | adminPass | X39vzn4RKwrL | | config_drive | | | created | 2016-01-27T11:14:46Z | | flavor | m1.large (1) | | hostId | | | id | b143fd7d-b1b7-49b4-ba20-7968777460bc | | image | cirrors (37aaedc7-6fe6-4fc8-b110-408d166b8e51) | | key_name | - | | metadata | {} | | name | glance_image_error_test | | os-extended-volumes:volumes_attached | [] | | progress | 0 | | security_groups | default | | status | BUILD | | tenant_id | 842ab3268a2c47e6a4b0d8774de805ae | | updated | 2016-01-27T11:14:46Z | | user_id | bc5e46fc4204497185ae3ca6f8b7affb | +--------------------------------------+------------------------------------------------+
3. 创建失败
[root@controller ~]# nova list |grep b143fd7d-b1b7-49b4-ba20-7968777460bc | b143fd7d-b1b7-49b4-ba20-7968777460bc | glance_image_error_test | ERROR | - | NOSTATE | | ChuangYiYuan_10_16_2_21 |
3.寻根究底
查看glance日志,包括glance-api和glance-registry
[root@controller ~]# tail -n 2 /var/log/glance/api.log 2016-01-27 19:15:22.917 2664 INFO urllib3.connectionpool [-] Starting new HTTP connection (1): controller 2016-01-27 19:15:22.948 2664 INFO glance.wsgi.server [89d3f8c3-9d66-4d75-b88c-eafe746f9a6b bc5e46fc4204497185ae3ca6f8b7affb 842ab3268a2c47e6a4b0d8774de805ae - - -] 10.16.2.8 - - [27/Jan/2016 19:15:22] "HEAD /v1/images/37aaedc7-6fe6-4fc8-b110-408d166b8e51 HTTP/1.1" 200 856 0.031628 [root@controller ~]# tail -n 2 /var/log/glance/registry.log 2016-01-27 19:15:22.946 2763 INFO glance.registry.api.v1.images [cca31ae2-f412-4605-a5db-0cc0a507955b bc5e46fc4204497185ae3ca6f8b7affb 842ab3268a2c47e6a4b0d8774de805ae - - -] Successfully retrieved image 37aaedc7-6fe6-4fc8-b110-408d166b8e51 2016-01-27 19:15:22.946 2763 INFO glance.wsgi.server [cca31ae2-f412-4605-a5db-0cc0a507955b bc5e46fc4204497185ae3ca6f8b7affb 842ab3268a2c47e6a4b0d8774de805ae - - -] 127.0.0.1 - - [27/Jan/2016 19:15:22] "GET /images/37aaedc7-6fe6-4fc8-b110-408d166b8e51 HTTP/1.1" 200 847 0.017350 #!!未发现有异常!!
2. 查看nova的日志,包括nova-api,nova-scheduler,nova-conductor和nova-compute节点日志
2016-01-09 17:42:09.653 2872 WARNING nova.openstack.common.loopingcall [-] task run outlasted interval by 9.578928 sec 2016-01-09 17:47:25.755 2872 WARNING nova.openstack.common.loopingcall [-] task run outlasted interval by 5.842983 sec 2016-01-27 19:14:49.762 2872 ERROR nova.scheduler.filter_scheduler [req-46235a89-6ed4-47e5-ac06-85f6dedc8985 bc5e46fc4204497185ae3ca6f8b7affb 842ab3268a2c47e6a4b0d8774de805ae] [instance: b143fd7d-b1b7-49b4-ba20-7968777460bc] Error from last host: ChuangYiYuan_10_16_2_22 (node ChuangYiYuan_10_16_2_22): [u‘Traceback (most recent call last):\n‘, u‘ Fil e "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1328, in _build_instance\n set_access_ip=set_access_ip)\n‘, u‘ File "/usr/lib/python2.6/site-packages/nov a/compute/manager.py", line 393, in decorated_function\n return function(self, context, *args, **kwargs)\n‘, u‘ File "/usr/lib/python2.6/site-packages/nova/compute/manager. py", line 1740, in _spawn\n LOG.exception(_(\‘Instance failed to spawn\‘), instance=instance)\n‘, u‘ File "/usr/lib/python2.6/site-packages/nova/openstack/common/excutils.p y", line 68, in __exit__\n six.reraise(self.type_, self.value, self.tb)\n‘, u‘ File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1737, in _spawn\n bl ock_device_info)\n‘, u‘ File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 2287, in spawn\n admin_pass=admin_password)\n‘, u‘ File "/usr/lib/python2 .6/site-packages/nova/virt/libvirt/driver.py", line 2656, in _create_image\n project_id=instance[\‘project_id\‘])\n‘, u‘ File "/usr/lib/python2.6/site-packages/nova/virt/li bvirt/imagebackend.py", line 192, in cache\n *args, **kwargs)\n‘, u‘ File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/imagebackend.py", line 383, in create_image\n prepare_template(target=base, max_size=size, *args, **kwargs)\n‘, u‘ File "/usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py", line 249, in inner\n ret urn f(*args, **kwargs)\n‘, u‘ File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/imagebackend.py", line 182, in fetch_func_sync\n fetch_func(target=target, *args, **k wargs)\n‘, u‘ File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/utils.py", line 653, in fetch_image\n max_size=max_size)\n‘, u‘ File "/usr/lib/python2.6/site-packag es/nova/virt/images.py", line 78, in fetch_to_raw\n max_size=max_size)\n‘, u‘ File "/usr/lib/python2.6/site-packages/nova/virt/images.py", line 72, in fetch\n image_serv ice.download(context, image_id, dst_path=path)\n‘, u‘ File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 331, in download\n _reraise_translated_image_except ion(image_id)\n‘, u‘ File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 329, in download\n image_chunks = self._client.call(context, 1, \‘data\‘, image_id)n‘, u‘ File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 209, in call\n return getattr(client.images, method)(*args, **kwargs)\n‘, u‘ File "/usr/lib/pytho n2.6/site-packages/glanceclient/v1/images.py", line 127, in data\n % urllib.quote(str(image_id)))\n‘, u‘ File "/usr/lib/python2.6/site-packages/glanceclient/common/http.py" , line 289, in raw_request\n return self._http_request(url, method, **kwargs)\n‘, u‘ File "/usr/lib/python2.6/site-packages/glanceclient/common/http.py", line 249, in _http _request\n raise exc.from_response(resp, body_str)\n‘, u‘ImageNotFound: Image 37aaedc7-6fe6-4fc8-b110-408d166b8e51 could not be found.\n‘] #在nova-scheduler和nova-compute的日志中查看到"ImageNotFound: Image 37aaedc7-6fe6-4fc8-b110-408d166b8e51 could not be found"的报错信息!
3.查看glance的服务状态
[root@controller ~]# /etc/init.d/openstack-glance-api status openstack-glance-api (pid 2222) is running... [root@controller ~]# /etc/init.d/openstack-glance-registry status openstack-glance-registry (pid 2694) is running... #状态正常 [root@controller ~]# glance image-list +--------------------------------------+---------------+-------------+------------------+-------------+--------+ | ID | Name | Disk Format | Container Format | Size | Status | +--------------------------------------+---------------+-------------+------------------+-------------+--------+ | 37aaedc7-6fe6-4fc8-b110-408d166b8e51 | cirrors | qcow2 | bare | 13200896 | active | +--------------------------------------+---------------+-------------+------------------+-------------+--------+ #正常工作,尝试upload一个镜像,也能够正常工作,原因何在呢??
4.抓住元凶
因为在运维过程中,修改过glance的默认路径由/var/lib/glance/images修改为/data1/glance,并且将/var/lib/glance/images下的镜像都mv至/data1/glance下了,而此时尽管数据已经前已过去了,但是image的元数据信息却牢牢的记录在glance的image_locations表中,查看得知:
mysql> select * from glance.image_locations where image_id=‘37aaedc7-6fe6-4fc8-b110-408d166b8e51‘\G; *************************** 1. row *************************** id: 37 image_id: 37aaedc7-6fe6-4fc8-b110-408d166b8e51 value: file:///var/lib/glance/images/37aaedc7-6fe6-4fc8-b110-408d166b8e51 #元凶 created_at: 2015-12-21 06:10:24 updated_at: 2015-12-21 06:10:24 deleted_at: NULL deleted: 0 meta_data: {} status: active 1 row in set (0.00 sec)
真像:原来原有目录/var/lib/glance/images目录下的镜像都已经mv至/data1/glance下,而数据库中却依旧记录着就的路径内容,从而,衍生的一个问题:当nova尝试启动一台instance的时候,nova会到instance镜像缓存路径,默认/var/lib/nova/_base下查找是否有该镜像,如果没有则向glance发起result api请求,请求下载指定image的镜像到本地,glance则根据数据库中image_locations所定义的值去查找镜像,从而导致失败!
解决方法:更新glance的元数据信息
mysql> update glance.image_locations set value=‘file:///data1/glance/37aaedc7-6fe6-4fc8-b110-408d166b8e51‘ where image_id=‘37aaedc7-6fe6-4fc8-b110-408d166b8e51‘\G; Query OK, 1 row affected (0.05 sec) Rows matched: 1 Changed: 1 Warnings: 0 #重建虚拟机,故障解决!!!
5. 进一步探索
glance中,主要有两张表很重要:images和image_locations,其中image负责存储镜像相关的信息,而image_locations记录镜像的存储url路径。
images数据表
mysql> select * from glance.images limit 2\G; *************************** 1. row *************************** id: 0267dcbf-9f72-4ce8-9976-7106e38ee948 name: cirror1 size: 6899532 status: deleted is_public: 1 created_at: 2015-12-02 01:45:13 updated_at: 2015-12-02 01:46:41 deleted_at: 2015-12-02 01:46:41 deleted: 1 disk_format: qcow2 container_format: bare checksum: 7c607794659403b970a5d0a00fb2c311 owner: 842ab3268a2c47e6a4b0d8774de805ae min_disk: 0 min_ram: 0 protected: 0 virtual_size: NULL *************************** 2. row *************************** id: 2437cede-d03a-4680-b704-6d27c4d7198e name: test1 size: 0 status: deleted is_public: 0 created_at: 2015-12-21 09:02:41 updated_at: 2015-12-21 09:06:02 deleted_at: 2015-12-21 09:06:02 deleted: 1 disk_format: qcow2 container_format: bare checksum: d41d8cd98f00b204e9800998ecf8427e owner: 842ab3268a2c47e6a4b0d8774de805ae min_disk: 0 min_ram: 0 protected: 0 virtual_size: NULL 2 rows in set (0.00 sec) #即记录着创建时候相关信息,还记得deleted字段的作用么?哈哈,删除镜像的原理??额
2. image_locations表
mysql> select * from image_locations; +----+--------------------------------------+--------------------------------------------------------------------+---------------------+---------------------+---------------------+---------+-----------+--------+ | id | image_id | value | created_at | updated_at | deleted_at | deleted | meta_data | status | +----+--------------------------------------+--------------------------------------------------------------------+---------------------+---------------------+---------------------+---------+-----------+--------+ | 1 | 437d860f-1c9f-4bb2-a3ca-8ec062441909 | file:///var/lib/glance/images/437d860f-1c9f-4bb2-a3ca-8ec062441909 | 2015-06-24 10:40:39 | 2015-12-01 11:52:20 | 2015-12-01 11:52:20 | 1 | {} | active | | 2 | 5ce414b0-660a-46e1-ad0a-b842b2afc0b7 | file:///var/lib/glance/images/5ce414b0-660a-46e1-ad0a-b842b2afc0b7 | 2015-06-25 02:49:33 | 2015-06-25 02:49:33 | NULL
6. 附录
images表的结构:
mysql> desc glance.images; +------------------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------------+--------------+------+-----+---------+-------+ | id | varchar(36) | NO | PRI | NULL | | | name | varchar(255) | YES | | NULL | | | size | bigint(20) | YES | | NULL | | | status | varchar(30) | NO | | NULL | | | is_public | tinyint(1) | NO | MUL | NULL | | | created_at | datetime | NO | | NULL | | | updated_at | datetime | YES | | NULL | | | deleted_at | datetime | YES | | NULL | | | deleted | tinyint(1) | NO | MUL | NULL | | | disk_format | varchar(20) | YES | | NULL | | | container_format | varchar(20) | YES | | NULL | | | checksum | varchar(32) | YES | MUL | NULL | | | owner | varchar(255) | YES | MUL | NULL | | | min_disk | int(11) | NO | | NULL | | | min_ram | int(11) | NO | | NULL | | | protected | tinyint(1) | YES | | NULL | | | virtual_size | bigint(20) | YES | | NULL | | +------------------+--------------+------+-----+---------+-------+ 17 rows in set (0.00 sec)
2. image_locations表结构
mysql> desc image_locations; +------------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------+-------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | image_id | varchar(36) | NO | MUL | NULL | | | value | text | NO | | NULL | | | created_at | datetime | NO | | NULL | | | updated_at | datetime | YES | | NULL | | | deleted_at | datetime | YES | | NULL | | | deleted | tinyint(1) | NO | MUL | NULL | | | meta_data | text | YES | | NULL | | | status | varchar(30) | NO | | active | | +------------+-------------+------+-----+---------+----------------+ 9 rows in set (0.00 sec)
本文出自 “Happy实验室” 博客,谢绝转载!
openstack运维实战系列(十三)之glance更改路径引发的"血案"
原文地址:http://happylab.blog.51cto.com/1730296/1739368