ceph——rgw服务启不起来

时间：2018-09-06 18:19:08 阅读：2311 评论：0 收藏：0 [点我收藏+]

标签：set uid require table 80端口被占用 disabled ons failed tcp active

环境：SUSE SESv5版本——对应社区ceph的L版本（12.2）

故障背景：在给ceph集群扩充第四个节点的时候，运行到stage4，报错：

sesadmin:~ # salt-run state.orch ceph.stage.4

openattic : valid

[ERROR ] Run failed on minions: sesnode3.ses5.com

Failures:

sesnode3.ses5.com:

----------

ID: wait for rgw processes

Function: module.run

Name: cephprocesses.wait

Result: False

Comment: Module function cephprocesses.wait executed

Started: 15:51:13.725345

Duration: 135585.3 ms

Changes:

----------

ret:

False

Summary for sesnode3.ses5.com

------------

Succeeded: 0 (changed=1)

Failed: 1

------------

Total states run: 1

Total run time: 135.585 s

sesadmin.ses5.com_master:

Name: mine.send - Function: salt.function - Result: Changed Started: - 15:49:35.968349 Duration: 1490.601 ms

Name: igw config - Function: salt.state - Result: Changed Started: - 15:49:37.459622 Duration: 13781.381 ms

Name: auth - Function: salt.state - Result: Changed Started: - 15:49:51.241587 Duration: 5595.701 ms

Name: keyring - Function: salt.state - Result: Clean Started: - 15:49:56.837756 Duration: 743.186 ms

Name: sysconfig - Function: salt.state - Result: Clean Started: - 15:49:57.581125 Duration: 748.137 ms

Name: iscsi import - Function: salt.state - Result: Changed Started: - 15:49:58.329640 Duration: 3324.991 ms

Name: iscsi apply - Function: salt.state - Result: Clean Started: - 15:50:01.655630 Duration: 3166.979 ms

Name: wait until sesnode2.ses5.com with role igw can be restarted - Function: salt.state - Result: Changed Started: - 15:50:04.823089 Duration: 6881.116 ms

Name: check if igw processes are still running on sesnode2.ses5.com after restarting igws - Function: salt.state - Result: Changed Started: - 15:50:11.704379 Duration: 1696.99 ms

Name: restarting igw on sesnode2.ses5.com - Function: salt.state - Result: Changed Started: - 15:50:13.401560 Duration: 6705.876 ms

Name: wait until sesnode3.ses5.com with role igw can be restarted - Function: salt.state - Result: Changed Started: - 15:50:20.107781 Duration: 6779.647 ms

Name: check if igw processes are still running on sesnode3.ses5.com after restarting igws - Function: salt.state - Result: Changed Started: - 15:50:26.887635 Duration: 773.069 ms

Name: restarting igw on sesnode3.ses5.com - Function: salt.state - Result: Changed Started: - 15:50:27.660939 Duration: 6693.656 ms

Name: cephfs pools - Function: salt.state - Result: Changed Started: - 15:50:34.354917 Duration: 6740.383 ms

Name: mds auth - Function: salt.state - Result: Changed Started: - 15:50:41.095912 Duration: 4639.248 ms

Name: mds - Function: salt.state - Result: Clean Started: - 15:50:45.735744 Duration: 1607.979 ms

Name: mds restart noop - Function: test.nop - Result: Clean Started: - 15:50:47.344850 Duration: 0.525 ms

Name: rgw auth - Function: salt.state - Result: Changed Started: - 15:50:47.345595 Duration: 4674.037 ms

Name: rgw users - Function: salt.state - Result: Changed Started: - 15:50:52.019892 Duration: 4806.13 ms

Name: rgw - Function: salt.state - Result: Changed Started: - 15:50:56.826321 Duration: 3165.723 ms

Name: setup prometheus rgw exporter - Function: salt.state - Result: Changed Started: - 15:50:59.992275 Duration: 2910.979 ms

Name: rgw demo buckets - Function: salt.state - Result: Clean Started: - 15:51:02.903422 Duration: 3480.446 ms

Name: wait until sesnode3.ses5.com with role rgw can be restarted - Function: salt.state - Result: Changed Started: - 15:51:06.384012 Duration: 6823.43 ms

----------

ID: check if rgw processes are still running on sesnode3.ses5.com after restarting rgws

Function: salt.state

Result: False

Comment: Run failed on minions: sesnode3.ses5.com

Failures:

sesnode3.ses5.com:

----------

ID: wait for rgw processes

Function: module.run

Name: cephprocesses.wait

Result: False

Comment: Module function cephprocesses.wait executed

Started: 15:51:13.725345

Duration: 135585.3 ms

Changes:

----------

ret:

False

Summary for sesnode3.ses5.com

------------

Succeeded: 0 (changed=1)

Failed: 1

------------

Total states run: 1

Total run time: 135.585 s

Started: 15:51:13.207613

Duration: 136141.45 ms

Changes:

Summary for sesadmin.ses5.com_master

-------------

Succeeded: 23 (changed=17)

Failed: 1

-------------

Total states run: 24

Total run time: 233.372 s

sesadmin:~ #

排查思路：

在该节点上，手动启动RGW服务：

sesnode3:~ # systemctl start ceph-radosgw@rgw.sesnode3.service

sesnode3:~ # systemctl status ceph-radosgw@rgw.sesnode3.service

● ceph-radosgw@rgw.sesnode3.service - Ceph rados gateway

Loaded: loaded (/usr/lib/systemd/system/ceph-radosgw@.service; enabled; vendor preset: disabled)

Active: failed (Result: start-limit) since Wed 2018-08-29 16:26:35 CST; 1h 39min ago

Process: 103198 ExecStart=/usr/bin/radosgw -f --cluster ${CLUSTER} --name client.%i --setuser ceph --setgroup ceph (code=exited, status=5)

Main PID: 103198 (code=exited, status=5)

Aug 29 16:26:34 sesnode3 systemd[1]: ceph-radosgw@rgw.sesnode3.service: Mai...ED

Aug 29 16:26:34 sesnode3 systemd[1]: ceph-radosgw@rgw.sesnode3.service: Uni...e.

Aug 29 16:26:34 sesnode3 systemd[1]: ceph-radosgw@rgw.sesnode3.service: Fai...‘.

Aug 29 16:26:35 sesnode3 systemd[1]: ceph-radosgw@rgw.sesnode3.service: Ser...t.

Aug 29 16:26:35 sesnode3 systemd[1]: Stopped Ceph rados gateway.

Aug 29 16:26:35 sesnode3 systemd[1]: ceph-radosgw@rgw.sesnode3.service: Sta...y.

Aug 29 16:26:35 sesnode3 systemd[1]: Failed to start Ceph rados gateway.

Aug 29 16:26:35 sesnode3 systemd[1]: ceph-radosgw@rgw.sesnode3.service: Uni...e.

Aug 29 16:26:35 sesnode3 systemd[1]: ceph-radosgw@rgw.sesnode3.service: Fai...‘.

Hint: Some lines were ellipsized, use -l to show in full.

但是依旧起不来，去查看日志

sesnode3:~ # tail -f /var/log/ceph/ceph-client.rgw.sesnode3.log

2018-08-29 16:26:33.330111 7f23d5511e00 0 deferred set uid:gid to 167:167 (ceph:ceph)

2018-08-29 16:26:33.330338 7f23d5511e00 0 ceph version 12.2.5-419-g8cbf63d997 (8cbf63d997fb5cdc783fe7bfcd4f5032ee140c0c) luminous (stable), process (unknown), pid 103128

2018-08-29 16:26:33.929287 7f23d5511e00 0 starting handler: civetweb

2018-08-29 16:26:33.929609 7f23d5511e00 0 civetweb: 0x5636c0c9e2e0: cannot listen to 80: 98 (Address already in use)

2018-08-29 16:26:33.936396 7f23d5511e00 -1 ERROR: failed run

2018-08-29 16:26:34.363426 7f5d467d9e00 0 deferred set uid:gid to 167:167 (ceph:ceph)

2018-08-29 16:26:34.363551 7f5d467d9e00 0 ceph version 12.2.5-419-g8cbf63d997 (8cbf63d997fb5cdc783fe7bfcd4f5032ee140c0c) luminous (stable), process (unknown), pid 103198

2018-08-29 16:26:34.817776 7f5d467d9e00 0 starting handler: civetweb

2018-08-29 16:26:34.818183 7f5d467d9e00 0 civetweb: 0x55c06880c2e0: cannot listen to 80: 98 (Address already in use)

2018-08-29 16:26:34.818351 7f5d467d9e00 -1 ERROR: failed run

查看ceph.conf：

sesnode3:/etc/ceph # cat ceph.conf

# DeepSea default configuration. Changes in this file will be overwritten on

# package update. Include custom configuration fragments in

# /srv/salt/ceph/configuration/files/ceph.conf.d/[global,osd,mon,mgr,mds,client].conf

[global]

fsid = 82499237-e7fe-32cf-b47f-4117d2b8e63a

mon_initial_members = sesnode2, sesnode3, sesnode1

mon_host = 192.168.120.82, 192.168.120.83, 192.168.120.81

auth_cluster_required = cephx

auth_service_required = cephx

auth_client_required = cephx

filestore_xattr_use_omap = true

public_network = 192.168.120.0/24

cluster_network = 192.168.125.0/24

# enable old ceph health format in the json output. This fixes the

# ceph_exporter. This option will only stay until the prometheus plugin takes

# over

mon_health_preluminous_compat = true

mon health preluminous compat warning = false

rbd default features = 3

[client.rgw.sesnode3]

rgw frontends = "civetweb port=80"

rgw dns name = sesnode3.ses5.com

rgw enable usage log = true

[osd]

[mon]

[mgr]

[mds]

[client]

注：很奇怪，用ps命令去查看，80端口没有被使用，但是报错是80端口被占用了，为了省事所以直接一点将端口改掉：

sesnode3:/etc/ceph # ps -ef |grep 80

root 380 1 0 Aug28 ? 00:00:01 /usr/lib/systemd/systemd-journald

root 2062 2 0 Aug28 ? 00:00:00 [cfg80211]

root 108046 107895 0 18:05 pts/1 00:00:00 tail -f /var/log/ceph/ceph-client.rgw.sesnode3.log

root 111617 107698 0 18:42 pts/0 00:00:00 grep --color=auto 80

sesnode3:/etc/ceph # netstat -ano |grep 80

tcp 0 0 192.168.125.83:6800 0.0.0.0:* LISTEN off (0.00/0/0)

tcp 0 0 192.168.120.83:6800 0.0.0.0:* LISTEN off (0.00/0/0)

tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN off (0.00/0/0)

tcp 0 0 192.168.120.83:6801 0.0.0.0:* LISTEN off (0.00/0/0)

tcp 0 0 192.168.125.83:6801 0.0.0.0:* LISTEN off (0.00/0/0)

tcp 0 0 0.0.0.0:8480 0.0.0.0:* LISTEN off (0.00/0/0)

tcp 0 0 192.168.120.83:6801 192.168.120.81:42728 ESTABLISHED off (0.00/0/0)

tcp 0 0 192.168.120.83:35610 192.168.120.80:4506 TIME_WAIT timewait (56.80/0/0)

tcp 0 0 192.168.120.83:35608 192.168.120.80:4506 TIME_WAIT timewait (56.06/0/0)

tcp 0 0 192.168.120.83:45428 192.168.120.81:6803 ESTABLISHED off (0.00/0/0)

[client.rgw.sesnode3]

rgw frontends = "civetweb port=8480"

rgw dns name = sesnode3.ses5.com

rgw enable usage log = true

启动服务，服务正常。

ceph——rgw服务启不起来

标签：set uid require table 80端口被占用 disabled ons failed tcp active

原文地址：https://www.cnblogs.com/Janessa-ting/p/9599306.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行