mirror of https://github.com/ceph/ceph-ansible.git
04f4991648
After restarting each OSD, restart_osd_daemon.sh checks that the cluster is in a good state before moving on to the next one. One of the checks it does is that the number of pgs in the state "active+clean" is equal to the total number of pgs in the cluster. On large clusters (e.g. we have 173,696 pgs), it is likely that at least one pg will be scrubbing and/or deep-scrubbing at any one time. These pgs are in state "active+clean+scrubbing" or "active+clean+scrubbing+deep", so the script was erroneously not including them in the "good" count. Similar concerns apply to "active+clean+snaptrim" and "active+clean+snaptrim_wait". Fix this by considering as good any pg whose state contains active+clean. Do this as an integer comparison to num_pgs in pgmap. (could this be backported to at least stable-3.0 please?) Closes: #2008 Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk> |
||
---|---|---|
.. | ||
ceph-agent | ||
ceph-client | ||
ceph-common | ||
ceph-common-coreos | ||
ceph-config | ||
ceph-defaults | ||
ceph-docker-common | ||
ceph-fetch-keys | ||
ceph-iscsi-gw | ||
ceph-mds | ||
ceph-mgr | ||
ceph-mon | ||
ceph-nfs | ||
ceph-osd | ||
ceph-rbd-mirror | ||
ceph-restapi | ||
ceph-rgw | ||
ceph-validate/tasks | ||
ceph.ceph-common | ||
ceph.ceph-docker-common |