mirror of https://github.com/ceph/ceph-ansible.git
04f4991648
After restarting each OSD, restart_osd_daemon.sh checks that the cluster is in a good state before moving on to the next one. One of the checks it does is that the number of pgs in the state "active+clean" is equal to the total number of pgs in the cluster. On large clusters (e.g. we have 173,696 pgs), it is likely that at least one pg will be scrubbing and/or deep-scrubbing at any one time. These pgs are in state "active+clean+scrubbing" or "active+clean+scrubbing+deep", so the script was erroneously not including them in the "good" count. Similar concerns apply to "active+clean+snaptrim" and "active+clean+snaptrim_wait". Fix this by considering as good any pg whose state contains active+clean. Do this as an integer comparison to num_pgs in pgmap. (could this be backported to at least stable-3.0 please?) Closes: #2008 Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk> |
||
---|---|---|
.github/ISSUE_TEMPLATE | ||
contrib | ||
docker | ||
docs | ||
group_vars | ||
infrastructure-playbooks | ||
library | ||
plugins | ||
profiles | ||
roles | ||
tests | ||
.gitignore | ||
.mergify.yml | ||
CONTRIBUTING.md | ||
LICENSE | ||
Makefile | ||
README-MULTISITE.md | ||
README.rst | ||
Vagrantfile | ||
ansible.cfg | ||
ceph-ansible.spec.in | ||
dummy-ansible-hosts | ||
example-ansible-role-requirements.yml | ||
generate_group_vars_sample.sh | ||
requirements.txt | ||
rhcs_edits.txt | ||
rundep.sample | ||
rundep_installer.sh | ||
site-docker.yml.sample | ||
site.yml.sample | ||
test.yml | ||
tox.ini | ||
vagrant_variables.yml.sample |
README.rst
ceph-ansible ============ Ansible playbooks for Ceph, the distributed filesystem. Please refer to our hosted documentation here: http://docs.ceph.com/ceph-ansible/master/ You can view documentation for our ``stable-*`` branches by substituting ``master`` in the link above for the name of the branch. For example: http://docs.ceph.com/ceph-ansible/stable-3.0/