purge-docker-cluster must remove all osd_disk_prepare logs in
`{{ ceph_osd_docker_run_script_path }}`, otherwise if you purge your
cluster and try to redeploy it, osds will fail to start since because it
will try to retrieve find a partition uuid which doesn't exist.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1510470
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The ansible inventory could have more than just ceph-ansible hosts, so
we shouldnt use "hosts: all", also only grab one file when getting
the ceph cluster name instead of failing when there is more than one
file in /etc/ceph. Also fix location of the ceph.conf template
Rebooting servers is really intrusive and perhaps this is not what the
operator wants. So we disable the reboot by default now. Note that the
reboot might not happen all the time.
It can be enabled by default by running the purge playbook with -e
reboot_osd_node=True
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1505011
Signed-off-by: Sébastien Han <seb@redhat.com>
During purge osd, the containers are not stopped because of a typo, as a
result, all the devices can't be unmounted later.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
stable-3.0 brought numerous changes in ceph-ansible variables, this PR
aims to maintain backward compatibility for someone running stable-2.2
upgrading to stable-3.0 but keeps its groups_vars untouched.
We will then determine the right options to make sure the upgrade works
but we are expecting that new variables should be used.
We will drop this in a near future, maybe 3.1 or 3.2.
Signed-off-by: Sébastien Han <seb@redhat.com>
nfs nodes can't be upgraded from jewel to luminous because ceph-nfs role
is skipped because of the condition `when:
"ceph_release_num[ceph_release] >= ceph_release_num.luminous"`. Indeed,
package is upgraded in `ceph-nfs` role, therefore,
`ceph_release` is still set to the old version. It means the when can't
be satisfied.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
mgr nodes can't be upgraded from jewel to luminous because ceph-mgr role
is skipped because of the condition `when:
"ceph_release_num[ceph_release] >= ceph_release_num.luminous"`. Indeed,
ceph-mgr package is upgraded in `ceph-mgr` role, therefore,
`ceph_release` is still set to the old version. It means the when can't
be satisfied.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 302e563601cd6820b1ae44fabdfb1506688c7c9b)
- Add upgrade support for rbd mirror and nfs daemons.
- Only works with systemd (remove sysvinit and upstart occurence)
- A bit of cleanup
Signed-off-by: Sébastien Han <seb@redhat.com>
This patch changes the `when:` keys so that they have no jinja2
delimiters. This avoids Ansible warnings which could turn into
errors in a future Ansible release.
Ansible throws warnings when using yum/dnf/rpm with the command
module:
[WARNING]: Consider using yum module rather than running yum
This patch adds the `warn: no` argument to suppress the warnings
in the Ansible output.
This playbook can replace failed OSD in containerized and
non-containerized env.
The current limitation is that it won't allow you to choose between
filestore/bluestore and will do collocation as well.
Signed-off-by: Sébastien Han <seb@redhat.com>
Using a condition when osd_scenario == 'non-collocated' was wrong since
these partitions can be collocated on a single device also. Removing the
check makes the purge of these partitions.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1499871
Signed-off-by: Sébastien Han <seb@redhat.com>
The current inclusion of purge-iscsi-gateways.yml in purge-cluster.yml
is not working well and blocking the CI too. So removing it from
purge-cluster.yml and re-add the original purge-iscsi-gateways.yml.
Signed-off-by: Sébastien Han <seb@redhat.com>
Use the pg check before doing the pg check, not on the quorum check.
Also never quote int when doing comparaison.
Signed-off-by: Sébastien Han <seb@redhat.com>
The shell wildcard expansion of non-existing paths fails on zsh making
the whole script fail. We can use file module with with_fileglob to
alleviate the problem instead.
Signed-off-by: Boris Ranto <branto@redhat.com>
The systemd can't stop services if the unit files were removed before
the cluster was purged. We should just ignore these.
Signed-off-by: Boris Ranto <branto@redhat.com>
Also we now play ceph-config to have everything being generated for new
daemons bootstrap during upgrade.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1497959
Signed-off-by: Sébastien Han <seb@redhat.com>
Even if the task is skipped, ansible registers the var as 'skipped' so
this task the task using this variable for its next usage.
Signed-off-by: Sébastien Han <seb@redhat.com>
rbd-mirror containers are not stopped in purge-docker-cluster playbook
because of the wrong name used.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
unti now, mgr nodes are not managed by purge-cluster.yml, therefore it
breaks scenario like purge_cluster.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The old name is used in `rolling_update.yml` and
`purge-docker-cluster.yml`, it breaks the
`test_rgw_service_is_running()` test.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>