On systems running docker there is an issue with lxfs that results in
the find command returning 1 but actually did the job.
e.g: on a system with docker runnning find /var will give us the
following error:
find:
'/var/lib/lxcfs/cgroup/devices/lxc/x1/system.slice/systemd-update-utmp.service/devices.deny':
Permission denied
find:
'/var/lib/lxcfs/cgroup/devices/lxc/x1/system.slice/dev-random.mount/devices.allow':
Permission denied
...
...
However ceph files got deleted so we ignore the error.
Signed-off-by: Sébastien Han <seb@redhat.com>
We now rely on the cli tool ceph-detect-init which will tell us the init
system in used on the distribution. We do this instead of the previous
lookup for systemd unit files to call the right task depending on the
init system.
Signed-off-by: Sébastien Han <seb@redhat.com>
with_items is evaluated before the when so in a second run where the
variable is empty if will fail with "'dict object' has no attribute
'stdout_lines'". To fix this we had a default array so with_items does
not fail and the task is skipped with the when.
Signed-off-by: Sébastien Han <seb@redhat.com>
Because the purge-cluster.yml playbook does not have access to the roles
default vars then we can be sure that raw_multi_journal is defined. For
example, if this was purging a dmcrypt journal then raw_multi_journal
might not be defined at all in group_vars/all.yml or
group_vars/osds.yml.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
When running encrypted OSDs, an encrypted device mapper is used (because
created by the crypsetup tool). So before attempting to remove all the
partitions on a device we must delete all the encrypted device mappers,
then we can delete all the partitions.
Signed-off-by: Sébastien Han <seb@redhat.com>
Please enter the commit message for your changes. Lines starting
The name of this variable was a bit confusing since its activation will
zap all the block devices no matter which osd scenario we are using.
Removing this variable and applying a condition on the OSD scenario is
now feasible and easier since we import group_vars variable files for
OSDs.
Signed-off-by: Sébastien Han <seb@redhat.com>
When purging OSDs we do not need to include these defaults as nothing in
the following tasks uses them. Also, it has the side effect of
overwriting any variables defined in group_vars files that are relative
to the inventory you are using with the default values. That behavior
was causing the CI tests to fail.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
In my testing zapping the osd disks deleted the journal
partitions, making the 'zap ceph journal partitions' task fail because
the partitions it found previously do not exist anymore.
This moves the task that finds the journal partitions after 'zap osd disks'
to catch any partitions ceph-disk might have missed.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Using failed_when will still throw an exception and stop the playbook if
the file you're trying to include doesn't exist.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
the libcephfs version was bumped to 2, so we need to check for that as
well when we're removing all ceph packages
Signed-off-by: Casey Bodley <cbodley@redhat.com>
- Separated out one large playbook into multiple playbooks to run
host-type by host-type i.e. mdss, rgws, rbdmirrors, nfss, osds, mons.
- Combined common tasks into one shared task for all hosts where
applicable
- Fixed various bugs
Signed-off-by: Ivan Font <ivan.font@redhat.com>
Prior to this change we were purging all the partitions on the device
when using the raw_journal_devices scenario.
This was breaking deployments where other partitions are used for other
purposes (ie: OS system).
Signed-off-by: Sébastien Han <seb@redhat.com>
Since we have a couple of infrastructure related playbooks
(additionnally to the roles we are using to deploy Ceph), it makes sense
to have them located in a separate directory.
Signed-off-by: Sébastien Han <seb@redhat.com>