This is causing unknown issues when trying to start a dmcrypt container.
Basically the container is stuck at mount opening the LUKS device. This
is still unknown why this is causing trouble but we need to move
forward. Also, this doesn't seem to help in any ways to fix the race
condition we've seen.
Here is the log for dmcrypt:
cryptsetup 1.7.4 processing "cryptsetup --debug --verbose --key-file
key luksClose fbf8887d-8694-46ca-b9ff-be79a668e2a9"
Running command close.
Locking memory.
Installing SIGINT/SIGTERM handler.
Unblocking interruption on signal.
Allocating crypt device context by device
fbf8887d-8694-46ca-b9ff-be79a668e2a9.
Initialising device-mapper backend library.
dm version [ opencount flush ] [16384] (*1)
dm versions [ opencount flush ] [16384] (*1)
Detected dm-crypt version 1.14.1, dm-ioctl version 4.35.0.
Device-mapper backend running with UDEV support enabled.
dm status fbf8887d-8694-46ca-b9ff-be79a668e2a9 [ opencount flush ]
[16384] (*1)
Releasing device-mapper backend.
Trying to open and read device /dev/sdc1 with direct-io.
Allocating crypt device /dev/sdc1 context.
Trying to open and read device /dev/sdc1 with direct-io.
Initialising device-mapper backend library.
dm table fbf8887d-8694-46ca-b9ff-be79a668e2a9 [ opencount flush
securedata ] [16384] (*1)
Trying to open and read device /dev/sdc1 with direct-io.
Crypto backend (gcrypt 1.5.3) initialized in cryptsetup library
version 1.7.4.
Detected kernel Linux 3.10.0-693.el7.x86_64 x86_64.
Reading LUKS header of size 1024 from device /dev/sdc1
Key length 32, device size 1943016847 sectors, header size 2050
sectors.
Deactivating volume fbf8887d-8694-46ca-b9ff-be79a668e2a9.
dm status fbf8887d-8694-46ca-b9ff-be79a668e2a9 [ opencount flush ]
[16384] (*1)
Udev cookie 0xd4d14e4 (semid 32769) created
Udev cookie 0xd4d14e4 (semid 32769) incremented to 1
Udev cookie 0xd4d14e4 (semid 32769) incremented to 2
Udev cookie 0xd4d14e4 (semid 32769) assigned to REMOVE task(2) with
flags (0x0)
dm remove fbf8887d-8694-46ca-b9ff-be79a668e2a9 [ opencount flush
retryremove ] [16384] (*1)
fbf8887d-8694-46ca-b9ff-be79a668e2a9: Stacking NODE_DEL [verify_udev]
Udev cookie 0xd4d14e4 (semid 32769) decremented to 1
Udev cookie 0xd4d14e4 (semid 32769) waiting for zero
Signed-off-by: Sébastien Han <seb@redhat.com>
Use an intermediate variable to build the final `dedicated_devices` list
to avoid duplicate entry in that array. (We need a 1:1 relation between
`dedicated_devices` and `devices` since we are using a `with_together`
later.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
All keyring are getting copied to all nodes.
This commit fixes a leftover from a previous code refactor.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498583
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This is to ensure `docker_exec_cmd` fact is set with the correct value
in case of daemons collocation
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Using systemd module allows us to do in one task what we did in three
tasks:
- enable unit file,
- issue a `daemon-reload`,
- start the service
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This fixes the error :
```
The conditional check 'sestatus.stdout != 'Disabled'' failed.
```
that occurs when running on non rhel based system since the
`sestatus` fact is registered only on rhel based distribution.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
It's sad but we can not rely on the prepare container anymore since the
log are flushed after reboot. So inpecting the container does not return
anything.
Now, instead we use a ephemeral container to look up for the
journal/block.db/block.wal (depending if filestore or bluestore) and
build the activate command accordingly.
Signed-off-by: Sébastien Han <seb@redhat.com>
We generate the ceph.conf on all the nodes through the
ceph-docker-common so there is no need to push it to the Ansible file.
Also this is breaking the ceph.conf template generation since we only
generate sections based on the host the ansible task is running on.
For example, what's typically happening, we bootstrap the monitor, we
get a ceph.conf generated for a mon only, we go on an osd, we generate
the ceph.conf with osd section (done by ceph-docker-common) but this
gets overwritten by the copy_config task of the ceph-osd role.
Signed-off-by: Sébastien Han <seb@redhat.com>
When Ansible is not run with verbose options it's difficult to see which
include and/or set_fact does what. So adding a name for each clarifies.
Signed-off-by: Sébastien Han <seb@redhat.com>
All keys are copied to all nodes.
This commit split that task in each roles so keys are copied to their
respective nodes.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1488999
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Prior to this patch this activation sequence for autodetection was
always skipped because we were asking to activate on device without
partitions, which doesn't make sense.
We also fix the way we lookup for a device, since the data partition is
always numbered 1, we take the min element of the dict.
Closes: https://github.com/ceph/ceph-ansible/issues/1782
Signed-off-by: Sébastien Han <seb@redhat.com>
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The script can fail to get the osd id because the osds are activated by
udev and it can take a while for them to activate. This commit fixes
that by trying to get all the osds per node in a loop.
This commit also makes the osd services enabled so that they are
available after reboot.
Signed-off-by: Boris Ranto <branto@redhat.com>
The lvm_volumes variable is now a list of dictionaries that represent
each OSD you'd like to deploy using ceph-volume. Each dictionary must
have the following keys: data, journal and data_vg. Each dictionary also
can optionaly provide a journal_vg key.
The 'data' key represents the lv name used for the OSD and the 'data_vg'
key is the vg name that the given lv resides on. The 'journal' key is
either an lv, device or partition. The 'journal_vg' key is optional and
must be the vg name for the journal lv if given. This key is mainly used
for purging of the journal lv if purge-cluster.yml is run.
For example:
lvm_volumes:
- data: data_lv1
journal: journal_lv1
data_vg: vg1
journal_vg: vg2
- data: data_lv2
journal: /dev/sdc
data_vg: vg1
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
To be properly evaluated the "skipped" conditions must always have the
first place on the list of condition, otherwise the other conditions are
evaluated before and make the task fail.
Closes: https://github.com/ceph/ceph-ansible/issues/1733
Signed-off-by: Sébastien Han <seb@redhat.com>
ceph services can fail to start under certain circumstances (for
example, when running in a container) because the default systemd
service configuration causes namespace issues.
To work around this we can override the system service settings by
placing an overrides file in the ceph-<service>@.service.d directory.
This can be generic so as to allow any potential changes required to
the ceph-<service> service files.
The overrides file is only setup when the
"ceph_<service>_systemd_overrides" config_template override variable is
specified.
The available service systemd override files are as follows:
ceph_mds_systemd_overrides
ceph_mgr_systemd_overrides
ceph_mon_systemd_overrides
ceph_osd_systemd_overrides
ceph_rbd_mirror_systemd_overrides
ceph_rgw_systemd_overrides
This scenario will create OSDs using ceph-volume and is only available
in ceph releases greater than Luminous.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
There is only two main scenarios now:
* collocated: everything remains on the same device:
- data, db, wal for bluestore
- data and journal for filestore
* non-collocated: dedicated device for some of the component
Signed-off-by: Sébastien Han <seb@redhat.com>
This will give us more flexibility and avoid a lot of useless when
skipping all tasks from a non-desired role.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Merge `ceph-docker-common` and `ceph-common` defaults vars in
`ceph-defaults` role.
Remove redundant variables declaration in `ceph-mon` and `ceph-osd` roles.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
ceph-disk is responsable for enabling the unit file if needed. Actually
since https://github.com/ceph/ceph/pull/12241 it seems that it's not
even needed. On an event of a restart, udev rules will be trigger and
they will ceph-disk activate the device too so the 'enabled' is not
needed.
Closes: https://github.com/ceph/ceph-ansible/issues/1142
Signed-off-by: Sébastien Han <seb@redhat.com>