This commit ensure all ceph-ansible modules pass flake8 properly.
Signed-off-by: Wong Hoi Sing Edison <hswong3i@pantarei-design.com>
(cherry picked from commit beda1fe773)
This function makes the `ceph_volume` module be not idempotent in
containerized context because it tries to run a container and bindmount
directories that no longer exist.
In that case, the `lvs` command being executed returns something
different than `0` so we can't call `json.loads(out)['report'][0]['lv']`
since it might throw an python error.
The idea is to return `True` only if `rc` is equal to `0` and
`len(result)` is greater than `0`, which means the command matched an
LV.
Fixes: #6284
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ed79bc7a4e)
When asking `ceph-volume` to report only in `lvm batch` context, there's
a bug described in bz1896803 [1] when `--yes` is passed (which by the
way isn't necessary with `--report`).
This commit ensure `--yes` isn't passed to `ceph-volume` when `--report`
is used.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1896803
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1896803
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fe6d6ba622)
The ceph-volume module relies on environment variables to determine if
the command should be executed within a container or not.
The containerized parameter isn't used anymore and we can remove it.
Fixes: #6153
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Since the action values are already defined as a list of choices in
ansible then we will never enter into this condition.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This adds a new `module_utils` namespace in order to avoid defining same
functions in each module.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Move the import at the top of the file and remove unused module import.
- E402 module level import not at top of file
- F401 'xxxx' imported but unused
This also removes the '# noqa E402' statement from the code.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
When running rhel8 containers on a rhel7 host, after zapping an OSD
there's a discrepancy with the lvmetad cache that needs to be refreshed.
Otherwise, the host still sees the lv and can makes the user confused.
If user tries to redeploy an OSD, it will fail because the LV isn't
present and need to be recreated.
ie:
```
stderr: lsblk: ceph-block-8/block-8: not a block device
stderr: blkid: error: ceph-block-8/block-8: No such file or directory
stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
usage: ceph-volume lvm prepare [-h] --data DATA [--data-size DATA_SIZE]
[--data-slots DATA_SLOTS] [--filestore]
[--journal JOURNAL]
[--journal-size JOURNAL_SIZE] [--bluestore]
[--block.db BLOCK_DB]
[--block.db-size BLOCK_DB_SIZE]
[--block.db-slots BLOCK_DB_SLOTS]
[--block.wal BLOCK_WAL]
[--block.wal-size BLOCK_WAL_SIZE]
[--block.wal-slots BLOCK_WAL_SLOTS]
[--osd-id OSD_ID] [--osd-fsid OSD_FSID]
[--cluster-fsid CLUSTER_FSID]
[--crush-device-class CRUSH_DEVICE_CLASS]
[--dmcrypt] [--no-systemd]
ceph-volume lvm prepare: error: Unable to proceed with non-existing device: ceph-block-8/block-8
```
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1886534
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
ceph-volume recently introduced a breaking change because of a `lvm
batch` refactor.
when rerunning `lvm batch --report --format json` on existing OSDs, it
doesn't output a valid json on stdout.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit ensure all ceph-ansible modules pass flake8 properly.
Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Running the `ceph_crush.py`, `ceph_key.py` or `ceph_volume.py` modules in check
mode resulted in the following error:
```
New-style module did not handle its own exit
```
This was due to the fact that they simply returned a `dict` in that case,
instead of calling `module.exit_json()`.
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
This commit makes the zap function idempotent, especially when using
lvm_volumes variable.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1845668
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since Ceph Octopus is python3 only we don't need to specify the max open
files anymore with the container engine.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
When using the lvm batch ceph-volume subcommand with dedicated devices
for filestore (journal) or bluestore (db/wal) then the list of devices
is convert to a string instead of being extended via an iterable.
This was working with only one dedicated device but starting with more
then the ceph_volume module fails.
TASK [ceph-osd : use ceph-volume lvm batch to create bluestore osds] **
fatal: [xxxxxx]: FAILED! => changed=true
cmd:
- ceph-volume
- --cluster
- ceph
- lvm
- batch
- --bluestore
- --yes
- --prepare
- --osds-per-device
- '4'
- /dev/nvme2n1
- /dev/nvme3n1
- /dev/nvme4n1
- /dev/nvme5n1
- /dev/nvme6n1
- --db-devices
- /dev/nvme0n1 /dev/nvme1n1
- --report
- --format=json
msg: non-zero return code
rc: 2
stderr: |2-
stderr: lsblk: /dev/nvme0n1 /dev/nvme1n1: not a block device
stderr: error: /dev/nvme0n1 /dev/nvme1n1: No such file or directory
stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
usage: ceph-volume lvm batch [-h] [--db-devices [DB_DEVICES [DB_DEVICES ...]]]
[--wal-devices [WAL_DEVICES [WAL_DEVICES ...]]]
[--journal-devices [JOURNAL_DEVICES [JOURNAL_DEVICES ...]]]
[--no-auto] [--bluestore] [--filestore]
[--report] [--yes] [--format {json,pretty}]
[--dmcrypt]
[--crush-device-class CRUSH_DEVICE_CLASS]
[--no-systemd]
[--osds-per-device OSDS_PER_DEVICE]
[--block-db-size BLOCK_DB_SIZE]
[--block-wal-size BLOCK_WAL_SIZE]
[--journal-size JOURNAL_SIZE] [--prepare]
[--osd-ids [OSD_IDS [OSD_IDS ...]]]
[DEVICES [DEVICES ...]]
ceph-volume lvm batch: error: Unable to proceed with non-existing device: /dev/nvme0n1 /dev/nvme1n1
So the dedicated device list is considered as a single string.
This commit also adds the journal_devices, block_db_devices and
wal_devices documentation to the ceph_volume module.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1816713
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This commit adds the filestore to bluestore migration support in
ceph_volume module.
We must append to the executed command only the relevant options
according to what is passed in `osd_objectostore`
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The zap action from ceph_volume module always implies `--destroy`.
This commit adds the destroy option support so we can ask ceph-volume to
not use `--destroy` when zapping a device.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit adds the `wal_devices` option support to the
ceph_volume module.
passing a devices list in `bluestore_wal_devices` will make ceph-volume
creating 1 vg using these devices to create block.wal partitions.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit adds the `block_db_devices` option support to the
ceph_volume module.
passing a devices list in `dedicated_devices` will make ceph-volume
creating 1 vg using these devices to create block.db partitions for data
devices.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
On containerized deployment, the OSD entrypoint runs some ceph-volume
commands (lvm/simple scan and/or activate) which perform badly without
the ulimit option.
This option was added for all previous ceph-volume commands but not on
the ceph-osd container startup.
Also updating hard limit value to 4096 to reflect default baremetal
value.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The ceph nodes couldn't have the python six library installed which
could lead to error during the ceph_volume custom module execution.
ImportError: No module named six
The six library isn't useful in this module if we're sure that all
action variables passed to the build_ceph_volume_cmd function are a
list and not a string.
Resolves: #4071
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The ceph-volume lvm list command takes ages to complete when having
a lot of LV devices on containerized deployment.
For instance, with 25 OSDs on a node it takes 3 mins 44s to list the
OSD.
Adding the max open files limit to the container engine cli when
executing the ceph-volume command seems to improve a lot thee
execution time ~30s.
This was impacting the OSDs creation with ceph-volume (both filestore
and bluestore) when using multiple LV devices.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
this is needed to properly handle semaphore synchronization for udev
actions via dmcrypt/cryptsetup.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683770
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
If you deploy with 2 HDDs and 1 SDD then each subsequent deploy both
HDD drives will be filtered out, because they're already used by ceph.
ceph-volume will report this as a 'strategy change' because the device
list went from a mixed type of HDD and SDD to a single type of only SDD.
This situation results in a non-zero exit code from ceph-volume. We want
to handle this situation gracefully and report that nothing will be changed.
A similar json structure to what would have been given by ceph-volume is
returned in the 'stdout' key.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1650306
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
In order to be able to retrieve udev information, we must expose its
socket. As per, https://github.com/ceph/ceph/pull/25201 ceph-volume will
start consuming udev output.
Signed-off-by: Sébastien Han <seb@redhat.com>
osds-per-device needs to be passed to run_command as a string.
Otherwise, expandvars method will try to iterate over an integer.
Signed-off-by: Maciej Naruszewicz <maciej.naruszewicz@intel.com>
This commit does a couple of things:
* Avoid code duplication
* Clarify the code
* add more unit tests
* add myself to the author of the module
Signed-off-by: Sébastien Han <seb@redhat.com>
The batch option got recently added, while rebasing this patch it was
necessary to implement it. So now, the batch option can work on
containerized environments.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1630977
Signed-off-by: Sébastien Han <seb@redhat.com>
This handles the case gracefully where --report does not return any JSON
because a validator might have failed.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
The command is run with --report first to see if any OSDs will be
created or not. If they will be, then the command is run. If not, then
changed is set to False and the module exits.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>