If a disk has a symlink it will be re-added to the devices lists one with resolved path and the other with a defined path.
We can rebuild the list from the readlink output cause readlink always return the correct path for all disks.
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
Exclude disks were defined in dedicated_devices and bluestore_wal_devices on osd_auto_discovery enabled.
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
With podman version podman-3:4.2.0-4.module+el8.7.0+17064+3b31f55c and
later, when mgr fails to start if mon is already running.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2169767
Signed-off-by: Teoman ONAY <tonay@ibm.com>
We need to make sure `rgw_instances` is set before `ceph.conf` is
rendered. Otherwise, the `ceph-crash` play in the main playbook updates
(via ceph-handler) the `ceph.conf` on rgw nodes and removes rgw instances
sections.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2141604
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When creating a RBD pool it needs to be initialized as per documentation[1]
Modified (pre_)generate_ceph_cmd to make it usable with any command with
the same parameters as the ceph command
[1]https://docs.ceph.com/en/latest/rbd/rados-rbd-cmds/#create-a-block-device-pool
Signed-off-by: Teoman ONAY <tonay@redhat.com>
The recent rbdmirror refactor introduced a regression in the
cephadm-adopt playbook.
Given that the rbd-mirror peer addition is now done by using the monitor
config-key store method during the cluster deployment, we can drop this
play from the cephadm-adopt.yml playbook.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2140569
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Changed the when condition to only execute that fact setting on RGW
nodes while before it was run on all nodes and failed if the node
was not on the same network range as the RGW.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2131150
Signed-off-by: Teoman ONAY <tonay@redhat.com>
When the following conditions are met:
- rgw is deployed,
- dashboard is deployed,
- playbook is called with --limit,
- a node being processed is collocated on either a mon or mgr.
The playbook fails because `rgw_instances` is undefined.
The idea here is to make sure this variable is always defined.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063029
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
when these variables are defined in the inventory host file,
all tasks are skipped then because the node being played isn't
aware about the values from the rgw nodes.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063029
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
`--state=enabled` isn't a valid filter so the unit from the packaging
never gets removed.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2134917
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Removes the case when display_name was defined prevously but
was not provided when modifying. Without this change the module
will change display_name to name even if display_name was not name
originally. See #7296
Signed-off-by: John Karasev <john.karasev@intel.com>
As the conf is always being set in the config file there is no need to set it in with `ceph config`.
Also this will make it hard to run the playbook with the `ceph_update_config` tag as it won't run and will create an inconsistency between the config managements of the cluster
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
When the value is overriding in `ceph_conf_overrides`, there is no need to calculate and set `osd_memory_target` again as we wanted to override the conf by our desired value.
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
When osd_auto_discovery is true the `devices` var will be empty (as the disks have holders).
Also in general there is no need to check for devices to list the devices with ceph-volume as we have `default({})` on the stdout in `num_osds` set fact in the next task
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
this should be set when rolling_update is true as well, otherwise, it will reset to default on the upgrade
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
in order to avoid the following error:
```
multiple RX peers are not currently supported
```
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2037646
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This `run_once: true` breaks multiple rbd-mirror daemons support
as it would make all rbd-mirror daemons use the same keyring.
Each rbd-mirror daemon needs its own keyring in order to start.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2037646
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
add cluster name to CEPH_ARGS env var to support custom cluster name
this can change as an arg to ceph-crash when https://github.com/ceph/ceph/pull/47836 released
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
If `osd_memory_target` is set in group_vars, the default value (4Gb)
should be overridden.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2118544
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When 'osd_memory_target' is overridden in ceph_conf_overrides.
The task that sets the fact `osd_memory_target` in the ceph-config role
should be skipped.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2056675#c11
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
There's no service to stop/mask when the node being upgraded is
a 'primary node' only (1 way replication).
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When Ansible collections are installed, they should be isolated.
Otherwise, they will be shared in any scheduled job.
This might cause issues when running different branch versions for instance.
This also replace `ANSIBLE_CALLBACK_WHITELIST` with `ANSIBLE_CALLBACK_ENABLED` as it's
going to be deprecated in Ansible 2.15.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
In order to not have to always reproduce it when a failure shows up in the CI
having the failure logged can make us save some time.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
"set_fact container_run_cmd" is not set when using --limit on MDS as facts
were not run on first MON.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2111017
Signed-off-by: Teoman ONAY <tonay@redhat.com>
When using the legacy group name 'grafana-server', this playbook will run but
won't remove properly all monitoring resources as expected.
Fixes: #7265
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Add missing `--cluster {{ cluster }}` on task
`set osd_memory_target` in the main.yml file of the
ceph-config role.
Also it moves the task after ceph configuration file is actually written.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
If the physical disk to device path mapping has changed since the
last ceph-volume simple scan (e.g. addition or removal of disks),
a wrong disk could be deleted.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2071035
Signed-off-by: Teoman ONAY <tonay@redhat.com>
since the bump to ansible-core 2.12, we have to drop
testing against py36 and py37 given that ansible 2.12 requires python
>=3.8
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Due to some changes [1] in nfs-ganesha-4, we now have to use `/var/run/ganesha/ganesha.pid`
[1] 52e15c30d0
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>