If the physical disk to device path mapping has changed since the
last ceph-volume simple scan (e.g. addition or removal of disks),
a wrong disk could be deleted.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2071035
Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit 64e08f2c0b)
f6b49f78a9 changed a call back to `ipwrap`
This fixes this.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a99812aa92)
use `include_tasks` instead of `import_tasks`.
Given that with `import_tasks` statements are preprocessed
and the tasks that defines it hasn't been run yet, it will fail
and complain like following:
```
The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute '_interface'
```
Using `include_tasks` instead fixes this.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 434793e2fe)
- preserve mode and ownership on main directories
- make sure the directories are well present prior to restoring files.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 047af3a3f6)
main branch requires it. Otherwise the playbook won't run.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e223630cf0)
shrink_osd has its own tox config file (tox-shrink_osd.ini)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6623f34679)
If the user doesn't pass a valid name (present in the inventory)
the playbook will fail like following:
```
fatal: [localhost -> {{ target_node }}]: FAILED! =>
msg: |-
The task includes an option with an undefined variable. The error was: "hostvars['10.70.46.40']" is undefined
```
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b18a1aa3ca)
Typical failure:
```
fatal: [localhost]: FAILED! =>
msg: |-
The conditional check 'mode not in ['backup', 'restore']' failed. The error was: error while evaluating conditional (mode not in ['backup', 'restore']): 'mode' is undefined
```
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 848dd03fa6)
default path has moved from `/var/run/ganesha.pid` to
`/var/run/ganesha/ganesha.pid`.
This updates the restart script accordingly.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
when we set target_size_ratio to warn it means that the administrator wants to get suggestion from the mgr module but apply it manually when he/she wants. So it's in the same approach as 'on' mode just triggered by hand.
So there is no need to set pg_num when target_size_ratio is 'warn' and the mgr module will calculate the correct pg_num and the administrator will adjust it whenever he/she wants.
It is the same approach that was in #6471
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit bb849a5586)
This adds a task that sets `autotune_memory_target_ratio` depending on the
value of `is_hci`.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2028693
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 41d62596fc)
When the upgrade from Ceph 4 to 5 is performed in the OpenStack context,
ceph-ansible triggers the rolling_update playbook, which is supposed to
rollout new Ceph containers. The ceph-infra role tries to take care
about firewall, ntp config and logrotate; however, TripleO manages them
through tripleo-heat-templates. This patch just add an additional tag
to skip the ceph-infra role in the OpenStack context.
Closes: https://bugzilla.redhat.com/2090456
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
(cherry picked from commit 0e9b3902b0)
When this directory is left after the osd adoption, it leads to the following error:
```
[WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
host axdesec2ocs1n002.ecommerce.inditex.grp `cephadm ceph-volume` failed: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config
ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config'.
```
this is because of an unexpected behavior regarding 'config inferring' when a legacy directory is present in /var/lib/ceph.
Note: this doesn't fix the root cause, this is a workaround.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2075510
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6e2ebe857d)
This updates the default value for the vagrant_box variable
in all vagrant_variables.yml files
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ef0455a0b1)
with the bump of py version, let's use newer version for pytest.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit dee49779c9)
This commit makes podman bindmount `/:/rootfs:ro` so the container can
collect data from the host.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2028775
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit fixes templating error that occurs when using auto osd discovery. Getting the len before converting the result to a list causes "object of type generator has no len()" error.
Signed-off-by: pinotelio <ahmadreza.mollapour@gmail.com>
Since the ISO install method removal, ceph-ansible isn't able
to detect wheter the user is deploying in a 'disconnected environment'.
By the way, given that ceph-ansible is available only for upgrading to RHCS 5,
this check doesn't make sense anymore, let's drop it.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2062147
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When deploying with --skip-tags=package-install (when there is no access to a repository), the playbook is still trying to update the package cache, which makes the playbook fail.
This change prevents the playbook to try to update the cache when the package-install tag is skipped.
Signed-off-by: Florent CARLI <florent.carli@rte-france.com>
When running the playbook with `--limit`, if the play targeted doesn't match
hosts present in the mgr group the playbook can fail.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063029
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
With this commit, upgrading a cluster from Nautilus to Pacific with
active rgw multisite replication will be blocked.
This is because a lot of bugs are currently present in Pacific regarding
RGW multisite.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063702
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Initially MONs and RGW binded /etc/pki/ca-trust/extracted using the :z flag
(introduced to solve an OSP TripleO issue on RHEL - #3638) but using
this flag prevents local services (like sssd) running on the host from accessing
the certificates/files in that folder.
Signed-off-by: Teoman ONAY <tonay@redhat.com>
When using group of group, the playbook will apply undesired
labels on nodes.
This commit fixes it by applying only the expected labels.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2057528
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When using cluster custom names, cephadm commands are executed using
the default admin keyring name which fails.
Signed-off-by: Teoman ONAY <tonay@redhat.com>
By default cephadm uses root account to connect remotely
to other nodes in the cluster. This change allows to choose
another account.
This commit also allows to use a dedicated subnet for cephadm mgmt.
Signed-off-by: Teoman ONAY <tonay@redhat.com>
This fixes the service file removal and makes the playbook
call `systemctl reset-failed` on the service because in Ceph
Nautilus, ceph-crash doesn't handle `SIGTERM` signal.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This playbook doesn't support less than 3 monitors present in the inventory.
Just like the rolling_update playbook, let's fail if less than
3 monitors are present.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2049132
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>