Commit Graph

835 Commits (be9b45852459434184b576d85ad80f505ee7ea3f)

Author SHA1 Message Date
Teoman ONAY 2c88ecc784 cephadm-adopt: fix "Update the placement of radosgw hosts" task
networks was at the wrong level in the spec file. Failed with
"got an unexpected keyword argument 'networks'"

Signed-off-by: Teoman ONAY <tonay@ibm.com>
2024-08-13 09:51:27 +02:00
Teoman ONAY 0bf3398774 cephadm-adopt: Fixes binding network for alertmanager
Alertmanager was bind to default * network instead of grafana_server_addr
as it was before. Now on if grafana_server_addr is defined, it will be
bind to that network.

Signed-off-by: Teoman ONAY <tonay@ibm.com>
2024-08-05 11:29:39 +02:00
Teoman ONAY e85060cb67 cephadm-adopt: fix "Update the placement of radosgw hosts" task
spec file template conditions were incorrect

Signed-off-by: Teoman ONAY <tonay@ibm.com>
2024-07-12 16:42:07 +02:00
Guillaume Abrioux 59198f5bcd Revert "nfs-ganesha support removal"
This reverts commit 675667e1d6.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-06-20 11:54:36 +02:00
Teoman ONAY f6fd034e7e ceph_orch_spec: Add ceph orch apply spec feature
Add new module ceph_orch_spec which applies ceph spec files.
This feature was needed to bind extra mount points to the RGW
container (/etc/pki/ca-trust/).

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2262133

Signed-off-by: Teoman ONAY <tonay@ibm.com>
2024-05-21 10:04:47 +02:00
Seena Fallah 1121e6d98a ceph-rgw: introduce rgw zone to the name schema
This is needed by ceph-exporter as it is parsing the socket by the number of dots.
Although the rgw_zone variable is only using for constructing the client name
and has nothing to do with multisiting.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2024-05-13 13:26:02 +02:00
Guillaume Abrioux 675667e1d6 nfs-ganesha support removal
nfs-ganesha support will be implemented in a separate playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-03-20 20:22:34 +01:00
Guillaume Abrioux d9e2de1d1d remove legacy task
this was for backward compatibility concerns, it's time to drop this
task.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-03-20 20:22:34 +01:00
Guillaume Abrioux 8871de6d99 simplify monitor address setting
this drops the following parameters:

- monitor_address_block
- monitor_interface
- monitor_address

The monitor address will be automatically set from `public_network` parameter.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-03-08 13:02:44 +01:00
Seena Fallah 2b72ea991d ceph-exporter: add installation role
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2024-03-07 20:22:44 +01:00
Guillaume Abrioux ebd0c6fce3 kickoff squid
This adds the few required changes in order to fully support
Ceph Squid.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-02-16 13:17:44 +01:00
Guillaume Abrioux 1af387621d drop rhcs references
RHCS moved away from ceph-ansible. All RHCS references should be
removed.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-02-16 11:12:01 +01:00
Guillaume Abrioux 03f1e3f48e drop iscsigw support
This service is no longer maintained.
Let's drop its support within ceph-ansible.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-02-16 08:59:05 +01:00
Guillaume Abrioux 18da10bb7a address Ansible linter errors
This addresses all errors reported by the Ansible linter.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-02-16 00:38:19 +01:00
Guillaume Abrioux 7d25a5d565 drop rgw multisite deployment support
The current approach is extremely complex and introduced a lot
of spaghetti code. This doesn't offer a good user experience at all.

It's time to think to another approach (dedicated playbook) and drop
the current implementation in order to clean up the code.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-02-16 00:38:19 +01:00
Guillaume Abrioux c58529fc04 update: update ceph release pre-check
update this check in order to check for Ceph Squid

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-02-14 09:54:13 +01:00
Guillaume Abrioux 644fac8f2d update: update the osd require-osd-release to squid
This updates the `osd require-osd-release` call with `squid`
instead of `reef`.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-02-14 09:54:13 +01:00
Guillaume Abrioux e2cbffc1c7 drop variables from 'hosts' key
The linter complains about that.
It doesn't work anyway so it doesn't make sense to leave these variables
here.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-02-14 09:54:13 +01:00
Guillaume Abrioux 58758bca28 cephadm-adopt: fix an issue in mgr adoption play
this drops variables in 'hosts' field of mgr adoption play.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-02-14 09:54:13 +01:00
Guillaume Abrioux 7909778d0e add CentOS stream 9 support
This adds the resquired changes in order to support
CentOS stream 9.

Also, this bumps the Ansible version support to 2.15

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-02-14 09:54:13 +01:00
Teoman ONAY db2f3e42dc cephadm-adopt: Fixes hosts addition to be managed by cephadm
The tasks "manage nodes with cephadm - ipv4/6" are skipped when
cephadm_mgmt_network contains more than one ip network which prevent
cephadm from managing the host.

Signed-off-by: Teoman ONAY <tonay@ibm.com>
2023-08-16 18:14:29 +02:00
Teoman ONAY 18cd35bad5 cephadm-adopt: Fixes rbd-mirror regression
779523f86f introduced a regression
related to rbdmirrors tasks. They were executed while
ceph_rbd_mirror_remot_* variables were not set.

Signed-off-by: Teoman ONAY <tonay@ibm.com>
2023-08-07 17:21:37 +02:00
Teoman ONAY bc54290718 cephadm-adopt: Add --networks parameter support to ceph orch apply rgw
When radosgw_address_block was defined, it was not taken into account
during rgw adoption process

depends on: https://tracker.ceph.com/issues/62185
fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2224351

Signed-off-by: Teoman ONAY <tonay@ibm.com>
2023-08-07 17:21:37 +02:00
Guillaume Abrioux 23a8bbc6c5 purge: remove /var/lib/ceph
This directory should be removed when the cluster is purged.
most of the services are started with the `--security-opt label=disable`
option. If the directory is not removed, it can cause SElinux issues
when the cluster is redeployed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2023-05-31 23:07:13 +02:00
Guillaume Abrioux 5cd692dcdc update: fix mgr upgrade issue
for some reason, this task has to be done in 2 steps otherwise it fails.
1/ stop and disable the service
2/ mask it

when done with with a single task, the module says the service has been
stopped while this isn't the case (Ansible systemd module bug?).

it possibly relates to https://github.com/ansible/ansible/issues/68680

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2023-05-31 23:07:13 +02:00
Guillaume Abrioux 896d82877f osd: drop filestore support
filestore objectstore will be gone in the next Ceph release.the
This drops the filestore support in ceph-ansible.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2023-05-31 23:07:13 +02:00
Lukas Bezdicka 5622a033a9 Replace ip_version check with ansible test
Instead of checking ip_version variable we should check the input
address for ip version and select code path based on that.

This solves ceph adoption with mixed ipv6 and ipv4 networks.

Resolves: rhbz#2186226
Signed-off-by: Lukas Bezdicka <lbezdick@redhat.com>
2023-04-24 14:21:24 +02:00
Teoman ONAY 49da07df68 shrink-osd fails when the OSD container is stopped
ceph-volume simple scan cannot be executed as it is meant to be
run inside the OSD container.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2164414

Signed-off-by: Teoman ONAY <tonay@ibm.com>
2023-03-15 16:00:22 +01:00
Guillaume Abrioux 15b91cef90 osd: drop filestore support
filestore is about to be removed. This commit removes the filestore
support in ceph-ansible.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2023-03-03 15:00:29 +01:00
Guillaume Abrioux c3fae04b8d cephadm-adopt: fix rbd-mirror adoption
The recent rbdmirror refactor introduced a regression in the
cephadm-adopt playbook.
Given that the rbd-mirror peer addition is now done by using the monitor
config-key store method during the cluster deployment, we can drop this
play from the cephadm-adopt.yml playbook.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2140569

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-11-14 15:45:00 +01:00
Guillaume Abrioux a158d0d53b switch-to-containers: ignore errors when stopping service
There might be cases where it can break idempotency.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-10-17 14:10:42 +02:00
Guillaume Abrioux 7664da58da switch-to-containers: fix rbd-mirror migration
`--state=enabled` isn't a valid filter so the unit from the packaging
never gets removed.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2134917

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-10-15 07:07:47 +02:00
Guillaume Abrioux 371592a8fb common: v18/reef kickoff
align with ceph/ceph/pull/47458 since it has been merged.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-10-07 16:39:56 +02:00
Guillaume Abrioux 82e0ae7e75 rolling_update: fix rbd-mirror play
There's no service to stop/mask when the node being upgraded is
a 'primary node' only (1 way replication).

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-08-03 13:09:42 +02:00
Guillaume Abrioux 30c7e88d81 adopt: fix placement update calls for rgw
The commands called here are not built correctly.
This commit fixes it.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2058038#c27

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-08-02 12:51:14 +02:00
Guillaume Abrioux 041435e1e3 rbd-mirror: follow up on recent rbd-mirror refactor
- ensure /var/lib/ceph/bootstrap-rbd-mirror exists
- always install ceph-base on rbdmirror nodes (otherwise, ceph-crash
  isn't present)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-08-02 10:35:33 +02:00
Guillaume Abrioux a9cb444be1 purge-dashboard: check for legacy group name 'grafana-server'
When using the legacy group name 'grafana-server', this playbook will run but
won't remove properly all monitoring resources as expected.

Fixes: #7265

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-07-29 18:21:33 +02:00
Teoman ONAY 64e08f2c0b Refresh /etc/ceph/osd json files content before zapping the disks
If the physical disk to device path mapping has changed since the
last ceph-volume simple scan (e.g. addition or removal of disks),
a wrong disk could be deleted.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2071035

Signed-off-by: Teoman ONAY <tonay@redhat.com>
2022-07-11 09:14:40 +02:00
Guillaume Abrioux dffe7b47de backup-and-restore: use archive/unarchive approach
current approach is too complex and causes too many issues permission
issues.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-07-07 17:11:19 +02:00
Guillaume Abrioux 047af3a3f6 backup-and-restore: various fixes
- preserve mode and ownership on main directories
- make sure the directories are well present prior to restoring files.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-07-05 14:35:03 +02:00
Guillaume Abrioux b18a1aa3ca backup-and-restore: fix check on 'target_node' variable
If the user doesn't pass a valid name (present in the inventory)
the playbook will fail like following:

```
fatal: [localhost -> {{ target_node }}]: FAILED! =>
  msg: |-
    The task includes an option with an undefined variable. The error was: "hostvars['10.70.46.40']" is undefined
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-29 09:05:19 +02:00
Guillaume Abrioux 848dd03fa6 backup-and-restore: fix check on 'mode' variable
Typical failure:

```
fatal: [localhost]: FAILED! =>
  msg: |-
    The conditional check 'mode not in ['backup', 'restore']' failed. The error was: error while evaluating conditional (mode not in ['backup', 'restore']): 'mode' is undefined
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-29 08:47:54 +02:00
Guillaume Abrioux 7d848fa19e Revert "upgrade: block upgrade when rgw multisite is active"
This reverts commit 51bc8cb636.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-29 06:55:31 +02:00
Guillaume Abrioux e28c486e52 backup-and-restore: fix a typo
Typo introduced during initial implementation.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-15 10:51:54 +02:00
Guillaume Abrioux aa68b06c99 ansible: bump to ansible 2.12
Add required changes to support ansible 2.12

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-15 08:09:10 +02:00
Guillaume Abrioux 41d62596fc cephadm_adopt: set autotune_memory_target_ratio
This adds a task that sets `autotune_memory_target_ratio` depending on the
value of `is_hci`.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2028693

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-05-30 14:56:42 +02:00
Francesco Pantano 0e9b3902b0 Add ceph_infra tag to rolling_update
When the upgrade from Ceph 4 to 5 is performed in the OpenStack context,
ceph-ansible triggers the rolling_update playbook, which is supposed to
rollout new Ceph containers.  The ceph-infra role tries to take care
about firewall, ntp config and logrotate; however, TripleO manages them
through tripleo-heat-templates.  This patch just add an additional tag
to skip the ceph-infra role in the OpenStack context.

Closes: https://bugzilla.redhat.com/2090456
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
2022-05-27 15:05:16 +02:00
Guillaume Abrioux 5ab46f836d purge: reset-failed ceph-crash
This ensures we always reset-failed the ceph-crash service.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-05-23 10:37:42 +02:00
Guillaume Abrioux c1649862a9 common: move to `ansible.utils.ipwrap`
ipwrap has moved to ansible.utils

see
db4920ebf6

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-05-12 22:51:31 +02:00
Guillaume Abrioux 6e2ebe857d cephadm-adopt: remove legacy directory after adoption
When this directory is left after the osd adoption, it leads to the following error:

```
[WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
    host axdesec2ocs1n002.ecommerce.inditex.grp `cephadm ceph-volume` failed: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config
ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config'.
```

this is because of an unexpected behavior regarding 'config inferring' when a legacy directory is present in /var/lib/ceph.

Note: this doesn't fix the root cause, this is a workaround.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2075510

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-05-12 09:58:14 +02:00