Commit Graph

818 Commits (efa91ad8896c5e2fabc8ddf94a95baee1801b425)

Author SHA1 Message Date
Teoman ONAY efa91ad889 ephadm-adopt: Alertmanager placement count missing
Regression from #7576. Alertmanager placement count was missing
after migration to ceph_orch_apply module

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit 31be495061)
2024-10-16 16:33:22 +02:00
Teoman ONAY 974c4ec040 cephadm-adopt: custom prometheus port lost after adoption
If a custom Prometheus port was used before adoption, it was not
taken into account and default 9095 was set instead. Now custom
port is re-applied.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2242346

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit b41b7bf869)

# Conflicts:
#	infrastructure-playbooks/cephadm-adopt.yml
2024-10-16 16:33:22 +02:00
Teoman ONAY a49580b46b ceph_orch_apply: fix yaml error when multiple rgw deployed
ceph orch ls rgw --format=yaml returns multiple documents
when multiple rgw are installed, this was not handled
correctly.

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit 823700bc1b)
2024-08-23 23:15:20 +02:00
Teoman ONAY 5ff72a35ef cephadm-adopt: fix "Update the placement of radosgw hosts" task
networks was at the wrong level in the spec file. Failed with
"got an unexpected keyword argument 'networks'"

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit 2c88ecc784)
2024-08-13 12:28:25 +02:00
Teoman ONAY 51f98abcdb cephadm-adopt: fix "Update the placement of radosgw hosts" task
spec file template conditions were incorrect

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit e85060cb67)
2024-08-06 13:31:23 +02:00
Teoman ONAY c114822a81 cephadm-adopt: Fixes binding network for alertmanager
Alertmanager was bind to default * network instead of grafana_server_addr
as it was before. Now on if grafana_server_addr is defined, it will be
bind to that network.

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit 0bf3398774)
2024-08-06 09:18:50 +02:00
Teoman ONAY b80b68bcd4 ceph_orch_spec: Add ceph orch apply spec feature
Add new module ceph_orch_spec which applies ceph spec files.
This feature was needed to bind extra mount points to the RGW
container (/etc/pki/ca-trust/).

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2262133

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit f6fd034e7e)
2024-05-22 13:43:23 +02:00
Guillaume Abrioux eb8904e795 core: bump to ansible 2.15
2.12 is EOL since May 2023.

Let's bump to 2.15.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-03-11 15:48:19 +01:00
Guillaume Abrioux 68ce8e11f8 linter: address syntax errors
This fixes the following error:
```
syntax-check[specific]: The field 'hosts' has an invalid value, which
includes an undefined variable.
```

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-03-11 15:48:19 +01:00
Teoman ONAY 10c7fd72b1 cephadm-adopt: Fixes hosts addition to be managed by cephadm
The tasks "manage nodes with cephadm - ipv4/6" are skipped when
cephadm_mgmt_network contains more than one ip network which prevent
cephadm from managing the host.

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit db2f3e42dc)
2023-08-17 22:56:05 +02:00
Teoman ONAY 0a9a91b662 cephadm-adopt: Fixes rbd-mirror regression
779523f86f introduced a regression
related to rbdmirrors tasks. They were executed while
ceph_rbd_mirror_remot_* variables were not set.

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit 18cd35bad5)
2023-08-16 10:36:34 +02:00
Teoman ONAY 0b2826132e cephadm-adopt: Add --networks parameter support to ceph orch apply rgw
When radosgw_address_block was defined, it was not taken into account
during rgw adoption process

depends on: https://tracker.ceph.com/issues/62185
fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2224351

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit bc54290718)
2023-08-16 10:36:34 +02:00
Lukas Bezdicka d5e55516f4 Replace ip_version check with ansible test
Instead of checking ip_version variable we should check the input
address for ip version and select code path based on that.

This solves ceph adoption with mixed ipv6 and ipv4 networks.

Resolves: rhbz#2186226
Signed-off-by: Lukas Bezdicka <lbezdick@redhat.com>
(cherry picked from commit 5622a033a9)
2023-05-02 14:56:33 +02:00
Guillaume Abrioux db3b8c271e cephadm-adopt: fix rbd-mirror adoption
The recent rbdmirror refactor introduced a regression in the
cephadm-adopt playbook.
Given that the rbd-mirror peer addition is now done by using the monitor
config-key store method during the cluster deployment, we can drop this
play from the cephadm-adopt.yml playbook.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2140569

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c3fae04b8d)
2022-12-15 15:45:20 +01:00
Guillaume Abrioux 0004891aad switch-to-containers: ignore errors when stopping service
There might be cases where it can break idempotency.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a158d0d53b)
2022-10-19 16:16:51 +02:00
Guillaume Abrioux b98d8b9535 switch-to-containers: fix rbd-mirror migration
`--state=enabled` isn't a valid filter so the unit from the packaging
never gets removed.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2134917

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7664da58da)
2022-10-17 10:52:03 +02:00
Guillaume Abrioux 2699a484a2 rolling_update: fix rbd-mirror play
There's no service to stop/mask when the node being upgraded is
a 'primary node' only (1 way replication).

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 82e0ae7e75)
2022-08-03 19:59:10 +02:00
Guillaume Abrioux 77574fbd05 adopt: fix placement update calls for rgw
The commands called here are not built correctly.
This commit fixes it.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2058038#c27

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 30c7e88d81)
2022-08-03 09:53:00 +02:00
Guillaume Abrioux 78f85e6e84 rbd-mirror: follow up on recent rbd-mirror refactor
- ensure /var/lib/ceph/bootstrap-rbd-mirror exists
- always install ceph-base on rbdmirror nodes (otherwise, ceph-crash
  isn't present)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 041435e1e3)
2022-08-03 06:44:51 +02:00
Guillaume Abrioux 712b3c4e29 purge-dashboard: check for legacy group name 'grafana-server'
When using the legacy group name 'grafana-server', this playbook will run but
won't remove properly all monitoring resources as expected.

Fixes: #7265

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a9cb444be1)
2022-08-01 20:32:25 +02:00
Teoman ONAY af0624150d Refresh /etc/ceph/osd json files content before zapping the disks
If the physical disk to device path mapping has changed since the
last ceph-volume simple scan (e.g. addition or removal of disks),
a wrong disk could be deleted.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2071035

Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit 64e08f2c0b)
2022-07-11 13:43:27 +02:00
Guillaume Abrioux 7b531514ce backup-and-restore: use archive/unarchive approach
current approach is too complex and causes too many issues permission
issues.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit dffe7b47de)
2022-07-07 17:16:31 +02:00
Guillaume Abrioux 42bd198a91 backup-and-restore: various fixes
- preserve mode and ownership on main directories
- make sure the directories are well present prior to restoring files.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 047af3a3f6)
2022-07-05 14:45:56 +02:00
Guillaume Abrioux 800da79617 Revert "upgrade: block upgrade when rgw multisite is active"
This reverts commit 51bc8cb636.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7d848fa19e)
2022-07-03 07:28:01 +02:00
Guillaume Abrioux 220a4a1369 backup-and-restore: fix check on 'target_node' variable
If the user doesn't pass a valid name (present in the inventory)
the playbook will fail like following:

```
fatal: [localhost -> {{ target_node }}]: FAILED! =>
  msg: |-
    The task includes an option with an undefined variable. The error was: "hostvars['10.70.46.40']" is undefined
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b18a1aa3ca)
2022-06-29 09:09:06 +02:00
Guillaume Abrioux 9965bf6cd6 backup-and-restore: fix check on 'mode' variable
Typical failure:

```
fatal: [localhost]: FAILED! =>
  msg: |-
    The conditional check 'mode not in ['backup', 'restore']' failed. The error was: error while evaluating conditional (mode not in ['backup', 'restore']): 'mode' is undefined
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 848dd03fa6)
2022-06-29 08:52:27 +02:00
Guillaume Abrioux 6615d97015 backup-and-restore: fix a typo
Typo introduced during initial implementation.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e28c486e52)
2022-06-22 07:17:32 +02:00
Guillaume Abrioux fd2279e75d cephadm_adopt: set autotune_memory_target_ratio
This adds a task that sets `autotune_memory_target_ratio` depending on the
value of `is_hci`.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2028693

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 41d62596fc)
2022-06-22 07:17:32 +02:00
Francesco Pantano abfb5385c1 Add ceph_infra tag to rolling_update
When the upgrade from Ceph 4 to 5 is performed in the OpenStack context,
ceph-ansible triggers the rolling_update playbook, which is supposed to
rollout new Ceph containers.  The ceph-infra role tries to take care
about firewall, ntp config and logrotate; however, TripleO manages them
through tripleo-heat-templates.  This patch just add an additional tag
to skip the ceph-infra role in the OpenStack context.

Closes: https://bugzilla.redhat.com/2090456
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
(cherry picked from commit 0e9b3902b0)
2022-06-22 07:17:32 +02:00
Guillaume Abrioux 4dd57379bb purge: reset-failed ceph-crash
This ensures we always reset-failed the ceph-crash service.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5ab46f836d)
2022-06-22 07:17:32 +02:00
Guillaume Abrioux aaf3dff30c cephadm-adopt: remove legacy directory after adoption
When this directory is left after the osd adoption, it leads to the following error:

```
[WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
    host axdesec2ocs1n002.ecommerce.inditex.grp `cephadm ceph-volume` failed: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config
ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config'.
```

this is because of an unexpected behavior regarding 'config inferring' when a legacy directory is present in /var/lib/ceph.

Note: this doesn't fix the root cause, this is a workaround.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2075510

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6e2ebe857d)
2022-06-22 07:17:32 +02:00
Guillaume Abrioux 57fb213f29 contrib: add a playbook
this playbook can backup or restore some ceph files.
(/etc/ceph, /var/lib/ceph, ...)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ed0bba4d77)
2022-06-22 07:17:32 +02:00
Guillaume Abrioux 63b3dadfbf common: move to `ansible.utils.ipwrap`
ipwrap has moved to ansible.utils

see
db4920ebf6

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c1649862a9)
2022-06-22 07:17:32 +02:00
Guillaume Abrioux ef4991910d ansible: bump to ansible 2.12
Add required changes to support ansible 2.12

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit aa68b06c99)
2022-06-22 07:17:32 +02:00
Teoman ONAY f851d3232c Using another user than root for cephadm ssh connections fails
Fixes commit da42f3d139

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2048734

Signed-off-by: Teoman ONAY <tonay@redhat.com>
2022-03-20 12:51:16 +01:00
Guillaume Abrioux 51bc8cb636 upgrade: block upgrade when rgw multisite is active
With this commit, upgrading a cluster from Nautilus to Pacific with
active rgw multisite replication will be blocked.
This is because a lot of bugs are currently present in Pacific regarding
RGW multisite.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063702

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-03-20 08:13:38 +01:00
Guillaume Abrioux 266b6e739c adopt: fix node labelling
When using group of group, the playbook will apply undesired
labels on nodes.
This commit fixes it by applying only the expected labels.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2057528

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-03-03 15:52:00 +01:00
Teoman ONAY f8c6bba657 Add cluster custom name support
When using cluster custom names, cephadm commands are executed using
the default admin keyring name which fails.

Signed-off-by: Teoman ONAY <tonay@redhat.com>
2022-03-03 15:52:00 +01:00
Teoman ONAY da42f3d139 Enable user to change the account used for ssh connection
By default cephadm uses root account to connect remotely
to other nodes in the cluster. This change allows to choose
another account.
This commit also allows to use a dedicated subnet for cephadm mgmt.

Signed-off-by: Teoman ONAY <tonay@redhat.com>
2022-03-03 15:52:00 +01:00
Guillaume Abrioux 2f11982590 purge: ceph-crash purge fixes
This fixes the service file removal and makes the playbook
call `systemctl reset-failed` on the service because in Ceph
Nautilus, ceph-crash doesn't handle `SIGTERM` signal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-03-02 10:08:35 +01:00
Guillaume Abrioux f08129edf2 switch2containers: fail if less than 3 monitors
This playbook doesn't support less than 3 monitors present in the inventory.
Just like the rolling_update playbook, let's fail if less than
3 monitors are present.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2049132

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-02-21 21:07:27 +01:00
Guillaume Abrioux 94e51d5c14 adopt: fix rbd-mirror adoption
We can't use `{{ cephadm_cmd }}` here because the monitors aren't yet adopted.
We must use `{{ ceph_cmd }}` instead.
This also fixes some filters `| default()` (they must be moved before `| from_json()`)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967440

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-02-09 20:58:27 +01:00
Guillaume Abrioux f30767432b adopt: fix bug in mon_ip_list set_fact
`default('{}')` must be before `| from_json`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-02-09 11:32:00 +01:00
Guillaume Abrioux ddae06e1a2 adopt: check for POOL_APP_NOT_ENABLED warning
This commit makes the cephadm-adopt playbook fail if the cluster
has the `POOL_APP_NOT_ENABLED` warning raised.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2040243

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-02-09 11:32:00 +01:00
jowsiewski 1dfd195c7e Remove the remaining packages
Signed-off-by: jowsiewski <owsiewski@gmail.com>
2022-02-04 10:00:44 +01:00
Francesco Pantano 12dd8b5df1 Add with_pkg tag on package related tasks
In the OpenStack context we let the integration tool (TripleO)
deal with repositories and packages.
This change just adds the with_pkg tag to allow TripleO skipping
both the repositories and packages installation.

Signed-off-by: Francesco Pantano <fpantano@redhat.com>
2022-02-01 16:04:10 +01:00
Guillaume Abrioux 7f517cdd22 adopt: create nfs exports at the user level
The current implementation is wrong.
ceph-ansible lists all existing buckets and try to create
an export for each of them.
Instead, it's easier to create the export at the user level.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2037691

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-01-28 15:16:30 +01:00
Dmitriy Rabotyagov 2eb0a88a67 Use upstream config_template collection
In order to reduce need of module
internal maintenance and to join forces on plugin development,
it's proposed to switch to using upstream version of
config_template module.

As it's shipped as collection, it's installation for end-users
is trivial and aligns with general approach of shipping extra modules.

Signed-off-by: Dmitriy Rabotyagov <noonedeadpunk@ya.ru>
2022-01-18 20:22:10 +01:00
Guillaume Abrioux aee1f06497 cephadm-adopt: use named args in rgw export creation
In order to avoid breaking changes, let's use named argument
instead of positional argument syntax in the command line
used to create rgw export.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2037691

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-01-06 15:56:07 +01:00
Guillaume Abrioux 817c03bc0e update: speed up client play
wip

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-12-15 08:42:23 +01:00