Commit Graph

5742 Commits (stable-5.0)
 

Author SHA1 Message Date
Teoman ONAY 5d1a7d60b3 Collocated mgr with mon fails to start on RHEL 8.7
With podman version podman-3:4.2.0-4.module+el8.7.0+17064+3b31f55c and
later, when mgr fails to start if mon is already running.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2169767

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit 637ca81c9c)
(cherry picked from commit be184e440f)
2023-02-21 04:59:51 -05:00
Teoman ONAY 46df5fe9ab use quay.io instead of quay.ceph.io
Signed-off-by: Teoman ONAY <tonay@redhat.com>
2022-12-19 16:15:56 +01:00
Teoman ONAY 1e6256ee0f use quay.io instead of quay.ceph.io
Signed-off-by: Teoman ONAY <tonay@redhat.com>
2022-12-19 11:01:09 +01:00
Guillaume Abrioux 1b49988e4a do not use dev repo
The branch 'master' of ceph/ceph has been renamed to 'main'

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-08-03 21:26:22 +02:00
Teoman ONAY 18064032a9 Refresh /etc/ceph/osd json files content before zapping the disks
If the physical disk to device path mapping has changed since the
last ceph-volume simple scan (e.g. addition or removal of disks),
a wrong disk could be deleted.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2071035

Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit 64e08f2c0b)
2022-07-11 14:20:21 +02:00
Guillaume Abrioux fd8aca866d facts: fix set_radosgw_address.yml
use `include_tasks` instead of `import_tasks`.
Given that with `import_tasks` statements are preprocessed
and the tasks that defines it hasn't been run yet, it will fail
and complain like following:

```
The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute '_interface'
```

Using `include_tasks` instead fixes this.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 434793e2fe)
2022-07-06 03:52:54 +02:00
Guillaume Abrioux 07e6762abf facts: fix deployments with different net interface names
Deployments when radosgws don't have the same names for
network interface.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2095605

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f6b49f78a9)
2022-07-06 03:52:54 +02:00
Guillaume Abrioux a382236da0 tests: add yes_i_know=true in tox-shrink_osd.ini
main branch requires it. Otherwise the playbook won't run.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e223630cf0)
2022-07-05 10:32:54 +02:00
Guillaume Abrioux fc766e160e tests: drop shrink_osd from tox.ini
shrink_osd has its own tox config file (tox-shrink_osd.ini)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6623f34679)
2022-07-05 10:32:54 +02:00
Seena Fallah 32eeaedcff ceph_pool: set target size ratio on both 'on' and 'warn' mode
when we set target_size_ratio to warn it means that the administrator wants to get suggestion from the mgr module but apply it manually when he/she wants. So it's in the same approach as 'on' mode just triggered by hand.
So there is no need to set pg_num when target_size_ratio is 'warn' and the mgr module will calculate the correct pg_num and the administrator will adjust it whenever he/she wants.

It is the same approach that was in #6471

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit bb849a5586)
2022-06-02 09:13:22 +02:00
Guillaume Abrioux 6c4bb2204a purge: reset-failed ceph-crash
This ensures we always reset-failed the ceph-crash service.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5ab46f836d)
2022-05-30 15:16:07 +02:00
Guillaume Abrioux 4d2855414e facts: follow up on aa0cc93
when these variables are defined in the inventory host file,
all tasks are skipped then because the node being played isn't
aware about the values from the rgw nodes.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063029

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 328bd7c975)
2022-04-21 13:38:23 +02:00
Guillaume Abrioux 1dcc072978 facts: fix mgr/mon collocation
`service dump` hangs when no active mgr is available.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-04-21 08:48:47 +02:00
Guillaume Abrioux af5b3f51cc dashboard: fix regression
introduced by ceph/ceph-ansible/pull/7150

when no rgw is present, it fails.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2076192

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1a56fd6a21)
2022-04-21 08:48:47 +02:00
Guillaume Abrioux f224782326 dashboard: support --limit execution with rgw
When the following conditions are met:

- rgw is deployed,
- dashboard is deployed,
- playbook is called with --limit,
- a node being processed is collocated on either a mon or mgr.

The playbook fails because `rgw_instances` is undefined.
The idea here is to make sure this variable is always defined.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063029

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit aa0cc9381d)
2022-04-14 10:38:34 +02:00
Guillaume Abrioux 86ac9a8c48 dashboard: allow collecting stats from the host
This commit makes podman bindmount `/:/rootfs:ro` so the container can
collect data from the host.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2028775

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0f34cd16d8)
2022-04-14 00:37:38 +02:00
Mathias Chapelain 36248edc1f library/ceph_pool: Fix potential null value when creating pools
Before, creating a pool by providing *only* `pg_num` would result in an
error as it would produce `--pgp-num null`.

This commit fix this behavior by defaulting `pgp_num` value to `pg_num`.

Signed-off-by: Mathias Chapelain <mathias.chapelain@proton.ch>
(cherry picked from commit f0f1dd986a)
2022-04-14 00:35:13 +02:00
insatomcat df8674a1c5 do not update Debian cache when package-install is disabled
When deploying with --skip-tags=package-install (when there is no access to a repository), the playbook is still trying to update the package cache, which makes the playbook fail.
This change prevents the playbook to try to update the cache when the package-install tag is skipped.

Signed-off-by: Florent CARLI <florent.carli@rte-france.com>
(cherry picked from commit 63f20f5941)
2022-04-04 13:49:19 +02:00
Guillaume Abrioux 3dd918db23 dashboard: always set `dashboard_server_addr`
When running the playbook with `--limit`, if the play targeted doesn't match
hosts present in the mgr group the playbook can fail.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063029

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 72e4654aae)
2022-03-26 21:57:20 +01:00
Guillaume Abrioux c0e6b64379 tests: update the system before deploying
Having a system up-to-date is usually a good idea.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3e87df5e8f)
(cherry picked from commit 5a1e8620bc)
2022-03-21 07:02:01 +01:00
Teoman ONAY de447d168e Turn off SELinux separation for containers MON and RGW
Initially MONs and RGW binded /etc/pki/ca-trust/extracted using the :z flag
(introduced to solve an OSP TripleO issue on RHEL - #3638) but using
this flag prevents local services (like sssd) running on the host from accessing
the certificates/files in that folder.

Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit 7e8ce2567e)
2022-03-10 16:17:35 +01:00
Guillaume Abrioux fdf201686e purge: ceph-crash purge fixes
This fixes the service file removal and makes the playbook
call `systemctl reset-failed` on the service because in Ceph
Nautilus, ceph-crash doesn't handle `SIGTERM` signal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2f11982590)
2022-03-04 12:51:51 +01:00
Guillaume Abrioux 5618405b60 adopt: fix node labelling
When using group of group, the playbook will apply undesired
labels on nodes.
This commit fixes it by applying only the expected labels.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2057528

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 266b6e739c)
2022-03-04 12:51:14 +01:00
Teoman ONAY be241058d4 Add cluster custom name support
When using cluster custom names, cephadm commands are executed using
the default admin keyring name which fails.

Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit f8c6bba657)
2022-03-04 12:51:14 +01:00
Teoman ONAY 10a5e54f8f Enable user to change the account used for ssh connection
By default cephadm uses root account to connect remotely
to other nodes in the cluster. This change allows to choose
another account.
This commit also allows to use a dedicated subnet for cephadm mgmt.

Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit da42f3d139)
2022-03-04 12:51:14 +01:00
Guillaume Abrioux d787836350 switch2containers: fail if less than 3 monitors
This playbook doesn't support less than 3 monitors present in the inventory.
Just like the rolling_update playbook, let's fail if less than
3 monitors are present.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2049132

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f08129edf2)
2022-02-22 09:23:54 +01:00
Benoît Knecht 4487e41a1e ceph-facts: Fix get_def_crush_rule_name.yml in check mode
This construct doesn't work as intended since ansible/ansible#74212:

```
item.stdout | default('{}') | from_json
```

That PR made the `command` module return `stdout` even in check mode (setting
it to the empty string), so `default()` has no effect in that case and
`from_json()` fails to parse an empty string.

Instead, `default()` needs to be invoked with its second argument set to
`True`, so that it replaces any `False` value (such as an empty string) with
its first argument:

```
item.stdout | default('{}', True) | from_json
```

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 7684d892c0)
2022-02-16 09:50:53 +01:00
Benoît Knecht 3ba0e4bdca ceph-osd: Fix crush_rules.yml in check mode
Set a default value for `item.stdout` before passing it to `from_json()`. The
`when` condition doesn't prevent this template from being evaluated in check
mode, so it fails if `item.stdout` doesn't contain a valid JSON string.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit ef05e9a313)
2022-02-16 09:50:53 +01:00
Benoît Knecht 9df27fc5c5 ceph-osd: Fix start_osds.yml in check mode
This construct doesn't work as intended since ansible/ansible#74212:

```
ceph_osd_ids.stdout | default('{}') | from_json
```

That PR made the `command` module return `stdout` even in check mode (setting
it to the empty string), so `default()` has no effect in that case and
`from_json()` fails to parse an empty string.

Instead, `default()` needs to be invoked with its second argument set to
`True`, so that it replaces any `False` value (such as an empty string) with
its first argument:

```
ceph_osd_ids.stdout | default('{}', True) | from_json
```

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 0b3a608216)
2022-02-16 09:50:53 +01:00
Francesco Pantano 0431746d32 Add with_pkg tag on package related tasks
In the OpenStack context we let the integration tool (TripleO)
deal with repositories and packages.
This change just adds the with_pkg tag to allow TripleO skipping
both the repositories and packages installation.

Signed-off-by: Francesco Pantano <fpantano@redhat.com>
(cherry picked from commit 12dd8b5df1)
2022-02-15 18:18:22 +01:00
John Karasev 2e2d23c79e ceph-grafana: Add proxy env vars to grafana service template
When installing grafana plugins, the container will make http requests.
This requires http proxy otherwise installation cannot be performed. Passed
the proxy vars from all.yml as env args.
Fixes: ceph#6484, ceph#6481

Signed-off-by: John Karasev <john.karasev@intel.com>
(cherry picked from commit 79ca442d53)
2022-02-09 11:35:17 +01:00
Dmitriy Rabotyagov f948b08934 Fix rich version for ansible-lint
Ansible-lint prior to v5.3.1 has issue with reach version >=11.0.0.
In order to cherry-pick fix to stable branches we fix rich version.

This should be reverted with ansible-lint version bump.

Signed-off-by: Dmitriy Rabotyagov <noonedeadpunk@ya.ru>
(cherry picked from commit 583e60af84)
2022-02-07 13:53:47 +01:00
jowsiewski 139289bdcd Remove the remaining packages
Signed-off-by: jowsiewski <owsiewski@gmail.com>
(cherry picked from commit 1dfd195c7e)
2022-02-07 13:53:47 +01:00
Guillaume Abrioux df8fe6335f tests: use centos stream-8 instead of centos 8
CentOS 8 is EOL as of December 2021.
Let's use CentOS stream 8 instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit bc36f60e8d)
2022-02-04 10:49:45 +01:00
Danny Webb 000e93f608 make grafana network a configurable option
Signed-off-by: Danny Webb <danny.webb@thehutgroup.com>
(cherry picked from commit 189ff93372)
2022-01-19 10:08:28 +01:00
Guillaume Abrioux 555d0c8037 purge: remove ceph directories on client nodes
Otherwise any ceph directories are left over on client nodes
after the purge.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2024815

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 20035852a4)
2022-01-06 10:33:47 +01:00
Guillaume Abrioux e083d9f62a container: align systemd units with rpm
Update `After=` and `Wants=` parameters in container systemd units
and make them be aligned with the systemd units that come
from the packaging.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027440

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f01536ea19)
2021-12-15 13:49:49 +01:00
Guillaume Abrioux 87b24a7e3b update: speed up client play
wip

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 817c03bc0e)
2021-12-15 13:49:21 +01:00
Guillaume Abrioux feffbba9d4 cephadm-adopt: ensure /etc/ceph is present on monitoring node
When deploying the monitoring stack on a dedicated node, the directory
`/etc/ceph` has never been created. Therefore, the play for adopting the
monitoring stack fails because it can't write the minimal config file.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2029697

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7ece59b41d)
2021-12-07 22:56:51 +01:00
Guillaume Abrioux 8f26939da4 cephadm-adopt: bindmount /var/lib/ceph with 'ro'
When collocating osds with iscsigw daemons, cephadm bindmounts the
following:

```
-v /var/lib/ceph/6126c064-6a9e-4092-8a64-977930df0843/iscsi.rbd.ceph-ameenasuhani-4fs3bq-node5.vomtqb/configfs:/sys/kernel/config
```

this prevents cephadm-adopt playbook from running container and bindmounting `/var/lib/ceph:/var/lib/ceph:z`

since 'ro' is enough in this playbook, let's replace the ':z' option on
this bindmount with ':ro'

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027411

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c4fdf956bd)
2021-12-02 08:52:05 +01:00
Guillaume Abrioux f2eab356d6 ceph_volume: support overriding bind-mounts
This makes it possible to call `podman run` with custom bind-mounts.

cephadm-adopt.yml playbook needs it for a very specific use case:

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027411

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b02d71c307)

# Conflicts:
#	library/ceph_volume.py
2021-12-02 08:52:05 +01:00
Guillaume Abrioux 9423ec3eb6 adopt: fix ceph_origin and ceph_repository defaults
This is overriding those variables because the precedence at the 'block
var' level is greater than the group_vars/host_vars.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2026861

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5ea2ece99)
2021-11-30 10:57:34 +01:00
Guillaume Abrioux 53dc75d29c validate: fix bug when using vault
since a variable encrypted with vault is no longer a string but a
encrypted object we can't use the filter | length, we have to convert it
to a string before.

Fixes: #6991

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6ad7e52869)
2021-11-29 13:42:24 +01:00
Guillaume Abrioux efc93f5669 cephadm: support adding hosts with ipv6
The current implementation doesn't support adding hosts when using ipv6
addresses.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4f2c2af9b4)
2021-11-08 10:36:27 +01:00
Guillaume Abrioux d06c856fca cephadm: use public_network when adding hosts
When adding host, using ansible_facts['default_ipv4']['address'] might
not be the desired network, we shouldn't enforce the subnet with the
default route.
Let's use the public_network instead.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2006415

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2f34531304)
2021-11-08 10:36:27 +01:00
Guillaume Abrioux 5f7ad182f9 update: move a set_fact
ceph-facts roles makes decisions based on the fact `rolling_update` so
it must be called before we run this role.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5edcc4214)
2021-11-03 11:50:38 +01:00
Guillaume Abrioux e63df909af update: support --limit on monitor nodes
Change needed in order to support --limit on mon nodes.
Otherwise, a call to `hostvars[groups[mon_group_name][0]]['_current_monitor_address']`
throws an error:

```
"The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute '_current_monitor_address'"
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304#c28

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 82eee4303b)
2021-11-03 08:48:51 +01:00
Guillaume Abrioux 9526425111 rolling_update: modify default health_osd_check_*
let's do more retries with a shorter delay.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 50a21d695e)
2021-10-25 20:38:09 +02:00
Guillaume Abrioux 0d1c0c2813 rolling_update: fix pre and post osd upgrade play
when using --limit osds, the play before and after osd upgrade are
skipped because we use `hosts: "{{ mon_group_name | default('mons') }}[0]"`
using `hosts: "{{ osds_group_name | default('osds') }}" with
`delegate_to` to the first monitor addresses this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fc9f87c45f)
2021-10-25 20:15:17 +02:00
Guillaume Abrioux 120ed2b7f3 tests: add new scenario subset_update
new scenario in order to test the subset upgrade approach using tags.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fb8a66149b)
2021-10-25 20:15:17 +02:00