Commit Graph

5842 Commits (12c700c1b02ea2574b0211bd016df64a05b47a13)
 

Author SHA1 Message Date
Guillaume Abrioux 12c700c1b0 common: fix a typo
s/of/or ..

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2099828#c25

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2e823b117e)
(cherry picked from commit c36bac3903)
2022-07-08 08:27:44 +02:00
Guillaume Abrioux 2541a7e067 purge: reset-failed ceph-crash
This ensures we always reset-failed the ceph-crash service.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5ab46f836d)
2022-05-23 10:44:48 +02:00
Guillaume Abrioux a9ca4ea2b9 cephadm-adopt: remove legacy directory after adoption
When this directory is left after the osd adoption, it leads to the following error:

```
[WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
    host axdesec2ocs1n002.ecommerce.inditex.grp `cephadm ceph-volume` failed: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config
ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config'.
```

this is because of an unexpected behavior regarding 'config inferring' when a legacy directory is present in /var/lib/ceph.

Note: this doesn't fix the root cause, this is a workaround.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2075510

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 364fe36e92)
2022-05-11 16:27:07 +02:00
Guillaume Abrioux 4339875f29 common: config rhcs tools repo on all nodes
Otherwise `cephadm` can't be installed during cephadm-adopt.yml
playbook execution.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2073480

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 380382a3c1)
2022-04-28 10:51:50 +02:00
Guillaume Abrioux 6766b09acd validate: drop a check
Since the ISO install method removal, ceph-ansible isn't able
to detect wheter the user is deploying in a 'disconnected environment'.
By the way, given that ceph-ansible is available only for upgrading to RHCS 5,
this check doesn't make sense anymore, let's drop it.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2062147

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1cd1fa0560)
(cherry picked from commit 0d6763d4ef)
2022-04-07 16:56:33 +02:00
Teoman ONAY 7e6d9511e5 Using another user than root for cephadm ssh connections fails
Fixes commit da42f3d139

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2048734

Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit f851d3232c)
(cherry picked from commit 274a780237)
2022-03-21 11:40:04 +01:00
Guillaume Abrioux 2b3a9882d2 upgrade: block upgrade when rgw multisite is active
With this commit, upgrading a cluster from Nautilus to Pacific with
active rgw multisite replication will be blocked.
This is because a lot of bugs are currently present in Pacific regarding
RGW multisite.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063702

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 51bc8cb636)
(cherry picked from commit f7b7ba30d9)
2022-03-21 11:39:58 +01:00
Guillaume Abrioux 92901d313b update: allow qe testing with rgw_multisite
This allows QE testing with rgw multisite enabled.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063702

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-03-16 12:59:28 +01:00
Guillaume Abrioux 880e6cd0b8 upgrade: block upgrade when rgw multisite is deployed
See BZ for details.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063702

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-03-15 11:19:45 +01:00
Guillaume Abrioux 0138d63583 Revert "rolling_update: block upgrade for RHCS deployments"
This reverts commit 686de20756.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-03-15 11:19:45 +01:00
Guillaume Abrioux 148b1bad92 Revert "update: allow qe testing"
This reverts commit 445acc99f7.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-03-15 11:19:44 +01:00
Guillaume Abrioux 7509242c14 adopt: fix node labelling
When using group of group, the playbook will apply undesired
labels on nodes.
This commit fixes it by applying only the expected labels.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2057528

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 266b6e739c)
(cherry picked from commit bcab0d7a55)
2022-03-03 17:03:09 +01:00
Teoman ONAY 5ffb0fb1a6 Add cluster custom name support
When using cluster custom names, cephadm commands are executed using
the default admin keyring name which fails.

Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit f8c6bba657)
(cherry picked from commit 839ad5927d)
2022-03-03 17:03:09 +01:00
Teoman ONAY 11677d6177 Enable user to change the account used for ssh connection
By default cephadm uses root account to connect remotely
to other nodes in the cluster. This change allows to choose
another account.
This commit also allows to use a dedicated subnet for cephadm mgmt.

Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit da42f3d139)
(cherry picked from commit c3ce6fc41a)
2022-03-03 17:03:09 +01:00
Guillaume Abrioux 445acc99f7 update: allow qe testing
Enable QE to test upgrade from 4.x to 5.1.

pass `-e qe_testing=true` to get around the upgrade blocking.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-03-03 15:48:57 +01:00
Guillaume Abrioux 686de20756 rolling_update: block upgrade for RHCS deployments
Specific to RHCS deployments. See BZ for details.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2052614

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-02-17 09:57:42 +01:00
Guillaume Abrioux 314ba6e3e9 adopt: fix rbd-mirror adoption
We can't use `{{ cephadm_cmd }}` here because the monitors aren't yet adopted.
We must use `{{ ceph_cmd }}` instead.
This also fixes some filters `| default()` (they must be moved before `| from_json()`)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967440

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 94e51d5c14)
2022-02-10 08:49:43 +01:00
Guillaume Abrioux 371c25f0ef adopt: fix bug in mon_ip_list set_fact
`default('{}')` must be before `| from_json`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f30767432b)
2022-02-09 12:40:09 +01:00
Guillaume Abrioux cb197575dd adopt: check for POOL_APP_NOT_ENABLED warning
This commit makes the cephadm-adopt playbook fail if the cluster
has the `POOL_APP_NOT_ENABLED` warning raised.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2040243

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ddae06e1a2)
2022-02-09 12:40:09 +01:00
John Karasev 9d7e3bf2bb ceph-grafana: Add proxy env vars to grafana service template
When installing grafana plugins, the container will make http requests.
This requires http proxy otherwise installation cannot be performed. Passed
the proxy vars from all.yml as env args.
Fixes: ceph#6484, ceph#6481

Signed-off-by: John Karasev <john.karasev@intel.com>
(cherry picked from commit 79ca442d53)
2022-02-09 11:34:57 +01:00
jowsiewski 3c38b1e410 Remove the remaining packages
Signed-off-by: jowsiewski <owsiewski@gmail.com>
(cherry picked from commit 1dfd195c7e)
2022-02-04 11:14:38 +01:00
Francesco Pantano 8f15179d57 Add with_pkg tag on package related tasks
In the OpenStack context we let the integration tool (TripleO)
deal with repositories and packages.
This change just adds the with_pkg tag to allow TripleO skipping
both the repositories and packages installation.

Signed-off-by: Francesco Pantano <fpantano@redhat.com>
(cherry picked from commit 12dd8b5df1)
2022-02-04 09:52:07 +01:00
Guillaume Abrioux 9916c2d752 tests: use centos stream-8 instead of centos 8
CentOS 8 is EOL as of December 2021.
Let's use CentOS stream 8 instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit bc36f60e8d)
2022-01-31 18:35:08 +01:00
Guillaume Abrioux f7389ecf25 nfs-ganesha: fix debian based OS deployments
Let's use ppa repositories in order to deploy nfs-ganesha on Debian based OS.

Fixes: #7031

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c491e67486)
2022-01-31 09:39:26 +01:00
Guillaume Abrioux fa281c7538 adopt: create nfs exports at the user level
The current implementation is wrong.
ceph-ansible lists all existing buckets and try to create
an export for each of them.
Instead, it's easier to create the export at the user level.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2037691

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7f517cdd22)
2022-01-29 15:25:46 +01:00
Dmitriy Rabotyagov 07fe7dd03d Fix rich version for ansible-lint
Ansible-lint prior to v5.3.1 has issue with reach version >=11.0.0.
In order to cherry-pick fix to stable branches we fix rich version.

This should be reverted with ansible-lint version bump.

Signed-off-by: Dmitriy Rabotyagov <noonedeadpunk@ya.ru>
(cherry picked from commit 583e60af84)
2022-01-29 08:47:15 +01:00
Danny Webb 29763efdc3 make grafana network a configurable option
Signed-off-by: Danny Webb <danny.webb@thehutgroup.com>
(cherry picked from commit 189ff93372)
2022-01-19 10:08:05 +01:00
Guillaume Abrioux 17d8351971 cephadm-adopt: use named args in rgw export creation
In order to avoid breaking changes, let's use named argument
instead of positional argument syntax in the command line
used to create rgw export.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2037691

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit aee1f06497)
2022-01-06 16:52:05 +01:00
Guillaume Abrioux e676502c8f purge: remove ceph directories on client nodes
Otherwise any ceph directories are left over on client nodes
after the purge.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2024815

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 20035852a4)
2022-01-06 10:33:31 +01:00
Guillaume Abrioux 9f04949ba0 container: align systemd units with rpm
Update `After=` and `Wants=` parameters in container systemd units
and make them be aligned with the systemd units that come
from the packaging.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027440

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f01536ea19)
2021-12-15 13:49:37 +01:00
Guillaume Abrioux 7791fac222 update: speed up client play
wip

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 817c03bc0e)
2021-12-15 13:48:14 +01:00
Guillaume Abrioux 993e4822c3 common: remove legacy repositories
As of rhceph-5, those repositories don't longer exist.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2032790

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit dc8940fe1c)
2021-12-15 13:36:02 +01:00
Guillaume Abrioux 8a32576d20 cephadm-adopt: ensure /etc/ceph is present on monitoring node
When deploying the monitoring stack on a dedicated node, the directory
`/etc/ceph` has never been created. Therefore, the play for adopting the
monitoring stack fails because it can't write the minimal config file.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2029697

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7ece59b41d)
2021-12-07 23:09:42 +01:00
Guillaume Abrioux b16d9fc289 cephadm-adopt: bindmount /var/lib/ceph with 'ro'
When collocating osds with iscsigw daemons, cephadm bindmounts the
following:

```
-v /var/lib/ceph/6126c064-6a9e-4092-8a64-977930df0843/iscsi.rbd.ceph-ameenasuhani-4fs3bq-node5.vomtqb/configfs:/sys/kernel/config
```

this prevents cephadm-adopt playbook from running container and bindmounting `/var/lib/ceph:/var/lib/ceph:z`

since 'ro' is enough in this playbook, let's replace the ':z' option on
this bindmount with ':ro'

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027411

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c4fdf956bd)
2021-11-30 21:04:31 +01:00
Guillaume Abrioux e3b2514e2b ceph_volume: support overriding bind-mounts
This makes it possible to call `podman run` with custom bind-mounts.

cephadm-adopt.yml playbook needs it for a very specific use case:

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027411

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b02d71c307)
2021-11-30 21:04:31 +01:00
Guillaume Abrioux 1628347253 adopt: fix ceph_origin and ceph_repository defaults
This is overriding those variables because the precedence at the 'block
var' level is greater than the group_vars/host_vars.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2026861

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5ea2ece99)
2021-11-30 13:02:24 +01:00
Guillaume Abrioux ef9921c215 validate: fix bug when using vault
since a variable encrypted with vault is no longer a string but a
encrypted object we can't use the filter | length, we have to convert it
to a string before.

Fixes: #6991

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6ad7e52869)
2021-11-29 13:40:59 +01:00
Guillaume Abrioux 6bdaa9e3d5 cephadm: support adding hosts with ipv6
The current implementation doesn't support adding hosts when using ipv6
addresses.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4f2c2af9b4)
2021-11-08 10:36:14 +01:00
Guillaume Abrioux 0097cb09f1 cephadm: use public_network when adding hosts
When adding host, using ansible_facts['default_ipv4']['address'] might
not be the desired network, we shouldn't enforce the subnet with the
default route.
Let's use the public_network instead.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2006415

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2f34531304)
2021-11-08 10:36:14 +01:00
Dimitri Savineau 041e8b0eaa cephadm-adopt: remove logrotate configuration
cephadm uses its own logrotate configuration file so ceph-ansible needs
to remove that custom file during the cephadm-adopt playbook.

Closes: #6944

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c41241244e)
2021-11-03 11:51:03 +01:00
Guillaume Abrioux 19dadc98da update: move a set_fact
ceph-facts roles makes decisions based on the fact `rolling_update` so
it must be called before we run this role.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5edcc4214)
2021-11-03 11:50:27 +01:00
Guillaume Abrioux 8f648269ec update: support --limit on monitor nodes
Change needed in order to support --limit on mon nodes.
Otherwise, a call to `hostvars[groups[mon_group_name][0]]['_current_monitor_address']`
throws an error:

```
"The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute '_current_monitor_address'"
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304#c28

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 82eee4303b)
2021-11-03 08:48:38 +01:00
Guillaume Abrioux a752edbd29 Revert "update: block upgrade when nfs+rgw is deployed"
This reverts commit 93f1765259.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2017508

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-28 08:13:05 +02:00
Guillaume Abrioux f7d67f7669 rolling_update: modify default health_osd_check_*
let's do more retries with a shorter delay.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 50a21d695e)
2021-10-25 21:08:44 +02:00
Guillaume Abrioux e5ef104c57 adopt: fix rbd mirror adoption
The rbd mirroring is broken because cephadm doesn't bindmount /etc/ceph anymore.
It means the keyrings and ceph config file aren't available after the
migration.
The idea here is to remove the current rbd mirror peer and add it back
to the mon config store so we aren't bound to the /etc/ceph directory.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967440

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9c794aa9bc)
2021-10-25 20:14:07 +02:00
Guillaume Abrioux b1bdb708d0 adopt: use mgr/nfs volume
use the mgr 'nfs' module to recreate nfs exports.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1954971

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4257410dcd)
2021-10-25 17:16:15 +02:00
Guillaume Abrioux efc6979db5 rolling_update: fix pre and post osd upgrade play
when using --limit osds, the play before and after osd upgrade are
skipped because we use `hosts: "{{ mon_group_name | default('mons') }}[0]"`
using `hosts: "{{ osds_group_name | default('osds') }}" with
`delegate_to` to the first monitor addresses this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fc9f87c45f)
2021-10-25 15:33:18 +02:00
Guillaume Abrioux e29defef7d tests: add new scenario subset_update
new scenario in order to test the subset upgrade approach using tags.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fb8a66149b)
2021-10-25 15:33:18 +02:00
Guillaume Abrioux ca25ebb323 update: support upgrading a subset of nodes
It can be useful in a large cluster deployment to split the upgrade and
only upgrade a group of nodes at a time.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5cf9db2b0)
2021-10-25 15:33:18 +02:00
Per Abildgaard Toft 3edc6ac5f2 shrink-osd: fix regression because of a wrong regex
968891f449 introduced a regression.
The regex is wrong because it doesn't allow to shrink osds with id
greater than 9

Fixes: #6950

Signed-off-by: Per Abildgaard Toft <per@minfejl.dk>
(cherry picked from commit 84118a3063)
2021-10-21 12:38:25 +02:00