Commit Graph

5997 Commits (896d82877fec586e76561720761b31cc9a4b68d5)
 

Author SHA1 Message Date
Guillaume Abrioux 3f923d69ac tests: temporarily disable nfs-ganesha
This commit [1] seems to have broken a selinux policy preventing nfs-ganesha from
starting properly.

Since we can't address the issue in ceph-ansible, let's disable temporarily nfs-ganesha testing.

[1] dae2da63d5

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-01-05 15:41:44 +01:00
Guillaume Abrioux dc8940fe1c common: remove legacy repositories
As of rhceph-5, those repositories don't longer exist.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2032790

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-12-15 13:28:50 +01:00
Guillaume Abrioux 817c03bc0e update: speed up client play
wip

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-12-15 08:42:23 +01:00
Guillaume Abrioux f01536ea19 container: align systemd units with rpm
Update `After=` and `Wants=` parameters in container systemd units
and make them be aligned with the systemd units that come
from the packaging.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027440

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-12-14 13:46:27 +01:00
Guillaume Abrioux 7ece59b41d cephadm-adopt: ensure /etc/ceph is present on monitoring node
When deploying the monitoring stack on a dedicated node, the directory
`/etc/ceph` has never been created. Therefore, the play for adopting the
monitoring stack fails because it can't write the minimal config file.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2029697

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-12-07 22:06:46 +01:00
Danny Webb 189ff93372 make grafana network a configurable option
Signed-off-by: Danny Webb <danny.webb@thehutgroup.com>
2021-12-02 08:53:58 +01:00
Guillaume Abrioux 20035852a4 purge: remove ceph directories on client nodes
Otherwise any ceph directories are left over on client nodes
after the purge.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2024815

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-12-02 08:53:24 +01:00
Guillaume Abrioux 64196ce3a3 validate: support obs repository
Otherwise, installation on SuSe fails.

Fixes: #6996

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-12-02 08:51:45 +01:00
Benoît Knecht b29a6b18f8 roles/ceph-rgw: Support CRUSH device class
The pools created by `ceph-rgw` (listed in `rgw_create_pools`) now support a
`ec_crush_device_class` option to specify which device class the EC pool should
use.

It default to being omitted, which means it will use OSDs from any device class
by default.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2021-12-01 08:39:14 +01:00
Benoît Knecht 09b56a8890 library/ceph_ec_profile.py: Support CRUSH device class
The `crush_device_class` option of the `ceph_ec_profile` module was documented
but not implemented.

This commit adds it and ensures its value is updated on the corresponding EC
profile.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2021-12-01 08:39:14 +01:00
Guillaume Abrioux c4fdf956bd cephadm-adopt: bindmount /var/lib/ceph with 'ro'
When collocating osds with iscsigw daemons, cephadm bindmounts the
following:

```
-v /var/lib/ceph/6126c064-6a9e-4092-8a64-977930df0843/iscsi.rbd.ceph-ameenasuhani-4fs3bq-node5.vomtqb/configfs:/sys/kernel/config
```

this prevents cephadm-adopt playbook from running container and bindmounting `/var/lib/ceph:/var/lib/ceph:z`

since 'ro' is enough in this playbook, let's replace the ':z' option on
this bindmount with ':ro'

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027411

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-11-30 18:39:31 +01:00
Guillaume Abrioux b02d71c307 ceph_volume: support overriding bind-mounts
This makes it possible to call `podman run` with custom bind-mounts.

cephadm-adopt.yml playbook needs it for a very specific use case:

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027411

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-11-30 18:39:31 +01:00
Guillaume Abrioux e5ea2ece99 adopt: fix ceph_origin and ceph_repository defaults
This is overriding those variables because the precedence at the 'block
var' level is greater than the group_vars/host_vars.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2026861

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-11-29 13:40:00 +01:00
Guillaume Abrioux 6ad7e52869 validate: fix bug when using vault
since a variable encrypted with vault is no longer a string but a
encrypted object we can't use the filter | length, we have to convert it
to a string before.

Fixes: #6991

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-11-16 16:23:27 +01:00
Dimitri Savineau c41241244e cephadm-adopt: remove logrotate configuration
cephadm uses its own logrotate configuration file so ceph-ansible needs
to remove that custom file during the cephadm-adopt playbook.

Closes: #6944

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2021-11-03 08:49:24 +01:00
Guillaume Abrioux e5edcc4214 update: move a set_fact
ceph-facts roles makes decisions based on the fact `rolling_update` so
it must be called before we run this role.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-11-03 08:48:09 +01:00
Guillaume Abrioux 82eee4303b update: support --limit on monitor nodes
Change needed in order to support --limit on mon nodes.
Otherwise, a call to `hostvars[groups[mon_group_name][0]]['_current_monitor_address']`
throws an error:

```
"The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute '_current_monitor_address'"
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304#c28

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-28 21:47:01 +02:00
Guillaume Abrioux 4f2c2af9b4 cephadm: support adding hosts with ipv6
The current implementation doesn't support adding hosts when using ipv6
addresses.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-28 16:37:22 +02:00
Guillaume Abrioux 2f34531304 cephadm: use public_network when adding hosts
When adding host, using ansible_facts['default_ipv4']['address'] might
not be the desired network, we shouldn't enforce the subnet with the
default route.
Let's use the public_network instead.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2006415

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-28 16:37:22 +02:00
Guillaume Abrioux 9aa9b4dda2 Revert "cephadm: use public_network when adding host"
This reverts commit 7a12b854c4.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-28 16:37:22 +02:00
Guillaume Abrioux 7a12b854c4 cephadm: use public_network when adding host
When adding host, using `ansible_facts['default_ipv4']['address']` might
not be the desired network, we shouldn't enforce the subnet with the
default route.
Let's use the public_network instead.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2006415

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-26 21:09:06 +02:00
Guillaume Abrioux 9c794aa9bc adopt: fix rbd mirror adoption
The rbd mirroring is broken because cephadm doesn't bindmount /etc/ceph anymore.
It means the keyrings and ceph config file aren't available after the
migration.
The idea here is to remove the current rbd mirror peer and add it back
to the mon config store so we aren't bound to the /etc/ceph directory.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967440

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-25 15:45:17 +02:00
Guillaume Abrioux 4257410dcd adopt: use mgr/nfs volume
use the mgr 'nfs' module to recreate nfs exports.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1954971

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-25 15:44:53 +02:00
Guillaume Abrioux 50a21d695e rolling_update: modify default health_osd_check_*
let's do more retries with a shorter delay.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-25 15:44:17 +02:00
Guillaume Abrioux fc9f87c45f rolling_update: fix pre and post osd upgrade play
when using --limit osds, the play before and after osd upgrade are
skipped because we use `hosts: "{{ mon_group_name | default('mons') }}[0]"`
using `hosts: "{{ osds_group_name | default('osds') }}" with
`delegate_to` to the first monitor addresses this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-25 14:23:00 +02:00
Guillaume Abrioux 923af40527 tests: followup on pr6951
destroy VMs at the end of the testing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-25 14:23:00 +02:00
Guillaume Abrioux e5cf9db2b0 update: support upgrading a subset of nodes
It can be useful in a large cluster deployment to split the upgrade and
only upgrade a group of nodes at a time.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-21 20:51:14 +02:00
Guillaume Abrioux fb8a66149b tests: add new scenario subset_update
new scenario in order to test the subset upgrade approach using tags.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-21 20:51:14 +02:00
Per Abildgaard Toft 84118a3063 shrink-osd: fix regression because of a wrong regex
968891f449 introduced a regression.
The regex is wrong because it doesn't allow to shrink osds with id
greater than 9

Fixes: #6950

Signed-off-by: Per Abildgaard Toft <per@minfejl.dk>
2021-10-21 10:01:23 +02:00
Seena Fallah ae6be71b08 cephadm: set ssh configs at bootstrap step
Add support ssh_user and ssh_config to cephadm bootstrap plugin

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2021-10-15 14:40:51 +02:00
Guillaume Abrioux 968891f449 shrink-osd: check osd id format
This adds a check early in order to ensure the format of osd ids passed
is correct.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2005734

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-12 18:26:18 +02:00
Seena Fallah 5822936252 cephadm: install cephadm from repository
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2021-10-08 16:56:47 +02:00
Seena Fallah 339212a7c6 cephadm-adopt: configure repository for cephadm installation
Configure repository for cephadm installation and use package install in both containerized and non containerized deployment

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2021-10-08 16:56:47 +02:00
Seena Fallah 4f6da9d92f ceph-validate: export validate repository vars as a task
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2021-10-08 16:56:47 +02:00
Seena Fallah e79bda9a05 ceph-common: export repository configuration to a single task
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2021-10-08 16:56:47 +02:00
Seena Fallah 0b78faa723 cephadm: use cephadm_ssh_user for ssh user
Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2021-10-01 21:08:13 +02:00
Francesco Pantano b7299f258b Add ceph_nfs_adopt tag to the cephadm-adopt playbook
There are existing OpenStack scenarios where nfs is still not managed
by cephadm. For this reason sometimes is useful skip the nfs part of
the adoption playbook and leave this daemon unmanaged.
The purpose of this patch is providing a tag to enable the OpenStack
operators to skip this playbook section.

Closes: https://bugzilla.redhat.com/2009212
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
2021-10-01 21:03:02 +02:00
Guillaume Abrioux b555f1d1cd cephadm: add admin label on mon nodes
This is needed if you want a copy of the admin keyring on the admin
nodes.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-01 17:17:01 +02:00
Guillaume Abrioux f277a39dfe tests: remove all references to ceph_stable_release
this is legacy and not needed anymore.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-09-30 16:13:55 +02:00
Seena Fallah fb99626987 ceph-defaults: set ceph_stable_release default to the stable branch release
ceph_stable_release is a legacy from the time where a single branch of ceph-ansible supported more than one release of ceph

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2021-09-30 16:13:55 +02:00
Guillaume Abrioux c2e46fe5a5 tests: set rgw_instances in collect-logs.yml
in order to gather rgw logs, we need rgw_instances to be set.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-09-30 16:13:34 +02:00
Guillaume Abrioux b2ccc7234a tests: update collect-logs.yml playbook
- change `ceph -s` output to json-pretty.
- gather rgw logs
- add `health detail` command

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-09-30 08:31:22 +02:00
Guillaume Abrioux 702564518b tests: move collect-logs.yml to ceph-ansible repo
related ceph-build PR: ceph/ceph-build#1914

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-09-29 15:18:45 +02:00
Alex Lambert a9680ab17f dashboard: allow disabling of unused features
Unconfigured dashboard features can lead to empty tabs in the dashboard
containing no meaningful content. Allow users to disable dashboard features
they know will not be used.

A list of features to be disabled allows the user to define a streamlined
dashboard as standard across deployments. Defaults to disabling no features,
ensuring that users are sure they do not need the dashboard feature before
disabling it.

Signed-off-by: Alex Lambert <lamberta@microsoft.com>
2021-09-29 12:02:16 +02:00
Guillaume Abrioux f8d49827a4 dashboard: retry setting rgw-credentials
for some reason, this task can fail in the CI.
Adding a retry can help to avoid this failure.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-09-29 10:29:42 +02:00
Guillaume Abrioux b6c470c7e2 tests: add osd node in collocation
we update the pool size from 1 to 2 in idempotency test
but only 1 node is available.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-09-29 10:29:42 +02:00
Guillaume Abrioux 0a3b916ee7 cephadm-adopt: add no_log: true
Let's add a `no_log: true` on the `cephadm registry-login` task.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-09-28 08:11:03 +02:00
Guillaume Abrioux d12efa1ab4 adopt: stop iscsi services in the first place
If old containers are still running, it can make tcmu-runner process
unable to open devices and there's nothing else to do than restarting
the container.

Also, as per discussion with iscsi experts, iscsi should be migrated before
OSDs. (the client should be closed before the server)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000412

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-09-27 19:46:37 +02:00
Dimitri Savineau 9125bba48d tests: auth_allow_insecure_global_id_reclaim false
Otherwise the clients won't be able to reconnect after the reboot in the
all_daemons and collocation jobs.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2021-09-17 07:34:40 +02:00
Guillaume Abrioux 66f3eb377c tests: fix container-cephadm job
add missing variable `containerized_deployment` in group_vars

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-09-16 16:57:16 +02:00