Commit Graph

5663 Commits (e332051b46561bed91cdf58e7750474f26f382fd)
 

Author SHA1 Message Date
Guillaume Abrioux e332051b46 switch-to-containers: only chown corresponding files
When collocating daemons, if we chown all files under `/var/lib/ceph` it
can cause issues for the collocated daemons that wouldn't have been
migrated yet.

This commit makes the playbook chown only the files corresponding to the
daemon being migrated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ddbc11c4a9)
2021-04-15 05:24:12 +02:00
Guillaume Abrioux 5f6050bed1 container/systemd: ensure /var/log/ceph exists
This adds a `ExecStartPre=-/usr/bin/mkdir -p /var/log/ceph` in all
systemd service templates for all ceph daemon.
This is specific to RHCS after a Leapp upgrade is done. Indeed, the
`/var/log/ceph` seems to be removed after the upgrade.
In order to work around this issue let's ensure the directory is present
before trying to start the containers with podman.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1949489

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit bab403b603)
2021-04-14 20:04:54 +02:00
Guillaume Abrioux fd0da6f43c fs2bs: add a final play
This removes the fact `skipped_nodes` which is useless when we run with
`--limit` since it gets reset when a new iteration is made.

Instead, let's print within a final play which node has been skipped
reusing the `skip_this_node` fact.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3d4267051f)
2021-04-14 16:46:31 +02:00
Guillaume Abrioux 64815ce7ac rbdmirror: add retries/until when configuring mirroring
`configure_mirroring.yml` is called right after the daemon is started.
Sometimes, it can happen the first task in `configure_mirroring.yml` is
run while the daemon isn't yet ready, adding a retries/until on that
task should help to avoid causing the playbook to fail.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1944996

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b1e7e1ad0f)
2021-04-14 16:12:38 +02:00
Guillaume Abrioux 6b87d8c95c cephadm_adopt: support nfs-ganesha adoption
This commit adds the nfs-ganesha adoption support in the
`cephadm-adopt.yml` playbook.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1944504

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a9220654f5)
2021-04-12 15:32:22 +02:00
Guillaume Abrioux 174df8c18e nfs: remove legacy task
This fact is never used, let's remove the task.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0772b3d28d)
2021-04-12 15:32:22 +02:00
Guillaume Abrioux 71a6e6ec33 nfs: rename two tasks
set the name of those tasks accordingly with the fact name being set.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d3d3d01528)
2021-04-12 15:32:22 +02:00
Guillaume Abrioux 5aa9d0dfb4 cephadm_adopt: modify placement policy for rgw
the adoption playbook should use `radosgw_num_instances` in order to
determine how much rgw instance it should set recreate.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1943170

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1ffc4df6b6)
2021-04-12 15:32:22 +02:00
Guillaume Abrioux c2d40d4383 cephadm_adopt: fix a typo
This play doesn't nothing else than stopping/removing rgw daemons.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ee44d86072)
2021-04-12 15:32:22 +02:00
Guillaume Abrioux e84c42b33f docker2podman: skip some role imports from handler
when running docker-to-podman playbook, there's no need to call
`ceph-config` and `ceph-rgw` from the role `ceph-handler`.
It can even have side effects when coming from a baremetal cluster that
was previously migrated using the switch-to-containers playbook. Indeed
it might complain about missing .target systemd unit since they are
removed during that migration.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1944999

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 70f19be367)
2021-04-12 13:30:09 +02:00
Guillaume Abrioux 03793da772 docker2podman: add documentation/header
this adds a small documentation in the header of the playbook in order
to explain what is the goal of this playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 36b4227dcd)
2021-04-12 09:44:14 +02:00
Guillaume Abrioux 9ab9b741f3 switch_to_containers: support iscsigws migration
This adds the iscsigws migration to containers.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=<bz-number>

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2c74c27321)
2021-04-09 15:28:06 +02:00
Guillaume Abrioux 69c3d6ea83 common: selinux tasks related refactor
This moves some task from the `ceph-nfs` role in `ceph-common` since
some of them are needed in `ceph-rgwloadbalancer` role.
This avoids duplicated tasks.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d0442d81b9)
2021-04-06 15:08:38 +02:00
Guillaume Abrioux cc6a10bd02 rgw-loadbalancers: add all rgw_ports to http_port_t type
This adds all rgw ports to the http_port_t selinux type so it
allows haproxy to connect to those ports in order to avoid AVC.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1923890

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6bbb90198b)
2021-04-06 15:08:38 +02:00
kalebskeithley ef99ac623e rgw-loadbalancer: Update haproxy.cfg.j2
haproxy gets an AVC when configured to connect to port 8081

This commit adds a snippet regarding haproxy in a selinux environment

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1923890

Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
(cherry picked from commit 9e7f22a071)
2021-04-06 15:08:38 +02:00
Dimitri Savineau 21fa7f31b4 container/registry: use password from stdin
Pass the password variable via stdin for the registry login
authentication.
This allows to remove the no_log statement and see the task output
without displaying the password value.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a0e1a450d3)
2021-04-02 09:46:01 +02:00
Guillaume Abrioux 5210bd16df rgw: supports pg_autoscale_mode option for pool creation
Support enabling/disabling the pg autoscaler for rgw pools.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9f03a527ba)
2021-04-01 15:32:40 +02:00
Guillaume Abrioux 66235feca3 dashboard: support prometheus storage.tsdb.retention.time parameter
This commit adds the parameter `--storage.tsdb.retention.time` to the
prometheus systemd unit template.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1928000

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b60c61ce45)
2021-04-01 14:52:37 +02:00
Guillaume Abrioux efddbdf909 nfs: set idmap config for Ceph-NFS
Currently NFS Ganesha (ceph-nfs) consumes /etc/idmapd.conf, which
controls mapping of user/owner identities under NFSv4+. With
containerized service deployment, this file is an immutable part of the
container image and cannot be modified.

Here we provide group variables, and a taskk and templates for the
ceph-nfs role, to set the path of the idmap configuration file and
to make the most common adjustment to the contents of that file --
namely to set the 'Domain'. We default the path to /etc/ganesha/idmap.conf
so that we will not conflict with /etc/idmapd.conf on the controller nodes
where ganesha runs. NFSv4 clients, as used for example by the Cinder NFS
driver, consume /etc/idmapd.conf and may require different settings than
what is wanted for NFS Ganesha. Additionally, because we already bind
/etc/ganesha from the host into the ceph-nfs container, the file NFS
Ganesha consumes will no longer be an immutable part of the container.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925646

Signed-off-by: Tom Barron tpb@dyncloud.net
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2db2208e40)
2021-04-01 14:52:12 +02:00
Guillaume Abrioux 94d227149d defaults: add a comment about `igw_network`
This add a quick documentation in ceph-defaults about `igw_network`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c5728bdc63)
2021-03-29 11:23:49 +02:00
Guillaume Abrioux 000b203ebf update: followup on 07029e1
Playbook must fail anyway, the `rescue` block has been introduced for
unmasking the unit after the playbook has failed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e9ddb972fe)
2021-03-29 10:54:44 +02:00
VasishtaShastry 4a10f6ee72 Peer addition won't be skipped if remote is not in peer
rbd-mirroring is not configured as adding peer is getting skipped.
Peer addition should not get skipped if its not added already

Closes - https://bugzilla.redhat.com/show_bug.cgi?id=1942444

Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
(cherry picked from commit 006998e804)
2021-03-26 21:24:59 +01:00
Guillaume Abrioux 2b19dfdaae dashboard: support igw nodes with dedicated subnet
This adds the possibility to deploy the dashboard with igw nodes using
a dedicated subnet.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1926170

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c33de174f1)
2021-03-26 19:38:37 +01:00
Guillaume Abrioux 1fd0661d3e rolling_update: unmask monitor service after a failure
if for some reason the playbook fails after the service was
stopped, disabled and masked and before it got restarted, enabled and
unmasked, the playbook leaves the service masked and which can make users
confused and forces them to unmask the unit manually.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1917680

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 07029e1bf1)
2021-03-26 15:20:35 +01:00
Ali Maredia fcd9544048 docs: rgw multisite docs with new rgw_instances config
Docs reflect that each instance of `rgw_instances`
can now take rgw_zonemaster, rgw_zonesecondary,
rgw_zonegroupmaster, rgw_multisite_proto.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit a59bc2da3b)
2021-03-26 07:42:35 +01:00
Guillaume Abrioux 8aa0dc2868 library: drop ceph_facts
This is never called in the playbook and seems unmaintained.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b01f16e835)
2021-03-26 00:07:18 +01:00
Ken Dreyer 37088d8c4f README-MULTISITE: fix typos
This commit fixes some typos in MULTISITE documentation.

Signed-off-by: Ken Dreyer <ktdreyer@redhat.com>
(cherry picked from commit 63a246db41)
2021-03-26 00:06:21 +01:00
Guillaume Abrioux 8a4fd99db7 convert some missed `ansible_*`` calls to `ansible_facts['*']`
This converts some missed calls to `ansible_*` that were missed in
initial PR #6312

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0163ecc924)
2021-03-26 00:04:49 +01:00
Alex Schultz 181924db7b Disable facts by default in ansible.cfg
As a continuation of a7f2fa73e6, this
change switches fact injection to off by default in the provided
ansible.cfg.

Signed-off-by: Alex Schultz <aschultz@redhat.com>
(cherry picked from commit db031a4993)
2021-03-26 00:04:49 +01:00
Alex Schultz 56aac327dd Use ansible_facts
It has come to our attention that using ansible_* vars that are
populated with INJECT_FACTS_AS_VARS=True is not very performant.  In
order to be able to support setting that to off, we need to update the
references to use ansible_facts[<thing>] instead of ansible_<thing>.

Related: ansible#73654
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1935406
Signed-off-by: Alex Schultz <aschultz@redhat.com>
(cherry picked from commit a7f2fa73e6)
2021-03-26 00:04:49 +01:00
Guillaume Abrioux ab857d8b54 tests: use master build for iscsigws
pacific builds for iscsi pkgs aren't available, as a workaround we can
use builds from master.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-03-24 21:36:24 +01:00
Guillaume Abrioux 723efc8576 tests: switch to quay.ceph.io for dashboard images
for some reason, `quay.io/app-sre/grafana` no longer exist.
as a workaround, all dashboard related images have been mirrored on
quay.ceph.io.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c90b0985e5)
2021-03-24 21:36:24 +01:00
Guillaume Abrioux 65d1cfd634 iscsi: fetch right repo from shaman
due to recent changes in shaman, we must fetch the right repo by
filtering on the desired architecture.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5801171b37)
2021-03-24 21:36:24 +01:00
Guillaume Abrioux 439cb79e3e tests: fix `test_rgw_is_up` test
The data structure seems to have been modified in ceph@master (quincy).

This commit update the test accordingly.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b8080bac41)
2021-03-24 21:36:24 +01:00
Guillaume Abrioux fb75fce4fa tests: fix `test_nfs_is_up` test
the data structure seems to have been modified in ceph@master (quincy).

This commit update the test accordingly.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7e1db0b599)
2021-03-24 21:36:24 +01:00
Guillaume Abrioux 43c7c20fa9 ceph_volume: fix bug in `is_lv()`
This function makes the `ceph_volume` module be not idempotent in
containerized context because it tries to run a container and bindmount
directories that no longer exist.

In that case, the `lvs` command being executed returns something
different than `0` so we can't call `json.loads(out)['report'][0]['lv']`
since it might throw an python error.

The idea is to return `True` only if `rc` is equal to `0` and
`len(result)` is greater than `0`, which means the command matched an
LV.

Fixes: #6284

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ed79bc7a4e)
2021-03-24 21:36:24 +01:00
Guillaume Abrioux a4d4f53080 fix 'command -v' tasks
`command -v` is a bash script which needs a shell to run.

Fixes: #6325

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 14c472707c)
2021-03-22 13:52:39 +01:00
Guillaume Abrioux 8d25b4305e adopt: convert legacy grafana-server groupname early
This is a follow up on PR #6332

cephadm-adopt.yml playbook is affected by the same bug

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1938658

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit af95595c82)
2021-03-18 08:56:44 +01:00
Guillaume Abrioux 5893a17886 tests: remove 1 client VM in external_clients job
We only use 2 client in this scenario, there's no need to fire up a
third VM.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fb1a5f071a)
2021-03-18 08:54:33 +01:00
Guillaume Abrioux 05ab3a7d50 validate: update `ceph_repository_community` check
this updates the `ceph_repository_community` check in `ceph-validate`
with the right ceph release expected.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 47b9b75ace)
2021-03-18 08:54:33 +01:00
Guillaume Abrioux 01939808b0 nfs: bump nfs-ganesha version
This commit updates the default version of nfs-ganesha to V3.5 which is the
latest version available upstream.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c78388e580)
2021-03-18 08:54:33 +01:00
Guillaume Abrioux c296824ae0 cephadm_adopt: fetch and write ceph minimal config
This commit makes the playbook fetch the minimal current ceph
configuration and write it later on monitoring nodes so `cephadm` can
proceed with the adoption.
When a monitoring stack was deployed on a dedicated node, it means no
`ceph.conf` file was written, `cephadm` requires a `ceph.conf` in order
to adopt the daemon present on the node.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1939887

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b445df0479)
2021-03-18 08:51:59 +01:00
Guillaume Abrioux 688e432c32 facts: fix nfs/external cluster scenario
These tasks shouldn't be run when at least 1 monitor isn't present in
the inventory.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1937997

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ccd1cbb732)
2021-03-18 06:40:33 +01:00
Guillaume Abrioux d65c7b4035 config: reset num_osds
When collocating OSDs with other daemon, `num_osds` is incorrectly calculated
because `ceph-config` is called multiple times.

Indeed, the following code:
```
num_osds: "{{ lvm_list.stdout | default('{}') | from_json | length | int + num_osds | default(0) | int }}"
```

makes `num_osds` be incremented each time `ceph-config` is called.

We have to reset it in order to get the correct number of expected OSDs.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 31a0f2653d)
2021-03-17 17:35:19 +01:00
Guillaume Abrioux 732e5b10b8 update: convert legacy grafana-server groupname early
If the legacy name `grafana-server` is still being used when upgrading
from Nautilus to Pacific, the task that sets the fact `rolling_update`
to `true` doesn't run on the node(s) included in that group. Indeed the
play where we set this fact (`rolling_update`) only runs on the group
`monitoring_group_name | default('monitoring')`.
As a workaround, we can run earlier the task which converts the
`grafana-server` group name to `monitoring`.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1935554

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6ccc8b4722)
2021-03-16 14:33:40 +01:00
Matthew Vernon 3c8191194d docs: Document the prepare_osd tag
There are times where being able to skip OSD creation is useful to the
admin (see #1777 for example), and skipping the prepare_osd tag is a
way to achieve this. Document this fact.

Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
(cherry picked from commit e66b7b7449)
2021-03-12 09:19:55 +01:00
Matthew Vernon 6deb88d8fb ceph-osd: add prepare_osd tag to lvm-batch scenario
Sometimes it's useful to be able to skip the OSD creation step when
running ceph-ansible (cf #1777). The lvm scenario has a prepare_osd
tag on the relevant play. This commit adds the same tag to the
lvm-batch scenario.

Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
(cherry picked from commit 88d119e95a)
2021-03-12 09:19:55 +01:00
Matthew Vernon 6a23be19f4 Docs: fix some typos
While working on the previous PR, I found a couple of typos in the
docs. This fixes those.

Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
(cherry picked from commit 8b1474ab75)
2021-03-11 22:04:53 +01:00
Matthew Vernon 1a67f59789 Fix typo and broken link for documenting RGW frontends
http://docs.ceph.com/docs/nautilus/radosgw/frontends/ 404s so replace
it with a working "pacific" docs link, and correct the spelling of
"additional" while I'm at it.

Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
(cherry picked from commit 847611048e)
2021-03-03 14:17:31 +01:00
Guillaume Abrioux 6832c8d7a5 tests: increase nb of rerun in pytest
In order to avoid false positive in the CI that I've been unable to
reproduce.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f7fd1c2298)
2021-03-03 14:12:46 +01:00