Commit Graph

5552 Commits (ae452a86dc3ee8f79a548c08cc2a9121d307403e)
 

Author SHA1 Message Date
Guillaume Abrioux ae452a86dc common: selinux tasks related refactor
This moves some task from the `ceph-nfs` role in `ceph-common` since
some of them are needed in `ceph-rgwloadbalancer` role.
This avoids duplicated tasks.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d0442d81b9)
2021-04-06 15:08:51 +02:00
Guillaume Abrioux b02c5e8db7 rgw-loadbalancers: add all rgw_ports to http_port_t type
This adds all rgw ports to the http_port_t selinux type so it
allows haproxy to connect to those ports in order to avoid AVC.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1923890

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6bbb90198b)
2021-04-06 15:08:51 +02:00
kalebskeithley e63e3a65b4 rgw-loadbalancer: Update haproxy.cfg.j2
haproxy gets an AVC when configured to connect to port 8081

This commit adds a snippet regarding haproxy in a selinux environment

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1923890

Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
(cherry picked from commit 9e7f22a071)
2021-04-06 15:08:51 +02:00
Dimitri Savineau 501e33bc6a container/registry: use password from stdin
Pass the password variable via stdin for the registry login
authentication.
This allows to remove the no_log statement and see the task output
without displaying the password value.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a0e1a450d3)
2021-04-02 13:14:08 +02:00
Guillaume Abrioux b7a699f75d rgw: supports pg_autoscale_mode option for pool creation
Support enabling/disabling the pg autoscaler for rgw pools.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9f03a527ba)
2021-04-01 15:33:09 +02:00
Guillaume Abrioux 15a0591615 dashboard: support prometheus storage.tsdb.retention.time parameter
This commit adds the parameter `--storage.tsdb.retention.time` to the
prometheus systemd unit template.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1928000

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b60c61ce45)
2021-04-01 14:52:50 +02:00
Guillaume Abrioux c8afa768ab nfs: set idmap config for Ceph-NFS
Currently NFS Ganesha (ceph-nfs) consumes /etc/idmapd.conf, which
controls mapping of user/owner identities under NFSv4+. With
containerized service deployment, this file is an immutable part of the
container image and cannot be modified.

Here we provide group variables, and a taskk and templates for the
ceph-nfs role, to set the path of the idmap configuration file and
to make the most common adjustment to the contents of that file --
namely to set the 'Domain'. We default the path to /etc/ganesha/idmap.conf
so that we will not conflict with /etc/idmapd.conf on the controller nodes
where ganesha runs. NFSv4 clients, as used for example by the Cinder NFS
driver, consume /etc/idmapd.conf and may require different settings than
what is wanted for NFS Ganesha. Additionally, because we already bind
/etc/ganesha from the host into the ceph-nfs container, the file NFS
Ganesha consumes will no longer be an immutable part of the container.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925646

Signed-off-by: Tom Barron tpb@dyncloud.net
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2db2208e40)
2021-04-01 14:52:25 +02:00
Guillaume Abrioux 93fd5532ba defaults: add a comment about `igw_network`
This add a quick documentation in ceph-defaults about `igw_network`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c5728bdc63)
2021-03-29 11:24:14 +02:00
Guillaume Abrioux f60516d920 update: followup on 07029e1
Playbook must fail anyway, the `rescue` block has been introduced for
unmasking the unit after the playbook has failed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e9ddb972fe)
2021-03-29 10:54:56 +02:00
Guillaume Abrioux 52a0b222c1 dashboard: support igw nodes with dedicated subnet
This adds the possibility to deploy the dashboard with igw nodes using
a dedicated subnet.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1926170

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c33de174f1)
2021-03-26 21:25:57 +01:00
VasishtaShastry af6abb7125 Peer addition won't be skipped if remote is not in peer
rbd-mirroring is not configured as adding peer is getting skipped.
Peer addition should not get skipped if its not added already

Closes - https://bugzilla.redhat.com/show_bug.cgi?id=1942444

Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
(cherry picked from commit 006998e804)
2021-03-26 19:14:49 +01:00
Guillaume Abrioux f42ee9f940 cephadm_adopt: fetch and write ceph minimal config
This commit makes the playbook fetch the minimal current ceph
configuration and write it later on monitoring nodes so `cephadm` can
proceed with the adoption.
When a monitoring stack was deployed on a dedicated node, it means no
`ceph.conf` file was written, `cephadm` requires a `ceph.conf` in order
to adopt the daemon present on the node.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1939887

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b445df0479)
2021-03-26 15:20:50 +01:00
Ali Maredia 80bf7030f7 docs: rgw multisite docs with new rgw_instances config
Docs reflect that each instance of `rgw_instances`
can now take rgw_zonemaster, rgw_zonesecondary,
rgw_zonegroupmaster, rgw_multisite_proto.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit a59bc2da3b)
2021-03-26 07:42:50 +01:00
Guillaume Abrioux dac1a284f6 library: drop ceph_facts
This is never called in the playbook and seems unmaintained.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b01f16e835)
2021-03-26 00:07:29 +01:00
Ken Dreyer 173b5599f9 README-MULTISITE: fix typos
This commit fixes some typos in MULTISITE documentation.

Signed-off-by: Ken Dreyer <ktdreyer@redhat.com>
(cherry picked from commit 63a246db41)
2021-03-26 00:06:39 +01:00
Guillaume Abrioux 50b95baa32 convert some missed `ansible_*`` calls to `ansible_facts['*']`
This converts some missed calls to `ansible_*` that were missed in
initial PR #6312

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0163ecc924)
2021-03-26 00:05:33 +01:00
Guillaume Abrioux d6fcd78e72 clients: build filtered clients group early
when the group `_filtered_clients` is built, the order can change from
the original `clients` group which can cause issues since we run
`ceph-container-engine` on the first client only. It means later in the
playbook we can make call to the container CLI on a node where the
container engine wasn't installed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a112572734)
2021-03-26 00:05:33 +01:00
Alex Schultz 7c6783acb1 Disable facts by default in ansible.cfg
As a continuation of a7f2fa73e6, this
change switches fact injection to off by default in the provided
ansible.cfg.

Signed-off-by: Alex Schultz <aschultz@redhat.com>
(cherry picked from commit db031a4993)
(cherry picked from commit 5fa4ff5ed3)
2021-03-26 00:05:33 +01:00
Alex Schultz 815ea7765f Use ansible_facts
It has come to our attention that using ansible_* vars that are
populated with INJECT_FACTS_AS_VARS=True is not very performant.  In
order to be able to support setting that to off, we need to update the
references to use ansible_facts[<thing>] instead of ansible_<thing>.

Related: ansible#73654
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1935406
Signed-off-by: Alex Schultz <aschultz@redhat.com>
(cherry picked from commit a7f2fa73e6)
2021-03-26 00:05:33 +01:00
Guillaume Abrioux a72ce4c04b tests: switch to quay.ceph.io for dashboard images
for some reason, `quay.io/app-sre/grafana` no longer exist.
as a workaround, all dashboard related images have been mirrored on
quay.ceph.io.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c90b0985e5)
2021-03-24 09:20:24 +01:00
Guillaume Abrioux 1fe44154de iscsi: fetch right repo from shaman
due to recent changes in shaman, we must fetch the right repo by
filtering on the desired architecture.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5801171b37)
2021-03-24 09:20:24 +01:00
Guillaume Abrioux 48c2db97dc tests: fix `test_rgw_is_up` test
The data structure seems to have been modified in ceph@master (quincy).

This commit update the test accordingly.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b8080bac41)
2021-03-24 09:20:24 +01:00
Guillaume Abrioux 84a3f807e1 tests: fix `test_nfs_is_up` test
the data structure seems to have been modified in ceph@master (quincy).

This commit update the test accordingly.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7e1db0b599)
2021-03-24 09:20:24 +01:00
Guillaume Abrioux 3c5f5f1503 ceph_volume: fix bug in `is_lv()`
This function makes the `ceph_volume` module be not idempotent in
containerized context because it tries to run a container and bindmount
directories that no longer exist.

In that case, the `lvs` command being executed returns something
different than `0` so we can't call `json.loads(out)['report'][0]['lv']`
since it might throw an python error.

The idea is to return `True` only if `rc` is equal to `0` and
`len(result)` is greater than `0`, which means the command matched an
LV.

Fixes: #6284

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ed79bc7a4e)
2021-03-24 09:20:24 +01:00
Guillaume Abrioux be7cfb9ccc fix 'command -v' tasks
`command -v` is a bash script which needs a shell to run.

Fixes: #6325

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 14c472707c)
2021-03-22 13:52:59 +01:00
Guillaume Abrioux 3fd6457c1d cephadm_adopt: fetch and write ceph minimal config
This commit makes the playbook fetch the minimal current ceph
configuration and write it later on monitoring nodes so `cephadm` can
proceed with the adoption.
When a monitoring stack was deployed on a dedicated node, it means no
`ceph.conf` file was written, `cephadm` requires a `ceph.conf` in order
to adopt the daemon present on the node.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1939887

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b445df0479)
2021-03-18 15:33:02 +01:00
Guillaume Abrioux dbd53a2ef2 tests: remove sleep commands from tox ini files
Since we use the rerun plugin in tox, we shouldn't need to add these
`sleep` commands.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e835c77a0e)
2021-03-18 09:25:20 +01:00
Guillaume Abrioux 802705ff9b facts: fix nfs/external cluster scenario
These tasks shouldn't be run when at least 1 monitor isn't present in
the inventory.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1937997

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ccd1cbb732)
2021-03-18 06:40:44 +01:00
Guillaume Abrioux 8e30a3c9f8 config: reset num_osds
When collocating OSDs with other daemon, `num_osds` is incorrectly calculated
because `ceph-config` is called multiple times.

Indeed, the following code:
```
num_osds: "{{ lvm_list.stdout | default('{}') | from_json | length | int + num_osds | default(0) | int }}"
```

makes `num_osds` be incremented each time `ceph-config` is called.

We have to reset it in order to get the correct number of expected OSDs.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 31a0f2653d)
2021-03-17 17:35:37 +01:00
Matthew Vernon 6ace9bd9e5 docs: Document the prepare_osd tag
There are times where being able to skip OSD creation is useful to the
admin (see #1777 for example), and skipping the prepare_osd tag is a
way to achieve this. Document this fact.

Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
(cherry picked from commit e66b7b7449)
2021-03-12 15:44:32 +01:00
Matthew Vernon d449d15d4d ceph-osd: add prepare_osd tag to lvm-batch scenario
Sometimes it's useful to be able to skip the OSD creation step when
running ceph-ansible (cf #1777). The lvm scenario has a prepare_osd
tag on the relevant play. This commit adds the same tag to the
lvm-batch scenario.

Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
(cherry picked from commit 88d119e95a)
2021-03-12 15:44:32 +01:00
Guillaume Abrioux 5816a6ebe8 tests: increase nb of rerun in pytest
In order to avoid false positive in the CI that I've been unable to
reproduce.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f7fd1c2298)
2021-03-12 09:38:46 +01:00
Guillaume Abrioux 8b69451652 dashboard: add missing parameter in `ceph_cmd`
the `ceph_cmd` fact is missing the `--net=host` parameter.

Some tasks consuming this fact can fail like following:

```
Error: error configuring network namespace for container b8ec913db1fb694ae683faf202680de7a59c714a004e533aba87e8503d29261f: Missing CNI default network
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1931365

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f143b1a647)
2021-03-12 09:38:46 +01:00
Matthew Vernon 54cf6b4c77 Docs: fix some typos
While working on the previous PR, I found a couple of typos in the
docs. This fixes those.

Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
(cherry picked from commit 8b1474ab75)
2021-03-12 09:35:54 +01:00
Guillaume Abrioux 32ad0f6fe7 common: ensure shaman returns right repo
Due to recent changes in shaman, there's a chance it returns the wrong
repository from architecture point of view.
We can query shaman and ask for the correct architecture to get around
this.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 39649f0ce8)
2021-03-10 17:17:33 -05:00
Dimitri Savineau 09d6706697 debian/uca: remove the handler notification
The "update apt cache" in the ceph-handler role was never called and the
handler trigger after adding the uca repository doesn't exist at all.
Instead of using a handler for that we can just set the update_cache
parameter to true like the other apt_repository tasks.

Resolve merge conflict from cherry-picking this commit.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2021-03-03 14:50:45 +01:00
Matthew Vernon 42b571b11f Fix typo and broken link for documenting RGW frontends
http://docs.ceph.com/docs/nautilus/radosgw/frontends/ 404s so replace
it with a working "latest" docs link, and correct the spelling of
"additional" while I'm at it.

Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
(cherry picked from commit 847611048e)
2021-03-03 14:19:52 +01:00
Florian Haas 21e2675adb requirements.txt: Move the six dependency into the general requirements
config_template.py depends on six, which isn't listed in the default
requirements.txt. This previously frequently wasn't a problem, because
six used to be a standard package being installed into a venv, and
lots of other projects depended on it.

It also does get installed for unit and integration tests via
tests/requirements.txt, so any broken dependency on six wouldn't be
detected by tox runs.

However, as other projects and distributions have phased out Python
2.7 support the dependency on six becomes less common. Thus, as long
as ceph-ansible does require it for config_template.py, add it to the
base requirements.

Signed-off-by: Florian Haas <florian@citynetwork.eu>
(cherry picked from commit d49ea9818b)
2021-03-01 15:17:10 +01:00
Guillaume Abrioux 6c61240637 defaults: update rhcs dashboard images versions
The current dashboard images deployed have a bad health index.
Updating to a newer version fixes this issue.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925350

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a16ae693d8)
2021-02-18 18:22:15 +01:00
Guillaume Abrioux 9359ee913f library: do not always add --yes in batch mode
When asking `ceph-volume` to report only in `lvm batch` context, there's
a bug described in bz1896803 [1] when `--yes` is passed (which by the
way isn't necessary with `--report`).
This commit ensure `--yes` isn't passed to `ceph-volume` when `--report`
is used.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1896803

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1896803

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fe6d6ba622)
2021-02-14 06:29:30 +01:00
Guillaume Abrioux 2bf1b6b64d purge: rm service-cid files
This commit makes sure purge playbooks remove those file if for any reason they
have been left.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1920900

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b9dd253a4f)
2021-02-12 18:33:37 +01:00
Guillaume Abrioux 5bedb04585 switch2container: do not serialize the ceph-crash migration
There's no need to slow down the playbook execution time by migrating
all the `ceph-crash` instances in a serial way. Let's remove the
`serial: 1` so the migration is achieved in a parallel way.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 980a5a7df4)
2021-02-12 14:06:30 +01:00
Guillaume Abrioux 149d76da9e tests: use V3.3-stable branch for nfs-ganesha
This is the latest stable release available for octopus.
Let's use it instead of using master builds.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-02-12 09:51:27 +01:00
Guillaume Abrioux 3e58f1ea6e doc: add a note about "latest" tags
See the change for details.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4e95180c80)
2021-02-11 16:48:05 +01:00
Dimitri Savineau ffedb78aa7 doc: update containerized deployment
This adds more documentation to the configuration and usage of
containerizerd deployment.

Closes: #6198

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d42d584085)
2021-02-11 16:48:05 +01:00
Dimitri Savineau a1ca7f3daa rolling_update: enforce ceph-container-engine
When running the rolling_update.yml playbook and adding the dashboard
component in the same time then the requirement (like container packages)
aren't installed.
This could lead to a failure in case of using authentication on the
container registry because the playbook will try to login on the registry
but podman/docker aren't yet installed.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1903504
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918650

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 48a456dc8c)
2021-02-10 09:58:03 +01:00
Dimitri Savineau 5572c907ee ceph-common: enable rhcs tools repo for monitoring
The monitoring node running grafana needs the rhcs tools repostory
enabled in non containerized deployment to be able to install the
ceph-grafana-dashboards rpm package.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918650

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit e4dd0067c6)
2021-02-10 09:58:03 +01:00
Guillaume Abrioux 32a84c9080 tests: pin ansible-lint version
This commit pins the ansible-lint version to 4.3.7 as ceph-ansible isn't
compatible with recent changes in 5.0.0

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2f1d287b1c)
2021-02-10 08:21:41 +01:00
Guillaume Abrioux 0e9b93db69 tests: set `mon_max_pg_per_osd` in rgw_multisite
Otherwise, the job fails when it tries to create a bucket with `s3cmd mb`
command because we have too many PGs per OSD.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 54bae480d2)
2021-02-10 08:21:41 +01:00
Guillaume Abrioux dd204d9e2f rgw: fix a typo in multisite
if `rgw_zonegroupmaster` is not defined at the rgw instance level in
`rgw_instances` it will fallback to a wrong variable (`rgw_zonemaster`).

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925247

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 931b87e830)
2021-02-10 08:21:41 +01:00