Commit Graph

2706 Commits (9237a98965004e2198d7672302f1459d7b1da1d8)

Author SHA1 Message Date
Guillaume Abrioux 9237a98965 debug
debug

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-04-16 09:43:08 +02:00
Guillaume Abrioux 3ef9690cd1 docker2podman: skip some role imports from handler
when running docker-to-podman playbook, there's no need to call
`ceph-config` and `ceph-rgw` from the role `ceph-handler`.
It can even have side effects when coming from a baremetal cluster that
was previously migrated using the switch-to-containers playbook. Indeed
it might complain about missing .target systemd unit since they are
removed during that migration.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1944999

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 70f19be367)
2021-04-12 13:30:31 +02:00
Guillaume Abrioux f47da73a8a common: selinux tasks related refactor
This moves some task from the `ceph-nfs` role in `ceph-common` since
some of them are needed in `ceph-rgwloadbalancer` role.
This avoids duplicated tasks.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d0442d81b9)
2021-04-06 15:09:00 +02:00
Guillaume Abrioux 3bfa0772e2 rgw-loadbalancers: add all rgw_ports to http_port_t type
This adds all rgw ports to the http_port_t selinux type so it
allows haproxy to connect to those ports in order to avoid AVC.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1923890

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6bbb90198b)
2021-04-06 15:09:00 +02:00
kalebskeithley 3ea39b5db3 rgw-loadbalancer: Update haproxy.cfg.j2
haproxy gets an AVC when configured to connect to port 8081

This commit adds a snippet regarding haproxy in a selinux environment

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1923890

Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
(cherry picked from commit 9e7f22a071)
2021-04-06 15:09:00 +02:00
Guillaume Abrioux b79c2070f4 nfs: set idmap config for Ceph-NFS
Currently NFS Ganesha (ceph-nfs) consumes /etc/idmapd.conf, which
controls mapping of user/owner identities under NFSv4+. With
containerized service deployment, this file is an immutable part of the
container image and cannot be modified.

Here we provide group variables, and a taskk and templates for the
ceph-nfs role, to set the path of the idmap configuration file and
to make the most common adjustment to the contents of that file --
namely to set the 'Domain'. We default the path to /etc/ganesha/idmap.conf
so that we will not conflict with /etc/idmapd.conf on the controller nodes
where ganesha runs. NFSv4 clients, as used for example by the Cinder NFS
driver, consume /etc/idmapd.conf and may require different settings than
what is wanted for NFS Ganesha. Additionally, because we already bind
/etc/ganesha from the host into the ceph-nfs container, the file NFS
Ganesha consumes will no longer be an immutable part of the container.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925646

Signed-off-by: Tom Barron tpb@dyncloud.net
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2db2208e40)
2021-04-02 13:18:52 +02:00
Guillaume Abrioux b2cf677b71 dashboard: support prometheus storage.tsdb.retention.time parameter
This commit adds the parameter `--storage.tsdb.retention.time` to the
prometheus systemd unit template.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1928000

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b60c61ce45)
2021-04-02 13:17:59 +02:00
Guillaume Abrioux 653d180ec0 defaults: add a comment about `igw_network`
This add a quick documentation in ceph-defaults about `igw_network`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c5728bdc63)
2021-03-29 11:24:28 +02:00
Guillaume Abrioux fe47a02134 dashboard: support igw nodes with dedicated subnet
This adds the possibility to deploy the dashboard with igw nodes using
a dedicated subnet.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1926170

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c33de174f1)
2021-03-26 21:26:14 +01:00
VasishtaShastry 58a28656ff Peer addition won't be skipped if remote is not in peer
rbd-mirroring is not configured as adding peer is getting skipped.
Peer addition should not get skipped if its not added already

Closes - https://bugzilla.redhat.com/show_bug.cgi?id=1942444

Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
(cherry picked from commit 006998e804)
2021-03-26 19:14:35 +01:00
Guillaume Abrioux 9780490b2f convert some missed `ansible_*`` calls to `ansible_facts['*']`
This converts some missed calls to `ansible_*` that were missed in
initial PR #6312

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0163ecc924)
2021-03-26 00:16:58 +01:00
Alex Schultz 6229b3bdba Disable facts by default in ansible.cfg
As a continuation of a7f2fa73e6, this
change switches fact injection to off by default in the provided
ansible.cfg.

Signed-off-by: Alex Schultz <aschultz@redhat.com>
(cherry picked from commit db031a4993)
2021-03-26 00:16:58 +01:00
Alex Schultz 7ddbe74712 Use ansible_facts
It has come to our attention that using ansible_* vars that are
populated with INJECT_FACTS_AS_VARS=True is not very performant.  In
order to be able to support setting that to off, we need to update the
references to use ansible_facts[<thing>] instead of ansible_<thing>.

Related: ansible#73654
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1935406
Signed-off-by: Alex Schultz <aschultz@redhat.com>
(cherry picked from commit a7f2fa73e6)
2021-03-26 00:16:58 +01:00
Guillaume Abrioux 7fd332e7fe iscsi: fetch right repo from shaman
due to recent changes in shaman, we must fetch the right repo by
filtering on the desired architecture.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5801171b37)
2021-03-25 14:11:11 +01:00
Guillaume Abrioux bbf8b2fdf6 facts: fix nfs/external cluster scenario
These tasks shouldn't be run when at least 1 monitor isn't present in
the inventory.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1937997

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ccd1cbb732)
2021-03-18 06:41:00 +01:00
Guillaume Abrioux dc2a11ce3f config: reset num_osds
When collocating OSDs with other daemon, `num_osds` is incorrectly calculated
because `ceph-config` is called multiple times.

Indeed, the following code:
```
num_osds: "{{ lvm_list.stdout | default('{}') | from_json | length | int + num_osds | default(0) | int }}"
```

makes `num_osds` be incremented each time `ceph-config` is called.

We have to reset it in order to get the correct number of expected OSDs.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 31a0f2653d)
2021-03-17 17:35:52 +01:00
Dimitri Savineau 6921aafb2b debian/uca: remove the handler notification
The "update apt cache" in the ceph-handler role was never called and the
handler trigger after adding the uca repository doesn't exist at all.
Instead of using a handler for that we can just set the update_cache
parameter to true like the other apt_repository tasks.

Resolve merge conflict from cherry-picking this commit.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 09d6706697)
2021-03-11 22:06:11 +01:00
Dimitri Savineau 735965ef9c ceph-common: enable rhcs tools repo for monitoring
The monitoring node running grafana needs the rhcs tools repostory
enabled in non containerized deployment to be able to install the
ceph-grafana-dashboards rpm package.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918650

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit e4dd0067c6)
2021-03-11 13:52:21 +01:00
Tyler Bishop ba76102952 facts: support device aliases for (dedicated|bluestore_wal)_devices
Just likve `devices`, this commit adds the support for linux device aliases for
`dedicated_devices` and `bluestore_wal_devices`.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1919084

Signed-off-by: Tyler Bishop <tbishop@liquidweb.com>
(cherry picked from commit ee4b8804ae)
2021-03-11 13:51:19 +01:00
Guillaume Abrioux e3165f9a07 mon: fix cephx disabled deployment
Due to missing condition on `cephx` variable, cephx disabled deployments
are broken.
This commit fixes this.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1910151

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4af0845702)
2021-03-11 13:51:04 +01:00
Guillaume Abrioux 241418409d common: ensure shaman returns right repo
Due to recent changes in shaman, there's a chance it returns the wrong
repository from architecture point of view.
We can query shaman and ask for the correct architecture to get around
this.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 39649f0ce8)
2021-03-10 16:43:04 +01:00
Matthew Vernon fdf437743c Fix typo and broken link for documenting RGW frontends
http://docs.ceph.com/docs/nautilus/radosgw/frontends/ 404s so replace
it with a working "latest" docs link, and correct the spelling of
"additional" while I'm at it.

Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
(cherry picked from commit 847611048e)
2021-03-03 14:20:26 +01:00
Guillaume Abrioux c3304c213b dashboard: add missing parameter in `ceph_cmd`
the `ceph_cmd` fact is missing the `--net=host` parameter.

Some tasks consuming this fact can fail like following:

```
Error: error configuring network namespace for container b8ec913db1fb694ae683faf202680de7a59c714a004e533aba87e8503d29261f: Missing CNI default network
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1931365

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f143b1a647)
2021-03-03 12:57:08 +01:00
Guillaume Abrioux b5d082c4bc rgw: fix a typo in multisite
if `rgw_zonegroupmaster` is not defined at the rgw instance level in
`rgw_instances` it will fallback to a wrong variable (`rgw_zonemaster`).

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925247

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 931b87e830)
2021-02-10 08:32:24 +01:00
Guillaume Abrioux 920f07514a rgw: quick fix in create_zone_user.yml
typical error:

```
2021-02-01 03:11:09,809 p=93834 u=cephuser n=ansible | TASK [ceph-rgw : check if the realm system user already exists] ***************************************************************************************************************************************************
2021-02-01 03:11:09,809 p=93834 u=cephuser n=ansible | Monday 01 February 2021  03:11:09 -0500 (0:00:00.084)       0:14:38.607 *******
2021-02-01 03:11:09,836 p=93834 u=cephuser n=ansible | fatal: [ceph-kvm-ms2-1611241931591-node7-rgw]: FAILED! =>
  msg: |-
    The task includes an option with an undefined variable. The error was: 'None' has no attribute 'realm'
```

This task should be skipped when `zone_users` is undefined.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1922998

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-02-01 11:28:57 -05:00
Dimitri Savineau 6278c5a4e3 ceph-mon: add ExecStartPre docker stop to systemd
We already do that in the other systemd templates (mgr, mds, etc..)
and would present to add workaround in other orchestration tool.
This change is for containerized deployment only.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1882724

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3749d297c7)
2021-01-29 12:00:14 -05:00
Guillaume Abrioux aeee3471e3 rgw: avoid useless call to ceph-rgw
since `ceph-rgw` may be called from `ceph-handler` in some contexts we
should avoid rerunning it unnecessarily.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8617081664)
2021-01-28 16:37:50 -05:00
Guillaume Abrioux b903446fa4 containers: use --cpus instead --cpu-quota
When using docker 1.13.1, the current condition:

```
{% if (container_binary == 'docker' and ceph_docker_version.split('.')[0] is version_compare('13', '>=')) or container_binary == 'podman' -%}
```

is wrong because it compares the first digit (1) whereas it should
compare the second one.
It means we always use `--cpu-quota` although documentation recommend
using `--cpus` when docker version is 1.13.1 or higher.

From the doc:
> --cpu-quota=<value>	Impose a CPU CFS quota on the container. The number of
> microseconds per --cpu-period that the container is limited to before
> throttled. As such acting as the effective ceiling.
> If you use Docker 1.13 or higher, use --cpus instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3e262e072b)
2021-01-28 16:37:50 -05:00
Guillaume Abrioux 14267fe0c4 rgw: multisite refact
Add the possibility to deploy rgw multisite configuration with a mix of
secondary and primary zones on a same rgw node.
Before that, on a same node, all instances were either primary
zones *OR* secondary.

Now you can define a rgw instance like following:

```
rgw_instances:
  - instance_name: 'rgw0'
    rgw_zonemaster: false
    rgw_zonesecondary: true
    rgw_zonegroupmaster: false
    rgw_realm: 'france'
    rgw_zonegroup: 'zonegroup-france'
    rgw_zone: paris-00
    radosgw_address: "{{ _radosgw_address }}"
    radosgw_frontend_port: 8080
    rgw_zone_user: jacques.chirac
    rgw_zone_user_display_name: "Jacques Chirac"
    system_access_key: P9Eb6S8XNyo4dtZZUUMy
    system_secret_key: qqHCUtfdNnpHq3PZRHW5un9l0bEBM812Uhow0XfB
    endpoint: http://192.168.101.12:8080
```

Basically it's now possible to define `rgw_zonemaster`,
`rgw_zonesecondary` and `rgw_zonegroupmaster` at the intsance
level instead of the whole node level.

Also, this commit adds an option `deploy_secondary_zones` (default True)
which can be set to `False` in order to explicitly ask the playbook to
not deploy secondary zones in case where the corresponding endpoint are
not deployed yet.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1915478

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 71a5e666e3)
2021-01-28 16:37:50 -05:00
Dimitri Savineau 07d2160421 dashboard: manage password backward compatibility
The ceph dashboard changed the way the password are provided via the
CLI.
This breaks the backward compatibility when using a recent ceph-ansible
version with ceph release without that feature.
This patch adds tasks for legacy workflow (ceph release without that
feature) in ceph-dashboard role.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1915506

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2021-01-19 18:05:02 +01:00
Guillaume Abrioux 623ca14682 dashboard: configure passwords via stdin
Due to recent changes in ceph, the few dashboard passwors
must be passed via `-i`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ef975ef5ea)
2021-01-19 18:05:02 +01:00
Mike Currin 360a2d2b30 Path for ceph config missing in crash template
The path where ceph.conf is located (/etc/ceph) missing in the Docker container bind mounts, this throws errors

Signed-off-by: Mike Currin <currin@gmail.com>
(cherry picked from commit 4cbc9a48c9)
2021-01-06 16:55:39 +01:00
Guillaume Abrioux 290d3ef369 rgw: support switching from single-site to multisite
When collocating rgw with either a mon, mgr or osd, switching from
single site to a multisite rgw setup failed because of the handlers
triggered between the ansible play of the collocated daemon and the play
of the rgw. Since the multisite changes are not yet applied the handlers
fail.
The idea here is to ensure we run the multisite configuration from the
ceph-handler role before the restart happens, this way it won't complain
because of non existing multisite configuration.

(Note: this is also valid when simply changing a multisite configuration)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1888630

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 513c8cfe55)
2021-01-06 10:38:50 -05:00
Guillaume Abrioux 607ef5a7d2 common: do not use pipefail when not needed
Let's discard the ansible lint error 306 and add a "# noqa 306" on tasks
where we don't need `set -o pipefail`

Fixes: #6090

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 86a8889ee3)
2020-12-16 14:05:45 +01:00
Guillaume Abrioux 6855feb604 ceph-osd: refact `docker_exec_start_osd`
This commit drops nested jinja construction in this set_fact task.
It also rename it to `container_exec_start_osd`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ff95fa9c32)
2020-12-16 14:05:45 +01:00
Dimitri Savineau 24a5b1bbb5 ceph-config: fix ceph-volume lvm batch report
Since the major ceph-volume lvm batch refactoring, the report value
is different.
Before the refact, the report was a dict with the OSDs list to be created
under the "osds" key.
After the refact, the report is a list of dict.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 827b23353f)
2020-12-15 17:26:01 -05:00
Dimitri Savineau 3f16132e44 library: add ceph_osd_flag module
This adds ceph_osd_flag ansible module for replacing the command module
usage with the ceph osd set/unset commands.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 5da593604a)
2020-12-15 17:36:28 +01:00
Dimitri Savineau e51f68fdbb ceph-iscsi: set the pool name in the config file
When using a custom pool for iSCSI gateway then we need to set the pool
name in the configuration otherwise the default rbd pool name will be
used.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 40a87c4b92)
2020-12-15 17:33:24 +01:00
Guillaume Abrioux 63fa4c9484 containers: modify bindmount option
This commit changes the bind mount option for the mount point
`/var/lib/ceph` in the systemd template for mon and mgr containers. This
is needed in case of collocating mon/mgr with osds using dmcrypt
scenario.
Once mon/mgr got converted to containers, the dmcrypt layer sub mount is
still seen in `/var/lib/ceph`. For some reason it makes the
corresponding devices busy so any other container can't open/close it.
As a result, it prevents osds from starting properly.

Since it only happens on the nodes converted before the OSD play, the idea is
to bind mount `/var/lib/ceph` on mon and mgr with the `rshared` option
so once the sub mount is unmounted, it is propagated inside the
container so it doesn't see that mount point.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1896392

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f5ba6d9b01)
2020-12-15 17:33:11 +01:00
Dimitri Savineau fa06752e4b alertmanager/prometheus: fix owner/group
Set the owner/group on alertmanager and prometheus directories and
files to nobody and nogroup (uid and gid 65534) to avoid permission
issues.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1901543

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit eb452d35bc)
2020-12-15 17:32:50 +01:00
Guillaume Abrioux 69b5b96f2d osd: add tag on 'wait for all osd to be up' task
This allows skipping this task if really desired.
Use it carefully. Use it at your own risk.

Fixes: #6073

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5c4ae5356d)
2020-12-15 17:32:09 +01:00
Jukka Nousiainen dca1534ee6 ceph-mon: No become during gen mon initial keyring
Since the backing generate_secret() just hands out urandom output,
running as privileged doesn't seem to be required. It's not
desireable to provide sudo in some Ansible runner environments.

Signed-off-by: Jukka Nousiainen <jukka.nousiainen@csc.fi>
(cherry picked from commit eb7473491b)
2020-12-15 17:31:37 +01:00
Dimitri Savineau 9858d61a57 Revert "config: Always use osd_memory_target if set"
This reverts commit 4d1fdd2b05.

This breaks the backward compatibility with previous osd_memory_target
calculation and we could have a value lower than the minimum value allowed
(896M) which causes some ceph commands to fail (like ceph assimilate-conf).

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit aa6e1f20ea)
2020-12-15 17:31:09 +01:00
Seena Fallah 2485b35825 ceph-osd: use global crush_device_class in lvm_volumes
Use global crush_device_class variable if it's not set per OSD

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 5e9444fa5c)
2020-12-15 17:30:55 +01:00
Karl-Heinz Preuß 00793c9221 fix broken ceph-fetch-keys role
set fetch_directory variable in default/main.yml instead of using the
defaults jinja filter in tasks/main.yml.

Fixes: #6072

Signed-off-by: Karl-Heinz Preuß <karl-heinz.preuss@cms.hu-berlin.de>
(cherry picked from commit 6ce34ef59f)
2020-12-15 17:30:42 +01:00
Dimitri Savineau f18142fc2e group_vars: remove useless files
Delete legacy files that aren't used anymore.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit e790b0851d)
2020-12-15 17:30:42 +01:00
Guillaume Abrioux 1fcf71dc33 common: drop `fetch_directory` feature
This commit drops the `fetch_directory` feature.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1cc9666c09)
2020-12-15 17:30:42 +01:00
Guillaume Abrioux dc7b9519f4 ceph-config: ceph.conf rendering refactor
This commit cleans up the `main.yml` task file of `ceph-config`.
It drops the local ceph.conf generation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 900c0f4492)
2020-12-15 17:30:42 +01:00
Guillaume Abrioux d14723d5b4 mon: refact initial keyring generation
adding monitor is no longer possible because we generate a new mon
keyring each time the playbook is run.

Fixes: #5864
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1902281

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 970c6a4ee6)
2020-12-01 09:53:26 -05:00
Dimitri Savineau f917bb015c ceph_key: set state as optional
Most ansible module using a state parameter default to the present
value (when available) instead of using it as a mandatory option.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit abb4023d76)
2020-12-01 09:53:26 -05:00