With OpenSSL version prior 1.1.1 (like CentOS 7 with 1.0.2k), the -addext
doesn't exist.
As a solution, this uses the default openssl.cnf configuration file as a
template and add the subjectAltName in the v3_ca section. This temp openssl
configuration file is removed after the TLS certificate creation.
This patch also move the run_once statement at the block level.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978869
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 5e0ace7e54)
the current way the variable is built results in:
```
2021-08-03 04:18:23,020 - ceph.ceph - INFO - ok: [ceph-sangadi-4x-indpt6-node1-installer] => changed=false
ansible_facts:
subj_alt_names: |-
subjectAltName=ceph-sangadi-4x-indpt6-node1-installer/subjectAltName=10.0.210.223/subjectAltName=ceph-sangadi-4x-indpt6-node1-installersubjectAltName=ceph-sangadi-4x-indpt6-node2/subjectAltName=10.0.210.252/subjectAltName=ceph-sangadi-4x-indpt6-node2/
```
which is incorrect.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978869
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6f1a0634f7)
sufficient for the default value (512) of rgw thread pool size.
But if its value is increased near to the pids-limit value,
it does not leave place for the other processes to spawn and run within
the container and the container crashes.
pids-limit set to unlimited regardless of the container engine.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1987041
Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit 9b5d97adb9)
The ceph osd pool ls detail command is a subset of the ceph osd dump
command.
$ ceph osd dump --format json|wc -c
10117
$ ceph osd pool ls detail --format json|wc -c
4740
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 06471a4b82)
Run the Ceph commands that only gather information (without making any changes
to the cluster) when running Ansible in check mode.
This allows the tasks that depend on the variables set by those tasks to
succeed in check mode.
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 498acd7527)
We currently download the grafana dashboars from the ceph@master branch
for all ceph releases.
We should use the right ceph branch according to the ceph release.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The radosgw-sync-overview and rbd-details grafana dashboars were missing
from the list.
Closes: #6758
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f0ccf3ebf0)
When using self-signed/untrusted CA certificates, alertmanager displays
an error in logs. With this commit this should make those messages
disappear.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1936299
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 9f77b929d1)
When the rgw_multisite_proto variable is set to https then we shoudn't use
the IP address in the zone endpoints list but the node FQDN to match the
TLS certificate CN.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1965504
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ad05a08160)
When using python 2 and the task with a loop is skipped then it generates
an error.
Unexpected templating type error occurred on
({{ (pool_list.stdout | from_json)['pools'] }}): expected string or buffer
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit cf6e33346e)
The PG autoscaler can disrupt the PG checks so the idea here is to
disable it and re-enable it back after the restart is done.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 13036115e2)
Populating the ceph_mgr_modules list in the mgr_modules doesn't make sense
since that file is only executed if the list isn't empty or we're using the
dashboard.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit cd06e7c046)
We already have config override variables for existing block (like
ganesha_ceph_export_overrides, ganesha_log_overrides, etc...) or a
global one (ganesha_conf_overrides) but redefining the NFS_CORE_PARAM
block in that variable will erase all previous values (currently only
Bind_Addr).
ganesha_core_param_overrides: |
Enable_UDP = false;
NFS_Port = 2050;
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1941775
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 9817d29543)
When deploying dashboard with ssl certificates generated by
ceph-ansible, we enforce the CN to 'ceph-dashboard' which can makes
application such alertmanager complain like following:
`err="Post https://mgr0:8443/api/prometheus_receiver: x509: certificate is valid for ceph-dashboard, not mgr0" context_err="context deadline exceeded"`
The idea here is to add alternative names matching all mgr/mon instances
in the certificate so this error won't appear in logs.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978869
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 72a0336c71)
This introduces a new variable `dashboard_network` in order to support
deploying the dashboard on a different subnet.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1927574
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f4f73b6197)
Instead of reusing the condition 'inventory_hostname in groups[osds]'
on each device facts tasks then we can move all the tasks into a
dedicated file and set the condition on the import_tasks statement.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d704b05e52)
We currently don't check if the logical volume used in lvm_volumes list
for either bluestore data/db/wal or filestore data/journal exist.
We're only doing this on raw devices for batch scenario.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 55bca07cb6)
When using dedicated devices for db/journal/wal objecstore with
ceph-volume lvm batch then we should also validate that those devices
exist and don't use a gpt partition table in addition of the devices
and lvm_volume.data variables.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 808e7106de)
Instead of using findmnt command to find the device associated to the
root mount point then we can use the ansible_mounts fact.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 7e50380f7f)
Instead of doing two parted calls we can check first if the device exist
and then test the partition table.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 14d458b3b4)
2888c08 introduced a regression as the check_devices tasks file was
only included based on the devices variable.
But that file also validate some devices from the lvm_volumes variable.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1906022
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ac0342b72e)
All ceph daemons need to have the TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES
environment variable set to 128MB by default in container setup.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1970913
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 9758e3c513)
When calling the `ceph_key` module with `state: info`, if the ceph
command called fails, the actual error is hidden by the module which
makes it pretty difficult to troubleshoot.
The current code always states that if rc is not equal to 0 the keyring
doesn't exist.
`state: info` should always return the actual rc, stdout and stderr.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1964889
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d58500ade0)
Starting RHCS 5, there's no ISO available anymore.
This removes all ISO variables and the ceph_repository_type variable.
Closes: #6626
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a05730b38a)
It was requested for us to update our alerting definitions to include a
slow OSD Ops health check.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1951664
Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit 2491d4e004)
There's no need to copy this keyring when using nfs with mds
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8dbee99882)
When running the switch-to-containers playbook with multisite enabled,
the fact "rgw_instances" is only set for the node being processed
(serial: 1), the consequence of that is that the set_fact of
'rgw_instances_all' can't iterate over all rgw node in order to look up
each 'rgw_instances_host'.
Adding a condition checking whether hostvars[item]["rgw_instances_host"]
is defined fixes this issue.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967926
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8279d14d32)
When monitors and rgw are collocated with multisite enabled, the
rolling_update playbook fails because during the workflow, we run some
radosgw-admin commands very early on the first mon even though this is
the monitor being upgraded, it means the container doesn't exist since
it was stopped.
This block is relevant only for scaling out rgw daemons or initial
deployment. In rolling_update workflow, it is not needed so let's skip
it.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1970232
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f7166cccbf)
When osd nodes are collocated in the clients group (HCI context for
instance), the current logic will exclude osd nodes since they are
present in the client group.
The best fix would be to exclude clients node only when they are not
member of another group but for now, as a workaround, we can enforce
the addition of osd nodes to fix this specific case.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1947695
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 664dae0564)
Enabling lvmetad in containerized deployments on el7 based OS might
cause issues.
This commit make it possible to disable this service if needed.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1955040
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since we need to revert 33bfb10, this is an alternative to initial approach.
We can avoid maintaining this file since it is present in container
image. The idea is to simply get it from the image container and write
it to the host.
Fixes: #6501
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e6d8b058ba)
The pg_autoscale_mode for rgw pools introduced in 9f03a52 was wrong
and was missing a `value` keyword because `rgw_create_pools` is a
dict.
Fixes: #6516
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a670982a38)
Skip the `get initial keyring when it already exists` task when both commands
whose `stdout` output it requires have been skipped (e.g. when running in check
mode).
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 2437f14581)
TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES is for both bluestore and filestore
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 41295f0ef6)
We need to filter with the OS architecture in order to fetch the right
dev repository in shaman
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8f87754b76)
When dashboard_frontend_vip is provided, all the services should be
configured using the related VIP. A new VIP variable is added for
both prometheus and alertmanager: we're already able to properly
config the grafana vip using dashboard_frontend_vip variable.
This change adds the same variable for both prometheus and
alertmanager.
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
(cherry picked from commit 441651638d)
The `set_fact rgw_ports` task was failing due to a templating error, because
`hostvars[item].rgw_instances` is a list, but it was treated as if it was a
dictionary.
Another issue was the fact that the `unique` filter only applied to the list
being appended to `rgw_ports` instead of the entire list, which means it was
possible to have duplicate items.
Lastly, `rgw_ports` would have been a list of integers, but the `seport` module
expects a list of strings.
This commit fixes all of the issues above, allowing the `ceph-rgw-loadbalancer`
role to work on systems with SELinux enabled.
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit c078513475)
This adds a `ExecStartPre=-/usr/bin/mkdir -p /var/log/ceph` in all
systemd service templates for all ceph daemon.
This is specific to RHCS after a Leapp upgrade is done. Indeed, the
`/var/log/ceph` seems to be removed after the upgrade.
In order to work around this issue let's ensure the directory is present
before trying to start the containers with podman.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1949489
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit bab403b603)
`configure_mirroring.yml` is called right after the daemon is started.
Sometimes, it can happen the first task in `configure_mirroring.yml` is
run while the daemon isn't yet ready, adding a retries/until on that
task should help to avoid causing the playbook to fail.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1944996
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b1e7e1ad0f)
set the name of those tasks accordingly with the fact name being set.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d3d3d01528)
when running docker-to-podman playbook, there's no need to call
`ceph-config` and `ceph-rgw` from the role `ceph-handler`.
It can even have side effects when coming from a baremetal cluster that
was previously migrated using the switch-to-containers playbook. Indeed
it might complain about missing .target systemd unit since they are
removed during that migration.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1944999
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 70f19be367)
This moves some task from the `ceph-nfs` role in `ceph-common` since
some of them are needed in `ceph-rgwloadbalancer` role.
This avoids duplicated tasks.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d0442d81b9)
This adds all rgw ports to the http_port_t selinux type so it
allows haproxy to connect to those ports in order to avoid AVC.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1923890
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6bbb90198b)
haproxy gets an AVC when configured to connect to port 8081
This commit adds a snippet regarding haproxy in a selinux environment
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1923890
Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
(cherry picked from commit 9e7f22a071)
Pass the password variable via stdin for the registry login
authentication.
This allows to remove the no_log statement and see the task output
without displaying the password value.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a0e1a450d3)
Support enabling/disabling the pg autoscaler for rgw pools.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9f03a527ba)
This commit adds the parameter `--storage.tsdb.retention.time` to the
prometheus systemd unit template.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1928000
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b60c61ce45)
Currently NFS Ganesha (ceph-nfs) consumes /etc/idmapd.conf, which
controls mapping of user/owner identities under NFSv4+. With
containerized service deployment, this file is an immutable part of the
container image and cannot be modified.
Here we provide group variables, and a taskk and templates for the
ceph-nfs role, to set the path of the idmap configuration file and
to make the most common adjustment to the contents of that file --
namely to set the 'Domain'. We default the path to /etc/ganesha/idmap.conf
so that we will not conflict with /etc/idmapd.conf on the controller nodes
where ganesha runs. NFSv4 clients, as used for example by the Cinder NFS
driver, consume /etc/idmapd.conf and may require different settings than
what is wanted for NFS Ganesha. Additionally, because we already bind
/etc/ganesha from the host into the ceph-nfs container, the file NFS
Ganesha consumes will no longer be an immutable part of the container.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925646
Signed-off-by: Tom Barron tpb@dyncloud.net
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2db2208e40)
This add a quick documentation in ceph-defaults about `igw_network`
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c5728bdc63)
rbd-mirroring is not configured as adding peer is getting skipped.
Peer addition should not get skipped if its not added already
Closes - https://bugzilla.redhat.com/show_bug.cgi?id=1942444
Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
(cherry picked from commit 006998e804)
This adds the possibility to deploy the dashboard with igw nodes using
a dedicated subnet.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1926170
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c33de174f1)
This converts some missed calls to `ansible_*` that were missed in
initial PR #6312
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0163ecc924)
As a continuation of a7f2fa73e6, this
change switches fact injection to off by default in the provided
ansible.cfg.
Signed-off-by: Alex Schultz <aschultz@redhat.com>
(cherry picked from commit db031a4993)
It has come to our attention that using ansible_* vars that are
populated with INJECT_FACTS_AS_VARS=True is not very performant. In
order to be able to support setting that to off, we need to update the
references to use ansible_facts[<thing>] instead of ansible_<thing>.
Related: ansible#73654
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1935406
Signed-off-by: Alex Schultz <aschultz@redhat.com>
(cherry picked from commit a7f2fa73e6)
due to recent changes in shaman, we must fetch the right repo by
filtering on the desired architecture.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5801171b37)
this updates the `ceph_repository_community` check in `ceph-validate`
with the right ceph release expected.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 47b9b75ace)
This commit updates the default version of nfs-ganesha to V3.5 which is the
latest version available upstream.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c78388e580)
When collocating OSDs with other daemon, `num_osds` is incorrectly calculated
because `ceph-config` is called multiple times.
Indeed, the following code:
```
num_osds: "{{ lvm_list.stdout | default('{}') | from_json | length | int + num_osds | default(0) | int }}"
```
makes `num_osds` be incremented each time `ceph-config` is called.
We have to reset it in order to get the correct number of expected OSDs.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 31a0f2653d)
Sometimes it's useful to be able to skip the OSD creation step when
running ceph-ansible (cf #1777). The lvm scenario has a prepare_osd
tag on the relevant play. This commit adds the same tag to the
lvm-batch scenario.
Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
(cherry picked from commit 88d119e95a)
http://docs.ceph.com/docs/nautilus/radosgw/frontends/ 404s so replace
it with a working "pacific" docs link, and correct the spelling of
"additional" while I'm at it.
Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
(cherry picked from commit 847611048e)
the `ceph_cmd` fact is missing the `--net=host` parameter.
Some tasks consuming this fact can fail like following:
```
Error: error configuring network namespace for container b8ec913db1fb694ae683faf202680de7a59c714a004e533aba87e8503d29261f: Missing CNI default network
```
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1931365
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f143b1a647)
Given there's no pacific packages available at
https://download.ceph.com, let's use shaman in order to test against
Ceph Pacific
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The monitoring node running grafana needs the rhcs tools repostory
enabled in non containerized deployment to be able to install the
ceph-grafana-dashboards rpm package.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918650
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
if `rgw_zonegroupmaster` is not defined at the rgw instance level in
`rgw_instances` it will fallback to a wrong variable (`rgw_zonemaster`).
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925247
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Due to recent changes in shaman, there's a chance it returns the wrong
repository from architecture point of view.
We can query shaman and ask for the correct architecture to get around
this.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
There's no need to set the rgw_instances_all fact for each node. We can
rely on run_once for that one.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
We already do that in the other systemd templates (mgr, mds, etc..)
and would present to add workaround in other orchestration tool.
This change is for containerized deployment only.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1882724
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
since `ceph-rgw` may be called from `ceph-handler` in some contexts we
should avoid rerunning it unnecessarily.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Add the possibility to deploy rgw multisite configuration with a mix of
secondary and primary zones on a same rgw node.
Before that, on a same node, all instances were either primary
zones *OR* secondary.
Now you can define a rgw instance like following:
```
rgw_instances:
- instance_name: 'rgw0'
rgw_zonemaster: false
rgw_zonesecondary: true
rgw_zonegroupmaster: false
rgw_realm: 'france'
rgw_zonegroup: 'zonegroup-france'
rgw_zone: paris-00
radosgw_address: "{{ _radosgw_address }}"
radosgw_frontend_port: 8080
rgw_zone_user: jacques.chirac
rgw_zone_user_display_name: "Jacques Chirac"
system_access_key: P9Eb6S8XNyo4dtZZUUMy
system_secret_key: qqHCUtfdNnpHq3PZRHW5un9l0bEBM812Uhow0XfB
endpoint: http://192.168.101.12:8080
```
Basically it's now possible to define `rgw_zonemaster`,
`rgw_zonesecondary` and `rgw_zonegroupmaster` at the intsance
level instead of the whole node level.
Also, this commit adds an option `deploy_secondary_zones` (default True)
which can be set to `False` in order to explicitly ask the playbook to
not deploy secondary zones in case where the corresponding endpoint are
not deployed yet.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1915478
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This update the grafana container tag to 6.7.4.
The RHCS version is now based on the RHCS 5 container image which is
also based on 6.7.4.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The "latest" ceph container tag references the latest stable release
(octopus at the moment). "latest" is an alias on "latest-octopus".
On the devel branch we should use "latest-master" tag instead.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Due to missing condition on `cephx` variable, cephx disabled deployments
are broken.
This commit fixes this.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1910151
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit checks the length of `virtual_ips` doesn't exceed the length
of `groups[rgwloadbalancer_group_name]`.
It also ensure this variable is defined when
`groups[rgwloadbalancer_group_name]` contains at least one node.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
While 2ca33641 fixed a bug in the way the `keepalived.conf.j2` template matched
hostnames to set the VRRP `MASTER`/`BACKUP` states, it also introduced a
regression in the case where `virtual_ips` is a list of more than one IP
address.
The previous behavior would result in each host in the `rgwloadbalancers` group
to be `MASTER` for one of the `virtual_ips`, but the new behavior caused the
first host to be `MASTER` for all the IP address in `virtual_ips`.
This commit restores the original behavior.
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
Instead of using the command module for retrieving a sysctl value then
we can use the slurp module and read the value directly from /proc.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The path where ceph.conf is located (/etc/ceph) missing in the Docker container bind mounts, this throws errors
Signed-off-by: Mike Currin <currin@gmail.com>
When collocating rgw with either a mon, mgr or osd, switching from
single site to a multisite rgw setup failed because of the handlers
triggered between the ansible play of the collocated daemon and the play
of the rgw. Since the multisite changes are not yet applied the handlers
fail.
The idea here is to ensure we run the multisite configuration from the
ceph-handler role before the restart happens, this way it won't complain
because of non existing multisite configuration.
(Note: this is also valid when simply changing a multisite configuration)
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1888630
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When creating a new pool, target_size_ratio was ignored by ansible module ceph_pool.py.
target_size_ratio is now used when pg_autoscale_mode is on.
Tests added to library tests.
This adds too the use in the role ceph-rgw.
Signed-off-by: Fabien Brachere <fabien.brachere@celeste.fr>
Since the major ceph-volume lvm batch refactoring, the report value
is different.
Before the refact, the report was a dict with the OSDs list to be created
under the "osds" key.
After the refact, the report is a list of dict.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
set fetch_directory variable in default/main.yml instead of using the
defaults jinja filter in tasks/main.yml.
Fixes: #6072
Signed-off-by: Karl-Heinz Preuß <karl-heinz.preuss@cms.hu-berlin.de>
This reverts commit 4d1fdd2b05.
This breaks the backward compatibility with previous osd_memory_target
calculation and we could have a value lower than the minimum value allowed
(896M) which causes some ceph commands to fail (like ceph assimilate-conf).
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The alertmanager, grafana and prometheus configuration file are
generated with the template module which doesn't allow for using
config overrides.
Instead we could use the config_template plugin action and add a
new variable for overrides (one for each component).
With this patch, one should be able to add configuration to
prometheus with the following:
---
alertmanager_conf_overrides:
global:
smtp_smarthost: 'localhost:25'
...
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1902999
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
81233dd introduced a regression with the ceph_ec_profile module call in
the ceph-rgw role due the missing cluster module parameter.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The conversion fact task was only executed when the grafana_server_group_name
variable was explicitly set in the user configuration. If an user was using
the default value then the conversion wasn't executed.
This also adds back the default grafana_server_group_name value in case user
was using the default value and to avoid undefined variable error.
Instead of hardcoding the "monitoring" group name then we can reuse the
monitoring_group_name variable.
There's no need to override the monitoring_group_name variable, it's either
using the default value or the one defined by the user.
Finally removing the delegate_to statement on the add_host task since it's
always executed on the ansible controller.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1903732
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Since the backing generate_secret() just hands out urandom output,
running as privileged doesn't seem to be required. It's not
desireable to provide sudo in some Ansible runner environments.
Signed-off-by: Jukka Nousiainen <jukka.nousiainen@csc.fi>
Let's discard the ansible lint error 306 and add a "# noqa 306" on tasks
where we don't need `set -o pipefail`
Fixes: #6090
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We should always use the ceph_volume ansible module when possible.
This patch replace the ceph-volume inventory and lvm {list,zap} commands
called via the command/shell modules by the corresponding call with the
ceph_volume module.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>