When using a quote in the registry password then we have the following
error:
The error was: ValueError: No closing quotation
To fix this we need to use the quote filter.
Close: https://bugzilla.redhat.com/show_bug.cgi?id=1880252
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
the current condition doesn't work, as soon as the first iteration is
done the condition makes next iterations skip since `rgw_instances` got
set with the first iteration.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1859872
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Looks like nfs-ganesha 3.3 and 4.-dev doesn't work with recent changes
in librgw 16.0.0.
The nfs-ganesha daemon is segfaulting and restart in a loop.
See https://tracker.ceph.com/issues/47520
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Since [1] The bytes_used pool counter in prometheus has been renamed
to stored.
Closes: #5781
[1] https://github.com/ceph/ceph/commit/71fe9149
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
When using a http(s) proxy with either docker or podman we can rely on
the HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables.
But with ansible, even if those variables are defined in a source file
then they aren't loaded during the container pull/login tasks.
This implements the http(s) proxy support with docker/podman.
Both implementations are different:
1/ docker doesn't rely en the environment variables with the CLI.
Thos are needed by the docker daemon via systemd.
2/ podman uses the environment variables so we need to add them to
the login/pull tasks.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876692
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
When running the switch2container playbook on a Debian based system
then the systemd unit path isn't the same than Red Hat based system.
Because the systemd unit files aren't removed then the new container
systemd unit isn't take in count.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Most ansible module using a state parameter default to the present
value (when available) instead of using it as a mandatory option.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Instead of using run_once: true on each tasks in a block section, we
can use the run_once statement at the block level.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
We don't need to install node-exporter on client node because there's
no ceph services running on them.
This also makes sure we use the group name variables in the prometheus
service template instead of hardcoding the values.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Most ansible module using a state parameter default to the present
value (when available) instead of using it as a mandatory option.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This adds the ceph_dashboard_user ansible module for replacing the
command module usage with the ceph dashboard ac-user-xxx command.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Before [1] we were using default value for
- size
- min_size
- rule_name
when the key wasn't present in the pool dict.
The commit [1] changed this by defaulting to omit.
This patch restores the original workflow by using facts:
- osd_pool_default_size
- osd_pool_default_min_size
- ceph_osd_pool_default_crush_rule_name
[1] af9f6684f2
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Even if the non containerized collocation scenario deploys ceph with
RPMs then we also deploy the dashboard/monitoring but with containers.
This requires to set the registry variable to ceph's quay.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
We already do this in the site-container.yml playbook because we don't
need docker/podman installed on all client nodes and having the
container image only on the first client node.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
When running the rolling_update playbook with an inventory without
monitor nodes defined (like external scenario) then we can't retrieve
the cluster fsid from the running monitor.
In this scenario we have to pass this information manually (group_vars
or host_vars).
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1877426
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Since [1] we can use the ceph_pool module instead of using the command
module combined with ceph osd pool commands.
[1] bddcb439ce
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This changes the grafana container image regitry from docker.io to
quay.io to avoid rate limit.
This also adds the missing container image values for docker2podman
and podman scenarios.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
In the OSP context, during the rolling update the playbook fails
with the following error:
'''
ERROR! The field 'hosts' has an invalid value, which includes an
undefined variable. The error was: list object has no element 0
'''
This PR just change the hosts field providing a valid mons group
value.
Closes: https://bugzilla.redhat.com/1876803
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
On DCN environments, or when multiple ceph cluster are configured,
we need to specify the cluster name before running the command or
the rolling_update playbook will fail during minor updates.
Closes: https://bugzilla.redhat.com/1876447
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
This commit diables nfs-ganesha testing on master for non-containerized
deployment because the dev repos are broken at the moment.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This re-adds the ceph-iscsi testing for both non containerized and
containerized deployment since the rados connection error on ceph
dev has been fixed [1].
[1] https://tracker.ceph.com/issues/47002
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Add the `check` option to server definitions to enable basic HAProxy health
checks for Ceph RADOS gateway backends.
Currently traffic will be forwarded to unhealthly `radosgw.service` servers.
These changes resolve the issue.
Signed-off-by: Niko Smeds nikosmeds@gmail.com
There's no need to use `ignore_errors: true` on these tasks.
Using a loop on the task stopping mon daemons allows us to avoid
duplicating this task, the `ignore_errors` isn't needed here because it
won't fail the playbook if one of the ID doesn't exist (shortname vs. fqdn)
Using the right condition on the task starting the mgr daemon allows us
to avoid using an `ignore_errors: true` as well.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit refactors the code to remove a duplicate condition and it
makes the `state: absent` code idempotent
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since there is a check if ceph_custom_key is defined, there is no reason
to define it by default.
Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com>
This commit moves the erasure pool creation testing from `all_daemons`
to `lvm_osds` so we can decrease the number of osd nodes we spawn so the
OVH Jenkins slaves aren't less overwhelmed when a `all_daemons` based
scenario is being tested.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Regardless of the outcome of Ansible 2.9.12 issue 71200
we can set a default permission for these files.
Closes: https://github.com/ceph/ceph-ansible/issues/5677
Signed-off-by: John Fulton <fulton@redhat.com>
When using fqdn in inventory host file, this task will fail because the
mds is registered with its shortname.
It means we must use `mds_to_kill_hostname` in this task.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1869837
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
For intsance, there is no need to install logrotate on clients nodes.
This also ensure logrotate is installed only for containerized
deployments since the packaging has an explicit dependency to logrotate
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since we've dropped ubuntu testing, we don't need these inventories
anymore. Let's remove this leftover.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Temporarily disable iscsigw testing for containerized deployments
because it's broken upstream on ceph@master.
non-containerized deployments use stable build for iscsigw to get around
this issue.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We already support specifiying a custom crush rule during pool creation
in ceph-osd role but not in ceph-rgw role.
This patch adds the missing code to implement this feature.
Note this is only available for replicated pool not erasure. The rule
must also exist prior the pool creation.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1855439
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
We only need the container engine to be installed on the first clients
node in order to execute the pools/keys operation. We already do the
same worflow with the ceph-container-common role which pull the ceph
container image.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>