The only useful ansible group for the grafana/prometheus stack is
grafana-server so no one of those files are actually needed.
The default values for all dashboard roles are present in ceph-defaults
role so it's also present in in group_vars/all.yml.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The registry.redhat.io regsitry requires authentication so before pulling
the RHCS 4 container images from the registry we need to do the login
step.
This is done via the new ceph_docker_registry_auth variable. The
default value is false but true for RHCS setup.
When set to true, you need to provide the username and password
for the registry via the associated variables.
This patch also updates the ceph_docker_registry value for RHCS setup.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1748911
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
On RHEL 8 system we should check the /usr/libexec/platform-python path
instead of installing python36 package.
[DEPRECATION WARNING]: Distribution redhat 8.0 on host xxxxx should use
/usr/libexec/platform-python, but is using /usr/bin/python for backward
compatibility with prior Ansible releases. A future Ansible release will
default to using the discovered platform python for this host.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
If we're looking at the mon hostname in the ceph status output then
there's some scenarios where this could be true.
If we collocate some services (mons, mgrs, etc..) then the hostname of
the monitor to shrink will still be present in the ceph status (like
in mgrs or other).
Instead we should check the hostame only in the mon part of the output.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Since the pg_autoscaler has been enabled recently in ceph, this check
should stick to validate the requested pools are well created only.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
In containerized deployment, the restart OSD handler couldn't be
triggered in most ansible execution.
This is due to the usage of run_once + a condition on the inventory
hostname and the last filter.
The run_once is triggered first so ansible will pick a node in the
osd group to execute the restart task. But if this node isn't the
last one in the osd group then the task is ignored. There's more
probability that the task will be ignored than executed.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The ceph-rbd-mirror role allows to copy the admin keyring via the
copy_admin_key variable but there's actually no task in that role
doing the job.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The admin keyring isn't present by default on the rbd mirror nodes so
the rbd commands related to the mirroring confguration will fail.
Instead we can use the rbd mirror client keyring.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
During an upgrade we're installation the platform with the stable-3.2
branch. But the ansible configuration is still using the file from the
current branch which could have some differences.
Instead we can override the ANSIBLE_CONFIG environment variable with
the stable-3.2 commands.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Ganesha cannot be operated active/active, in those deployments
where it is managed by pacemaker the container name can be
different than the default.
This change uses "ceph_nfs_service_suffix" where previously
missing to ensure tasks will work with customized names.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1750005
Signed-off-by: Giulio Fidente <gfidente@redhat.com>
The ANSIBLE_CONFIG value wasn't set correctly for two scenarios. This
environment variable doesn't use '-F'.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The rbd mirror configuration was only available for non containerized
deployment and was also imcomplete.
We now enable the mirroring on the pool and add the remote peer in both
scenarios.
The default mirroring mode is set to 'pool' but can be configured via
the ceph_rbd_mirror_mode variable.
This commit also fixes an issue on the rbd mirror command if the ceph
cluster name isn't using the default value (ceph) due to a missing
--cluster parameter to the command.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1665877
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
We should pin down the versions of downstream container for dashboard
instead of using upstream containers.
Signed-off-by: Boris Ranto <branto@redhat.com>
This change fixes the discovered_interpreter_python variable
name that was "discovered_python_interpreter" and caused a
failure in OSP deployments.
Signed-off-by: fmount <fpantano@redhat.com>
When upgrading from stable to devel release with redhat community
packages, the rpm packages are not updated due to priority introduced
via a7b1e35 (starting nautilus).
We need to remove the ceph stable repositories when configuring the
dev repositories.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
We have no reason to make grafana container
listen on *:<port>, so this change adds the
http_addr option to the grafana config file
and adds the related option on the wait_for
tasks.
Since grafana_server_addr should exists, we
shouldn't rely on the _current_monitor_addr
default on prometheus/grafana templates.
This change also remove this default value
that is not necessary anymore.
Signed-off-by: fmount <fpantano@redhat.com>
This commit also remove the notify on new added debian repo,
force update_cache to yes and define sample ceph_custom_key vars.
Signed-off-by: Anthony Rusdi <33247310+antrusd@users.noreply.github.com>
Instead of hardcoding `luminous`, use the `ceph_stable_release` variable
to point to the correct repository.
This is now uncommented in roles/ceph-defaults/defaults/main.yml to be
available, as it is only used if ceph_repository is set to 'obs'.
group_vars/*.sample files have been regenerated using the
./generate_group_vars_sample.sh script.
Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
roles/ceph-common/tasks/installs/suse_obs_repository.yml:
ansible's zypper_repository module does not know a parameter 'uri', this is
called 'repo' instead
Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
We don't use multiple grafana nodes for the moment on the others
scenarios and I don't think this is supposed to be working.
We can often see failure on grafana on that scenario.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
test switch_to_containers job against the latest ceph@master
ceph-container image tag available.
In order to be sure the ceph release deployed in the first step (non
containerized deployment) isn't newer than the tag used for the
containerized migration (which would mean we try to downgrade the
version).
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
If the user has set the `ansible_python_interpreter`, ansible will not try to
discover python, so `discovered_python_interpreter` will not be set.
Solution: Set `discovered_python_interpreter` to `ansible_python_interpreter`
if `ansible_python_interpreter` is defined
Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
On containerized deployment, the mon container sometimes needs to
access to the radosgw endpoint (via the radosgw-admin command). When
using TLS on the radosgw with self-signed certificates then we need to
access to the CA certification from the mon container.
The CA certificate needs to be added on the host and then the directory
will be bind mount on the container.
Resolves: #4358
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Like the OpenStack keyrings, we can use the profile rbd for the clients
keyring (both mon and osd).
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This reverts commit 2d955757ee.
The "osd blacklist" isn't an osd caps but should be used with mon caps.
Also the correct caps for this is: 'allow command "osd blacklist"'.
The current change is breaking the openstack and clients keyrings.
By using the profile rbd (which is already used) we already rely on the
ability to blacklist dead client.
Resolves: #4385
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This commit makes it possible to parametrize the ceph directories modes.
So it changes hardocded mode for ceph related directories from 0755 to
customizable with `ceph_directories_mode` variable.
Closes: #2920
Signed-off-by: Artur Fijalkowski <artur.fijalkowski@ing.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
since the following commit:
commit 1ac94c048f
rgw: add support for multiple rgw instances on a single host
we have multi-instance rgw support on a single host and
the config section name of the rgw changed from
[client.rgw.$(hostname)] -> [client.rgw.$(hostname).rgwX]
when X is the sequence number: 0,1,2,...
So we should assign 'rgw_zone' item to the exact rgw instance
config section in ceph.conf
Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>
This commit fixes the error [301]:
`[301] Commands should not change things if nothing needs doing`
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit fixes the error [306]:
`[306] Shells that use pipes should set the pipefail option`
using `/bin/bash` as executable because Debian/Ubuntu systems use `dash`
by default which doesn't have the `-o pipefail`. (See:
https://github.com/ansible/ansible-lint/issues/497#issue-424623501)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Allow the use of 'obs' as a valid value for ceph_repository, and validate that
- OS is openSUSE
- ceph_obs_repo is defined
Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
Move the validation from roles/ceph-common/tasks/installs/install_on_suse.yml
to roles/ceph-validate/ and fix the syntax.
There are two valid combinations of `ceph_origin` and `ceph_repository` on
SUSE/openSUSE:
- ceph_origin == 'distro'
- ceph_origin == 'repository' and ceph_repository == 'obs'
The current when condition would fail even in the valid second combination,
as ceph_origin != distro would be true then
Fixes: #4362
Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
The "run 'ceph-volume lvm batch --report' to see how many osds are to be
created" and "run 'ceph-volume lvm list' to see how many osds have already been
created" statements only register the lvm_batch_report and lvm_list variables.
Running those ceph-volume commands should never produce a change on the system.
Adding changed_when: false prevents irrelevant change messages from Ansible.
Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
As SUSE 15.x and openSUSE Leap 15.x share the same base, make clear
that both are targeted by the respective tasks
Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
On containerized deployment, the OSD entrypoint runs some ceph-volume
commands (lvm/simple scan and/or activate) which perform badly without
the ulimit option.
This option was added for all previous ceph-volume commands but not on
the ceph-osd container startup.
Also updating hard limit value to 4096 to reflect default baremetal
value.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
By changing the set ownership command from using the file module in combination with a with_items loop to a raw chown command, we can achieve a 98% performance increase here.
On a ceph cluster with a significant amount of directories and files in /var/lib/ceph, the file module has to run checks on ownership of all those directories and files to determine whether a change is needed.
In this case, we just want to explicitly set the ownership of all these directories and files to the ceph_uid
Added context note to all set proper ownership tasks
Signed-off-by: Kevin Jones <kevinjones@redhat.com>
roles/ceph-validate/tasks/check_nfs.yml: fail on openSUSE Leap
using `ceph_origin = distro`, as the ganesha packages are not available from
the distribution repositories
Fixes: #4342
Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
install packages on SUSE/openSUSE distributions, using the
same logic as on RedHat-based distributions
Fixes#4340
Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
roles/ceph-common/tasks/installs/install_on_suse.yml: remove the task that
installs the dependencies, as this is done later in install_suse_packages.yml
Signed-off-by: Johannes Kastl <kastl@b1-systems.de>