The `osd_pool_default_crush_rule` is set based on `crush_rule_variable`, which
is the output of a `grep` command.
However, two consecutive tasks can set that variable, and if the second task is
skipped, it still overwrites the `crush_rule_variable`, leading the
`osd_pool_default_crush_rule` to be set to `ceph_osd_pool_default_crush_rule`
instead of the output of the first task.
This commit ensures that the fact is set right after the `crush_rule_variable`
is assigned, before it can be overwritten.
Closes#5912
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit c5f7343a2f)
The ceph status command returns a lot of information stored in variables
and/or facts which could consume resources for nothing.
When checking the rgw/rbdmirror services status, we're only using the
servicmap structure in the ceph status output.
To optimize this, we could use the ceph service dump command which contains
the same needed information.
This command returns less information and is slightly faster than the ceph
status command.
$ ceph status -f json | wc -c
2001
$ ceph service dump -f json | wc -c
1105
$ time ceph status -f json > /dev/null
real 0m0.557s
user 0m0.516s
sys 0m0.040s
$ time ceph service dump -f json > /dev/null
real 0m0.454s
user 0m0.434s
sys 0m0.020s
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3f9081931f)
In case of deploying new monitor node to an existing cluster,
osd_pool_default_crush_rule should be taken from running monitor because
ceph-osd role won't be run and the new monitor will have different
osd_pool_default_crush_role from other monitors.
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit ff9f4d138f)
delegate to its own host after checking mon socket to findout if mon socket is in-use or not.
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 69f7e35382)
There's no need to define a variable via a fact if we can do it via a
default value. Using a fact could be interesseting to override the
default value on some condition.
- ceph_uid could be set to 167 by default because it's only different on
non containerized deployment on Debian/Ubuntu.
- rbd_client_directory_{owner,group,mode} could be set to ceph,ceph,0770
by default install of null as we are doing in the facts.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1875058
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 7f997e623a)
When running the rolling_update playbook with an inventory without
monitor nodes defined (like external scenario) then we can't retrieve
the cluster fsid from the running monitor.
In this scenario we have to pass this information manually (group_vars
or host_vars).
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1877426
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f63022dfec)
This commit fixes these tasks when --limit is used.
It makes sure the fact is set on right nodes even when the playbook is
run with `--limit`
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f8a951f50c)
always set these facts on monitor nodes whatever we run with `--limit`.
Otherwise, playbook will fail when using `--limit` on nodes where these
facts are used on a delegated task to monitor.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5e81843e9)
This commit fixes a bug when trying to scale out osd nodes with
`crush_rule_config` is enabled.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822599
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4bcc52cb2a)
When `rule_name` isn't set in `crush_rules` the osd pool creation will
fail.
This commit adds a new fact `ceph_osd_pool_default_crush_rule_name` with
the default crush rule name.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1817586
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1bb9860dfd)
It looks like that the service module doesn't support wildcard anymore
for stopping/disabling multiple services.
fatal: [rgw0]: FAILED! => changed=false
msg: 'This module does not currently support using glob patterns,
found ''*'' in service name: ceph-radosgw@*'
...ignoring
Instead we should iterate over the rgw_instances list.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 9d3b49293d)
When using different name between the inventory_hostname and the
ansible_hostname then the _container_exec_cmd fact will get a wrong
value based on the inventory_hostname instead of the ansible_hostname.
This happens when the ceph cluster is already running (update/upgrade).
Later the container exec commands will fail because the container name
is wrong.
We should always set the _container_exec_cmd based on the
ansible_hostname fact.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1795792
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1fcafffdad)
fact 'running_mon' is set once 'grep' successfully exits with 'rc == 0'
Signed-off-by: Vytenis Sabaliauskas <vytenis.sabaliauskas@protonmail.com>
(cherry picked from commit ed1eaa1f38)
We don't need to executed the grafana fact everytime but only during
the dashboard deployment.
Especially for ceph-grafana, ceph-prometheus and ceph-dashboard roles.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790303
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f940e695ab)
We must pick up a mon which actually exists in ceph-facts in order to
detect if a cluster is running. Otherwise, it will state no cluster is
already running which will end up deploying a new monitor isolated in a
new quorum.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 86f3eeb717)
that task is delegated on the first mon so we should always use the
`discovered_interpreter_python` from that node.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5adb735c78)
When using `osd_auto_discovery`, `devices` is built multiple times due
to multiple runs of `ceph-facts` role. It end up with duplicate
instances of a same device in the list.
Using `unique` filter when building the list fixes this issue.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 23b1f43897)
in order to be able to call container_binary without having to run the
whole ceph-facts role.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fe5ffe589e)
This commit refacts the way we set `ceph_uid` fact in `ceph-facts` and
removes all `set_fact` tasks for `ceph_uid` in switch-to-containers playbook
to avoid duplicated code.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fa9b42e98e)
During the rolling_update scenario, the fsid value is retrieve from the
current ceph cluster configuration via the ceph daemon config command.
This command tries first to resolve the admin socket path via the
ceph-conf command.
Unfortunately this command won't work if you have a duplicate key in the
ceph configuration even if it only produces a warning. As a result the
task will fail.
Can't get admin socket path: unable to get conf option admin_socket for
mon.xxx: warning: line 13: 'osd_memory_target' in section 'osd' redefined
Instead of using ceph daemon we can use the --admin-daemon option
because we already know what the socket admin path value based on the
ceph cluster and mon hostname values.
Closes: #4492
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ec3b687dc4)
e695efc hasn't been updated with the changes introduced in 9bb11c7 so
the ips_in_ranges filter isn't used for an external grafana instance.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 20b1a464ec)
This change implements a filter_plugin that is used in the
ceph-facts, ceph-validate roles and infrastucture-playbooks.
The new filter plugin will return a list of all IP address
that reside in any one of the given IP ranges. The new filter
replaces the use of the ipaddr filter.
ceph.conf already support a comma separated list of CIDRs
for the public_network and cluster_network options.
Changes: [1] and [2] introduced a regression in ceph-ansible
where public_network can no longer be a comma separated list
of cidrs.
With this change a comma separated list of subnet CIDRs can
also be used for monitor_address_block and radosgw_address_block.
[1] commit: d67230b2a2
[2] commit: 20e4852888
Related-To: https://bugs.launchpad.net/tripleo/+bug/1840030
Related-To: https://bugzilla.redhat.com/show_bug.cgi?id=1740283Closes: #4333
Please backport to stable-4.0
Signed-off-by: Harald Jensås <hjensas@redhat.com>
(cherry picked from commit e695efcaf7)
This change just adds the task to inject from the
ceph dashboard mgr module the required layouts
to show all the cluster metrics on the grafana
instance.
Since we're now able to push grafana layouts through
the ceph mgr module command, the dashboards configuration
template is no longer needed on containerized environments.
This commit also fixes the Vagrantfile IP static assigment
in the grafana section because it generates an issue (it's
the same of the mgr instance).
Finally, considering some deployments that use an external
grafana server instance, we reworked the 'grafana_server_addr'
assignment to address these requirements.
Signed-off-by: fmount <fpantano@redhat.com>
(cherry picked from commit 9bb11c7b2a)
This change fixes the discovered_interpreter_python variable
name that was "discovered_python_interpreter" and caused a
failure in OSP deployments.
Signed-off-by: fmount <fpantano@redhat.com>
(cherry picked from commit 81eb091533)
If the user has set the `ansible_python_interpreter`, ansible will not try to
discover python, so `discovered_python_interpreter` will not be set.
Solution: Set `discovered_python_interpreter` to `ansible_python_interpreter`
if `ansible_python_interpreter` is defined
Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
(cherry picked from commit bd507fa147)
This commit fixes a typo in roles/ceph-facts/tasks/facts.yml
Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
(cherry picked from commit e1b9312084)
there is no need to use `shell` in these tasks. Let's use `command`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4df92152c0)
in order to use the right binary name when using python cli in command
or shell module.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 13815ad3ca)
There's no need to add complexity and trying to fallback on other group.
Let's deploy dashboard on all nodes present in grafana-server group.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d67230b2a2)
This commit moves some old variables into ceph-defaults so we can move
the `use_new_ceph_iscsi` fact in ceph-facts role in order.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a781ce881c)
As the bz1721914 describes, the grafana_server_addr
fact is not defined if ip_version used is ipv6.
This commit adds the ip_version condition to set
correctly this fact.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1721914
Signed-off-by: fmount <fpantano@redhat.com>
(cherry picked from commit e655038743)
If no grafana-server group is defined while an mgr group is, that task
will fail because `hostvars[groups[grafana_server_group_name][0]` can't
return anything since `groups['grafana-server']` will be a non existing
key.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 366b309c12)
Since timesyncd is not available on RHEL-based OSs, change the default
to chronyd for RHEL-based OSs. Also, chronyd is chrony on Ubuntu, so
set the Ansible fact accordingly.
Fixes: https://github.com/ceph/ceph-ansible/issues/3628
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 9d88d3199f)
Few fixes on systemd unit templates for node_exporter and
alertmanager container parameters.
Added the ability to use a dedicated instance to deploy the
dashboard components (prometheus and grafana).
This commit also introduces the grafana_group_name variable
to refer grafana group and keep consistency with the other
groups.
During the integration with TripleO some grafana/prometheus
template variables resulted undefined. This commit adds the
ability to check if the group exist and create, accordingly,
different job groups in prometheus template.
Signed-off-by: fmount <fpantano@redhat.com>
(cherry picked from commit 069076bbfd)
By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these:
```
[DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable,
this behaviour will go away and you might need to add |bool to the expression
in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This
feature will be removed in version 2.12. Deprecation warnings can be disabled
by setting deprecation_warnings=False in ansible.cfg.
```
Now appended ``| bool`` on a lot of the affected variables.
Sometimes the coding style from ``variable|bool`` changed to ``variable | bool`` *(with spaces at the pipe)*.
Closes: #4022
Signed-off-by: L3D <l3d@c3woc.de>
(cherry picked from commit ab54fe20ec)
The fsid generation is done via a python command. When the ansible
controller node only have python3 available (like RHEL 8) then the
python command isn't necessarily present causing the fsid generation
to fail.
We already do some resource creation (like ceph keyring secret) with
the python command too but from the mon node so we should do the same
for fsid.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1714631
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit daf92a9e1f)
This commit renames the `docker_exec_cmd` variable to
`container_exec_cmd` so it's more generic.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e74d80e72f)
This is needed for the ceph-dashboard implementation since it requires
to run containerized application which aren't packaged as RPMs.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6d9dbb1d39)
running an external ceph cluster deployment with (obviously) no
monitors defined in inventory breaks with an undefined error because
`_monitor_addresses` never get defined.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707460
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 936c6fca78)
Adds "check_mode: no" to commands which register cluster state in a
variable and don't modify anything. These commands have to run in order
to support running the playbook in check mode.
Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>
(cherry picked from commit 3c8987c7a5)
Keywords requiring only one item shouldn't express it by creating a
list with single item.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 739a662c80)
Conflicts:
roles/ceph-mon/tasks/ceph_keys.yml
roles/ceph-validate/tasks/check_devices.yml
Otherwise the reader is forced to search for "when" when blocks are too
long.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit e0beaf123a)
Conflicts:
roles/ceph-config/tasks/main.yml
roles/ceph-container-common/tasks/pre_requisites/prerequisites.yml
roles/ceph-validate/tasks/check_devices.yml
ceph-volume didn't work when the devices where passed by path.
Since it now support it, let's allow this feature in ceph-ansible
Closes: #3812
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7e0adca7a4)
Add a couple of fixes to allow containerized deployments upgrade support
to upgrade from luminous/mimic to nautilus.
- pass CEPH_CONTAINER_IMAGE and CEPH_CONTAINER_BINARY environment
variable to the ceph_key module,
- fix the docker exec command in 'waiting for the containerized monitor
to join the quorum' task according to the `delegate_to` parameter,
- override `docker_exec_cmd` in `ceph-facts` with `mon_host` when
rolling_update is `True`,
- do not run unnecessarily `create_mds_filesystems.yml` when performing an
upgrade.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>