Move dashboard, grafana/prometheus and node-exporter plays into a
dedicated playbook in infrastructure-playbook directory.
To avoid using 'dashboard_enabled | bool' condition multiple time
in the main playbook we can just import the dashboard playbook or
not.
This patch also allows to use an unique dashboard playbook for
both baremetal and container playbooks.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 43135840b1)
Even if dashboard feature is disabled then the installer status will
still report dashboard, grafana and node-exporter roles timing.
INSTALLER STATUS **********************************
Install Ceph Monitor : Complete (0:01:21)
Install Ceph Manager : Complete (0:00:49)
Install Ceph Dashboard : Complete (0:00:00)
Install Ceph Grafana : Complete (0:00:02)
When need to set the dashboard_enabled condition on those installer
phase pre/post tasks.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0604275c11)
This commits adds the support of the installer phase for dashboard,
grafana and node-exporter roles.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c7a5967a6f)
The ceph-agent role was used only for RHCS 2 (jewel) so it's not
usefull anymore.
The current code will fail on CentOS distribution because the rhscon
package is only avaible on Red Hat with the RHCS 2 repository and
this ceph release is supported on stable-3.0 branch.
Resolves: #4020
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 7503098ca0)
ceph-dashboard should be deployed on either a dedicated mgr node or a
mon if they are collocated.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit bdc870cbf5)
Few fixes on systemd unit templates for node_exporter and
alertmanager container parameters.
Added the ability to use a dedicated instance to deploy the
dashboard components (prometheus and grafana).
This commit also introduces the grafana_group_name variable
to refer grafana group and keep consistency with the other
groups.
During the integration with TripleO some grafana/prometheus
template variables resulted undefined. This commit adds the
ability to check if the group exist and create, accordingly,
different job groups in prometheus template.
Signed-off-by: fmount <fpantano@redhat.com>
(cherry picked from commit 069076bbfd)
By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these:
```
[DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable,
this behaviour will go away and you might need to add |bool to the expression
in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This
feature will be removed in version 2.12. Deprecation warnings can be disabled
by setting deprecation_warnings=False in ansible.cfg.
```
Now appended ``| bool`` on a lot of the affected variables.
Sometimes the coding style from ``variable|bool`` changed to ``variable | bool`` *(with spaces at the pipe)*.
Closes: #4022
Signed-off-by: L3D <l3d@c3woc.de>
(cherry picked from commit ab54fe20ec)
This commit splits the current `ceph-container-common` role.
This introduces a new role `ceph-container-engine` which handles the
tasks specific to the installation of containers tools (docker/podman).
This is needed for the ceph-dashboard implementation for 2 main reasons:
1/ Since the ceph-dashboard stack is only containerized, we must install
everything needed to run containers even in non containerized
deployments. Splitting this role allows us to not have to call the full
`ceph-container-common` role which would run a bunch of unneeded tasks
that would have been skipped anyway.
2/ The current implementation would have required to run
`ceph-container-common` on all ceph-clients nodes which would have been
conflicting with 9d3517c670 (we don't want
to run ceph-container-common on all client nodes, see mentioned commit
for more details)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 55420d6253)
use a block in grafana-server section to avoid duplicate condition.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit be4a565612)
use site.yml to deploy ceph-container-common in order to install docker
even in non-containerized deployments since there's no RPM available to
deploy the differents applications needed for ceph-dashboard.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cdff0da7d4)
This commit will merge dashboard-ansible installation scripts with
ceph-ansible. This includes several new roles to setup ceph-dashboard
and the underlying technologies like prometheus and grafana server.
Signed-off-by: Boris Ranto & Zack Cerza <team-gmeno@redhat.com>
Co-authored-by: Zack Cerza <zcerza@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2f141a6e80)
Keywords requiring only one item shouldn't express it by creating a
list with single item.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 739a662c80)
Conflicts:
roles/ceph-mon/tasks/ceph_keys.yml
roles/ceph-validate/tasks/check_devices.yml
Let's bootstrap mgrs on monitors only if there's no mgrs section in
inventory hostfile.
Closes: #3613
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
is_atomic is defined in ceph-facts or very early in main playbook.
In non containerized deployment, is_atomic is only set in ceph-facts
which is played after ceph-validate.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
to avoid duplicating code in `site.yml.sample`, `site-docker.yml.sample`
and `setup.yml`, let's isolate this part of the code and simply include
it each time we need it.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
configure_firewall tests the is_atomic variable if the firewalld package
is not present. is_atomic is defined in ceph_facts so include that.
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
This condition is useless and it's also creating issues we don't see in
our CI. ceph_release is set by either ceph-common or ceph-docker-common
so let's keep it this way.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Add `any_errors_fatal: true` to all host sections in `site.yml.sample`
and `site-container.yml.sample` so that the playbook execution
ceases spontaneously and instantaneously when errors occurs.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
sometimes we play the whole role `ceph-defaults` just to access the
default value of some variables. It means we play the `facts.yml` part
in this role while it's not desired. Splitting this role will speedup
the playbook.
Closes: #3282
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since sharing variables amongst roles has been made default since
Ansible 2.6, private option has been deprecated; so stop using it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
This will speed up the deployment and also deploy mon and mgr collocated
just as recommended.
This won't prevent you of adding more and dedicaded machines for mgr if
needed.
Signed-off-by: Sébastien Han <seb@redhat.com>
We need to apply any_errors_fatal: true to every play so it can take
effect, not only on the initial pass. With this flag, any error in the
playbook will cause the playbook to stop.
Signed-off-by: Sébastien Han <seb@redhat.com>
Since import_role and include_role are more readable, explicit (about
the nature of inclusion) and flexible (allows placibf inclusion
anywhere) amongst the tasks, use them instead of using roles or role
keyword. Besides, these keywords also allow more arguments.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
As of now, we should no longer support Jewel in ceph-ansible.
The latest ceph-ansible release supporting Jewel is `stable-3.1`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
If someone's cluster name is 'ceph' then the playbook will fail (with no
errors because of ignore_errors) saying it can not find the variable. So
let's declare the default. If the cluster name is different then it'll
be in group_vars and thus there won't be any failre.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636962
Signed-off-by: Sébastien Han <seb@redhat.com>
The role contains all the handlers for Ceph services. We decided to
leave ceph-defaults role with variables and a few facts only. This is
useful when organizing the site.yml files and also adding the known
variables to infrastructure-playbooks.
Signed-off-by: Sébastien Han <seb@redhat.com>
Same problem again... ceph_release_num[ceph_release] is only set in
ceph-docker-common/common roles so putting the condition on that role
will never work. Removing the condition.
The downside of this is we will be installing packages and then skip the
role on the node.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622210
Signed-off-by: Sébastien Han <seb@redhat.com>
Recently we renamed the group_name for iscsi iscsigws where previously
it was named iscsi-gws. Existing deployments with a host file section
with iscsi-gws must continue to work.
This commit adds the old group name as a backoward compatility, no error
from Ansible should be expected, if the hostgroup is not found nothing
is played.
Close: https://bugzilla.redhat.com/show_bug.cgi?id=1619167
Signed-off-by: Sébastien Han <seb@redhat.com>
ceph-rest-api binary has been removed in mimic so we cannot deploy it
anymore. We just keep the role and the compability for existing users.
Signed-off-by: Sébastien Han <seb@redhat.com>
Check `systempython2.stat` instead of `systempython2.stat.exists`.
Without this change, in the case that python2 is not installed, the `stat`
task fails without defining `systempython2.stat`. It leads that the next
installation tasks fail because of undefined `systempython2.stat`.
An example error output (edited for readability):
```
TASK [check for python2] ***********************************************
Wednesday 25 July 2018 14:52:47 +0900 (0:00:00.182) 0:00:00.182 *
fatal: [ceph-osd1.vlan221.vtj]: FAILED! => {
"changed": false, "module_stderr": "/bin/sh: 1: /usr/bin/python: not
found\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 127}
...ignoring
TASK [install python2 for debian based systems] ************************
Wednesday 25 July 2018 14:51:00 +0900 (0:00:01.742) 0:00:01.926 *
fatal: [ceph-mon2]: FAILED! => {
"msg": "The conditional check 'systempython2.stat.exists is undefined or
systempython2.stat.exists == false' failed. The error was: error while
evaluating conditional (systempython2.stat.exists is undefined or
systempython2.stat.exists == false): 'dict object' has no attribute 'stat'
\n\n The error appears to have been in
'/Users/arata/git/ceph-ansible/site.yml.sample': line 36, column 7, but
may\n be elsewhere in the file depending on the exact syntax problem.\n\n
The offending line appears to be:\n\n\n
- name: install python2 for debian based systems\n
^ here\n
"}
...ignoring
```
Fixes: #2930
Signed-off-by: Arata Notsu <arata776@gmail.com>
Let's try to avoid using dashes as testinfra needs to be able to read
the groups.
Typically, with iscsi-gws we can't add a marker for these iscsi nodes,
using an underscore fixes the issue.
Signed-off-by: Sébastien Han <seb@redhat.com>
This needs to be in it's own play with ceph-defaults included
so that I can validate things that might be defaulted in that
role.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.
Fixes: #2553
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This was taken from the openshift ansible repository here:
https://github.com/leseb/openshift-ansible/tree/master/roles/installer_checkpoint
Rationale:
A complete OpenShift cluster installation is comprised of many different
components which can take 30 minutes to several hours to complete. If
the installation should fail, it could be confusing to understand at
which component the failure occurred. Additionally, it may be desired to
re-run only the component which failed instead of starting over from the
beginning. Components which came after the failed component would also
need to be run individually.
Ceph has a similar situation so we can benefit from that
callback_plugin.
Signed-off-by: Sébastien Han <seb@redhat.com>