Commit Graph

75 Commits (b1f8518ef9b705223554251474be44bf8091151e)

Author SHA1 Message Date
fmount 138fa19ccf Fix units and add ability to have a dedicated instance
Few fixes on systemd unit templates for node_exporter and
alertmanager container parameters.
Added the ability to use a dedicated instance to deploy the
dashboard components (prometheus and grafana).
This commit also introduces the grafana_group_name variable
to refer grafana group and keep consistency with the other
groups.
During the integration with TripleO some grafana/prometheus
template variables resulted undefined. This commit adds the
ability to check if the group exist and create, accordingly,
different job groups in prometheus template.

Signed-off-by: fmount <fpantano@redhat.com>
(cherry picked from commit 069076bbfd)
2019-06-12 11:48:12 +02:00
L3D 1daca1ba83 ansible: use 'bool' filter on boolean conditionals
By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these:
```
[DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable,
this behaviour will go away and you might need to add |bool to the expression
in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This
feature will be removed in version 2.12. Deprecation warnings can be disabled
by setting deprecation_warnings=False in ansible.cfg.
```

Now appended ``| bool`` on a lot of the affected variables.

Sometimes the coding style from ``variable|bool`` changed to ``variable | bool`` *(with spaces at the pipe)*.

Closes: #4022

Signed-off-by: L3D <l3d@c3woc.de>
(cherry picked from commit ab54fe20ec)
2019-06-07 16:05:51 +02:00
guihecheng 606b2e2082 Add section for rgw loadbalancer in site.yml
This drives ceph rgw loadbalancer stuff to run.

Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>
(cherry picked from commit 96c346743b)
2019-06-06 19:44:30 +00:00
Guillaume Abrioux 16c6d530c6 roles: introduce `ceph-container-engine` role
This commit splits the current `ceph-container-common` role.

This introduces a new role `ceph-container-engine` which handles the
tasks specific to the installation of containers tools (docker/podman).

This is needed for the ceph-dashboard implementation for 2 main reasons:

1/ Since the ceph-dashboard stack is only containerized, we must install
everything needed to run containers even in non containerized
deployments. Splitting this role allows us to not have to call the full
`ceph-container-common` role which would run a bunch of unneeded tasks
that would have been skipped anyway.

2/ The current implementation would have required to run
`ceph-container-common` on all ceph-clients nodes which would have been
conflicting with 9d3517c670 (we don't want
to run ceph-container-common on all client nodes, see mentioned commit
for more details)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 55420d6253)
2019-05-22 15:24:11 -04:00
Guillaume Abrioux 406dd2880c playbook: use blocks for grafana-server section
use a block in grafana-server section to avoid duplicate condition.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit be4a565612)
2019-05-17 16:05:58 +02:00
Guillaume Abrioux fe5bcc2f9f dashboard: do not call ceph-container-common from other role
use site.yml to deploy ceph-container-common in order to install docker
even in non-containerized deployments since there's no RPM available to
deploy the differents applications needed for ceph-dashboard.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cdff0da7d4)
2019-05-17 16:05:58 +02:00
Boris Ranto 5ac7559736 Merge cephmetrics/dashboard-ansible repo
This commit will merge dashboard-ansible installation scripts with
ceph-ansible. This includes several new roles to setup ceph-dashboard
and the underlying technologies like prometheus and grafana server.

Signed-off-by: Boris Ranto & Zack Cerza <team-gmeno@redhat.com>
Co-authored-by: Zack Cerza <zcerza@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2f141a6e80)
2019-05-17 16:05:58 +02:00
Rishabh Dave 06b3ab2a6b improve coding style
Keywords requiring only one item shouldn't express it by creating a
list with single item.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 739a662c80)

Conflicts:
	roles/ceph-mon/tasks/ceph_keys.yml
	roles/ceph-validate/tasks/check_devices.yml
2019-05-06 15:09:06 +00:00
Guillaume Abrioux 655bdb189c Revert "site.yml: run ceph-validate before facts/defaults roles"
This commit wasn't making any sense and should have never got merged.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-15 16:17:15 +00:00
Guillaume Abrioux 299c7b670e site.yml: do not bootstrap mgrs on monitors by default
Let's bootstrap mgrs on monitors only if there's no mgrs section in
inventory hostfile.

Closes: #3613

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-26 13:19:06 +00:00
Guillaume Abrioux 7f7f3769b3 main: add a retry/until for python installation
Add a retry/until in raw_install_python.yml to avoid unexpected
repository failures.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-19 16:40:08 +01:00
Guillaume Abrioux 500256cdab validate: fix ntp_daemon_type check in validate
is_atomic is defined in ceph-facts or very early in main playbook.

In non containerized deployment, is_atomic is only set in ceph-facts
which is played after ceph-validate.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-14 10:34:37 +00:00
Guillaume Abrioux 9c10affb69 site.yml: run ceph-validate before facts/defaults roles
ceph-validate must be run before any other role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-14 10:34:37 +00:00
Guillaume Abrioux b94290af43 refact the 'raw' installation of python
to avoid duplicating code in `site.yml.sample`, `site-docker.yml.sample`
and `setup.yml`, let's isolate this part of the code and simply include
it each time we need it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-16 10:16:11 +01:00
Brad Hubbard 55fab6f547 site: Make sure is_atomic is defined
configure_firewall tests the is_atomic variable if the firewalld package
is not present. is_atomic is defined in ceph_facts so include that.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
2019-01-15 15:22:43 +01:00
Guillaume Abrioux e9188cd202 ceph-default: rm useless condition
This condition is useless and it's also creating issues we don't see in
our CI. ceph_release is set by either ceph-common or ceph-docker-common
so let's keep it this way.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-14 14:41:13 +00:00
Rishabh Dave 5f43dae593 set any_errors_fatal to true for all host sections
Add `any_errors_fatal: true` to all host sections in `site.yml.sample`
and `site-container.yml.sample` so that the playbook execution
ceases spontaneously and instantaneously when errors occurs.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-12-20 14:04:11 +01:00
Guillaume Abrioux 0eb56e36f8 introduce new role ceph-facts
sometimes we play the whole role `ceph-defaults` just to access the
default value of some variables. It means we play the `facts.yml` part
in this role while it's not desired. Splitting this role will speedup
the playbook.

Closes: #3282

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-12 11:18:01 +01:00
Rishabh Dave 2fb12ae554 use pre_tasks and post_tasks when necessary
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-12-05 08:17:10 +00:00
Rishabh Dave e4f0af2b78 don't use private option for import_role
Since sharing variables amongst roles has been made default since
Ansible 2.6, private option has been deprecated; so stop using it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-12-04 23:45:59 +00:00
Sébastien Han 1c760904b0 site: collocated mon and mgr by default
This will speed up the deployment and also deploy mon and mgr collocated
just as recommended.
This won't prevent you of adding more and dedicaded machines for mgr if
needed.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han acc92626f6 sites: fail the playbook on any failure
We need to apply   any_errors_fatal: true to every play so it can take
effect, not only on the initial pass. With this flag, any error in the
playbook will cause the playbook to stop.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 87e90a0893 lint: Don't compare to literal True/False
Use `when: var` rather than `when: var == True`

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-08 10:22:02 +00:00
Rishabh Dave 3f62fc585f don't use "role" or "roles" to include roles
Since import_role and include_role are more readable, explicit (about
the nature of inclusion) and flexible (allows placibf inclusion
anywhere) amongst the tasks, use them instead of using roles or role
keyword. Besides, these keywords also allow more arguments.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-10-31 09:38:59 +01:00
Guillaume Abrioux d8d3e55006 remove restapi role
As of `mimic`, restapi is no longer available because of manager daemon.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-30 14:19:13 +01:00
Guillaume Abrioux 40b7747af7 remove jewel support
As of now, we should no longer support Jewel in ceph-ansible.
The latest ceph-ansible release supporting Jewel is `stable-3.1`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-12 23:38:17 +00:00
Guillaume Abrioux b3a71eeb08 ceph-infra: add new role ceph-infra
this role manages ceph infra services such as ntp, firewall, ...

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-10 15:44:33 +00:00
Sébastien Han 82ec5a29f2 site: use default value for 'cluster' variable
If someone's cluster name is 'ceph' then the playbook will fail (with no
errors because of ignore_errors) saying it can not find the variable. So
let's declare the default. If the cluster name is different then it'll
be in group_vars and thus there won't be any failre.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636962
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-10-08 20:31:32 +00:00
Sébastien Han 4db6a213f7 add ceph-handler role
The role contains all the handlers for Ceph services. We decided to
leave ceph-defaults role with variables and a few facts only. This is
useful when organizing the site.yml files and also adding the known
variables to infrastructure-playbooks.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-09-28 15:15:49 +00:00
Sébastien Han ae5ebeeb00 sites: fix conditonnal
Same problem again... ceph_release_num[ceph_release] is only set in
ceph-docker-common/common roles so putting the condition on that role
will never work. Removing the condition.

The downside of this is we will be installing packages and then skip the
role on the node.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622210
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-08-27 22:11:15 +02:00
Sébastien Han 77a3a682f3 iscsi group name preserve backward compatibility
Recently we renamed the group_name for iscsi iscsigws where previously
it was named iscsi-gws. Existing deployments with a host file section
with iscsi-gws must continue to work.

This commit adds the old group name as a backoward compatility, no error
from Ansible should be expected, if the hostgroup is not found nothing
is played.

Close: https://bugzilla.redhat.com/show_bug.cgi?id=1619167
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-08-20 23:52:19 +02:00
Sébastien Han b334cdcbe5 restapi: disable it when ceph version > luminous
ceph-rest-api binary has been removed in mimic so we cannot deploy it
anymore. We just keep the role and the compability for existing users.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-07-30 13:18:30 +00:00
Sébastien Han 1f341e69d1 site: report ceph -s status at the end of the deployment
We now show the output of 'ceph -s'. Example output below:

TASK [display post install message] **********************************************************************************************************************************************************************************************************
ok: [localhost] => {
    "msg": [
        "  cluster:",
        "    id:     753212df-f32a-4cc9-a097-2db6fe89a251",
        "    health: HEALTH_OK",
        " ",
        "  services:",
        "    mon: 1 daemons, quorum ceph-nano-lul-faa32aebf00b",
        "    mgr: ceph-nano-lul-faa32aebf00b(active)",
        "    osd: 1 osds: 1 up, 1 in",
        " ",
        "  data:",
        "    pools:   4 pools, 32 pgs",
        "    objects: 224 objects, 2546 bytes",
        "    usage:   1027 MB used, 9212 MB / 10240 MB avail",
        "    pgs:     32 active+clean",
        " "
    ]
}

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1602910
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-07-27 14:49:42 +00:00
Guillaume Abrioux a1ca2c8fd3 iscsigw: do not run common roles when deploying jewel
Let's not deploy common roles when iscsigw nodes for jewel deployment.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-07-26 18:47:10 +00:00
Arata Notsu 2bbb4acca6 site.yml.sample: fix install python2
Check `systempython2.stat` instead of `systempython2.stat.exists`.

Without this change, in the case that python2 is not installed, the `stat`
task fails without defining `systempython2.stat`. It leads that the next
installation tasks fail because of undefined `systempython2.stat`.

An example error output (edited for readability):

```
TASK [check for python2] ***********************************************
Wednesday 25 July 2018  14:52:47 +0900 (0:00:00.182)       0:00:00.182 *
fatal: [ceph-osd1.vlan221.vtj]: FAILED! => {
"changed": false, "module_stderr": "/bin/sh: 1: /usr/bin/python: not
found\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 127}
...ignoring

TASK [install python2 for debian based systems] ************************
Wednesday 25 July 2018  14:51:00 +0900 (0:00:01.742)       0:00:01.926 *
fatal: [ceph-mon2]: FAILED! => {
"msg": "The conditional check 'systempython2.stat.exists is undefined or
systempython2.stat.exists == false' failed. The error was: error while
evaluating conditional (systempython2.stat.exists is undefined or
systempython2.stat.exists == false): 'dict object' has no attribute 'stat'
\n\n The error appears to have been in
'/Users/arata/git/ceph-ansible/site.yml.sample': line 36, column 7, but
may\n be elsewhere in the file depending on the exact syntax problem.\n\n
The offending line appears to be:\n\n\n
    - name: install python2 for debian based systems\n
      ^ here\n
"}
...ignoring
```

Fixes: #2930
Signed-off-by: Arata Notsu <arata776@gmail.com>
2018-07-25 16:59:37 +00:00
Sébastien Han 20c8065e48 ceph-iscsi: rename group iscsi_gws
Let's try to avoid using dashes as testinfra needs to be able to read
the groups.
Typically, with iscsi-gws we can't add a marker for these iscsi nodes,
using an underscore fixes the issue.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-06-08 10:21:54 +02:00
Andrew Schoen c40ed1c66b site.yml: combine validate play with fact gathering play
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen a80a109ac9 site.yml: the validation play must use become: true
The ceph-defaults role expects this.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen d83bdce8a9 site.yml: abort playbook when it fails during config validation
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen b2b905f47e site.yml: remove the testing task that fails the playbook run
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Andrew Schoen 4008d700a4 site.yml: move validate task to it's own play
This needs to be in it's own play with ceph-defaults included
so that I can validate things that might be defaulted in that
role.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-05-18 17:58:24 +02:00
Alfredo Deza 0ace2e9534 site: add validation task
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2018-05-18 17:58:24 +02:00
Guillaume Abrioux 75733daf23 playbook: improve facts gathering
there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.

Fixes: #2553

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-04 14:28:19 +02:00
Guillaume Abrioux ac41efd3c2 site: make it more readable
These conditions introduced by d981c6bd2 were insane.
This should be a bit easier to read.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-04-14 00:37:41 +02:00
Sébastien Han f2e0ceed78 add support for installation checkpoint
This was taken from the openshift ansible repository here:
https://github.com/leseb/openshift-ansible/tree/master/roles/installer_checkpoint

Rationale:

A complete OpenShift cluster installation is comprised of many different
components which can take 30 minutes to several hours to complete. If
the installation should fail, it could be confusing to understand at
which component the failure occurred. Additionally, it may be desired to
re-run only the component which failed instead of starting over from the
beginning. Components which came after the failed component would also
need to be run individually.

Ceph has a similar situation so we can benefit from that
callback_plugin.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-03-06 15:21:40 +00:00
Sébastien Han ff90661033 site: ability to only generate a ceph.conf on the machines
Now by running the playbook like this:

ansible-playbook site.yml --tags='ceph_update_config'

You can only generate a ceph configuration file on the nodes.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1543434
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-02-09 14:07:58 +01:00
Markos Chandras 849786967a ceph-common: Add initial support for openSUSE Leap distributions
openSUSE Leap 42.3 provides support for Ceph Luminous in both the
distribution package and the latest available version in the OBS
repository so add these as the only available installation methods for
openSUSE.

Signed-off-by: Markos Chandras <mchandras@suse.de>
2017-11-14 10:51:22 +00:00
Guillaume Abrioux 4596fbaac1 common: make the delegate_facts feature optional
Since we encountered issue with this on ansible2.2, this commit provide
the ability to enable or disable it regarding which ansible we are
running.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-31 16:44:26 +01:00
Major Hayden c01851325e
Remove jinja2 delimiters from `when` keys
This patch changes the `when:` keys so that they have no jinja2
delimiters. This avoids Ansible warnings which could turn into
errors in a future Ansible release.
2017-10-12 11:27:42 -05:00
Sébastien Han b6b24a5ca9 iscsi: fix wrong group name for iscsi
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498490
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-05 17:25:32 +02:00