Commit Graph

2887 Commits (63f20f59418a668261c91e408ad7ae5ac67a63d1)

Author SHA1 Message Date
Dimitri Savineau d4fd38c967 ceph-nfs: change ganesha CentOS repository
Since we don't have nfs-ganesha builds available on CentOS 8 at the
moment on shaman then we can use the alternative repository at [1]

[1] https://download.nfs-ganesha.org/3/LATEST/CentOS

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-08 11:13:46 +01:00
Guillaume Abrioux 217d95abb2 common: add centos8 support
Ceph octopus only supports CentOS 8.

This commit adds CentOS 8 support:
  - update vagrant image in tox configurations.
  - add CentOS 8 repository for el8 dependencies.
  - CentOS 8 container engine is podman (same than RHEL 8).
  - don't use the epel mirror on sepia because it's epel7 only.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-08 11:13:46 +01:00
Stanley Lam 2ca3364109 ceph-rgw-loadbalancer: Modify keepalived master selection
Currently the keepalived template only works when system hostnames exactly match the Ansible inventory name. If these are different, all generated templates become BACKUP without a MASTER assigned. Using the inventory_hostname in the template file resolves this issue.

Signed-off-by: Stanley Lam stanleylam_604@hotmail.com
2020-01-06 09:25:04 -05:00
Dimitri Savineau 2c06678cde ceph-infra: replace hardcoded grafana group name
The grafana-server group name was hardcoded for the grafana/prometheus
firewalld tasks condition.
We should we the associated variable : grafana_server_group_name

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-18 16:09:14 +01:00
Dimitri Savineau f4c261ef90 ceph-infra: move dashboard into a dedicated file
Instead of using multiple dashboard_enabled condition in the
configure_firewall file we could just have the condition once
and include the dedicated tasks list.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-18 16:09:14 +01:00
Dimitri Savineau 4535985188 ceph-infra: open dashboard port on monitor
When there's no mgr group defined in the ansible inventory then the
mgrs are deployed implicitly on the mons nodes.
If the dashboard is enabled then we need to open the dashboard port on
the node that is running the ceph mgr process (mgr or mon).
The current code only allow to open that port on the mgr nodes when they
are present explicitly in the inventory but not implicitly.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783520

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-18 16:09:14 +01:00
Dimitri Savineau 6f0556f015 ceph-defaults: exclude rbd devices from discovery
The RBD devices aren't excluded from the devices list in the LVM auto
discovery scenario.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783908

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-18 09:03:19 +01:00
Guillaume Abrioux fc02fc98eb defaults: change monitor|radosgw_address default values
To avoid confusion, let's change the default value from `0.0.0.0` to
`x.x.x.x`.
Users might think setting `0.0.0.0` will make the daemon binding on all
interfaces.

Fixes: #4827

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-12 09:58:33 +01:00
Philip Brown 9021c29b61 Add comment on auto-SSL cert generation
Fixes: #4830

Signed-off-by: Philip Brown <phil@bolthole.com>
2019-12-11 10:57:28 +01:00
Dimitri Savineau 68c6f39349 ceph-facts: set use_new_ceph_iscsi on iscsi nodes
We don't need to set the use_new_ceph_iscsi fact on other nodes than
those present in the iscsigws group.
Also remove the duplicate iscsi_gw_group_name condition already present
on the include_task.
Finally validate the ansible distribution as the first task.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-10 23:57:03 +01:00
Guillaume Abrioux 8d0dc34ebe defaults: fix a typo
s/above/below

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-10 09:32:02 -05:00
Guillaume Abrioux a234338eff defaults: add a comment
This commit isolates and adds an explicit comment about variables not
intended to be modified by the user.

Fixes: #4828

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-09 13:50:43 -05:00
Guillaume Abrioux d245eb7e7d dashboard: run node_export as privileged container
Typical error:

```
type=AVC msg=audit(1575367499.582:3210): avc:  denied  { search } for  pid=26680 comm="node_exporter" name="1" dev="proc" ino=11528 scontext=system_u:system_r:container_t:s0:c100,c1014 tcontext=system_u:system_r:init_t:s0 tclass=dir permissive=0
```

node_exporter needs to be run as privileged to avoid avc denied error
since it gathers lot of information on the host.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762168

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-09 09:40:13 -05:00
Dimitri Savineau 1a77dd7e91 ceph-validate: start with ansible version test
It doesn't make sense to start validating configuration if the ansible
version isn't the good one.
This commit moves the check_system task as the first task in the
ceph-validate role.
The ansible version test tasks are moved at the top of this file.
Also moving the iscsi kernel tests from check_system to check_iscsi
file.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-09 09:35:03 +01:00
Dimitri Savineau 12aa8f4025 ceph-facts: move ntp/chrony facts to ceph-infra
The ntp/chrony facts are only used in the ceph-infra role so we don't
really need to set them in the ceph-facts roles.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-05 19:46:59 +01:00
Guillaume Abrioux 0756fa467d defaults: change default value for dashboard_admin_password
A recent change in ceph/ceph prevent from having username in the
password:

`Error EINVAL: Password cannot contain username.`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-05 13:02:06 -05:00
Dimitri Savineau 014f51c2a4 ceph-defaults: exclude md devices from discovery
The md devices (RAID software) aren't excluded from the devices list in
the auto discovery scenario.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1764601

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-05 10:14:25 +01:00
Guillaume Abrioux a8d76d72d7 dashboard: use fqdn url for active alert
When using the shortname, the URL for active alert launches with short
hostname and fails to connect to the server.

This commit changes the template in order to use the fqdn.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1765485

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-03 14:30:32 +01:00
Guillaume Abrioux fe5ffe589e facts: isolate container_binary facts
in order to be able to call container_binary without having to run the
whole ceph-facts role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-03 13:29:52 +01:00
Guillaume Abrioux d23383a820 purge: remove docker_* task
All containers are removed when systemd stops them.
There is no need to call this module in purge container playbook.

This commit also removes all docker_image task and remove all container
images in the final cleanup play.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1776736

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-03 13:29:52 +01:00
Stanley Lam ad7a5dad3f Add option for HAproxy to act a SSL frontend termination point for loadbalanced RGW instances.
Signed-off-by: Stanley Lam <stanleylam_604@hotmail.com>
2019-12-02 16:54:33 -05:00
Guillaume Abrioux a43a872105 docker2podman: import ceph-handler role
This is needed to avoid following error:

```
ERROR! The requested handler 'restart ceph mons' was not found in either the main handlers list nor in the listening handlers list
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-02 09:11:12 -05:00
Dimitri Savineau 5bd1cf40eb ceph-osd: wait for all osds once
cf8c6a3 moves the 'wait for all osds' task from openstack_config to the
main tasks list.
But the openstack_config code was executed only on the last OSD node.
We don't need to do this check on all OSD node so we need to add set
run_once to true on that task.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-27 13:05:42 -05:00
Guillaume Abrioux 23b1f43897 facts: avoid duplicated element in devices list
When using `osd_auto_discovery`, `devices` is built multiple times due
to multiple runs of `ceph-facts` role. It end up with duplicate
instances of a same device in the list.

Using `unique` filter when building the list fixes this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-27 16:35:41 +01:00
Guillaume Abrioux cc0c1ce301 dashboard: only print dashboard url of the grafana-server node
This commit makes the ceph-dashboard role only printing ceph-dashboard
URL of the nodes present in grafana-server group

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762163

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-27 10:28:23 -05:00
Guillaume Abrioux f19a2aef1a Revert "tox-podman: use centos 8 vagrant image"
This reverts commit 19e9a06ab1.
2019-11-27 16:19:58 +01:00
Dimitri Savineau cf8c6a3849 ceph-osd: wait for all osd before crush rules
When creating crush rules with device class parameter we need to be sure
that all OSDs are up and running because the device class list is
is populated with this information.
This is now enable for all scenario not openstack_config only.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-27 07:43:07 +01:00
Dimitri Savineau 55adc10be3 ceph-grafana: remove ipv6 brakets on wait_for
The wait_for ansible module doesn't support the backets on IPv6 address
so need to remove them.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769710

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-26 10:08:17 +01:00
Guillaume Abrioux 33bfb10af9 nfs: remove legacy file
this file is provided by the packaging (nfs-ganesha) so there's no need
to maintain it in ceph-ansible

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-22 05:11:41 +01:00
Guillaume Abrioux d06158e9d9 nfs: do not run privileged nfs container
At the moment, we bindmount the dbus socket from the host, this requires
to run the container with --privileged.
Since we now run a dedicated dbus daemon inside the same container, we
can stop running privileged nfs-ganesha containers

Related ceph-container PR : ceph/ceph-container#1517

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1725254

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-22 05:11:41 +01:00
Dimitri Savineau 19e9a06ab1 tox-podman: use centos 8 vagrant image
Switch the podman scenario from atomic centos 7 to centos 8 (not atomic)

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-20 10:34:34 +01:00
VasishtaShastry 72c43cc5d9 Fixes failure of cephfs configuration using --limit
Configuration of cephfs with an existing cluster using --limit used to fail
at different tasks while running with site-docker.yml
This commit addresses both of those tasks

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1773489
Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
2019-11-18 16:44:47 +01:00
Dimitri Savineau ef2cb99f73 ceph-osd: add device class to crush rules
This adds device class support to crush rules when using the class key
in the rule dict via the create-replicated sub command.
If the class key isn't specified then we use the create-simple sub
command for backward compatibility.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-14 16:25:46 +01:00
Dimitri Savineau ed36a11eab move crush rule creation from mon to osd role
If we want to create crush rules with the create-replicated sub command
and device class then we need to have the OSD created before the crush
rules otherwise the device classes won't exist.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-14 16:25:46 +01:00
Dimitri Savineau 3e29b8d5ff ceph-defaults: pin prometheus container tags
In addition to the grafana container tag change, we need to do the same
for the prometheus container stack based on the release present in the
OSE 4.1 container image.

$ docker run --rm openshift4/ose-prometheus-node-exporter:v4.1 --version
node_exporter, version 0.17.0
  build user:       root@67fee13ed48f
  build date:       20191023-14:38:12
  go version:       go1.11.13
$ docker run --rm openshift4/ose-prometheus-alertmanager:4.1 --version
alertmanager, version 0.16.2
  build user:       root@70b79a3f29b6
  build date:       20191023-14:57:30
  go version:       go1.11.13
$ docker run --rm openshift4/ose-prometheus:4.1 --version
prometheus, version 2.7.2
  build user:       root@12da054778a3
  build date:       20191023-14:39:36
  go version:       go1.11.13

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-14 16:11:14 +01:00
VasishtaShastry 9a1f1626c3 Evades validation of ceph_repository_type in containerized scenario
This will prevent failure of site-docker.yml with configs in doc.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769760

Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-14 15:53:22 +01:00
Dimitri Savineau 4a065cebd7 ceph-validate: add rbdmirror validation
When ceph_rbd_mirror_configure is set to true we need to ensure that
the required variables aren't empty.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1760553

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-07 08:57:43 -05:00
Dimitri Savineau 60cbfdc2a6 ceph-handler: Use /proc/net/unix for rgw socket
If for some reason, there's an old rgw socket file present in the
/var/run/ceph/ directory then the test command could fail with

test: xxxxxxxxx.asok: binary operator expected

$ ls -hl /var/run/ceph/
total 0
srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94153614631472.asok
srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94240997655088.asok

We can check the radosgw socket in /proc/net/unix to avoid using wildcard
in the socket name.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-07 14:41:11 +01:00
Dimitri Savineau ece46d33be ceph-osd: fix fs.aio-max-nr sysctl condition
[1] introduced a regression on the fs.aio-max-nr sysctl value condition.
The enable key isn't a boolean but a string because the expression isn't
evaluated.
This string output "(osd_objectstore == 'bluestore')" is always true
because item.enable condition only matches non empty string. So the
sysctl value was applyied for both filestore and bluestore backend.

[2] added the bool filter to the condition but the filter always returns
false on string and the sysctl wasn't applyed at all.

This commit fixes the enable key value by evaluating the value instead
of using the string.

[1] https://github.com/ceph/ceph-ansible/commit/08a2b58
[2] https://github.com/ceph/ceph-ansible/commit/ab54fe2

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-07 13:51:48 +01:00
Dimitri Savineau 2037fb87b6 ceph-defaults: pin grafana container tag to 5.2.4
The latest grafana container tag is using grafana 6.x release which could
cause issue with the ceph dashboard integration.
Considering that the grafana container in RHCS 3 is based on 5.x then we
should use the same version.

$ docker run --rm rhceph/rhceph-3-dashboard-rhel7:3 -v
Version 5.2.4 (commit: unknown-dev)

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-31 18:44:51 -04:00
Dimitri Savineau 9a996aef7f ceph-osd: Remove ulimit nofile on container start
Even if this improves ceph-disk/ceph-volume performances then it also
impact the ceph-osd process.
The ceph-osd process shouldn't use 1024:4096 value for the max open
files.
Removing the ulimit option from the container engine and doing this kind
of change on the container side [1].

[1] https://github.com/ceph/ceph-container/pull/1497

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-31 10:42:09 -04:00
fmount 41b8c17356 Set grafana-server user and password in ceph-dashboard role
This change adds two tasks to set grafana-api user and password
that are required to inject dashboard layouts to the external
grafana instance.
Without these two parameters the ceph-ansible playbook fails
showing an authorization error (HTTPError: 401 Client Error:
Unauthorized").

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1767365
Signed-off-by: fmount <fpantano@redhat.com>
2019-10-31 10:29:57 -04:00
Mihai Plasoianu d3f67d63ae ceph-mon: use --admin-daemon to set default crush rule
Signed-off-by: Mihai Plasoianu <m.plasoianu@vertical.de>
2019-10-29 20:59:32 -04:00
Radu Toader f2573c9e6b nfs: support specific keys for rgw nfs user
This brings the possibility to modify the rgw nfs user to use specific
keys when those are defined.

Signed-off-by: Radu Toader <radu.m.toader@gmail.com>
2019-10-29 14:59:26 -04:00
Dimitri Savineau 15f7c7195a ceph-nfs: add nfs-ganesha-rados-grace explicitly
Since nfs-ganesha V3.0-rc4 and [1] we need to explicitly install the
nfs-ganesha-rados-grace package.

[1] https://github.com/nfs-ganesha/nfs-ganesha/commit/0fea990

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-28 16:27:36 -04:00
Dimitri Savineau b33c476f16 defaults: add user/pass auth registry variables
Add ceph_docker_registry_username and ceph_docker_registry_password
variables in ceph-defaults role so they will be present in the group_vars
samples but commented.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1763139

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-24 15:11:45 -04:00
Guillaume Abrioux 3d28773da5 mon: call mon_status from asok
since c09b82a80a392ccd0da7677c7b424ce5cd3fa5d6 in ceph/ceph we must call
mon_status from asok instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-24 10:19:16 -04:00
Dimitri Savineau d050391cbb dashboard: add ceph iscsi management
When deploying with ceph-iscsi nodes and dashboard enabled, we need to
add the ceph iscsi gateway endpoints to the dashboard configuration and
add the mgr ip address in the trusted list in the iscsi gateway
configuration file.

Closes: #4638
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1764173

https://docs.ceph.com/docs/master/mgr/dashboard/#enabling-iscsi-management

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-22 23:24:17 +02:00
Dimitri Savineau f2cb937193 ceph-iscsi: add ceph-iscsi stable repositories
This commit adds the support of the ceph-iscsi stable repository when
use ceph_repository community instead of always using the devel
repositories.
We're still using the devel repositories for rtslib and tcmu-runner in
both cases (dev and community).

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-22 23:24:17 +02:00
Dimitri Savineau fd8d47da98 Revert "iscsigw: install python-requests"
We don't need this since [1]. Also this was only working for python2 and
not supporting python3.

[1] https://github.com/ceph/ceph-iscsi/commit/00f198a

This reverts commit 167737dd3d.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-22 23:24:17 +02:00
Dimitri Savineau 9ad000618f container/dashboard: run the registry auth task
When deploying with packages then the ceph-container-common role isn't
executed so the registry authentication task is ignored.

Closes: #4636

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-22 23:23:32 +02:00
Guillaume Abrioux da4215e9c0 validate: fix credentials validation
This task is failing when `ceph_docker_registry_auth` is enabled and
`ceph_docker_registry_username` is undefined with an ansible error
instead of the expected message.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1763139

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-21 13:26:55 -04:00
Dimitri Savineau 3969470fca travis: fail on ansible-lint errors
If ansible-lint reports an error then it's skipped. We should fail in
this case.

This patch also fixes the pipefail lint in the rbd mirror role

[306] Shells that use pipes should set the pipefail option

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-21 13:26:02 -04:00
Guillaume Abrioux 4e9504c939 common: do not override ceph_release when using custom repo
Otherwise it fails like following:

```
TASK [ceph-mds : allow multimds] **************************************************************************************************************************************************
Monday 22 July 2019  16:37:38 +0800 (0:00:03.269)       0:13:25.651 ***********
fatal: [rhel7u6clone1]: FAILED! => {"msg": "The conditional check 'ceph_release_num[ceph_release] == ceph_release_num.luminous' failed. The error was: error while evaluating conditional (ceph_release_num[ceph_release] == ceph_release_num.luminous): 'dict object' has no attribute u'dummy'\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-mds/tasks/create_mds_filesystems.yml': line 43, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: allow multimds\n  ^ here\n"}
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-17 22:58:16 +02:00
Mike Christie ba141298d7 iscsi-gw: Fix rtslib installation
When using python3 the name of the rtslib rpm is python3-rtslib. The
packages that use rtslib already have code that detects the python
version and distro deps, so drop it from the ceph iscsi gw task list and
let the ceph-iscsi rpm dependency handle it.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1760930

Signed-off-by: Mike Christie <mchristi@redhat.com>
2019-10-16 12:59:31 -04:00
Guillaume Abrioux 71cebf80a6 update: follow new recommandation to upgrade mds cluster
Refact the mds cluster upgrade code in order to follow the documented
recommandation.
See: https://github.com/ceph/ceph/blob/master/doc/cephfs/upgrading.rst

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1569689

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-16 11:23:12 -04:00
Guillaume Abrioux b63bd13073 nfs: remove unnecessary set_fact in main.yml
this task is a leftover and no longer needed.
It even causes bug when collocating nfs with mon.

Closes: #4609

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-16 11:23:02 -04:00
Dimitri Savineau 0b1e9c0737 rbd-mirror: fail if the peer is not added
Due the 'failed_when: false' statement present in the peer task then
the playbook continues to ran even if the peer task was failing (like
incorrect remote peer format.

"stderr": "rbd: invalid spec 'admin@cluster1'"

This patch adds a task to list the peer present and add the peer only if
it's not already added. With this we don't need the failed_when statement
anymore.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1665877

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-16 16:27:46 +02:00
Dimitri Savineau bc701860d5 ceph-iscsi: notify rbd target services
When the iscsi gateway or the ceph configuration file change then we
need to notify the rbd target api/gw services to be restarted.
This patch also merges the rbd-target-api and rbd-target-gw handler
into a single file and listen.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-16 16:25:40 +02:00
Guillaume Abrioux cb80231725 mgr: do not copy all keyrings on all mgr
There is no need to loop over all mgr nodes to set this fact, it's even
breaking deployments because it tries to copy all mgr keyring on all
mgr.

Closes: #4602

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-15 15:06:46 -04:00
Dimitri Savineau 0f978d969b Remove validate action and notario dependency
The current ceph-validate role is using both validate action and fail
module tasks to validate the ceph configuration.
The validate action is based on the notario python library. When one of
the notario validation fails then a python stack trace is reported to the
ansible task. This output isn't understandable by users.

This patch removes the validate action and the notario depencendy. The
validation is now done with only fail ansible module.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654790

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-15 11:34:49 +02:00
Dimitri Savineau f7fd0b6d4f lint: fix error [303,602,701,702]
[303] mktemp used in place of tempfile module
 [602] Don't compare to empty string
 [701] No 'galaxy_info' found
 [702] Use 'galaxy_tags' rather than 'categories'

This patch also changes the ansible log_path value via the
ANSIBLE_LOG_PATH environment variable in the travis configuration to
avoid warnings.

[WARNING]: log file at /home/travis/ansible/ansible.log is not writeable
and we cannot create it, aborting

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-15 10:07:52 +02:00
Dimitri Savineau fe9c5b8c68 ceph-handler: group listen topics and condition
We are using multiple listen topics with the handlers. That means that
we are notifying 4 tasks for each handler.
Instead we can group the listen on an include_tasks and based on the
group condition.

Before:

NOTIFIED HANDLER ceph-handler : set _mon_handler_called before restart for mon0
NOTIFIED HANDLER ceph-handler : copy mon restart script for mon0
NOTIFIED HANDLER ceph-handler : restart ceph mon daemon(s) for mon0
NOTIFIED HANDLER ceph-handler : set _mon_handler_called after restart for mon0
NOTIFIED HANDLER ceph-handler : set _osd_handler_called before restart for mon0
NOTIFIED HANDLER ceph-handler : copy osd restart script for mon0
NOTIFIED HANDLER ceph-handler : restart ceph osds daemon(s) for mon0
NOTIFIED HANDLER ceph-handler : set _osd_handler_called after restart for mon0
NOTIFIED HANDLER ceph-handler : set _mds_handler_called before restart for mon0
NOTIFIED HANDLER ceph-handler : copy mds restart script for mon0
NOTIFIED HANDLER ceph-handler : restart ceph mds daemon(s) for mon0
NOTIFIED HANDLER ceph-handler : set _mds_handler_called after restart for mon0
NOTIFIED HANDLER ceph-handler : set _rgw_handler_called before restart for mon0
NOTIFIED HANDLER ceph-handler : copy rgw restart script for mon0
NOTIFIED HANDLER ceph-handler : restart ceph rgw daemon(s) for mon0
NOTIFIED HANDLER ceph-handler : set _rgw_handler_called after restart for mon0
NOTIFIED HANDLER ceph-handler : set _mgr_handler_called before restart for mon0
NOTIFIED HANDLER ceph-handler : copy mgr restart script for mon0
NOTIFIED HANDLER ceph-handler : restart ceph mgr daemon(s) for mon0
NOTIFIED HANDLER ceph-handler : set _mgr_handler_called after restart for mon0
NOTIFIED HANDLER ceph-handler : set _rbdmirror_handler_called before restart for mon0
NOTIFIED HANDLER ceph-handler : copy rbd mirror restart script for mon0
NOTIFIED HANDLER ceph-handler : restart ceph rbd mirror daemon(s) for mon0
NOTIFIED HANDLER ceph-handler : set _rbdmirror_handler_called after restart for mon0

After:

NOTIFIED HANDLER ceph-handler : mons handler for mon0
NOTIFIED HANDLER ceph-handler : osds handler for mon0
NOTIFIED HANDLER ceph-handler : mdss handler for mon0
NOTIFIED HANDLER ceph-handler : rgws handler for mon0
NOTIFIED HANDLER ceph-handler : mgrs handler for mon0
NOTIFIED HANDLER ceph-handler : rbdmirrors handler for mon0

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-11 15:43:58 -04:00
Guillaume Abrioux 161170524d mgr: improve mgr keyring creation
Delegating on remote node isn't necessary here since we are already
iterating over the right nodes.

Closes: #4518

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-11 09:40:07 -04:00
Guillaume Abrioux 273413186a common: do not reset `container_exec_cmd`
This commit removes some legacy tasks.

These tasks aren't needed, they cause the playbook to fail when
collocating daemons.

Closes: #4553

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-10 14:38:30 -04:00
Guillaume Abrioux 80e2d00b16 validate: prevent from installing OSD on same disk as the OS
This commit adds a validation task to prevent from installing an OSD on
the same disk as the OS.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1623580

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-10 08:33:20 -04:00
Dimitri Savineau 3f6ff240b7 dashboard: update layouts before the restart
If the mgr dashboard doesn't restart fast enough then the inject
dashboard task will fail with a HTTP error 400.

Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 914, in _handle_command
    return self.handle_command(inbuf, cmd)
  File "/usr/share/ceph/mgr/dashboard/module.py", line 450, in handle_command
    push_local_dashboards()
  File "/usr/share/ceph/mgr/dashboard/grafana.py", line 132, in push_local_dashboards
    retry()
  File "/usr/share/ceph/mgr/dashboard/grafana.py", line 89, in call
    result = self.func(*self.args, **self.kwargs)
  File "/usr/share/ceph/mgr/dashboard/grafana.py", line 127, in push
    grafana.push_dashboard(body)
  File "/usr/share/ceph/mgr/dashboard/grafana.py", line 54, in push_dashboard
    response.raise_for_status()
  File "/usr/lib/python2.7/site-packages/requests/models.py", line 834, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
HTTPError: 400 Client Error: Bad Request

Instead we can trigger this task before the module restart.

Closes: #4565

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-09 09:10:27 +02:00
Guillaume Abrioux 6c6a512a72 nfs: stop nfs server service in all context
This commit moves this task in order to stop the nfs server service
regardless the deployment type desired (containerized or non
containerized).

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-07 10:24:33 -04:00
Guillaume Abrioux 47034effe0 nfs: stop nfs server service
The syntax here wasn't working, this refact fixes this task.
Also, removing the `ignore_errors: true` which was hidding the failure.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-07 10:24:33 -04:00
Guillaume Abrioux fa9b42e98e switch_to_containers: do not re-set `ceph_uid`
This commit refacts the way we set `ceph_uid` fact in `ceph-facts` and
removes all `set_fact` tasks for `ceph_uid` in switch-to-containers playbook
to avoid duplicated code.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-07 14:15:56 +02:00
Dimitri Savineau b9e93ad7a6 ceph-dashboard: remove rgw api host,port,scheme
We don't need to have dedicated variables for the RGW integration into
the Ceph Dashboard and need to be manually filled.
Instead we can use the current values from the RGW nodes by using the
IP and port from the first RGW instance of the first RGW node via the
radosgw_address and radosgw_frontend_port variables.
We don't need to specify all RGW nodes, this will be done automatically
with one node.
The RGW api scheme is using the radosgw_frontend_ssl_certificate variable
to determine if the value is http or https. This variable is also reuse
as a condition for the ssl verify task.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-07 11:22:44 +02:00
Dimitri Savineau 249764047b ceph-dashboard: Improve https configuration
This patch moves the https dashboard configuration into a dedicated
block to avoid the multiple occurence of the dashboard_protocol
condition.
It also fixes the dashboard certificate and key variables handling in
the condition introduced by ab54fe2. Those variables aren't boolean but
strings so we can test them via the length filter.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-07 09:08:16 +02:00
Guillaume Abrioux ccc11cfc93 handler: followup on #4519
This commit adds some missing `| bool` filters.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-04 10:49:15 -04:00
Dimitri Savineau dd526cfe4e ceph-dashboard: add cluster parameter to ceph cmd
The ceph dashboard tasks didn't use the cluster option if the cluster
name isn't the default value.

Closes: #4529

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-04 16:10:22 +02:00
Guillaume Abrioux 411bd07d54 handlers: refact osd handler
This commit merges the two restart tasks into a single one, this way
it's one task less to notify.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-04 09:42:20 -04:00
Dimitri Savineau 0346871fb5 ceph-handler: don't restart all OSDs with limit
When using the ansible --limit option on one or few OSD nodes and if the
handler is triggered then we will restart the OSD service on all OSDs
nodes instead of the hosts limited by the limit value.
Even if the play is limited by the --limit value we are using all OSD
nodes from the OSD group.

  with_items: '{{ groups[osd_group_name] }}'

Instead we should iterate only on the nodes present in both OSD group and
limit list.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-03 14:52:27 -04:00
Dimitri Savineau 780cf36a59 ceph-facts: fix _radosgw_address with block
e695efc introduced a regression in the _radosgw_address fact when using
the radosgw_address_block variable.
There's no item there because we don't use the items lookup. This is
only used for _monitor_address with monitor_address_block.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1758099

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-03 16:15:45 +02:00
Guillaume Abrioux 9bad239d77 common: improve keyrings generation
There is no need to get n * number of nodes the different keyrings.
Adding a `run_once: true` here avoid running a ceph command too many
times which could be impacting large cluster deployment.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-02 13:09:50 +02:00
Dimitri Savineau ec3b687dc4 ceph-facts: use --admin-daemon to get fsid
During the rolling_update scenario, the fsid value is retrieve from the
current ceph cluster configuration via the ceph daemon config command.
This command tries first to resolve the admin socket path via the
ceph-conf command.
Unfortunately this command won't work if you have a duplicate key in the
ceph configuration even if it only produces a warning. As a result the
task will fail.

Can't get admin socket path: unable to get conf option admin_socket for
mon.xxx: warning: line 13: 'osd_memory_target' in section 'osd' redefined

Instead of using ceph daemon we can use the --admin-daemon option
because we already know what the socket admin path value based on the
ceph cluster and mon hostname values.

Closes: #4492

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-02 10:07:13 +02:00
Guillaume Abrioux 272d16e101 validate: fix gpt header check
Check for gpt header when osd scenario is lvm or lvm batch.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-01 11:46:17 -04:00
Guillaume Abrioux ed8616aa66 rbdmirror: rename a file
rename this file to be more generic.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-01 10:27:51 -04:00
Guillaume Abrioux e08194dd67 rgw: refact tasks directory layout
This commit moves containerized deployment related files to `./tasks/`
directory. This is needed to make `docker-to-podman.yml` working since
we use `tasks_from:` option.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-01 10:27:51 -04:00
Guillaume Abrioux c69816c6b7 rbdmirror: refact tasks directory layout
This commit moves containerized deployment related files to `./tasks/`
directory. This is needed to make `docker-to-podman.yml` working since
we use `tasks_from:` option.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-01 10:27:51 -04:00
Guillaume Abrioux 4636f3f7e2 iscsigw: refact tasks directory layout
This commit moves containerized deployment related files to `./tasks/
directory. This is needed to make `docker-to-podman.yml` working since
we use `tasks_from:` option.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-01 10:27:51 -04:00
Guillaume Abrioux bd64167469 container: isolate systemd tasks
This commit isolates the systemd unit files generation for containers into
separate yml files in order to be able importing each corresponding roles
without playing all tasks.
This is needed so we can run ceph-ansible to render systemd unit files
so they call podman instead of docker.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-01 10:27:51 -04:00
Dimitri Savineau 20b1a464ec ceph-facts: update external grafana fact filter
e695efc hasn't been updated with the changes introduced in 9bb11c7 so
the ips_in_ranges filter isn't used for an external grafana instance.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-01 10:47:14 +02:00
Guillaume Abrioux e4444d29e0 Revert "ceph-common: install only necesarry ceph-* packages on debian"
This reverts commit 58b27ef0b3.
This is breaking debian based OS deployments.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-28 08:04:12 +02:00
Boris Ranto b96c6da832 ceph-defaults: Change the default prometheus port
The old default prometheus port 9090 clashes with cockpit in rhel 8. The
9090 port is reserved for web service administration of machines. We
should change the default to something that does not clash with other
ports used in rhel 8, at least by default. The port 9092 seems like a
good choice in my testing.

Signed-off-by: Boris Ranto <branto@redhat.com>
2019-09-28 04:40:42 +02:00
Johannes Kastl 5cf22e9b31 move python-xml to raw_install_python.yml
The package python-xml is needed for ansible's zypper module to interact with
the zypper package management tool.

roles/ceph-defaults/defaults/main.yml:
Remove python-xml from variable suse_package_dependencies to only
install python-xml on SUSE/openSUSE if python is not found.
raw_install_python.yml already contains all the logic needed to check
if there is a valid python installation, so this is better suited there.

openSUSE Leap 15.x / SLES 15.x do no longer have /usr/bin/python,
only /usr/bin/python3, which already contains the xml module, so
nothing needs to be installed in that case.

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-09-27 14:19:32 +02:00
Harald Jensås e695efcaf7 Replace ipaddr() with ips_in_ranges()
This change implements a filter_plugin that is used in the
ceph-facts, ceph-validate roles and infrastucture-playbooks.
The new filter plugin will return a list of all IP address
that reside in any one of the given IP ranges. The new filter
replaces the use of the ipaddr filter.

ceph.conf already support a comma separated list of CIDRs
for the public_network and cluster_network options.

Changes: [1] and [2] introduced a regression in ceph-ansible
where public_network can no longer be a comma separated list
of cidrs.

With this change a comma separated list of subnet CIDRs can
also be used for monitor_address_block and radosgw_address_block.

[1] commit: d67230b2a2
[2] commit: 20e4852888

Related-To: https://bugs.launchpad.net/tripleo/+bug/1840030
Related-To: https://bugzilla.redhat.com/show_bug.cgi?id=1740283

Closes: #4333
Please backport to stable-4.0

Signed-off-by: Harald Jensås <hjensas@redhat.com>
2019-09-27 10:11:53 +02:00
Dimitri Savineau 74ab59c4f3 ceph-dashboard: Add prometheus api host
The set-prometheus-api-host ceph dashboard subcommand was missing in
ceph-dashboard role. Only grafana and alermanager were present.
This commit also remove the trailing slash at the end of the host/url
values.

Closes: #4453

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-09-27 09:16:12 +02:00
Anthony Rusdi 58b27ef0b3 ceph-common: install only necesarry ceph-* packages on debian
Currently, ceph package only an meta-package that do not contain
actual software, but simply depend on other packages. It's been few
release since debian stretch (official), ubuntu bionic (official),
ubuntu uca repository and upstream debian-jewel.
As we only support nautilus and higher release for master branch,
I propose to drop ceph package and use ceph-base instead for repository
model other than rhcs so debian ceph install will be more minimalis.

Signed-off-by: Anthony Rusdi <33247310+antrusd@users.noreply.github.com>
2019-09-27 01:11:22 +02:00
Dimitri Savineau ca77d7bd31 ceph-nfs: Allow to configure SecType value
Depending on the infrastruture (w/o kerberos auth) then the SecType
value could be different.
Currently this value is hardcoded in the NFS Ganesha template. Instead
we can use a variable.
The default value is still the same to avoid breaking the backward
compatibility.

Closes: #4459

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-09-27 00:33:18 +02:00
liuxu 195f70897c dashboard: add grafana dashboard support on Debian based OS
download grafana dashboard files from github when running on Debian based OS

Signed-off-by: liuxu <liuxu623@gmail.com>
2019-09-26 18:49:56 +02:00
fmount 9bb11c7b2a Inject ceph grafana dashboard layouts
This change just adds the task to inject from the
ceph dashboard mgr module the required layouts
to show all the cluster metrics on the grafana
instance.
Since we're now able to push grafana layouts through
the ceph mgr module command, the dashboards configuration
template is no longer needed on containerized environments.
This commit also fixes the Vagrantfile IP static assigment
in the grafana section because it generates an issue (it's
the same of the mgr instance).
Finally, considering some deployments that use an external
grafana server instance, we reworked the 'grafana_server_addr'
assignment to address these requirements.

Signed-off-by: fmount <fpantano@redhat.com>
2019-09-26 11:12:20 -04:00
Guillaume Abrioux 167737dd3d iscsigw: install python-requests
Typical error at rbd-target-api startup:

```
Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: Traceback (most recent call last):
Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: File "/usr/bin/rbd-target-api", line 39, in <module>
Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: from gwcli.utils import (APIRequest, valid_gateway, valid_client,
Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: File "/usr/lib/python2.7/site-packages/gwcli/utils.py", line 1, in <module>
Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: import requests
Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: ImportError: No module named requests
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux 5bb6a4da42 tests: set copy_admin_key at group_vars level
setting it at extra vars level prevent from setting it per node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux ab370b6ad8 global: remove fetch_directory dependency
This commit drops the fetch_directory dependency.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux 09e04a9197 osd: add wal_devices option support to ceph_volume module
This commit adds the `wal_devices` option support to the
ceph_volume module.
passing a devices list in `bluestore_wal_devices` will make ceph-volume
creating 1 vg using these devices to create block.wal partitions.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux 70f1b37097 osd: update doc text in defaults/main.yml
This commit removes ceph-disk references.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux 7b836eaa47 osd: add block_db_devices option support to ceph_volume module
This commit adds the `block_db_devices` option support to the
ceph_volume module.
passing a devices list in `dedicated_devices` will make ceph-volume
creating 1 vg using these devices to create block.db partitions for data
devices.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux 2b97ac921b validate: check ceph_docker_registry_* length
This commit adds a condition to check whether these variables are empty.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-18 16:03:18 +02:00
Dimitri Savineau 9f4a99fb24 container: Allow to use registry authentication
The registry.redhat.io regsitry requires authentication so before pulling
the RHCS 4 container images from the registry we need to do the login
step.
This is done via the new ceph_docker_registry_auth variable. The
default value is false but true for RHCS setup.
When set to true, you need to provide the username and password
for the registry via the associated variables.
This patch also updates the ceph_docker_registry value for RHCS setup.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1748911

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-09-18 16:03:18 +02:00
Dimitri Savineau 5b1c15653f ceph-handler: Fix osd restart condition
In containerized deployment, the restart OSD handler couldn't be
triggered in most ansible execution.
This is due to the usage of run_once + a condition on the inventory
hostname and the last filter.
The run_once is triggered first so ansible will pick a node in the
osd group to execute the restart task. But if this node isn't the
last one in the osd group then the task is ignored. There's more
probability that the task will be ignored than executed.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-09-10 15:56:53 -04:00
Dimitri Savineau 1f505628dd rbd-mirror: Allow to copy the admin keyring
The ceph-rbd-mirror role allows to copy the admin keyring via the
copy_admin_key variable but there's actually no task in that role
doing the job.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-09-10 15:44:04 -04:00
Dimitri Savineau a3d36df025 rbd-mirror: Use the rbd mirror client keyring
The admin keyring isn't present by default on the rbd mirror nodes so
the rbd commands related to the mirroring confguration will fail.
Instead we can use the rbd mirror client keyring.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-09-10 15:44:04 -04:00
Giulio Fidente d2a2bd7c42 Look for additional names when checking ceph-nfs container status
Ganesha cannot be operated active/active, in those deployments
where it is managed by pacemaker the container name can be
different than the default.

This change uses "ceph_nfs_service_suffix" where previously
missing to ensure tasks will work with customized names.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1750005
Signed-off-by: Giulio Fidente <gfidente@redhat.com>
2019-09-09 15:27:37 -04:00
Harald Jensås d94229204d Support comma-delimited subnets in firewall
ceph.conf supports a comma separated list of
subnet CIDR's for the public_network and the
cluster network. ceph-ansible should support
setting up the firewall for this configuration.

Closes: #4425
Related: #4333
https://docs.ceph.com/docs/nautilus/rados/configuration/network-config-ref/#network-config-settings

Signed-off-by: Harald Jensås <hjensas@redhat.com>
2019-09-09 15:20:58 -04:00
Dimitri Savineau 7e5e21741e rbd-mirror: configure pool and peer
The rbd mirror configuration was only available for non containerized
deployment and was also imcomplete.
We now enable the mirroring on the pool and add the remote peer in both
scenarios.

The default mirroring mode is set to 'pool' but can be configured via
the ceph_rbd_mirror_mode variable.

This commit also fixes an issue on the rbd mirror command if the ceph
cluster name isn't using the default value (ceph) due to a missing
--cluster parameter to the command.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1665877

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-09-06 11:00:55 -04:00
fmount 81eb091533 Fix discovered_interpreter_python variable
This change fixes the discovered_interpreter_python variable
name that was "discovered_python_interpreter" and caused a
failure in OSP deployments.

Signed-off-by: fmount <fpantano@redhat.com>
2019-09-04 09:55:30 -04:00
Dimitri Savineau 42082c0a27 lint: fix error [201,206]
[201] Trailing whitespace
 [206] Variables should have spaces before and after: {{ var_name }}

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-08-29 14:28:35 -04:00
Dimitri Savineau 65089a7fc3 ceph-common: remove ceph_stable repo on dev
When upgrading from stable to devel release with redhat community
packages, the rpm packages are not updated due to priority introduced
via a7b1e35 (starting nautilus).
We need to remove the ceph stable repositories when configuring the
dev repositories.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-08-29 14:05:13 -04:00
Dimitri Savineau 5e5d5c2d87 Add octopus release
Add the 15th ceph release: octopus.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-08-29 14:05:13 -04:00
fmount 8a666bfd15 Add http_addr option to grafana config
We have no reason to make grafana container
listen on *:<port>, so this change adds the
http_addr option to the grafana config file
and adds the related option on the wait_for
tasks.
Since grafana_server_addr should exists, we
shouldn't rely on the _current_monitor_addr
default on prometheus/grafana templates.
This change also remove this default value
that is not necessary anymore.

Signed-off-by: fmount <fpantano@redhat.com>
2019-08-29 13:00:22 -04:00
Anthony Rusdi 4c592066b7 ceph_custom_repo: define apt and rpm key for custom repo
This commit also remove the notify on new added debian repo,
force update_cache to yes and define sample ceph_custom_key vars.

Signed-off-by: Anthony Rusdi <33247310+antrusd@users.noreply.github.com>
2019-08-29 10:25:10 -04:00
Johannes Kastl 0cedc4d303 openSUSE OBS repo using ceph_stable_release
Instead of hardcoding `luminous`, use the `ceph_stable_release` variable
to point to the correct repository.

This is now uncommented in roles/ceph-defaults/defaults/main.yml to be
available, as it is only used if ceph_repository is set to 'obs'.

group_vars/*.sample files have been regenerated using the
./generate_group_vars_sample.sh script.

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-08-29 10:23:56 -04:00
Johannes Kastl 4711a7d626 fix openSUSE OBS repo creation
roles/ceph-common/tasks/installs/suse_obs_repository.yml:
ansible's zypper_repository module does not know a parameter 'uri', this is
called 'repo' instead

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-08-29 10:23:07 -04:00
Nick Erdmann 7953ee1b81 ceph-infra: open ceph iscsi/prometheus port
Signed-off-by: Nick Erdmann <n@nirf.de>
2019-08-28 16:09:55 -04:00
Johannes Kastl bd507fa147 set discovered_python_interpreter if ansible_python_interpreter is defined
If the user has set the `ansible_python_interpreter`, ansible will not try to
discover python, so `discovered_python_interpreter` will not be set.

Solution: Set `discovered_python_interpreter` to `ansible_python_interpreter`
if `ansible_python_interpreter` is defined

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-08-27 20:54:59 +02:00
Dimitri Savineau 2b0616ecca ceph-mon: Bind mount the ca-trust directory
On containerized deployment, the mon container sometimes needs to
access to the radosgw endpoint (via the radosgw-admin command). When
using TLS on the radosgw with self-signed certificates then we need to
access to the CA certification from the mon container.
The CA certificate needs to be added on the host and then the directory
will be bind mount on the container.

Resolves: #4358

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-08-27 20:53:45 +02:00
Dimitri Savineau 49aa05b96c ceph-client: Use profile rbd in keyring caps
Like the OpenStack keyrings, we can use the profile rbd for the clients
keyring (both mon and osd).

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-08-27 20:52:23 +02:00
Dimitri Savineau 717af83475 Revert "osd: add 'osd blacklist' cap for osp keyrings"
This reverts commit 2d955757ee.

The "osd blacklist" isn't an osd caps but should be used with mon caps.
Also the correct caps for this is: 'allow command "osd blacklist"'.
The current change is breaking the openstack and clients keyrings.
By using the profile rbd (which is already used) we already rely on the
ability to blacklist dead client.

Resolves: #4385

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-08-27 20:52:23 +02:00
Guillaume Abrioux 5986b26a01 global: add newline at end of file
This commit re-add a newline at end of files when it's missing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-08-23 15:56:47 +02:00
Artur Fijalkowski 011270ca69 global: make directories mode parameterizable
This commit makes it possible to parametrize the ceph directories modes.
So it changes hardocded mode for ceph related directories from 0755 to
customizable with `ceph_directories_mode` variable.

Closes: #2920

Signed-off-by: Artur Fijalkowski <artur.fijalkowski@ing.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-08-23 09:38:17 +02:00
guihecheng a0590cae9d rgw/multisite: assign 'rgw_zone' to the exact section in ceph.conf
since the following commit:
  commit 1ac94c048f
  rgw: add support for multiple rgw instances on a single host

we have multi-instance rgw support on a single host and
the config section name of the rgw changed from
[client.rgw.$(hostname)] -> [client.rgw.$(hostname).rgwX]
when X is the sequence number: 0,1,2,...
So we should assign 'rgw_zone' item to the exact rgw instance
config section in ceph.conf

Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>
2019-08-23 08:14:10 +02:00
Guillaume Abrioux 327d564106 lint: fix error [301], add `changed_when: false` when needed
This commit fixes the error [301]:

`[301] Commands should not change things if nothing needs doing`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-08-23 00:23:47 +02:00
Guillaume Abrioux 102edaeb61 lint: fix error [306], add pipefail on shell command using pipe
This commit fixes the error [306]:

`[306] Shells that use pipes should set the pipefail option`

using `/bin/bash` as executable because Debian/Ubuntu systems use `dash`
by default which doesn't have the `-o pipefail`. (See:
https://github.com/ansible/ansible-lint/issues/497#issue-424623501)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-08-23 00:23:47 +02:00
Johannes Kastl efd38ecc88 ceph-validate: Refactor check for installation check on SUSE/openSUSE
Move the validation from roles/ceph-common/tasks/installs/install_on_suse.yml
to roles/ceph-validate/ and fix the syntax.

There are two valid combinations of `ceph_origin` and `ceph_repository` on
SUSE/openSUSE:
- ceph_origin == 'distro'
- ceph_origin == 'repository' and ceph_repository == 'obs'

The current when condition would fail even in the valid second combination,
as ceph_origin != distro would be true then

Fixes: #4362

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-08-22 20:22:13 +02:00
Johannes Kastl e1b9312084 facts: fix a typo
This commit fixes a typo in roles/ceph-facts/tasks/facts.yml

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-08-22 18:08:28 +02:00
Kevin Coakley e11cbbbcb1 ceph-config: Set changed_when to false on fact gathering statements
The "run 'ceph-volume lvm batch --report' to see how many osds are to be
created" and "run 'ceph-volume lvm list' to see how many osds have already been
created" statements only register the lvm_batch_report and lvm_list variables.
Running those ceph-volume commands should never produce a change on the system.
Adding changed_when: false prevents irrelevant change messages from Ansible.

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
2019-08-22 17:27:58 +02:00
Johannes Kastl 8e3511ddc7 fix SUSE/openSUSE naming
As SUSE 15.x and openSUSE Leap 15.x share the same base, make clear
that both are targeted by the respective tasks

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-08-22 17:20:21 +02:00
Johannes Kastl cdbe958e55 roles/ceph-validate/tasks/check_system.yml: fail on unsupported SUSE versions
Fail if SUSE distributions other than 15.x are found, similar to what we have
for openSUSE

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-08-22 17:17:21 +02:00
Dimitri Savineau 9a4ac46d19 ceph-osd: Add ulimit nofile on container start
On containerized deployment, the OSD entrypoint runs some ceph-volume
commands (lvm/simple scan and/or activate) which perform badly without
the ulimit option.
This option was added for all previous ceph-volume commands but not on
the ceph-osd container startup.
Also updating hard limit value to 4096 to reflect default baremetal
value.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-08-22 16:59:08 +02:00
Johannes Kastl 11aa5dbb58 ceph-nfs: fail on openSUSE Leap using distro packages
roles/ceph-validate/tasks/check_nfs.yml: fail on openSUSE Leap
using `ceph_origin = distro`, as the ganesha packages are not available from
the distribution repositories

Fixes: #4342

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-08-21 09:58:54 +02:00
Johannes Kastl c721cb99cb install ceph-mds packages on SUSE/openSUSE
install packages on SUSE/openSUSE distributions, using the
same logic as on RedHat-based distributions

Fixes #4340

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-08-21 09:57:56 +02:00
Guillaume Abrioux 9329bbb3af handler: do not validate the server certificate against the CA
Otherwise rgw handler ends up with an error when using https.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-08-20 13:52:15 +02:00
Johannes Kastl 504017d562 remove duplicate task installing suse dependencies
roles/ceph-common/tasks/installs/install_on_suse.yml: remove the task that
installs the dependencies, as this is done later in install_suse_packages.yml

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-08-20 12:59:25 +02:00
Guillaume Abrioux 70cf2a5846 osd: remove useless condition
just like `ceph_osd_pool_default_size`, a pool size might change after an
initial deployment. Having this condition prevents from customizing the
pool in that case.
This is not needed so let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-08-19 16:17:22 +02:00
Guillaume Abrioux 4df92152c0 common: replace shell module
there is no need to use `shell` in these tasks. Let's use `command`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-08-14 16:42:02 +02:00
Guillaume Abrioux 687087fd43 osd: refact 'wait for all osd to be up' task
let's use `until` instead of doing test in bash using python oneliner
also, use `command` instead of `shell`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-08-14 16:42:02 +02:00
Guillaume Abrioux 13815ad3ca common: use discovered_interpreter_python fact
in order to use the right binary name when using python cli in command
or shell module.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-08-14 16:42:02 +02:00
Guillaume Abrioux a5e359ee80 osd: update the check for 'all osd to be up'
the data structure has changed in octopus.
eg: the path to `num_osds` is now `["osdmap"]["num_osds"]`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-08-14 16:42:02 +02:00
Guillaume Abrioux 5b9b841108 mgr: refact 'wait for all mgr to be up' task
There's no need to use `shell` module here.
Instead of using `| python -c`, let's use `from_json` filter.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-08-07 10:33:54 +02:00
Dimitri Savineau 4c6ec1dccb mgr/dashboard: Fix grafana/prometheus url config
When configuring grafana/prometheus embed in the mgr/dashboard, we need
to use the address of the grafana-server node and not the current
hostname because mgr/dashboard and grafana/prometheus could be present
on different hosts.
We should instead rely on the grafana_server_addr variable and remove
the dashboard_url.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-08-06 09:34:20 +02:00
Dimitri Savineau f545b5be0d ceph-dashboard: Add run_once on delegate tasks
Because we need to execute commands from a monitor node (the first one
in the mons list) we are using delegate_to option.
If there's multiple nodes running the ceph-dashboard role then the
delegated task will be executed multiple times.
Also remove a mgr config-key option not present for nautilus+ releases.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-08-06 09:34:20 +02:00
Johannes Kastl 5ee3d96fb4 only support openSUSE Leap 15.x, fail on 42.x
openSUSE switched from 'openSUSE 13.x' to 'openSUSE Leap 42.x' and then to
'openSUSE Leap 15.x' to align with SLES15 development.
The previous logic did not correctly allow the current release, as 15.x matched
the 'less than 42.3' condition.

For now only support openSUSE Leap 15.x, and extend support once 16.x is
released (or whatever the exact version will be)

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-08-05 09:46:31 -04:00
Dimitri Savineau 771f25b1f8 ceph-infra: Apply firewall rules with container
We don't have a reason to not apply firewall rules on the host when
using a containerized deployment.
The TripleO environments already manage the ceph firewall rules outside
ceph-ansible and set the configure_firewall variable to false.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1733251

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-08-01 15:16:49 +02:00
Dimitri Savineau 34036c667c ceph-grafana: Set grafana uid/gid on files
We don't need to create a grafana system user (in fact we even don't
set the righ uid to this user) because we're using a container setup.
Instead we just need to be sure to set the owner/group to 472 (grafana
user/group from the container) like we do for ceph/167.
We don't need to set the user/group recursively on /etc/grafana
directory in a dedicated task.
Also on Ubuntu system, the ceph-grafana-dashboards isn't present so on
non containerized deployment we won't have the
/etc/grafana/dashboards/ceph-dashboard directory present (coming with
the package) so we need to be sure it exists.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-08-01 10:10:56 +02:00
Guillaume Abrioux c9d80af4e0 dashboard: fix timeout usage on rgw user creation command
For some reason, this is making the playbook failing like following:

```
TASK [ceph-dashboard : create radosgw system user] ************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************
task path: /home/guits/ceph-ansible/roles/ceph-dashboard/tasks/configure_dashboard.yml:106
Tuesday 30 July 2019  10:04:54 +0200 (0:00:01.910)       0:11:22.319 **********
FAILED - RETRYING: create radosgw system user (3 retries left).
FAILED - RETRYING: create radosgw system user (2 retries left).
FAILED - RETRYING: create radosgw system user (1 retries left).
fatal: [mgr0 -> mon0]: FAILED! => changed=true
  attempts: 3
  cmd: timeout 20 podman exec ceph-mon-mon0 radosgw-admin user create --uid=ceph-dashboard --display-name='Ceph dashboard' --system
  delta: '0:00:20.021973'
  end: '2019-07-30 08:06:32.656066'
  msg: non-zero return code
  rc: 124
  start: '2019-07-30 08:06:12.634093'
  stderr: 'exec failed: container_linux.go:336: starting container process caused "process_linux.go:82: copying bootstrap data to pipe caused \"write init-p: broken pipe\""'
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>
```

using `timeout -f -s KILL` fixes this issue.

Also, there is no need to use `shell` module here, let's switch to
`command`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-30 13:52:44 +02:00
Guillaume Abrioux 2d955757ee osd: add 'osd blacklist' cap for osp keyrings
This commits adds the `osd blacklist` cap on all OSP clients keyrings.

Fixes: #2296

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-29 09:57:25 -04:00
Dimitri Savineau d549fffdd2 ceph-osd: check container engine rc for pools
When creating OpenStack pools, we only check if the return code from
the pool list command isn't 0 (ie: if it doesn't exist). In that case,
the return code will be 2. That's why the next condition is rc != 0 for
the pool creation.
But in containerized deployment, the return code could be different if
there's a failure on the container engine command (like container not
running). In that case, the return code could but either 1 (docker) or
125 (podman) so we should fail at this point and not in the next tasks.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1732157

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-29 15:55:04 +02:00
Guillaume Abrioux 02beb00916 validate: add checks for grafana-server group definition
this commit adds two checks:
- check that the `[grafana-server]` group is defined
- check that the `[grafana-server]` contains at least one node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-29 14:42:45 +02:00
Guillaume Abrioux ec33ee7574 mgr: fix a typo
this tasks isn't using the right container_exec_cmd, that's delegating
to the wrong node.
Let's use the right fact to fix this command.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-29 14:42:45 +02:00
Guillaume Abrioux b9cdf341be dashboard: remove cfg80211 module installation
According to this comment [1], this seems to be needed to detect wifi
devices.

In node exporter we can see this:

```
--collector.wifi          Enable the wifi collector (default: disabled).
```

since it's enabled by default and we don't even change this in our
systemd templates for node-exporter, we can easily assume in the end
it's not needed. Therefore, let's remove this.

[1] dbf81b6b5b (diff-961545214e21efed3b84a9e178927a08L21-L23)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-29 14:42:45 +02:00
Guillaume Abrioux d67230b2a2 dashboard: use dedicated group only
There's no need to add complexity and trying to fallback on other group.
Let's deploy dashboard on all nodes present in grafana-server group.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-29 14:42:45 +02:00
Guillaume Abrioux fb1b5b3251 dashboard: enable dashboard by default
This commit enables dashboard deployment by default.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1726739

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-29 14:42:45 +02:00
Dimitri Savineau 07c6695d16 Remove NBSP characters
Some NBSP are still present in the yaml files.
Adding a test in travis CI.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-26 16:09:23 -04:00
Guillaume Abrioux 19950b5170 container: rename docker directories
Those 2 directories should be renamed to be more generic (docker vs.
podman).

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-24 16:31:46 +02:00
fmount fac1b030cb Avoid to setup provisioners in a fully containerized environment
This commit adds a when clause to avoid the setup of grafana
provisioners in a fully containerized scenario.
This is needed when the ceph-grafana-dashboards package is not
installed and this task could result in a wrong grafana
configuration that let the container crash.

Signed-off-by: fmount <fpantano@redhat.com>
2019-07-23 09:06:50 +02:00
Giulio Fidente edd1420217 Fix backward compat with old cephfs_pools format
Previously cephfs_pools items used to have a pgs: key but not
pgp_num: nor pg_num:

Signed-off-by: Giulio Fidente <gfidente@redhat.com>
2019-07-19 11:56:58 -04:00
Guillaume Abrioux 618dbf271d handler: fix bug in osd handlers
fbf4ed42ae introduced a bug when
container binary is podman.
podman doesn't support ps -f using regular expression, the container id
is never set in the restart script causing the handler to fail.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1721536

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-18 16:22:51 +02:00
Guillaume Abrioux 487d701685 validate: fail if gpt header found on unprepared devices
ceph-volume will complain if gpt headers are found on devices.
This commit checks whether a gpt header is present on devices passed in
`devices` variable and fail early.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1730541

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-18 07:43:55 +02:00
Dimitri Savineau 5383c2f7f3 ceph-dashboard: enable rgw options conditionally
The dashboard rgw frontend options only need to be applied when there's
some nodes present in the rgw ansible group.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-18 07:22:13 +02:00
Dimitri Savineau 8ab9b719fa dashboard: use variables for port value
The current port value for alertmanager, grafana, node-exporter and
prometheus is hardcoded in the roles so it's not possible to change the
port binding of those services.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-18 07:22:13 +02:00
Dimitri Savineau 0ae0193144 ceph-infra: update handler with daemon variable
Both ntp and chrony daemon use variable for the service name because it
could be different depending on the GNU/Linux distribution.
This has been update in 9d88d3199 for chrony but only for the start part
not for the handler.
The commit fixes this for both ntp and chrony.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-12 09:14:33 -04:00
Dimitri Savineau 41b44dde85 ceph-infra: Open prometheus port
The Prometheus porrt 9090 isn't open in the firewall configuration.
Also the dashboard task on the grafana node was not required because
it's already present on the mgr node.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-11 13:40:22 +02:00
Guillaume Abrioux ee29f7370a handler: remove legacy condition
since everything is already in a block with the same condition, it's not
needed to leave all of them on these tasks.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-10 09:42:00 -04:00
Guillaume Abrioux e6dc3ebd8c validate: improve message printed in check_devices.yml
The message prints the whole content of the registered variable in the
playbook, this is not needed and makes the message pretty unclear and
unreadable.

```
"msg": "{'_ansible_parsed': True, 'changed': False, '_ansible_no_log': False, u'err': u'Error: Could not stat device /dev/sdf - No such file or directory.\\n', 'item': u'/dev/sdf', '_ansible_item_result': True, u'failed': False, '_ansible_item_label': u'/dev/sdf', u'msg': u\"Error while getting device information with parted script: '/sbin/parted -s -m /dev/sdf -- unit 'MiB' print'\", u'rc': 1, u'invocation': {u'module_args': {u'part_start': u'0%', u'part_end': u'100%', u'name': None, u'align': u'optimal', u'number': None, u'label': u'msdos', u'state': u'info', u'part_type': u'primary', u'flags': None, u'device': u'/dev/sdf', u'unit': u'MiB'}}, 'failed_when_result': False, '_ansible_ignore_errors': None, u'out': u''} is not a block special file!"
```

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1719023

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-10 09:32:11 -04:00
Dimitri Savineau 1f2a4f1910 ceph-iscsi: Update gateway config/template
- Remove gateway_keyring from the configuration file because it's
not used in ceph-iscsi 3.x release.
- Use config_template instead of template module for iscsi-gateway
configuration file. Because the file is an ini file and we might want
to override more parameters than those present in ceph-ansible.
- Because we can now set the pool name in the configuration, we should
use a variable for that. This is refact with the iscsi_pool_* variables
also used to configure the pool size.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-10 09:44:40 +02:00
Dimitri Savineau 5413274412 ceph-dashboard: remove bool filter for rgw vars
Some dashboard_rgw_api_* variables are using the bool filter but those
variables are strings with an empty string as default value.
So we should test the variable against an empty string instead of a
bool.

dashboard_rgw_api_host: ''
dashboard_rgw_api_port: ''
dashboard_rgw_api_scheme: ''
dashboard_rgw_api_admin_resource: ''

Resolves: #4179

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-10 09:42:37 +02:00
Boris Ranto 21758fcee8 dashboard: Use upstream default port
We are currently using incorrect dashboard default port. The upstream
uses 8443 instead of 8234 by default. This should get us closer to the
upstream project.

Signed-off-by: Boris Ranto <branto@redhat.com>
2019-07-10 09:17:36 +02:00
Dimitri Savineau de7f948b75 ceph-handler: fix cluster name in socket path
c90f605b5 introduces the default ceph cluster name value in the rgw
socket path for the rgw restart script. But this should use the
`cluster` variable instead.
This commit also fixes this in the osd restart script.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-08 13:55:35 -04:00
fmount 95bd002b35 Add package-install tag on ceph-grafana-dashboard pkg install.
According to the OSP pattern, we need the package-install tag
to control what is installed on the host. This commit just add
the missing tag to meet the TripleO requirements.

See: /issues/4197 for details

Fixes: #4197

Signed-off-by: fmount <fpantano@redhat.com>
2019-07-08 10:54:30 +02:00
Dimitri Savineau 91bef94b6c ceph-iscsi-gw: Update log directories bind mount
On containerized deployment we need to bind mount the ceph-iscsi
directory to avoid writing the logs in the container.
The /var/log/ceph directory isn't use by rbd-targe-api/gw services
because they have their own log directories.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-07 07:25:33 +02:00
ilyashestopalov 904532c5e2 ceph-mon: Fix cluster name parameter
The ability to add nodes with the monitor role to an existing cluster
whose name differs from the default name is fixed.

Signed-off-by: ilyashestopalov <usr.tester@yandex.ru>
2019-07-07 07:21:29 +02:00
Guillaume Abrioux a781ce881c iscsi: refact deprecated variables
This commit moves some old variables into ceph-defaults so we can move
the `use_new_ceph_iscsi` fact in ceph-facts role in order.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-03 22:13:19 +02:00
Mike Christie 08a6d10c32 igw: Add check for missing iqn
If the user is still using the older packages and does not setup
the target iqn you will just get a vague error message later on.
This adds a check during the validate task, so it is clear to the
user.

Signed-off-by: Mike Christie <mchristi@redhat.com>
2019-07-03 22:13:19 +02:00
Mike Christie 75fee55d19 igw: Update iscsigws.yml.sample for ceph-iscsi support
Update iscsigws.yml.sample to document that we cannot use ansible to
setup iSCSI objects and use the new ceph-iscsi package.

Signed-off-by: Mike Christie <mchristi@redhat.com>
2019-07-03 22:13:19 +02:00
Mike Christie cbe66cec52 igw: Support ceph-iscsi package for install
This adds support for the ceph-iscsi package during install. ceph-iscsi
does not support setting up targets/gws, luns and clients with the
current library/igw_* code. Going forward those tasks should be done with
gwcli or dashboard. ceph-iscsi will only be used if the user has no iscsi
objects setup so we do not break existing setups.

The next patch will update the iscsigws.yml.sample to document that
users must not setup any iscsi object if they want to use the new
package and tools.

Signed-off-by: Mike Christie <mchristi@redhat.com>
2019-07-03 22:13:19 +02:00
Mike Christie b7b2213be1 igw: drop gateway_ip_list for container setups
The gateway_ip_list is not used in container setups, so drop it
for that case.

Signed-off-by: Mike Christie <mchristi@redhat.com>
2019-07-03 22:13:19 +02:00
Mike Christie d89d3e7cd6 igw: move gateway_ip_list check to validate role
Signed-off-by: Mike Christie <mchristi@redhat.com>
2019-07-03 22:13:19 +02:00
Dimitri Savineau c90f605b51 ceph-handler: Fix rgw socket in restart script
Since Mimic the radosgw socket has two extra fields in the socket
name (before the .asok suffix): <pid>.<ctid>

Before:
  /var/run/ceph/ceph-client.rgw.cephaio-1.asok
After:
  /var/run/ceph/ceph-client.rgw.cephaio-1.16913.23928832.asok

The radosgw restart script doesn't handle this and could fail during
an upgrade.
If the SOCKETS variable isn't defined in the script then the test
command won't fail because the return code is 0

$ test -S
$ echo $?
0

There multiple issues in that script:
  - The default SOCKETS value isn't defined due to a typo
SOCKET vs SOCKETS.
  - Because the socket name uses the pid then we need to check the
socket name after the service restart.
  - After restarting the radosgw service we need to wait few seconds
otherwise the socket won't be created.
  - Update the wget parameters because the command is doing a loop.
We now use the same option than curl.
  - The check_rest function doesn't test the radosgw at all due to
a wrong test command (test against a string) and always returns 0.
This needs to use the DOCKER_EXECS variable in order to execute the
command.

$ test 'wget http://192.168.100.11:8080'
$ echo $?
0

Also remove the test based on the ansible_fqdn because we only use
the ansible_hostname + rgw instance name.

Finally group all for loop into a single one.

Resolves: #3926

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-03 09:30:33 +02:00
Giulio Fidente d526803c6c Add radosgw_frontend_ssl_certificate parameter
This is necessary when configuring RGW with SSL because
in addition to passing specific frontend options, civetweb
appends the 's' character to the binding port and beast uses
ssl_endpoint instead of endpoint.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1722071
Signed-off-by: Giulio Fidente <gfidente@redhat.com>
2019-07-02 14:14:37 -04:00
Guillaume Abrioux b725b3077e nfs: clean template
remove legacy options

```
ganesha.nfsd-115[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:13): Unknown parameter (Dir_Max)
ganesha.nfsd-115[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:14): Unknown parameter (Cache_FDs)

```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-28 15:09:19 -04:00
Guillaume Abrioux 33eed78d17 containers: improve logging
bindmount /var/log/ceph on all containers so it's possible to retrieve
logs from the host.

related ceph-container PR: ceph/ceph-container#1408

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1710548

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-28 13:30:36 -04:00
Dimitri Savineau 02fbe76e62 ceph-osd: Add CONTAINER_IMAGE env variable
This environment variable was added in cb381b4 but was removed in
4d35e9e.
This commit reintroduces the change.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-27 16:38:02 +02:00
fmount e655038743 Set grafana_server_addr fact for ipv6 scenarios.
As the bz1721914 describes, the grafana_server_addr
fact is not defined if ip_version used is ipv6.
This commit adds the ip_version condition to set
correctly this fact.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1721914

Signed-off-by: fmount <fpantano@redhat.com>
2019-06-26 15:47:22 +02:00
Guillaume Abrioux 366b309c12 facts: fix bug in grafana_server_addr fact setting
If no grafana-server group is defined while an mgr group is, that task
will fail because `hostvars[groups[grafana_server_group_name][0]` can't
return anything since `groups['grafana-server']` will be a non existing
key.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-26 10:49:30 +02:00
Guillaume Abrioux 2b9fb377a8 nfs: add missing | bool filters
To address this warning:
```
[DEPRECATION WARNING]: evaluating nfs_ganesha_dev as a bare variable, this
behaviour will go away and you might need to add |bool to the expression in the
 future
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-26 08:58:51 +02:00
Guillaume Abrioux edb8d42596 nfs: remove duplicate task
This task is already present in pre_requisite_non_container.yml

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-26 08:58:51 +02:00
Dimitri Savineau 45d46541cb ceph-handler: Fix OSD restart script
There's two big issues with the current OSD restart script.

1/ We try to test if the ceph osd daemon socket exists but we use a
wildcard for the socket name : /var/run/ceph/*.asok.
This fails because we usually have multiple ceph osd sockets (or
other ceph daemon collocated) present in /var/run/ceph directory.
Currently the test fails with:

bash: line xxx: [: too many arguments

But it doesn't stop the script execution.
Instead we can specify the full ceph osd socket name because we
already know the OSD id.

2/ The container filter pattern is wrong and could matches multiple
containers resulting the script to fail.
We use the filter with two different patterns. One is with the device
name (sda, sdb, ..) and the other one is with the OSD id (ceph-osd-0,
ceph-osd-15, ..).
In both case we could match more than needed.

$ docker container ls
CONTAINER ID IMAGE              NAMES
958121a7cc7d ceph-daemon:latest ceph-osd-strg0-sda
589a982d43b5 ceph-daemon:latest ceph-osd-strg0-sdb
46c7240d71f3 ceph-daemon:latest ceph-osd-strg0-sdaa
877985ec3aca ceph-daemon:latest ceph-osd-strg0-sdab
$ docker container ls -q -f "name=sda"
958121a7cc7d
46c7240d71f3
877985ec3aca

$ docker container ls
CONTAINER ID IMAGE              NAMES
2db399b3ee85 ceph-daemon:latest ceph-osd-5
099dc13f08f1 ceph-daemon:latest ceph-osd-13
5d0c2fe8f121 ceph-daemon:latest ceph-osd-17
d6c7b89db1d1 ceph-daemon:latest ceph-osd-1
$ docker container ls -q -f "name=ceph-osd-1"
099dc13f08f1
5d0c2fe8f121
d6c7b89db1d1

Adding an extra '$' character at the end of the pattern solves the
problem.

Finally removing the get_container_osd_id function because it's not
used in the script at all.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-21 19:54:15 +02:00
Dimitri Savineau dc187ea6fa Change ansible_lsb by ansible_distribution_release
The ansible_lsb fact is based on the lsb package (lsb-base,
lsb-release or redhat-lsb-core).
If the package isn't installed on the remote host then the fact isn't
populated.

--------
"ansible_lsb": {},
--------

Switching to the ansible_distribution_release fact instead.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-21 11:55:05 -04:00
fpantano ba73dc7b21 Add higher retry/delay defaults to check the quorum status.
As per bz1718981, this commit adds higher values to check
the quorum status. This is helpful for several OSP deployments
that fail during the scale up.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1718981

Signed-off-by: fpantano <fpantano@redhat.com>
2019-06-20 22:39:57 +02:00
Dimitri Savineau b987534881 ceph-volume: Set max open files limit on container
The ceph-volume lvm list command takes ages to complete when having
a lot of LV devices on containerized deployment.
For instance, with 25 OSDs on a node it takes 3 mins 44s to list the
OSD.
Adding the max open files limit to the container engine cli when
executing the ceph-volume command seems to improve a lot thee
execution time ~30s.

This was impacting the OSDs creation with ceph-volume (both filestore
and bluestore) when using multiple LV devices.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-20 22:37:40 +02:00
Guillaume Abrioux 46a2683944 facts: add a retry on get current fsid task
sometimes it can happen the following task fails:

```
TASK [ceph-facts : get current fsid] *******************************************
task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-centos-container-update/roles/ceph-facts/tasks/facts.yml:78
Wednesday 19 June 2019  18:12:49 +0000 (0:00:00.203)       0:02:39.995 ********
fatal: [mon2 -> mon1]: FAILED! => changed=true
  cmd:
  - timeout
  - --foreground
  - -s
  - KILL
  - 600s
  - docker
  - exec
  - ceph-mon-mon1
  - ceph
  - --cluster
  - ceph
  - daemon
  - mon.mon1
  - config
  - get
  - fsid
  delta: '0:00:00.239339'
  end: '2019-06-19 18:12:49.812099'
  msg: non-zero return code
  rc: 22
  start: '2019-06-19 18:12:49.572760'
  stderr: 'admin_socket: exception getting command descriptions: [Errno 2] No such file or directory'
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>
```

not sure exactly why since just before this task, mon1 seems to be well
UP otherwise it wouldn't have passed the task `waiting for the
containerized monitor to join the quorum`.

As a quick fix/workaround, let's add a retry which allows us to get
around this situation:

```
TASK [ceph-facts : get current fsid] *******************************************
task path: /home/jenkins-build/build/workspace/ceph-ansible-scenario/roles/ceph-facts/tasks/facts.yml:78
Thursday 20 June 2019  15:35:07 +0000 (0:00:00.201)       0:03:47.288 *********
FAILED - RETRYING: get current fsid (3 retries left).
changed: [mon2 -> mon1] => changed=true
  attempts: 2
  cmd:
  - timeout
  - --foreground
  - -s
  - KILL
  - 600s
  - docker
  - exec
  - ceph-mon-mon1
  - ceph
  - --cluster
  - ceph
  - daemon
  - mon.mon1
  - config
  - get
  - fsid
  delta: '0:00:00.290252'
  end: '2019-06-20 15:35:13.960188'
  rc: 0
  start: '2019-06-20 15:35:13.669936'
  stderr: ''
  stderr_lines: <omitted>
  stdout: |-
    {
        "fsid": "153e159d-7ade-42a7-842c-4d04348b901e"
    }
  stdout_lines: <omitted>
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-20 13:13:04 -04:00
Dimitri Savineau 7c3640177b roles: Remove useless become (true) flag
We already set the become flag to true at a play level in the site*
playbooks so we don't need to set it at a task level.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-19 10:31:32 +02:00
Guillaume Abrioux eece362b38 osd: remove legacy task
`parted_results` isn't used anymore in the playbook.

By the way, `parted` seems to cause issue because it changes the
ownership on devices:

```
root@osd0 ~]# ls -l /dev/sdc*
brw-rw----. 1 root disk 8, 32 Jun 11 08:53 /dev/sdc
brw-rw----. 1 ceph ceph 8, 33 Jun 11 08:53 /dev/sdc1
brw-rw----. 1 ceph ceph 8, 34 Jun 11 08:53 /dev/sdc2

[root@osd0 ~]# parted -s /dev/sdc print
Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sdc: 53.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system  Name           Flags
 1      1049kB  1075MB  1074MB               ceph block.db
 2      1075MB  2149MB  1074MB               ceph block.db

[root@osd0 ~]# #We can see ownerships have changed from ceph:ceph to root:disk:
[root@osd0 ~]# ls -l /dev/sdc*
brw-rw----. 1 root disk 8, 32 Jun 11 08:57 /dev/sdc
brw-rw----. 1 root disk 8, 33 Jun 11 08:57 /dev/sdc1
brw-rw----. 1 root disk 8, 34 Jun 11 08:57 /dev/sdc2
[root@osd0 ~]#
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-18 12:45:01 -04:00
Dimitri Savineau 34f9d51178 tests: Update ansible ssh_args variable
Because we're using vagrant, a ssh config file will be created for
each nodes with options like user, host, port, identity, etc...
But via tox we're override ANSIBLE_SSH_ARGS to use this file. This
remove the default value set in ansible.cfg.

Also adding PreferredAuthentications=publickey because CentOS/RHEL
servers are configured with GSSAPIAuthenticationis enabled for ssh
server forcing the client to make a PTR DNS query.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-17 09:24:24 +02:00
Rishabh Dave 9d88d3199f ceph-infra: make chronyd default NTP daemon
Since timesyncd is not available on RHEL-based OSs, change the default
to chronyd for RHEL-based OSs. Also, chronyd is chrony on Ubuntu, so
set the Ansible fact accordingly.

Fixes: https://github.com/ceph/ceph-ansible/issues/3628
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-06-13 14:53:22 -04:00
Rishabh Dave d1c266e6c7 ceph-infra: update cache for Ubuntu
Ubuntu-based CI jobs often fail with error code 404 while installing
NTP daemons. Updating cache beforehand should fix the issue.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-06-13 14:00:29 +02:00
Rishabh Dave 67071c3169 align cephfs pool creation
The definitions of cephfs pools should match openstack pools.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
Co-Authored-by: Simone Caronni <simone.caronni@teralytics.net>
2019-06-13 09:44:05 +02:00
Guillaume Abrioux 4cf17a6fdd iscsi: assign application (rbd) to pool 'rbd'
if we don't assign the rbd application tag on this pool,
the cluster will get `HEALTH_WARN` state like following:

```
HEALTH_WARN application not enabled on 1 pool(s)
POOL_APP_NOT_ENABLED application not enabled on 1 pool(s)
    application not enabled on pool 'rbd'
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-13 07:35:39 +02:00
Dimitri Savineau da9891da1e ceph-handler: replace fuser by /proc/net/unix
We're using fuser command to see if a process is using a ceph unix
socket file. But the fuser command runs through every PID present in
/proc/<PID> to see if one of them is using the file.
On a system running thousands processes, the fuser command can take
a long time to finish.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1717011

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-12 19:31:21 +02:00
Guillaume Abrioux 905c2256bd mon: enforce mon0 delegation for initial_mon_key register
since this task is designed to be always run on the first monitor, let's
enforce the container name accordingly otherwise it could fail like
following:

```
fatal: [mon1 -> mon0]: FAILED! => changed=true
  cmd:
  - docker
  - exec
  - ceph-mon-mon1
  - ceph
  - --cluster
  - ceph
  - --name
  - mon.
  - -k
  - /var/lib/ceph/mon/ceph-mon0/keyring
  - auth
  - get-key
  - mon.
  delta: '0:00:00.085025'
  end: '2019-06-12 06:12:27.677936'
  msg: non-zero return code
  rc: 1
  start: '2019-06-12 06:12:27.592911'
  stderr: 'Error response from daemon: No such container: ceph-mon-mon1'
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-12 11:21:19 -04:00
Guillaume Abrioux 27856cc499 dashboard: add allow_embedding support
Add a variable to support the allow_embedding support.

See ceph/ceph-ansible/issues/4084 for details.

Fixes: #4084

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-12 16:00:32 +02:00
Guillaume Abrioux 2c9cd9d9e7 dashboard: fix dashboard_url setting
This setting must be set to something resolvable.

See: ceph/ceph-ansible/issues/4085 for details

Fixes: #4085

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-12 15:59:58 +02:00
Dimitri Savineau d0840217f3 ceph-node-exporter: Fix systemd template
069076b introduced a bug in the systemd unit script template. This
commit fixes the options used by the node-exporter container.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-11 21:48:40 +02:00
Dimitri Savineau dbf81b6b5b ceph-node-exporter: use modprobe ansible module
Instead of using the modprobe command from the path in the systemd
unit script, we can use the modprobe ansible module.
That way we don't have to manage the binary path based on the linux
distribution.

Resolves: #4072

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-11 21:40:50 +02:00
fmount 069076bbfd Fix units and add ability to have a dedicated instance
Few fixes on systemd unit templates for node_exporter and
alertmanager container parameters.
Added the ability to use a dedicated instance to deploy the
dashboard components (prometheus and grafana).
This commit also introduces the grafana_group_name variable
to refer grafana group and keep consistency with the other
groups.
During the integration with TripleO some grafana/prometheus
template variables resulted undefined. This commit adds the
ability to check if the group exist and create, accordingly,
different job groups in prometheus template.

Signed-off-by: fmount <fpantano@redhat.com>
2019-06-10 18:18:46 +02:00
Guillaume Abrioux 771648304d validate: fail in check_devices at the right task
see https://bugzilla.redhat.com/show_bug.cgi?id=1648168#c17 for details.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1648168#c17

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-07 16:14:18 +02:00
Dimitri Savineau f49090df7e podman: Add systemd dependency on network.target
When using podman, the systemd unit scripts don't have a dependency
on the network. So we're not sure that the network is up and running
when the containers are starting.
With docker this behaviour is already handled because the systemd
unit scripts depend on docker service which is started after the
network.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-07 09:28:58 +02:00
guihecheng 35d40c65f8 Add role definitions of ceph-rgw-loadbalancer
This add support for rgw loadbalancer based on HAProxy and Keepalived.
We define a single role ceph-rgw-loadbalancer and include HAProxy and
Keepalived configurations all in this.

A single haproxy backend is used to balance all RGW instances and
a single frontend is exported via a single port, default 80.

Keepalived is used to maintain the high availability of all haproxy
instances. You are free to use any number of VIPs. A single VIP is
shared across all keepalived instances and there will be one
master for one VIP, selected sequentially, and others serve as
backups.
This assumes that each keepalived instance is on the same node as
one haproxy instance and we use a simple check script to detect
the state of each haproxy instance and trigger the VIP failover
upon its failure.

Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>
2019-06-06 17:12:04 +02:00
L3D ab54fe20ec ansible: use 'bool' filter on boolean conditionals
By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these:
```
[DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable,
this behaviour will go away and you might need to add |bool to the expression
in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This
feature will be removed in version 2.12. Deprecation warnings can be disabled
by setting deprecation_warnings=False in ansible.cfg.
```

Now appended ``| bool`` on a lot of the affected variables.

Sometimes the coding style from ``variable|bool`` changed to ``variable | bool`` *(with spaces at the pipe)*.

Closes: #4022

Signed-off-by: L3D <l3d@c3woc.de>
2019-06-06 10:21:17 +02:00
Dimitri Savineau 518ab794fb container-common: support podman on Ubuntu
Currently we're only able to use podman on ubuntu if podman's
installation is done manually before the ceph-ansible execution
because the deb package is present in an external repository.
We already manage the docker-ce installation via an external
repository so we should be able to allow the podman installation
with the same mechanism too.

https://github.com/containers/libpod/blob/master/install.md#ubuntu

Resolves: #3947

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-05 14:07:34 +02:00
Guillaume Abrioux 80875adba7 ceph-osd: do not relabel /run/udev in containerized context
Otherwise content in /run/udev is mislabeled and prevent some services
like NetworkManager from starting.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-04 11:32:41 -04:00
Guillaume Abrioux a78fb209b1 tests: test podman against atomic os instead rhel8
the rhel8 image used is an outdated beta version, it is not worth it to
maintain this image upstream, since it's possible to test podman with a
newer version of centos/atomic-host image.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-04 11:32:41 -04:00
Dimitri Savineau 616c484698 ceph-nfs: use template module for configuration
789cef7 introduces a regression in the ganesha configuration file
generation. The new config_template module version broke it.
But the ganesha.conf file isn't an ini file and doesn't really
need to use the config_template module. Instead we can use the
classic template module.

Resolves: #4045

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-04 09:11:52 +02:00
Guillaume Abrioux 6e2e30db54 dashboard: move ceph-grafana-dashboards package installation
This commit moves the package installation into ceph-dashboard role.
This is needed to install ceph dasboard json file in
`/etc/grafana/dashboards/ceph-dashboard/`.

Closes: #4026

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-03 13:36:38 +02:00
Guillaume Abrioux 14f5fc3c86 infra: refact dashboard firewall rules
- There is no need to open ports 3000, 8234, 9283 on all nodes.
- Add missing rule for alertmanager (port 9093)

Closes: #4023

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-03 13:36:38 +02:00
Guillaume Abrioux a2b6f44665 dashboard: append mgr modules to ceph_mgr_modules
when `dashboard_enabled` is `True`, let's append `dashboard` and
`prometheus` modules to `ceph_mgr_modules` so they are automatically
loaded.

Closes: #4026

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-03 13:36:38 +02:00
Dimitri Savineau 7503098ca0 remove ceph-agent role and references
The ceph-agent role was used only for RHCS 2 (jewel) so it's not
usefull anymore.
The current code will fail on CentOS distribution because the rhscon
package is only avaible on Red Hat with the RHCS 2 repository and
this ceph release is supported on stable-3.0 branch.

Resolves: #4020

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-03 13:35:50 +02:00
Guillaume Abrioux 003aeea45a validate: add a check for nfs standalone
if `nfs_obj_gw` is True when deploying an internal ganesha with an
external ceph cluster, `ceph_nfs_rgw_access_key` and
`ceph_nfs_rgw_secret_key` must be provided so the
ganesha configuration file can be generated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-03 13:34:38 +02:00
Guillaume Abrioux 6a6785b719 nfs: support internal Ganesha with external ceph cluster
This commits allows to deploy an internal ganesha with an external ceph
cluster.

This requires to define `external_cluster_mon_ips` with a comma
separated list of external monitors.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1710358

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-03 13:34:38 +02:00
Dimitri Savineau daf92a9e1f ceph-facts: generate fsid on mon node
The fsid generation is done via a python command. When the ansible
controller node only have python3 available (like RHEL 8) then the
python command isn't necessarily present causing the fsid generation
to fail.
We already do some resource creation (like ceph keyring secret) with
the python command too but from the mon node so we should do the same
for fsid.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1714631

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-03 10:11:32 +02:00
Guillaume Abrioux 55420d6253 roles: introduce `ceph-container-engine` role
This commit splits the current `ceph-container-common` role.

This introduces a new role `ceph-container-engine` which handles the
tasks specific to the installation of containers tools (docker/podman).

This is needed for the ceph-dashboard implementation for 2 main reasons:

1/ Since the ceph-dashboard stack is only containerized, we must install
everything needed to run containers even in non containerized
deployments. Splitting this role allows us to not have to call the full
`ceph-container-common` role which would run a bunch of unneeded tasks
that would have been skipped anyway.

2/ The current implementation would have required to run
`ceph-container-common` on all ceph-clients nodes which would have been
conflicting with 9d3517c670 (we don't want
to run ceph-container-common on all client nodes, see mentioned commit
for more details)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-22 13:02:10 +02:00
Dimitri Savineau f37edfa113 ceph-mgr: install python-routes for dashboard
The ceph mgr dashboard requires routes python library to be installed
on the system.

Resolves: #3995

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-05-22 08:46:16 +02:00
Dimitri Savineau 622d9feae9 common: use gnupg instead of gpg
gpg package isn't available for all Debian/Ubuntu distribution but
gnupg is.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-05-21 17:53:58 +02:00
Dimitri Savineau 29b0d47c8c ceph-prometheus: fix error in templates
- remove trailing double quotes in jinja templates
- add jinja filename without .j2 suffix

Resolves: #4011

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-05-21 17:53:39 +02:00
Guillaume Abrioux 6ca7372a2d config: fix ipv6
As of nautilus, if you set `ms bind ipv6 = True` you must explicitly set
`ms bind ipv4 = False` too, otherwise OSDs will still try to pick up an
IPv4 address.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1710319

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-21 10:38:00 -04:00
Dimitri Savineau 0ee833432e ceph-nfs: apply selinux fix anyway
Because ansible_distribution_version doesn't return minor version on
CentOS with ansible 2.8 we can apply the selinux anyway but only for
CentOS/RHEL 7.
Starting RHEL 8, there's a dedicated package for selinux called
nfs-ganesha-selinux [1].

Also replace the command module + semanage by the selinux_permissive
module.

[1] https://github.com/nfs-ganesha/nfs-ganesha/commit/a7911f

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-05-20 13:04:58 +02:00
Dimitri Savineau 0c7fd79865 ceph-validate: use kernel validation for iscsi
Ceph iSCSI gateway requires Red Hat Enterprise Linux or CentOS 7.5
or later.
Because we can not check the ansible_distribution_version fact for
CentOS with ansible 2.8 (returns only the major version) we can
fallback by checking the kernel option.

  - CONFIG_TARGET_CORE=m
  - CONFIG_TCM_USER2=m
  - CONFIG_ISCSI_TARGET=m

http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-05-20 13:04:58 +02:00
Guillaume Abrioux 72d8315299 switch to ansible 2.8
- remove private attribute with import_role.
- update documentation.
- update rpm spec requirement.
- fix MagicMock python import in unit tests.

Closes: #3765

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-20 13:04:58 +02:00
Dimitri Savineau 494746b7a6 common: install dependencies for apt modules
When using a minimal Debian/Ubuntu distribution there's no
ca-certificates and gpg packages installed so the apt modules will
fail:

Failed to find required executable gpg in paths:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

apt.cache.FetchFailedException:
W:https://download.ceph.com/debian-luminous/dists/bionic/InRelease:
No system certificates available. Try installing ca-certificates.

Resolves: #3994

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-05-20 10:02:43 +02:00
Guillaume Abrioux 9f0d4d6847 dashboard: move defaults variables to ceph-defaults
There is no need to have default values for these variables in each roles
since there is no corresponding host groups

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
Guillaume Abrioux e74d80e72f rename docker_exec_cmd variable
This commit renames the `docker_exec_cmd` variable to
`container_exec_cmd` so it's more generic.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
Guillaume Abrioux cc285c417a dashboard: align the way containers are managed
This commit aligns the way the different containers are managed with how
it's currently done with the other ceph daemon.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
Guillaume Abrioux cd5f3fca64 dashboard: convert dashboard_rgw_api_no_ssl_verify to a bool
make `dashboard_rgw_api_no_ssl_verify` a bool variable since it seems to
be used as it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
Guillaume Abrioux 8bbcc46ae4 dashboard: remove legacy file
this file seems to be no longer used, let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
Guillaume Abrioux 14f381200d dashboard: set less permissive permissions on dashboard certificate/key
use `0440` instead of `0644` is enough

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
Guillaume Abrioux 4405f50c85 dashboard: simplify config-key command
since stable-4.0 isn't to deploy ceph releases prior to nautilus,
there's no need to add this complexity here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
Guillaume Abrioux cdff0da7d4 dashboard: do not call ceph-container-common from other role
use site.yml to deploy ceph-container-common in order to install docker
even in non-containerized deployments since there's no RPM available to
deploy the differents applications needed for ceph-dashboard.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
Guillaume Abrioux 742bb6214c dashboard: use existing variable to detect containerized deployment
there is no need to add more complexity for this, let's use
`containerized_deployment` in order to detect if we are running a
containerized deployment.
The idea is to use `container_exec_cmd` the same way we do in the rest of
the playbook to run the different ceph commands needed to deploy the
ceph-dashboard role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
Guillaume Abrioux 6d9dbb1d39 facts: set container_binary fact in non-containerized deployment
This is needed for the ceph-dashboard implementation since it requires
to run containerized application which aren't packaged as RPMs.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
Guillaume Abrioux 3578d576a4 dashboard: rename template files
add .j2 to all templates file related to dashboard roles.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
Boris Ranto b4d1c3693b dashboard: Support podman
This adds support for podman in dashboard-related roles. It also drops
the creation of custom network for the dashboard-related roles as this
functionality works in a different way with podman.

Signed-off-by: Boris Ranto <branto@redhat.com>
2019-05-16 16:39:13 +02:00
Boris Ranto e737a1f83e dashboard: Set ssl_server_port if it is supported
We cannot use the old fashioned config-key way, here. It was not
supported when the option was introduced (post 14.2.0). Since the option
is not always supported we can simply ignore the potential failure on
ceph clusters that do not support it.

Signed-off-by: Boris Ranto <branto@redhat.com>
2019-05-16 16:39:13 +02:00
Boris Ranto 8f77caa932 dashboard: Add and copy alerting rules
This commit adds a list of alerting rules for ceph-dashboard from the
old cephmetrics project. It also installs the configuration file so that
the rules get recognized by the prometheus server.

Signed-off-by: Boris Ranto <branto@redhat.com>
2019-05-16 16:39:13 +02:00
Boris Ranto 2f141a6e80 Merge cephmetrics/dashboard-ansible repo
This commit will merge dashboard-ansible installation scripts with
ceph-ansible. This includes several new roles to setup ceph-dashboard
and the underlying technologies like prometheus and grafana server.

Signed-off-by: Boris Ranto & Zack Cerza <team-gmeno@redhat.com>
Co-authored-by: Zack Cerza <zcerza@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
Dimitri Savineau d2ad191eca container-common: allow podman for other distros
Currently podman installation is very tied to RHEL 8 even if we're
able to install it on Debian/Ubuntu distribution.
This patch changes the way we are starting or not the (fat) container
daemon. Before the condition was based on the distribution release
and now on the container_service_name variable.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-05-13 16:24:00 +02:00
Bruceforce c3b0ee30a1 ceph-nfs: fixed with_items
If we do this in one line we get the error described in #3968

fixes #3968

Signed-off-by: Bruceforce <markus.greis@gmx.de>
2019-05-13 16:23:43 +02:00
Bruceforce 29f2c953b4 ceph-nfs: fixed condition for "stable repos specific tasks"
The old condition would resolve to
"when": "nfs_ganesha_stable - ceph_repository == 'community'"

now it is
"when": [
          "nfs_ganesha_stable",
          "ceph_repository == 'community'"
        ]

Please backport to stable-4.0

Signed-off-by: Bruceforce <markus.greis@gmx.de>
2019-05-13 09:53:54 +02:00
Dimitri Savineau ba49225eab Update RHCS version with Nautilus
RHCS 4 will be based on Nautilus and only usable on RHEL 8.
Updated the default ceph_rhcs_version to 4 and update the rhcs
repositories to rhcs 4 with RHEL 8.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-05-13 09:53:18 +02:00
Kevin Coakley 381c58ca3e Set the rgw_create_pools pools application to rgw
Set the application to rgw for pools created from rgw_create_pools. On Ceph Nautilus the heath is set to HEALTH_WARN with the message "application not enabled on X pool(s)" if an application isn't specified for a pool.

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
2019-05-13 09:48:25 +02:00
Rishabh Dave 121b5e4184 ceph-rbd-mirror: refactor tasks/main.yml
Use blocks for similar tasks in main.yml. And move when keywords before
block keywords.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-05-10 09:21:54 +02:00
Rishabh Dave 1a4dccdbb9 ceph-mds: group similar tasks in create_mds_filesystem.yml
Group similar tasks together using block keyword.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-05-10 09:21:05 +02:00
Guillaume Abrioux 936c6fca78 facts: fix external cluster bug
running an external ceph cluster deployment with (obviously) no
monitors defined in inventory breaks with an undefined error because
`_monitor_addresses` never get defined.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707460

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-07 17:53:53 +02:00
Rishabh Dave 56bfec7c58 ceph-mgr: create keys for MGRs
Add code in ceph-mgr for creating a keyring for manager in so that
managers can be deployed on a separate node too.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-05-07 14:13:06 +02:00
Rishabh Dave 89748d579a don't access other node's docker_exec_cmd variable
Except for some corner case, it's not correct to access some other
node's copy of variable docker_exec_cmd. Therefore replace
"hostvars[groups[mon_group_name][0]]['docker_exec_cmd']" by
"docker_exec_cmd".

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-05-07 12:37:48 +02:00
Gaudenz Steinlin 3c8987c7a5 Fix check mode support
Adds "check_mode: no" to commands which register cluster state in a
variable and don't modify anything. These commands have to run in order
to support running the playbook in check mode.

Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>
2019-05-07 09:49:20 +02:00
Dimitri Savineau ae266c6f2b ansible: remove private and static attribute
This will be removed in ansible 2.8 and breaks the playbook execution
with this release.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-05-02 14:25:17 -04:00
Dimitri Savineau 1999cf3d19 ceph-mds: Increase cpu limit to 4
In containerized deployment the default mds cpu quota is too low
for production environment.
This is causing performance degradation compared to bare-metal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695850

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-24 20:33:02 +02:00
Dimitri Savineau c17106874c ceph-osd: Increase cpu limit to 4
In containerized deployment the default osd cpu quota is too low
for production environment using NVMe devices.
This is causing performance degradation compared to bare-metal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695880

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-24 17:59:42 +02:00
Dimitri Savineau 4ae5ce399b ceph-iscsi: start tcmu-runner for non-container
Only rbd-target-api and rbd-target-gw were started/enabled for non
containerized deployment.
The issue doesn't happen with containerized setup.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-24 10:03:25 +02:00
Rishabh Dave 739a662c80 improve coding style
Keywords requiring only one item shouldn't express it by creating a
list with single item.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-04-23 15:37:07 +02:00
Radu Toader b2f242660e Allow CephFS pool to be created with specific rule_name, erasure_profile just like rbd pools
Signed-off-by: Radu Toader <radu.m.toader@gmail.com>
2019-04-20 02:26:05 +00:00
Dimitri Savineau 8105a1cefb ceph-container-common: modify requirement flow
Until now it was not possible to install a specific container package
because it was somehow hardcoded.
This patch allows to override the container package name (docker.io
vs docker-ce) and refacts the package installation. This could be
achieve via the container_package_name variable.
Instead of using one task per distribution we can set the package and
service name in vars. This allows to have a unified package task.
Also refactorize the debian_prerequisites tasks because the content
was outdated.

https://docs.docker.com/install/linux/docker-ce/debian/
https://docs.docker.com/install/linux/docker-ce/ubuntu/

Resolves: #3609

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-18 16:18:01 +02:00
Guillaume Abrioux 58f3851573 mds: remove legacy task
this task has nothing to do in stable-4.0 and after.
Let's remove it since stable-4.0 and after aren't intended to deploy
luminous.

Closes: #3873

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-18 15:55:45 +02:00
Kyle Bader 0bee90b201 rgw: add cpuset support
1/ The OSD already supports cpuset to be used for containerized deployments
through the use of the ceph_osd_docker_cpuset_cpus variable. This adds similar
support to the RGW service for containerized deployments by setting a new
variable named ceph_rgw_docker_cpuset_cpus. Like the OSD, there are times where
using distinct cores has advantages over using the CFS in kernel scheduler.

ceph_rgw_docker_cpuset_cpus accepts a comma delimited set of CPU ids

2/ Add support for specifying --cpuset-mem variable to restrict the cgroup's memory
allocations to a particular numa node, which should typically correspond with
the cpu ids of that numa node that were provided with --cpuset-cpus. To ensure
the correct cpu ids are used one can run `numactl --hardware`  to list the nodes
and which cpu ids correspond to each.

Signed-off-by: Kyle Bader <kbader@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-18 15:55:19 +02:00
Dimitri Savineau 86315272c7 ceph-mgr: Add extra module packages
Since Nautilus there's mgr extra modules not present in ceph-mgr
package but in dedicated packages.

Resolves: #3860

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-18 15:31:22 +02:00
Guillaume Abrioux a4bc7bda51 update: refact msgr2 migration
this commit refact the msgr2 protocol introduction.

If it's a fresh install, let's go with v2 only.
If we upgrade to nautilus, we should go with v2+v1 syntax to ensure
nothing breaks.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-18 11:16:11 +02:00
Andrew Schoen 67453853ff rolling_update: set num_osds to the number of running osds
We do this so that the ceph-config role can most accurately
report the number of osds for the generation of the ceph.conf
file.

We don't want to use ceph-volume to determine the number of
osds because in an upgrade to nautilus ceph-volume won't be able to
accurately count osds created by ceph-disk.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-04-18 10:55:11 +02:00
Andrew Schoen 5e3dfe5021 ceph-osd: do not run lvm batch tasks during update
When performing a rolling update do not try to create
any new osds with `ceph-volume lvm batch`. This is troublesome
because when upgrading to nautilus the devices list might contain
devices that are currently being used by ceph-disk and have GPT
headers on them, which will cause ceph-volume to fail when
trying to use such a device. Any devices originally created
by ceph-disk will need to be removed from the devices list
before any new osds can be created.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-04-18 10:55:11 +02:00
Dimitri Savineau c8814d1331 ceph-iscsi-gw: Remove library directory
The library directory that contain the custom ceph modules in present
in the ceph-ansible root directory.
All igw_* mocules are already present there so we don't need the one
present in roles/ceph-iscsi-gw/library.
Also remove the associated spec file.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-18 10:37:57 +02:00
Dimitri Savineau e471bce76b allow using ansible 2.8
Currently we only support ansible 2.7
We plan to use 2.8 when it will be release so we have to support both
2.7 and 2.8.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1700548

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-17 16:57:37 +02:00
Guillaume Abrioux edfa4310d3 defaults: refact package dependencies installation.
Because 5c98e361df could be seen as a non
backward compatible change this commit reverts it and bring back package
dependencies installation support.
Let's just modify the default value instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-16 11:07:59 -04:00
Guillaume Abrioux 83df60cbc3 defaults: remove some package dependencies
These packages aren't needed anymore.
They were needed for ceph-init-detect buti as of ceph-init-detect doesn't exist
anymore.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683885

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-15 11:28:58 -04:00
Rishabh Dave 96c180cc0e check if mon daemon is installed before restarting it
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-04-15 10:00:50 +02:00
Guillaume Abrioux edf1ee2073 mon: check if an initial monitor keyring already exists
When adding a new monitor, we must reuse the existing initial monitor
keyring. Otherwise, the new monitor will issue its 'mkfs' with a new
monitor keyring and it will result with a mismatch between them. The
new monitor will be unable to join the quorum in the end.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Rishabh Dave <ridave@redhat.com>
2019-04-15 10:00:50 +02:00
Guillaume Abrioux f899da3172 osd: remove legacy file
this file is not used anymore, let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-11 11:57:02 -04:00
Guillaume Abrioux 4f68462009 osd: remove ceph-disk scenarios files
these files aren't needed anymore since we only use lvm scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-11 11:57:02 -04:00
Guillaume Abrioux f0416c8892 osd: remove dedicated_devices variable
This variable was related to ceph-disk scenarios.
Since we are entirely dropping ceph-disk support as of stable-4.0, let's
remove this variable.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-11 11:57:02 -04:00
Guillaume Abrioux 4d35e9eeed osd: remove variable osd_scenario
As of stable-4.0, the only valid scenario is `lvm`.
Thus, this makes this variable useless.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-11 11:57:02 -04:00
Guillaume Abrioux 4d5637fd8a osd: remove legacy file
ceph_disk_cli_options_facts.yml is not used anymore, let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-11 11:57:02 -04:00
Sébastien Han 2888c0825f validate: only check device when they are devices
We only validate the devices that are passed if there is a list of
devices to validate.

Signed-off-by: Sébastien Han <seb@redhat.com>
2019-04-11 11:57:02 -04:00
Sébastien Han 52df15895b osd: default osd_scenario to lvm
osd_scenario has become obsolete and defaults to lvm. With lvm there is
no such things has collocated and non-collocated.

Signed-off-by: Sébastien Han <seb@redhat.com>
2019-04-11 11:57:02 -04:00
Sébastien Han 9ea1e49407 validate: print a message for old scenarios
ceph-disk is not supported anymore, so all the newly created OSDs will
be configured using ceph-volume.

Signed-off-by: Sébastien Han <seb@redhat.com>
2019-04-11 11:57:02 -04:00
Sébastien Han e2a5aa062e osd: remove ceph-disk support
We don't support the preparation of OSD with ceph-disk. ceph-volume is
only supported. However, the start operation of OSD is still supported.
So let's say you change a config option, the handlers will be able to
restart all the OSDs via their respective systemd unit files.

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-11 11:57:02 -04:00
Dimitri Savineau d2efb7f02b ceph-mds: Set application pool to cephfs
We don't need to use the cephfs variable for the application pool
name because it's always cephfs.
If the cephfs variable is set to something else than the default
value it will break the appplication pool task.

Resolves: #3790

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-11 15:15:41 +02:00
Guillaume Abrioux 7e0adca7a4 osds: allow passing devices by path
ceph-volume didn't work when the devices where passed by path.
Since it now support it, let's allow this feature in ceph-ansible

Closes: #3812

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-10 13:22:30 -04:00
Guillaume Abrioux 631e5d3144 mon: remove useless delegate_to
Let's use a condition to run this task only on the first mon.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-10 00:39:25 +00:00
Dimitri Savineau d17b1b48b6 rgw: change default frontend on nautilus
As discussed in ceph/ceph#26599, beast is now the default frontend
for rados gateway with nautilus release.
Add rgw_thread_pool_size variable with 512 as default value and keep
backward compatibility with num_threads option when using civetweb.
Update radosgw_civetweb_num_threads to reflect rgw_thread_pool_size
change.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-09 17:21:51 +02:00
Dimitri Savineau 37816570c6 container-common: Enable docker on boot for ubuntu
docker daemon is automatically started during package installation
but the service isn't enabled on boot.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-09 16:50:10 +02:00
Matthew Vernon 9dd913cf8a UCA: Uncomment UCA variables in defaults, fix consequent breakage
The Ubuntu Cloud Archive-related (UCA) defaults in
roles/ceph-defaults/defaults/main.yml were commented out, which means
if you set `ceph_repository` to "uca", you get undefined variable
errors, e.g.

```
The task includes an option with an undefined variable. The error was: 'ceph_stable_repo_uca' is undefined

The error appears to have been in '/nfs/users/nfs_m/mv3/software/ceph-ansible/roles/ceph-common/tasks/installs/debian_uca_repository.yml': line 6, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

- name: add ubuntu cloud archive repository
  ^ here

```

Unfortunately, uncommenting these results in some other breakage,
because further roles were written that use the fact of
`ceph_stable_release_uca` being defined as a proxy for "we're using
UCA", so try and install packages from the bionic-updates/queens
release, for example, which doesn't work. So there are a few `apt` tasks
that need modifying to not use `ceph_stable_release_uca` unless
`ceph_origin` is `repository` and `ceph_repository` is `uca`.

Closes: #3475
Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
2019-04-09 13:44:00 +02:00
Dimitri Savineau fd4b0ec7eb ceph-facts: use last ipv6 address for mon/rgw
When using monitor_address_block or radosgw_address_block variables
to configure the mon/rgw address we're getting the first ip address
from the ansible facts present in that cidr.
When there's VIP on that network the first filter could return the
wrong value.
This seems to affect only IPv6 setup because the VIP addresses are
added to the ansible facts at the beginning of the list. This is the
opposite (at the end) when using IPv4.
This causes the mon/rgw processes to bind on the VIP address.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1680155

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-09 06:17:05 +02:00
François Lafont 4c3e77d869 ceph-rgw: Fix bad paths which depend on the clustername
The path of the RGW environment file (in the /var/lib/ceph/radosgw/
directory) depends on the Ceph clustername. It was not taken into
account in the Ansible role `ceph-rgw`.

Signed-off-by: flaf <francois.lafont.1978@gmail.com>
2019-04-09 06:16:31 +02:00
Guillaume Abrioux cbfdbab177 mgr: manage mgr modules when mgr and mon are collocated
When mgrs are implicitly collocated on monitors (no mgrs in mgrs group).
That include was skipped because of this condition :

`inventory_hostname == groups[mgr_group_name][0]`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-09 06:12:29 +02:00
Guillaume Abrioux f596cc1711 mgr: wait for all mgr to be available
before managing mgr modules, we must ensure all mgr are available
otherwise we can hit failure like following:

```
stdout:Error ENOENT: all mgr daemons do not support module 'restful', pass --force to force enablement
```

It happens because all mgr are not yet available when trying to manage
with mgr modules.

Closes: #3100

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-09 06:12:29 +02:00
Rishabh Dave c0dfa9b61a allow adding a MDS to already deployed cluster
Add a tox scenario that adds an new MDS node as a part of already
deployed Ceph cluster and deploys MDS there.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-04-08 13:33:28 +02:00
Ali Maredia 37f46a8c5d rgw multisite: add more than 1 rgw to the master or secondary zone
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1664869

Signed-off-by: Ali Maredia <amaredia@redhat.com>
2019-04-06 08:01:19 +02:00
Dimitri Savineau d3ae9fd05f radosgw: Raise cpu limit to 8
In containerized deployment the default radosgw quota is too low
for production environment.
This is causing performance degradation compared to bare-metal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1680171

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-04 18:50:48 +02:00
fpantano afbb90e4ac Check ceph_health_raw.stdout value as string during mon bootstrap
According to rdo testing https://review.rdoproject.org/r/#/c/18721
a check on the output of the ceph_health value is added to
allow the playbook to make several attempts (according to the
retry/delay variables) when waiting the cluster quorum or
when the container bootstrap is not ended.
It avoids the failure of the command execution when it doesn't
receive a valid json object to decode (because cluster is too
slow to boostrap compared to ceph-ansible task execution).

Signed-off-by: fpantano <fpantano@redhat.com>
2019-04-03 20:55:05 +00:00
Dimitri Savineau 7e5e4229b7 ceph-volume: Add PYTHONIOENCODING env variable
Since https://github.com/ceph/ceph/commit/77912c0 ceph-volume uses
stdout encoding based on LC_CTYPE and PYTHONIOENCODING environment
variables.
Thoses variables aren't set when using ansible.
Currently this commit breaks non containerized deployment on Ubuntu.

TASK [use ceph-volume to create bluestore osds] ********************
  cmd:
  - ceph-volume
  - --cluster
  - ceph
  - lvm
  - create
  - --bluestore
  - --data
  - /dev/sdb
  rc: 1
  stderr: |-
    Traceback (most recent call last):
    (...)
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in
    position 132: ordinal not in range(128)

Note that the task is failing on ansible side due to the stdout
decoding but the osd creation is successful.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-02 12:41:55 +02:00
Rishabh Dave 4241b6403f merge task blocks if their execution is based on same conditions
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-03-29 16:16:04 +00:00
Rishabh Dave e0beaf123a "when" keyword should precede "block" keyword
Otherwise the reader is forced to search for "when" when blocks are too
long.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-03-29 16:16:04 +00:00
Guillaume Abrioux f55e2b08be remove all NBSPs on master branch
Similar to #3658

Since there's too many changes between master and stable branches let's
commit directly in each branches instead of trying to backport this
commit.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-28 11:57:55 +00:00
Dimitri Savineau 40a8e1160c container: Add python3-docker on Ubuntu bionic
When installing python-minimal on Ubuntu bionic, this will add the
/usr/bin/python symlink to the default python interpreter.
On bionic, this isn't python2 but python3.

$ /usr/bin/python --version
Python 3.6.7

The python docker library is only installed for python2 which causes
issues when running the purge-docker-cluster playbook. This playbook
uses the ansible docker modules and requires to have python bindings
installed on the remote host.
Without the bindings we can see python error reported by the docker
module.

msg: Failed to import docker or docker-py - No module named 'docker'.
Try `pip install docker` or `pip install docker-py` (Python 2.6)

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-28 08:03:58 +00:00
Guillaume Abrioux 6f47c20c3a rgw: fix a typo
ee2d52d33d introduced a typo.
This commit fixes it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux 3c4f464c54 rgw: cleanup legacy task
this task was here for backward compatibility.
It's time to remove it in the next release.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux 9134624578 rgw: add a retry on pool related tasks
sometimes those tasks might fail because of a timeout.
I've been facing this several times in the CI, adding this retry might
help and won't hurt in any case.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux f6e0185146 update: add containerized deployment upgrade support (L->N)
Add a couple of fixes to allow containerized deployments upgrade support
to upgrade from luminous/mimic to nautilus.

- pass CEPH_CONTAINER_IMAGE and CEPH_CONTAINER_BINARY environment
variable to the ceph_key module,
- fix the docker exec command in 'waiting for the containerized monitor
to join the quorum' task according to the `delegate_to` parameter,
- override `docker_exec_cmd` in `ceph-facts` with `mon_host` when
rolling_update is `True`,
- do not run unnecessarily `create_mds_filesystems.yml` when performing an
upgrade.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux 7386249c71 facts: retrieve fsid during rolling_update playbook
otherwise it generates a new cluster fsid and makes the upgrade failing

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux 5c3ce4ca77 mon: fetch initial keyring even when running rolling_update
otherwise, the task to copy mgr keyring fails during the rolling_update.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux afdaa70a63 update: enable msgr2 protocol
This commit enable the msgr2 protocol when the cluster is fully upgraded
to nautilus

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux 82764afe8d update: mask systemd service units during upgrade
This prevents the packaging from restarting services before we do need
to restart them in the rolling update sequence.
We want to handle services restart at rolling_update playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux b4f14aba8e ceph_key: `lookup_ceph_initial_entities` shouldn't fail on update
As of nautilus, the initial keyrings list has changed, it means when
upgrading from Luminous or Mimic, it is expected there's a mismatch
between what is found on the cluster and the expected initial keyring
list hardcoded in ceph_key module. We shouldn't fail when upgrading to
nautilus.

str_to_bool() took from ceph-volume.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-Authored-by: Alfredo Deza <adeza@redhat.com>
2019-03-25 16:02:56 -04:00
Guillaume Abrioux e99305c684 handlers: do not trigger handlers on rolling_update
rolling_update playbook already takes care of stopping/starting services
during the sequence. There's no need to trigger potential unwanted
services restart.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-25 16:02:56 -04:00
Dimitri Savineau 179fdfbc19 ceph-osd: Ensure lvm2 is installed
When using osd_scenario lvm, we never check if the lvm2 package is
present on the host.
When using containerized deployment and docker on CentOS/RedHat this
package will be automatically installed as a dependency but not for
Ubuntu distribution.
OSD deployed via ceph-volume require the lvmetad.socket to be active
and running.

Resolves: #3728

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-20 22:26:45 +00:00
Guillaume Abrioux 987bdac963 osd: backward compatibility with old disk_list.sh location
Since all files in container image have moved to `/opt/ceph-container`
this check must look for new AND the old path so it's backward
compatible. Otherwise it could end up by templating an inconsistent
`ceph-osd-run.sh`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-18 17:25:51 +00:00
Dimitri Savineau 5c39735be5 ceph-validate: fail if there's no ipaddr available in monitor_address_block subnet
When using monitor_address_block to determine the ip address of the
monitor node, we need an ip address available in that cidr to be
present in the ansible facts (ansible_all_ipv[46]_addresses).
Currently we don't check if there's an ip address available during
the ceph-validate role.
As a result, the ceph-config role fails due to an empty list during
ceph.conf template creation but the error isn't explicit.

TASK [ceph-config : generate ceph.conf configuration file] *****
fatal: [0]: FAILED! => {"msg": "No first item, sequence was empty."}

With this patch we will fail before the ceph deployment with an
explicit failure message.

Resolves: rhbz#1673687

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-18 16:35:36 +00:00
Dimitri Savineau a7b1e35a16 ceph-common: Install yum plugin priorities
When using community repository we need to set the priority on the
ceph repositories because we could have some conflict with EPEL
packages.
In order to set the priority on the ceph repositories, we need to
install the yum-plugin-priorities package.

http://docs.ceph.com/docs/master/install/get-packages/#rpm-packages

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-16 06:24:55 +00:00
wumingqiao 31617afca9 ceph-mgr: run mgr_modules.yml only on the first mgr host
the task will be delegated to mons[0] for all mgr hosts, so we can just run it on the first host and have the same effect.

Signed-off-by: wumingqiao <wumingqiao@beyondcent.com>
2019-03-14 20:16:33 +00:00
Dimitri Savineau d8538ad4e1 Set the default crush rule in ceph.conf
Currently the default crush rule value is added to the ceph config
on the mon nodes as an extra configuration applied after the template
generation via the ansible ini module.

This implies two behaviors:

1/ On each ceph-ansible run, the ceph.conf will be regenerated via
ceph-config+template and then ceph-mon+ini_file. This leads to a
non necessary daemons restart.

2/ When other ceph daemons are collocated on the monitor nodes
(like mgr or rgw), the default crush rule value will be erased by
the ceph.conf template (mon -> mgr -> rgw).

This patch adds the osd_pool_default_crush_rule config to the ceph
template and only for the monitor nodes (like crush_rules.yml).
The default crush rule id is read (if exist) from the current ceph
configuration.
The default configuration is -1 (ceph default).

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1638092

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-14 08:56:52 +00:00
Dimitri Savineau b7f4e3e7c7 ceph-osd: Install numactl package when needed
With 3e32dce we can run OSD containers with numactl support.
When using numactl command in a containerized deployment we need to
be sure that the corresponding package is installed on the host.
The package installation is only executed when the
ceph_osd_numactl_opts variable isn't empty.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-12 07:43:06 +00:00
Guillaume Abrioux b3eb9206fa osd: support numactl options on OSD activate
This commit adds OSD containers activate with numactl support.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1684146

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-11 10:14:50 +01:00
Dimitri Savineau a089e1ec23 systemd/service: Set docker.service conditionally
We don't need to set After=docker.service when the container_binary
variable isn't set to docker.
It doesn't break anything currently but it could be confusing when
using podman.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-07 20:56:11 +00:00
Dimitri Savineau d6e71d769c common: Use rhsm_repository module for RHCS
Instead of using subscription-manager with command module we can use
the rhsm_repository ansible module.
This module already uses repos list feature to determine if a
repository is enabled or not. That way this module is idempotent so
we don't need changed_when: false anymore.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-07 19:15:42 +00:00
Dimitri Savineau 53514a5b50 common: Add noarch to community repository
The ceph stable community repository only enables the basearch
packages url.
Adding the noarch url because starting with nautilus release, some
packages are added there and useful for mgr or grafana.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-06 00:25:11 +00:00
Dimitri Savineau 4d32ecc980 Force osd pool min_size value to integer
After b8d580b and e9e5d5a we could have either item.min_size or
osd_pool_default_min_size using string instead of int causing the
condition to be true when it's false.
As a result, the task could try to set the pool min_size value to
0 which leads to:

Error EINVAL: pool min_size must be between 1 and 1

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-05 19:48:09 +00:00
Dimitri Savineau cb381b41fe Add CONTAINER_IMAGE env var to ceph daemons
Ceph daemons will set the CONTAINER_IMAGE environment variable value
in the daemon metadata.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-05 15:07:05 +00:00
Guillaume Abrioux e9e5d5a39a fix pool min_size customization
b8d580b3f4 introduced a bug when
`min_size` isn't set (default to 0).

Typical error:

```
Error EINVAL: pool min_size must be between 1 and 1
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-05 13:29:34 +00:00
Radu Toader b8d580b3f4 Customize pools min_size
Signed-off-by: Radu Toader <radu.m.toader@gmail.com>
2019-03-05 10:57:15 +00:00
Radu Toader 2048255f61 When creating pool, read pool.application and make the call to ceph osd pool enable application
Signed-off-by: Radu Toader <radu.m.toader@gmail.com>
2019-03-05 09:16:03 +00:00
Kevin Coakley b11dc13476 Updated 7 ansible-lint issues in the ceph-mon, ceph-osd, and ceph-rgw roles
The following lint issues have been resolved:

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-mon/tasks/ceph_keys.yml:2

[305] Use shell only when shell functionality is required
/home/travis/build/ceph/ceph-ansible/roles/ceph-osd/tasks/start_osds.yml:47

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:2

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:7

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:14

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:19

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:24

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
2019-03-04 22:25:35 +00:00
Guillaume Abrioux 359f8a9a4a nfs: fix systemd template service for ubuntu
`mkdir` is located in `/bin` on Ubuntu.
Let's use some jinja to support Ubuntu.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-04 19:54:25 +00:00
Dimitri Savineau 45a7082712 lint: Fix spaces before and after variables
ansible-lint reports:

[206] Variables should have spaces after {{ and before }}

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-01 17:22:24 +00:00
VasishtaShastry 34c25ef49b Extends check_devices tasks to non-collocated an lvm-batch scenarios
Tuned name of a task and error message to make it more user understandable

Fixes BZ 1648168 - ceph-validate : devices are not validated in non-collocated and lvm_batch scenario

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1648168

Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
2019-03-01 02:13:51 +00:00
Kevin Coakley 038401fef2 Add changed_when: false to the "get osd ids" statement
The "get osd ids" statement only registers the osd_ids_non_container variable. Running "ls /var/lib/ceph/osd/ | sed 's/.*-//'" should never produce a change on the system. Adding changed_when: false prevents irrelevant change messages from Ansible.

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
2019-02-28 22:46:19 +00:00
ToprHarley 573adce7dd Convert interface names to underscores
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1540881

Signed-off-by: Tomas Petr <tpetr@redhat.com>
2019-02-28 17:07:34 +00:00
Guillaume Abrioux d5be83e504 osd: add ipc=host in systemd template for containers
in addition to 15812970f0

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-28 13:14:09 +00:00
fpantano 21fad7ced3 Removed not needed mountpoint and removed ubuntu section
Referring to BZ#1683290, as dsavineau suggests, being this
bug tripleO specific, removed the ubuntu section and removed
useless mountpoints.

Signed-off-by: fpantano <fpantano@redhat.com>
2019-02-28 09:46:10 +00:00
fpantano 0c1944236b Added to the ceph-radosgw service template the ca-trust
volume avoiding to expose useless information.
This bug is referred to the following bugzilla:

https://bugzilla.redhat.com/show_bug.cgi?id=1683290

Signed-off-by: fpantano <fpantano@redhat.com>
2019-02-28 09:46:10 +00:00
Dimitri Savineau 58a9d310d5 mon: Move client admin variable to defaults
There's no need to set the client_admin_ceph_authtool_cap variable
via a set_fact task.
Instead we can set this in the role defaults.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-02-27 18:39:39 +00:00
Dimitri Savineau dd7b7604de mon: Add mds permissions to client.admin
The administrator keyring needs full capabilities on mds like mon,
osd and mgr.
Whithout this, the client.admin key won't be able to run commands
against mds (like ceph tell mds.0 session ls)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1672878

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-02-27 18:39:39 +00:00
Guillaume Abrioux 4ab02d2cd1 tests: set ceph_origin and ceph_repository for non_container-collocation
those variables are mandatory.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-27 15:58:35 +00:00
Guillaume Abrioux f68ad10bc9 mon: do not create unnecessarily mgr keyrings
there's no need to generate mgr keyrings 'mgr.monX' when mgrs aren't
collocated with monitors.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-27 15:58:35 +00:00
Kevin Coakley d327681b99 Set permissions on monitor directory to u=rwX,g=rX,o=rX recursive
Set directories to 755 and files to 644 to /var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} recursively instead of setting files and directories to 755 recursively. The ceph mon process writes files to this path with permissions 644. This update stops ansible from updating the permissions in /var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} every time ceph mon writes a file and increases idempotency.

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
2019-02-27 10:48:19 +00:00
Dimitri Savineau dc1c0dcee2 ceph-osd: Drop memory flag with bluestore
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-02-26 07:27:06 +00:00
Guillaume Abrioux 8f42007272 facts: fix auto_discovery exclude
the previous approach was wrong.
checking if `item.key` is in `osd_auto_discovery_exclude` (`['dm-',
'loop']`) is incorrect because it will obviously not match. Therefore,
the condition will return `True` whatever the device we are checking.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-26 03:16:33 +00:00
Guillaume Abrioux 83d7ef777e osd: add possibility to exclude device in osd_auto_discovery
Add a new `osd_auto_discovery_exclude` to give the possibility of
excluding some devices in auto_discovery scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-25 10:05:34 +00:00
Dimitri Savineau b7338d438a ceph-infra: Remove restart firewalld handler
There's no need to restart firewalld service when a new rule is
added due to the usage of the immediate flag.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-02-22 18:15:47 +00:00
Guillaume Abrioux 2b60a35634 common: do not override ceph_release when ceph_repository is 'rhcs'
We shouldn't reset `ceph_release` with `ceph_stable_release` when
`ceph_repository` is `rhcs`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-21 11:58:49 +01:00
Guillaume Abrioux 21e5db8982 osd: make the 'wait for all osd to be up' task configurable
introduce two new variables to make the check that 'wait for all osd to
be up' configurable.
It's possible that for some deployments, OSDs can take longer to be seen
as UP and IN.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1676763

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-20 16:06:04 +00:00
Patrick Donnelly ed40c5237d delegate key creation to first mon
Otherwise keys get scattered over the mons and the mgr key is not copied properly.

With ansible_inventory:

    [mdss]
            mds-000 ansible_ssh_host=192.168.129.110 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa'
    [clients]
            client-000 ansible_ssh_host=192.168.143.94 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa'
    [mgrs]
            mgr-000 ansible_ssh_host=192.168.222.195 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa'
    [mons]
            mon-000 ansible_ssh_host=192.168.139.173 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa' monitor_address=192.168.139.173
            mon-002 ansible_ssh_host=192.168.212.114 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa' monitor_address=192.168.212.114
            mon-001 ansible_ssh_host=192.168.167.177 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa' monitor_address=192.168.167.177
    [osds]
            osd-001 ansible_ssh_host=192.168.178.128 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa'
            osd-000 ansible_ssh_host=192.168.138.233 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa'
            osd-002 ansible_ssh_host=192.168.197.23 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_private_key_file='/root/.ssh/id_rsa'

We get this failure:

    TASK [ceph-mon : include_tasks ceph_keys.yml] **********************************************************************************************************************************************************************
    included: /root/ceph-ansible/roles/ceph-mon/tasks/ceph_keys.yml for mon-000, mon-002, mon-001

    TASK [ceph-mon : waiting for the monitor(s) to form the quorum...] *************************************************************************************************************************************************
    changed: [mon-000] => {
        "attempts": 1,
        "changed": true,
        "cmd": [
            "ceph",
            "--cluster",
            "ceph",
            "-n",
            "mon.",
            "-k",
            "/var/lib/ceph/mon/ceph-li1166-30/keyring",
            "mon_status",
            "--format",
            "json"
        ],
        "delta": "0:00:01.897397",
        "end": "2019-02-14 17:08:09.340534",
        "rc": 0,
        "start": "2019-02-14 17:08:07.443137"
    }

    STDOUT:

    {"name":"li1166-30","rank":0,"state":"leader","election_epoch":4,"quorum":[0,1,2],"quorum_age":0,"features":{"required_con":"2449958747315912708","required_mon":["kraken","luminous","mimic","osdmap-prune","nautilus"],"quorum_con":"4611087854031667199","quorum_mon":["kraken","luminous","mimic","osdmap-prune","nautilus"]},"outside_quorum":[],"extra_probe_peers":[{"addrvec":[{"type":"v2","addr":"192.168.167.177:3300","nonce":0},{"type":"v1","addr":"192.168.167.177:6789","nonce":0}]},{"addrvec":[{"type":"v2","addr":"192.168.212.114:3300","nonce":0},{"type":"v1","addr":"192.168.212.114:6789","nonce":0}]}],"sync_provider":[],"monmap":{"epoch":1,"fsid":"bb401e2a-c524-428e-bba9-8977bc96f04b","modified":"2019-02-14 17:08:05.012133","created":"2019-02-14 17:08:05.012133","features":{"persistent":["kraken","luminous","mimic","osdmap-prune","nautilus"],"optional":[]},"mons":[{"rank":0,"name":"li1166-30","public_addrs":{"addrvec":[{"type":"v2","addr":"192.168.139.173:3300","nonce":0},{"type":"v1","addr":"192.168.139.173:6789","nonce":0}]},"addr":"192.168.139.173:6789/0","public_addr":"192.168.139.173:6789/0"},{"rank":1,"name":"li985-128","public_addrs":{"addrvec":[{"type":"v2","addr":"192.168.167.177:3300","nonce":0},{"type":"v1","addr":"192.168.167.177:6789","nonce":0}]},"addr":"192.168.167.177:6789/0","public_addr":"192.168.167.177:6789/0"},{"rank":2,"name":"li895-17","public_addrs":{"addrvec":[{"type":"v2","addr":"192.168.212.114:3300","nonce":0},{"type":"v1","addr":"192.168.212.114:6789","nonce":0}]},"addr":"192.168.212.114:6789/0","public_addr":"192.168.212.114:6789/0"}]},"feature_map":{"mon":[{"features":"0x3ffddff8ffacffff","release":"luminous","num":1}],"client":[{"features":"0x3ffddff8ffacffff","release":"luminous","num":1}]}}

    TASK [ceph-mon : fetch ceph initial keys] **************************************************************************************************************************************************************************
    changed: [mon-001] => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "mon.",
            "-k",
            "/var/lib/ceph/mon/ceph-li985-128/keyring",
            "--cluster",
            "ceph",
            "auth",
            "get",
            "client.bootstrap-rgw",
            "-f",
            "plain",
            "-o",
            "/var/lib/ceph/bootstrap-rgw/ceph.keyring"
        ],
        "delta": "0:00:03.179584",
        "end": "2019-02-14 17:08:14.305348",
        "rc": 0,
        "start": "2019-02-14 17:08:11.125764"
    }

    STDERR:

    exported keyring for client.bootstrap-rgw
    changed: [mon-002] => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "mon.",
            "-k",
            "/var/lib/ceph/mon/ceph-li895-17/keyring",
            "--cluster",
            "ceph",
            "auth",
            "get",
            "client.bootstrap-rgw",
            "-f",
            "plain",
            "-o",
            "/var/lib/ceph/bootstrap-rgw/ceph.keyring"
        ],
        "delta": "0:00:03.706169",
        "end": "2019-02-14 17:08:14.041698",
        "rc": 0,
        "start": "2019-02-14 17:08:10.335529"
    }

    STDERR:

    exported keyring for client.bootstrap-rgw
    changed: [mon-000] => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "mon.",
            "-k",
            "/var/lib/ceph/mon/ceph-li1166-30/keyring",
            "--cluster",
            "ceph",
            "auth",
            "get",
            "client.bootstrap-rgw",
            "-f",
            "plain",
            "-o",
            "/var/lib/ceph/bootstrap-rgw/ceph.keyring"
        ],
        "delta": "0:00:03.916467",
        "end": "2019-02-14 17:08:13.803999",
        "rc": 0,
        "start": "2019-02-14 17:08:09.887532"
    }

    STDERR:

    exported keyring for client.bootstrap-rgw

    TASK [ceph-mon : create ceph mgr keyring(s)] ***********************************************************************************************************************************************************************
    skipping: [mon-000] => (item=mgr-000)  => {
        "changed": false,
        "item": "mgr-000",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=mon-000)  => {
        "changed": false,
        "item": "mon-000",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=mon-002)  => {
        "changed": false,
        "item": "mon-002",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=mon-001)  => {
        "changed": false,
        "item": "mon-001",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=mgr-000)  => {
        "changed": false,
        "item": "mgr-000",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=mon-000)  => {
        "changed": false,
        "item": "mon-000",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=mon-002)  => {
        "changed": false,
        "item": "mon-002",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=mon-001)  => {
        "changed": false,
        "item": "mon-001",
        "skip_reason": "Conditional result was False"
    }
    changed: [mon-001] => (item=mgr-000) => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "client.admin",
            "-k",
            "/etc/ceph/ceph.client.admin.keyring",
            "--cluster",
            "ceph",
            "auth",
            "import",
            "-i",
            "/etc/ceph//ceph.mgr.li547-145.keyring"
        ],
        "delta": "0:00:05.822460",
        "end": "2019-02-14 17:08:21.422810",
        "item": "mgr-000",
        "rc": 0,
        "start": "2019-02-14 17:08:15.600350"
    }

    STDERR:

    imported keyring
    changed: [mon-001] => (item=mon-000) => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "client.admin",
            "-k",
            "/etc/ceph/ceph.client.admin.keyring",
            "--cluster",
            "ceph",
            "auth",
            "import",
            "-i",
            "/etc/ceph//ceph.mgr.li1166-30.keyring"
        ],
        "delta": "0:00:05.814039",
        "end": "2019-02-14 17:08:27.663745",
        "item": "mon-000",
        "rc": 0,
        "start": "2019-02-14 17:08:21.849706"
    }

    STDERR:

    imported keyring
    changed: [mon-001] => (item=mon-002) => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "client.admin",
            "-k",
            "/etc/ceph/ceph.client.admin.keyring",
            "--cluster",
            "ceph",
            "auth",
            "import",
            "-i",
            "/etc/ceph//ceph.mgr.li895-17.keyring"
        ],
        "delta": "0:00:05.787291",
        "end": "2019-02-14 17:08:33.921243",
        "item": "mon-002",
        "rc": 0,
        "start": "2019-02-14 17:08:28.133952"
    }

    STDERR:

    imported keyring
    changed: [mon-001] => (item=mon-001) => {
        "changed": true,
        "cmd": [
            "ceph",
            "-n",
            "client.admin",
            "-k",
            "/etc/ceph/ceph.client.admin.keyring",
            "--cluster",
            "ceph",
            "auth",
            "import",
            "-i",
            "/etc/ceph//ceph.mgr.li985-128.keyring"
        ],
        "delta": "0:00:05.782064",
        "end": "2019-02-14 17:08:40.138706",
        "item": "mon-001",
        "rc": 0,
        "start": "2019-02-14 17:08:34.356642"
    }

    STDERR:

    imported keyring

    TASK [ceph-mon : copy ceph mgr key(s) to the ansible server] *******************************************************************************************************************************************************
    skipping: [mon-000] => (item=mgr-000)  => {
        "changed": false,
        "item": "mgr-000",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=mgr-000)  => {
        "changed": false,
        "item": "mgr-000",
        "skip_reason": "Conditional result was False"
    }
    changed: [mon-001] => (item=mgr-000) => {
        "changed": true,
        "checksum": "aa0fa40225c9e09d67fe7700ce9d033f91d46474",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/etc/ceph/ceph.mgr.li547-145.keyring",
        "item": "mgr-000",
        "md5sum": "cd884fb9ddc9b8b4e3cd1ad6a98fb531",
        "remote_checksum": "aa0fa40225c9e09d67fe7700ce9d033f91d46474",
        "remote_md5sum": null
    }

    TASK [ceph-mon : copy keys to the ansible server] ******************************************************************************************************************************************************************
    skipping: [mon-000] => (item=/var/lib/ceph/bootstrap-osd/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-osd/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=/var/lib/ceph/bootstrap-rgw/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-rgw/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=/var/lib/ceph/bootstrap-mds/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-mds/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=/var/lib/ceph/bootstrap-rbd/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-rbd/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=/var/lib/ceph/bootstrap-osd/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-osd/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-000] => (item=/etc/ceph/ceph.client.admin.keyring)  => {
        "changed": false,
        "item": "/etc/ceph/ceph.client.admin.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=/var/lib/ceph/bootstrap-rgw/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-rgw/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=/var/lib/ceph/bootstrap-mds/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-mds/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=/var/lib/ceph/bootstrap-rbd/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-rbd/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring)  => {
        "changed": false,
        "item": "/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring",
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => (item=/etc/ceph/ceph.client.admin.keyring)  => {
        "changed": false,
        "item": "/etc/ceph/ceph.client.admin.keyring",
        "skip_reason": "Conditional result was False"
    }
    changed: [mon-001] => (item=/var/lib/ceph/bootstrap-osd/ceph.keyring) => {
        "changed": true,
        "checksum": "095c7868a080b4c53494335d3a2223abbad12605",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/var/lib/ceph/bootstrap-osd/ceph.keyring",
        "item": "/var/lib/ceph/bootstrap-osd/ceph.keyring",
        "md5sum": "d8f4c4fa564aade81b844e3d92c7cac6",
        "remote_checksum": "095c7868a080b4c53494335d3a2223abbad12605",
        "remote_md5sum": null
    }
    changed: [mon-001] => (item=/var/lib/ceph/bootstrap-rgw/ceph.keyring) => {
        "changed": true,
        "checksum": "ce7a2d4441626f22e995b37d5131b9e768f18494",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/var/lib/ceph/bootstrap-rgw/ceph.keyring",
        "item": "/var/lib/ceph/bootstrap-rgw/ceph.keyring",
        "md5sum": "271e4f90c5853c74264b6b749650c3f2",
        "remote_checksum": "ce7a2d4441626f22e995b37d5131b9e768f18494",
        "remote_md5sum": null
    }
    changed: [mon-001] => (item=/var/lib/ceph/bootstrap-mds/ceph.keyring) => {
        "changed": true,
        "checksum": "e35e8613076382dd3c9d89b5bc2090e37871aab7",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/var/lib/ceph/bootstrap-mds/ceph.keyring",
        "item": "/var/lib/ceph/bootstrap-mds/ceph.keyring",
        "md5sum": "ed7c32277914c8e34ad5c532d8293dd2",
        "remote_checksum": "e35e8613076382dd3c9d89b5bc2090e37871aab7",
        "remote_md5sum": null
    }
    changed: [mon-001] => (item=/var/lib/ceph/bootstrap-rbd/ceph.keyring) => {
        "changed": true,
        "checksum": "ac43101ad249f6b6bb07ceb3287a3693aeae7f6c",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/var/lib/ceph/bootstrap-rbd/ceph.keyring",
        "item": "/var/lib/ceph/bootstrap-rbd/ceph.keyring",
        "md5sum": "1460e3c9532b0b7b3a5cb329d77342cd",
        "remote_checksum": "ac43101ad249f6b6bb07ceb3287a3693aeae7f6c",
        "remote_md5sum": null
    }
    changed: [mon-001] => (item=/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring) => {
        "changed": true,
        "checksum": "01d74751810f5da621937b10c83d47fc7f1865c5",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring",
        "item": "/var/lib/ceph/bootstrap-rbd-mirror/ceph.keyring",
        "md5sum": "979987f10fd7da5cff67e665f54bfe4d",
        "remote_checksum": "01d74751810f5da621937b10c83d47fc7f1865c5",
        "remote_md5sum": null
    }
    changed: [mon-001] => (item=/etc/ceph/ceph.client.admin.keyring) => {
        "changed": true,
        "checksum": "482f702cf861b41021d76de655ecf996fe9a4a4a",
        "dest": "/root/ceph-ansible/fetch/bb401e2a-c524-428e-bba9-8977bc96f04b/etc/ceph/ceph.client.admin.keyring",
        "item": "/etc/ceph/ceph.client.admin.keyring",
        "md5sum": "7581c187044fd4e0f7a5440244a6b306",
        "remote_checksum": "482f702cf861b41021d76de655ecf996fe9a4a4a",
        "remote_md5sum": null
    }

    TASK [ceph-mon : include secure_cluster.yml] ***********************************************************************************************************************************************************************
    skipping: [mon-000] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }

    TASK [ceph-mon : crush_rules.yml] **********************************************************************************************************************************************************************************
    skipping: [mon-000] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-001] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }

    TASK [ceph-mgr : set_fact docker_exec_cmd] *************************************************************************************************************************************************************************
    skipping: [mon-000] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-001] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }

    TASK [ceph-mgr : include common.yml] *******************************************************************************************************************************************************************************
    included: /root/ceph-ansible/roles/ceph-mgr/tasks/common.yml for mon-000, mon-002, mon-001

    TASK [ceph-mgr : create mgr directory] *****************************************************************************************************************************************************************************
    changed: [mon-000] => {
        "changed": true,
        "gid": 167,
        "group": "ceph",
        "mode": "0755",
        "owner": "ceph",
        "path": "/var/lib/ceph/mgr/ceph-li1166-30",
        "secontext": "unconfined_u:object_r:ceph_var_lib_t:s0",
        "size": 4096,
        "state": "directory",
        "uid": 167
    }
    changed: [mon-002] => {
        "changed": true,
        "gid": 167,
        "group": "ceph",
        "mode": "0755",
        "owner": "ceph",
        "path": "/var/lib/ceph/mgr/ceph-li895-17",
        "secontext": "unconfined_u:object_r:ceph_var_lib_t:s0",
        "size": 4096,
        "state": "directory",
        "uid": 167
    }
    changed: [mon-001] => {
        "changed": true,
        "gid": 167,
        "group": "ceph",
        "mode": "0755",
        "owner": "ceph",
        "path": "/var/lib/ceph/mgr/ceph-li985-128",
        "secontext": "unconfined_u:object_r:ceph_var_lib_t:s0",
        "size": 4096,
        "state": "directory",
        "uid": 167
    }

    TASK [ceph-mgr : fetch ceph mgr keyring] ***************************************************************************************************************************************************************************
    skipping: [mon-000] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-002] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }
    skipping: [mon-001] => {
        "changed": false,
        "skip_reason": "Conditional result was False"
    }

    TASK [ceph-mgr : copy ceph keyring(s) if needed] *******************************************************************************************************************************************************************
    An exception occurred during task execution. To see the full traceback, use -vvv. The error was: If you are using a module and expect the file to exist on the remote, see the remote_src option
    failed: [mon-002] (item={'name': '/etc/ceph/ceph.mgr.li895-17.keyring', 'dest': '/var/lib/ceph/mgr/ceph-li895-17/keyring', 'copy_key': True}) => {
        "changed": false,
        "item": {
            "copy_key": true,
            "dest": "/var/lib/ceph/mgr/ceph-li895-17/keyring",
            "name": "/etc/ceph/ceph.mgr.li895-17.keyring"
        }
    }

    MSG:

    Could not find or access 'fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring'
    Searched in:
     /root/ceph-ansible/roles/ceph-mgr/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring
     /root/ceph-ansible/roles/ceph-mgr/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring
     /root/ceph-ansible/roles/ceph-mgr/tasks/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring
     /root/ceph-ansible/roles/ceph-mgr/tasks/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring
     /root/ceph-ansible/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring
     /root/ceph-ansible/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li895-17.keyring on the Ansible Controller.
    If you are using a module and expect the file to exist on the remote, see the remote_src option
    skipping: [mon-002] => (item={'name': '/etc/ceph/ceph.client.admin.keyring', 'dest': '/etc/ceph/ceph.client.admin.keyring', 'copy_key': False})  => {
        "changed": false,
        "item": {
            "copy_key": false,
            "dest": "/etc/ceph/ceph.client.admin.keyring",
            "name": "/etc/ceph/ceph.client.admin.keyring"
        },
        "skip_reason": "Conditional result was False"
    }
    An exception occurred during task execution. To see the full traceback, use -vvv. The error was: If you are using a module and expect the file to exist on the remote, see the remote_src option
    failed: [mon-001] (item={'name': '/etc/ceph/ceph.mgr.li985-128.keyring', 'dest': '/var/lib/ceph/mgr/ceph-li985-128/keyring', 'copy_key': True}) => {
        "changed": false,
        "item": {
            "copy_key": true,
            "dest": "/var/lib/ceph/mgr/ceph-li985-128/keyring",
            "name": "/etc/ceph/ceph.mgr.li985-128.keyring"
        }
    }

    MSG:

    Could not find or access 'fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring'
    Searched in:
     /root/ceph-ansible/roles/ceph-mgr/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring
     /root/ceph-ansible/roles/ceph-mgr/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring
     /root/ceph-ansible/roles/ceph-mgr/tasks/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring
     /root/ceph-ansible/roles/ceph-mgr/tasks/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring
     /root/ceph-ansible/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring
     /root/ceph-ansible/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li985-128.keyring on the Ansible Controller.
    If you are using a module and expect the file to exist on the remote, see the remote_src option
    skipping: [mon-001] => (item={'name': '/etc/ceph/ceph.client.admin.keyring', 'dest': '/etc/ceph/ceph.client.admin.keyring', 'copy_key': False})  => {
        "changed": false,
        "item": {
            "copy_key": false,
            "dest": "/etc/ceph/ceph.client.admin.keyring",
            "name": "/etc/ceph/ceph.client.admin.keyring"
        },
        "skip_reason": "Conditional result was False"
    }
    An exception occurred during task execution. To see the full traceback, use -vvv. The error was: If you are using a module and expect the file to exist on the remote, see the remote_src option
    failed: [mon-000] (item={'name': '/etc/ceph/ceph.mgr.li1166-30.keyring', 'dest': '/var/lib/ceph/mgr/ceph-li1166-30/keyring', 'copy_key': True}) => {
        "changed": false,
        "item": {
            "copy_key": true,
            "dest": "/var/lib/ceph/mgr/ceph-li1166-30/keyring",
            "name": "/etc/ceph/ceph.mgr.li1166-30.keyring"
        }
    }

    MSG:

    Could not find or access 'fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring'
    Searched in:
     /root/ceph-ansible/roles/ceph-mgr/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring
     /root/ceph-ansible/roles/ceph-mgr/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring
     /root/ceph-ansible/roles/ceph-mgr/tasks/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring
     /root/ceph-ansible/roles/ceph-mgr/tasks/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring
     /root/ceph-ansible/files/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring
     /root/ceph-ansible/fetch//bb401e2a-c524-428e-bba9-8977bc96f04b//etc/ceph/ceph.mgr.li1166-30.keyring on the Ansible Controller.
    If you are using a module and expect the file to exist on the remote, see the remote_src option
    skipping: [mon-000] => (item={'name': '/etc/ceph/ceph.client.admin.keyring', 'dest': '/etc/ceph/ceph.client.admin.keyring', 'copy_key': False})  => {
        "changed": false,
        "item": {
            "copy_key": false,
            "dest": "/etc/ceph/ceph.client.admin.keyring",
            "name": "/etc/ceph/ceph.client.admin.keyring"
        },
        "skip_reason": "Conditional result was False"
    }

    NO MORE HOSTS LEFT *************************************************************************************************************************************************************************************************
     to retry, use: --limit @/root/ceph-linode/linode.retry

    PLAY RECAP *********************************************************************************************************************************************************************************************************
    client-000                 : ok=30   changed=2    unreachable=0    failed=0
    mds-000                    : ok=32   changed=4    unreachable=0    failed=0
    mgr-000                    : ok=32   changed=4    unreachable=0    failed=0
    mon-000                    : ok=89   changed=21   unreachable=0    failed=1
    mon-001                    : ok=84   changed=20   unreachable=0    failed=1
    mon-002                    : ok=81   changed=17   unreachable=0    failed=1
    osd-000                    : ok=32   changed=4    unreachable=0    failed=0
    osd-001                    : ok=32   changed=4    unreachable=0    failed=0
    osd-002                    : ok=32   changed=4    unreachable=0    failed=0

Also, create all keys on the first mon and copy those to the other mons to be
consistent.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-02-20 11:19:44 +01:00
David Waiting 3930791cb7 ensure at least one osd is up
The existing task checks that the number of OSDs is equal to the number of up OSDs before continuing.

The problem is that if none of the OSDs have been discovered yet, the task will exit immediately and subsequent pool creation will fail (num_osds = 0, num_up_osds = 0).

This is related to Bugzilla 1578086.

In this change, we also check that at least one OSD is present. In our testing, this results in the task correctly waiting for all OSDs to come up before continuing.

Signed-off-by: David Waiting <david_waiting@comcast.com>
2019-02-19 18:31:05 +00:00
Guillaume Abrioux c98fd0b9e0 facts: ensure ceph_uid is set when running rhel
when hosts is running on RHEL, let's enforce ceph_uid to 167.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-19 16:40:08 +01:00
Guillaume Abrioux e7034402a4 container: fix tmpfiles.d ceph files
- fix uid/gid in ceph tmpfiles
- move file to `/etc/tmpfiles.d`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-19 16:40:08 +01:00
Guillaume Abrioux d4e31b90a6 Revert "osd: container remove --pid=host"
This reverts commit bb2bbeb941.

Looks like when not passing `--pid=host` we are facing some issues when
deploying more than 2 OSDs in containerized environment.

At the moment, we are still troubleshooting this issue but we prefer to
revert this commit so it doesn't block any PR in the CI.

As soon as we have a fix; we will push a new PR to remove `--pid=host`
(a revert of revert...)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-14 10:34:37 +00:00
Guillaume Abrioux 500256cdab validate: fix ntp_daemon_type check in validate
is_atomic is defined in ceph-facts or very early in main playbook.

In non containerized deployment, is_atomic is only set in ceph-facts
which is played after ceph-validate.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-14 10:34:37 +00:00
Guillaume Abrioux 76303b457c container: create ceph-common.conf tmpfiles.d if it doesn't exist
Otherwise the task will fail.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-14 10:34:37 +00:00
Guillaume Abrioux b24202f6a4 facts: move two set_fact into ceph-facts
those two set_fact tasks should be moved in ceph-facts.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-14 10:34:37 +00:00
Guillaume Abrioux 8c8ec63633 container: use tmpfiles.d to creates /run/ceph
instead of using `RuntimeDirectory` parameter in systemd unit files,
let's use a systemd `tmpfiles.d` to ensure `/run/ceph`.

Explanation:

`podman` doesn't create the `/var/run/ceph` if it doesn't exist the time
where the container is run while `docker` used to create it.
In case of `switch_to_containers` scenario, `/run/ceph` gets created by
a tmpfiles.d systemd file; when switching to containers, the systemd
unit file complains because `/run/ceph` already exists

The better fix would be to ensure `/usr/lib/tmpfiles.d/ceph-common.conf`
is removed and only rely on `RuntimeDirectory` from systemd unit file parameter
but we come from a non-containerized environment which is already running,
it means `/run/ceph` is already created and when starting the unit to
start the container, systemd will still complain and we can't simply
remove the directory if daemons are collocated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-13 09:42:27 +01:00
Guillaume Abrioux 7e0a70f7a8 switch_to_containers: do not try to redeploy monitors
`ceph-mon` tries to redeploy monitors because it assumes it was not yet
deployed since `mon_socket_stat` and `ceph_mon_container_stat` are
undefined (indeed, we stop the daemon before calling `ceph-mon` in the
switch_to_containers playbook).

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-13 09:42:27 +01:00
Rishabh Dave 05ea783eff fix mistake in task that aborts when ntpd is chosen on Atomic
Since it's already confusing whether ntp_daemon_type should be "ntp" or
"ntpd", fix the mistake in the title of the task that aborts if
ntp_daemon_type is set to "ntpd" and OS being used is Atomic.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-02-12 09:09:27 +01:00
Rishabh Dave bdff3e48fd don't install NTPd on Atomic
Since Atomic doesn't allow any installations and NTPd is not present
on Atomic image we are using, abort when ntp_daemon_type is set to ntpd.

https://github.com/ceph/ceph-ansible/issues/3572
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-02-11 12:02:30 +01:00
Sébastien Han c69c8c9ac1 mon: do not hardcode ceph uid
167 is the ceph uid for Red Hat based system, thus trying to deploy a
monitor on Debian fail since the ceph user id on that system is 64045.
This commit uses the ceph_uid variable which contains the right uid
based on system/container detection.

Closes: https://github.com/ceph/ceph-ansible/issues/3589
Signed-off-by: Sébastien Han <seb@redhat.com>
2019-02-11 09:09:40 +00:00
Leah Neukirchen 4fe7f37849 Fix uses of default(omit) with string concatenation
When {{omit}} is concatenated with another string, it expands to something
like __omit_place_holder__63eea0d96dd6ed867b95405e11d87dddf61f448d.
However, in these use-cases we need an empty string.

Regression introduced in d53f55e807.

Signed-off-by: Leah Neukirchen <leah.neukirchen@mayflower.de>
2019-02-08 16:18:15 +00:00
Patrick C. F. Ernzer c605ff6a68 setup_ntp: call handler to disable ntpd if chronyd used
The task setup chronyd called the handler disable chronyd, which of
course defeats the purpose.

Changing the task to disable ntpd instead fixes the issue of chronyd
being disabled after it got enabled.

Fixes: #3582

Signed-off-by: Patrick C. F. Ernzer pcfe@redhat.com
2019-02-08 12:04:44 +01:00
Guillaume Abrioux d4b3c1d409 iscsi-gws: remove a leftover
remove leftover introduced by 9d590f4

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-08 01:11:42 +01:00
Guillaume Abrioux 9d590f4339 iscsi: fix permission denied error
Typical error:
```
fatal: [iscsi-gw0]: FAILED! =>
  msg: 'an error occurred while trying to read the file ''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key'': [Errno 13] Permission denied: b''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key'''
```

`become: True` is not needed on the following task:

`copy crt file(s) to gateway nodes`.

Since it's already set in the main playbook (site.yml/site-container.yml)

The thing is that the files get generated in the 'fetch_directory' with
root user because there is a 'delegate_to' + we run the playbook with
`become: True` (from main playbook).

The idea here is to create files under ansible user so we can open them
later to copy them on the remote machine.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-07 17:57:22 +01:00
Sébastien Han bb2bbeb941 osd: container remove --pid=host
Let's try again with the Nautilus release.

Closes: https://github.com/ceph/ceph-ansible/issues/1297
Signed-off-by: Sébastien Han <seb@redhat.com>
2019-02-07 12:13:51 +00:00
John Fulton cc0bf197e1 Fix CNI error when net=host is not used on OSD calls
Follow up fix that 410abd7 missed.

Related: ceph#3561

Signed-off-by: John Fulton <fulton@redhat.com>
2019-02-05 22:49:01 +00:00
John Fulton 719a25b571 Create Ceph Initial Dirs earlier
Include tasks from create_ceph_initial_dirs earlier during
ceph config role.

Fixes: #3568
Signed-off-by: John Fulton <fulton@redhat.com>
2019-02-05 18:38:05 +00:00
John Fulton dab3f6ee3f Fix CNI error when net=host is not used in some podman calls
With 'podman version 1.0.0' on RHEL8 beta the 'get ceph version' and
'ceph monitor mkfs' commands fail [1] with "error configuring network
namespace for container Missing CNI default network".

When net=host is added these errors are resolved. net=host is used in
many other calls (grep -R net=host | wc -l --> 38).

Fixes: #3561
Signed-off-by: John Fulton <fulton@redhat.com>
(cherry picked from commit 410abd7745)
2019-02-05 18:14:28 +01:00
Guillaume Abrioux 914d94cae8 set RuntimeDirectory in all systemd unit templates
/var/run/ceph resides in a non persistent filesystem (tmpfs)
After a reboot, all daemons won't start because this directory will be
missing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux 7ade032807 osd: bind mount /var/run/udev/
without this, the command `ceph-volume lvm list --format json` hangs and
takes a very long time to complete.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux fdca29f2a7 facts: set timeout_command fact in ceph-defaults
- also add `--foreground` which seems to fix some issue we are facing when
using timeout with `podman`.
- use this fact in the `is ceph running already?` task.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux 16efdbc59b podman: support podman installation on rhel8
Add required changes to support podman on rhel8

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1667101

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
John Fulton 37b5d1084a Make python print statements python3 compatible
The restart_osd_daemon.sh generated from the j2 template
contains a python call which uses 'print x' instead of
'print(x)'. Add the missing parentheses to make this call
compatible with both 2 and 3.

Also add parentheses to other python print calls found
in roles/ceph-client/defaults/main.yml and
infrastructure-playbooks/cluster-os-migration.yml.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1671721
Signed-off-by: John Fulton <fulton@redhat.com>
2019-02-01 15:23:27 +00:00
Andrew Schoen 70a4368bc5 ceph-config: do not always assume containers when calculating num_osds
CEPH_CONTAINER_IMAGE should be None if containerized_deployment
is False.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-02-01 12:28:12 +01:00
Andrew Schoen 88eda479a9 ceph-facts: generate devices when osd_auto_discovery is true
This task used to live in ceph-osd, but we need it defined here to that
ceph-config can use it when trying to determine the number of osds.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-02-01 12:28:12 +01:00
John Fulton cba9b23363 Do not timeout podman/docker pull if timeout value is '0'
If user sets "docker_pull_timeout: '0'" then do not use the
timeout command when running podman/docker pull. Also, use
"timeout -s KILL"; without KILL, podman on RHEL8 beta does
not timeout and deployment can hang.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1670625
Signed-off-by: John Fulton <fulton@redhat.com>
2019-01-31 15:35:09 +00:00
Guillaume Abrioux fe1528adb4 config: support num_osds fact setting in containerized deployment
This part of the code must be supported in containerized deployment

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1664112

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-31 14:30:23 +00:00
Ramana Raja dfff89ce67 Install nfs-ganesha stable v2.7
nfs-ganesha v2.5 and 2.6 have hit EOL. Install nfs-ganesha v2.7
stable that is currently being maintained.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2019-01-30 14:57:26 +01:00
Guillaume Abrioux 9f16501747 common: clean monitor initial keyring code
let's not be blocked by the fact we don't have the initial keyring in
`{{ fetch_directory }}`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-30 10:36:02 +01:00
Guillaume Abrioux f8aa8cdf60 facts: clean fsid generation code
clean some leftover and duplicate code.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-30 10:36:02 +01:00
Patrick Donnelly 080ce7dd72 do not set ceph_release to dummy
When we're deploying a dev branch.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-01-29 16:04:44 +00:00
Zack Cerza 82897c76fb Fix ceph.conf generation
877979c78 in #3486 broke ceph.conf generation entirely. Remove the stray
curly brace.

Signed-off-by: Zack Cerza <zack@redhat.com>
2019-01-25 09:19:24 +01:00
Guillaume Abrioux 0bfefdd5bc override ceph_release with ceph_stable_release
when `ceph_origin` is set to `'repository'` and `ceph_repository` to
`'community'` we need to ensure `ceph_release` reflect
`ceph_stable_release`.

4a3f180f9d simply removed the override
while it should just have to be run only when the condition mentioned
above is satisfied.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-22 15:53:32 +01:00
Guillaume Abrioux 877979c787 config: make sure ceph_release is set for all client node
`ceph_release` is set in `ceph-container-common` but this role is
played only on first node for clients, this means ceph-config will fail
on all client nodes except the first one.

This commit ensure ceph_release is set for all client nodes.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-22 13:45:38 +01:00
Sébastien Han 5babc1b4eb mon: enable msgr2
Enabling msgr2 style declaration for Nautilus and above. Prior releases
will keep the right syntax.
When upgrading from Mimic to Nautilus we must maintain something in the
form of:

mon_host = [v1:127.0.0.1:6789/0,v2:127.0.0.1:3300/0]

Signed-off-by: Sébastien Han <seb@redhat.com>
2019-01-22 13:45:38 +01:00
Sébastien Han fc34fb1bd9 mon: ability to change mon listening port on container
You can now use 'ceph_mon_container_listen_port' to change the port the
monitor will listen on.
Setting the default to 3300 (assigned by IANA) since Nautilus has released the messenger2
transport protocol.

Signed-off-by: Sébastien Han <seb@redhat.com>
2019-01-22 13:45:38 +01:00
Sébastien Han 41a7cc878c Revert "mon: force peer addition"
This reverts commit ee08d1f89a which was
mostly to workaround a bug in ceph@master. Now, ceph@master is fixed so
reverting this. Thanks to https://github.com/ceph/ceph/pull/25900

Signed-off-by: Sébastien Han <seb@redhat.com>
2019-01-22 13:45:38 +01:00
guihecheng 1ac94c048f rgw: add support for multiple rgw instances on a single host
With this, we could have multiple rgw instances on a single host
with a single run, don't have to use rgw-standalone.yml which does not
seems able to bind ports separately.
If you want to have multiple rgw instances, just change 'radosgw_instances'
to the number you want, which defaults to 1.
Not compatible with Multi-Site yet.

Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>
2019-01-18 11:12:28 +01:00
Guillaume Abrioux 1bbdde272f config: remove code related to ceph release prior to luminous
This part of the code is not needed since ceph-ansible@master is
intended to deploy ceph@master only.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-14 14:41:13 +00:00
Guillaume Abrioux e9188cd202 ceph-default: rm useless condition
This condition is useless and it's also creating issues we don't see in
our CI. ceph_release is set by either ceph-common or ceph-docker-common
so let's keep it this way.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-14 14:41:13 +00:00
Sébastien Han ee08d1f89a mon: force peer addition
Somewhat something changed with the introduction of msg2 and we have to
add each node as a peer so the monitors can form a quorum. This might be
due to our CI environment, although adding this is completly harmless
and solves monitors not being able to form quorum.

It seems that the initial monitor map wasn't containing the right
information about the peers (addresses like 0.0.0.0/0r1, for each rank.

Signed-off-by: Sébastien Han <seb@redhat.com>
2019-01-09 13:15:52 +01:00
Bruceforce 446f3c9fae nfs-ganesha: fixed nfs_ganesha_dev_apt_repo variable
The nfs_ganesha_dev_apt_repo variable was set incorrect in task
"fetch nfs-ganesha development repository"

Signed-off-by: Bruceforce <Bruceforce@users.noreply.github.com>
2019-01-05 16:04:05 +01:00
Sébastien Han 9abf9dba0b rgw: do not create mandatory directories
The packages are responsible for this, currently tracked in ceph https://github.com/ceph/ceph/pull/25503

Signed-off-by: Sébastien Han <seb@redhat.com>
2019-01-04 13:57:40 +00:00
Sébastien Han 2af624dc5b rbd-mirror: copy bootstrap key after package install
If we don't copy the key after the package install the directory /var/lib/ceph/bootstrap-rbd-mirror
will not exist and the copy will fail.

Signed-off-by: Sébastien Han <seb@redhat.com>
2019-01-04 13:57:40 +00:00
Sébastien Han b1dfe3f03e config: only pre-create ceph dirs on containers
We don't need to create the directories on non-containers, they are
created by the packages.

Closes: https://github.com/ceph/ceph-ansible/issues/3430
Signed-off-by: Sébastien Han <seb@redhat.com>
2019-01-04 13:57:40 +00:00
Rishabh Dave 6fa757d343 ceph-infra: disable unrequired NTP services
When one of the currently supported NTP services has been set up,
disable rest of the NTP services on Ceph nodes.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-01-04 14:01:05 +01:00
Rishabh Dave b03ab60742 ceph-infra: merge ntp_debian.yml and ntp_rpm.yml
Merge ntp_debian.yml and ntp_rpm.yml into one (the new file is called
setup_ntp.yml) since they are almost identical. Also avoid repetition
of the common setup step for ntpd and chronyd services.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-01-04 14:01:05 +01:00
Rishabh Dave a14bfa282a copy certificates as root user
Since the current user on the controller node, might not have the
permission to read the TLS certificate and related files, copy these
files to the Ceph nodes as root user.

Fixes: https://github.com/ceph/ceph-ansible/issues/3465
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-01-03 09:49:06 +01:00
Kai Wembacher 1dd26f76bf document missing support for non-containerized deployment
Signed-off-by: Kai Wembacher <kai@ktwe.de>
2018-12-21 15:37:55 +00:00
jtudelag 23ad5fd9cb Clarify RGWs configuration when using ceph_conf_overrides.
To avoid future misconfigurations, clarify that the only valid
scheme is [client.rgw.*] instead of [client.radosgw.*].
2018-12-20 13:55:03 +00:00
Kai Wembacher a273ed7f60 add support for rocksdb and wal on the same partition in non-collocated
Signed-off-by: Kai Wembacher <kai@ktwe.de>
2018-12-20 14:19:46 +01:00
Sébastien Han f99a875b7f lint: Remote package tasks should have a retry
Make linter happy and add more robustness to remote tasks by retrying 3
times (the default) before failing.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-20 11:06:09 +01:00
Guillaume Abrioux d7e77012ef retry on packages and repositories failures
add register/until on all packaging related tasks to avoid non valid CI
failure.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-19 14:48:27 +00:00
Sébastien Han d9e7835086 mon: remove ceph aliases for containers
These aliases have led to several issues making believe that ceph
binaries are actually present on the host when running the command.
However it wasn't explicit that the commands were only ran inside a
container.
It has brought to much confusion so we decided to remove them.

Closes: https://github.com/ceph/ceph-ansible/issues/3445
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-17 11:10:03 +01:00
Guillaume Abrioux 0eb56e36f8 introduce new role ceph-facts
sometimes we play the whole role `ceph-defaults` just to access the
default value of some variables. It means we play the `facts.yml` part
in this role while it's not desired. Splitting this role will speedup
the playbook.

Closes: #3282

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-12 11:18:01 +01:00
Guillaume Abrioux 1b8b5e0aac meta: set the right minimum ansible version required for galaxy
ceph-ansible@master requires the latest stable ansible version.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-11 09:59:25 +01:00
Sébastien Han 8e2585b6c7 ceph-defaults: do not use podman only on atomic
We want to test podman on f29 non-atomic, atomic is not a hard
requirement. However, if you want to get podman then you will have to
install it first before running the playbook.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-07 15:37:07 +01:00
Sébastien Han 51ca4f883b mgr: little refact
This commit removes the default module, so ceph-ansible does not enable
any manager module.
To enable a module you need to set a value to 'ceph_mgr_modules', you
can pass a list of modules like this:

ceph_mgr_modules:
  - status
  - dashboard

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-06 14:55:56 +00:00
Noah Watkins 3cf5fd2c3e start_osds: use list instead of keys (re-introduce)
the python3 fix merged by:

  https://github.com/ceph/ceph-ansible/pull/3346

was reintroduced a few days later by:

  82a6b5adec

and this patch fixes it again :)

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-12-05 23:25:35 +00:00
Guillaume Abrioux 1cff1f9806 revert infra: don't restart firewalld if unit is masked
If firewalld unit is masked, setting `configure_firewall: false` is
enough

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-04 15:52:08 +00:00
Sébastien Han 896676ee80 fix json data type
Json is a type structure which is always typed as a string, where before
this we were declaring a dict, which is not a json valid structure.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-04 12:34:54 +01:00
Sébastien Han 82a6b5adec osd: manage legacy ceph-disk non-container startup
The code is now able (again) to start osds that where configured with
ceph-disk on a non-container scenario.

Closes: https://github.com/ceph/ceph-ansible/issues/3388
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 452069cb3a)
2018-12-03 16:01:57 +01:00
Sébastien Han ec2d1f502d osd: re-introduce disk_list check
This commit
4cc1506303 (diff-51bbe3572e46e3b219ad726da44b64ebL13)
accidentally removed this check.

This is a must have for ceph-disk based containerized OSDs.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 9b5a93e3a5)
2018-12-03 16:01:57 +01:00
Sébastien Han 4c51130198 osd: discover osd_objectstore on the fly
Applying and passing the OSD_BLUESTORE/FILESTORE on the fly is wrong for
existing clusters as their config will be changed.

Typically, if an OSD was prepared with ceph-disk on filestore and we
change the default objectstore to bluestore, the activation will fail.
The flag osd_objectstore should only be used for the preparation, not
activation. The activate in this case detects the osd objecstore which
prevents failures like the one described above.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:11:47 +00:00
Sébastien Han bef522627e ceph-osd: change jinja condition
If an existing cluster runs this config, and has ceph-disk OSD, the
`expose_partitions` won't be expected by jinja since it's inside the
'old' if. We need it as part of the osd_scenario != 'lvm' condition.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1640273
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:11:47 +00:00
Sébastien Han bf375327a0 ceph-mgr: refact role for containers
Now we simplify the invocation of start and remove some code and the
directory 'docker'.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 14fc5bad12 mon: do not serialized container bootstrap
This commit unifies the container and non-container code, which in the
meantime gives use the ability to deploy N mon container at the same
time without having to serialized the deployment. This will drastically
reduces the time needed to bootstrap the cluster.
Note, this is only possible since Nautilus because the monitors are
bootstrap the initial keys on their own once they reach quorum. In the
Nautilus version of the ceph-container mon, we stopped generating the
keys 'manually' from inside the container, for more detail see: https://github.com/ceph/ceph-container/pull/1238

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 61082b3b32 mgr: only copy keys with dedicated mgr
When collocating mon and mgr, the mgr container will attempt to create
its own key since it has the admin key at its disposal. Also at this
point there is nothing to fetch since the key is not created by the
mons, as mentionned above the mgr creates the key on its own.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 1c760904b0 site: collocated mon and mgr by default
This will speed up the deployment and also deploy mon and mgr collocated
just as recommended.
This won't prevent you of adding more and dedicaded machines for mgr if
needed.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han ee1905ad31 mon: add missing include_tasks instead of import_tasks
This was probably a leftover/mistake so let's fix this and make the file
consistent.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 7cb1040440 config: add missing bootstrap mgr directory
This directory is needed so we can fetch the bootstrap mgr key in it.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 8d4de44f5d mon: default ceph_health_raw to json
During the first iteration, the command won't return anything, or can
simply fail and might not return a valid json structure. Ansible will
fail parsing it in the filter `from_json` so let's default that variable
to empty dictionary.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han cfac79bec4 container-common: remove old check
This removes a bit of unnecessary code, the check was always wrong
because of the condition 'not ceph_current_status.get('rc', 1) == 0'
It will never match since `Not` is used for bool and we are checking for
an rc.
Also, even though the check would work, this will be a major blocker for
a complete meltdown. If the whole platform is shutdown then nothing will
be up but files will be present, so this check is definitely wrong.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 7ac73202f7 fw: update rules for mon/mgr collocation
Since we now deploy mgr on mon we need to open fw rules so the mgr can
reach out to the osds.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han 5b9d8f9737 mon: remove old ubuntu login status
We don't support Ubuntu Precise, so this feature does not exists
anymore.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Sébastien Han a0e5ef8516 mon: secure cluster on container
Add the ability to protect pools on containerized clusters.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-03 14:39:43 +01:00
Guillaume Abrioux ccc0c9c24c osd: remove a leftover
this file is never included in ceph-osd, looks like a leftover let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-03 09:12:02 +01:00
Guillaume Abrioux 0187166926 osd: remove an incorrect information
This is false, `./defaults/main.yml` is not supposed to be modified
directly. groups_vars a/o host_vars should always be preferred.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-03 08:11:35 +00:00
Guillaume Abrioux fead0813b4 remove kv store support
the next stable release will drop this feature.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-30 13:45:12 +00:00
Christian Berendt 1f73a9900f Add missing space before }}
This will fix the following yamllint warning:

Variables should have spaces after {{ and before }}

Signed-off-by: Christian Berendt <berendt@betacloud-solutions.de>
2018-11-29 16:04:05 +01:00
Guillaume Abrioux a86c2b8526 config: write jinja comment with appropriate syntax
jinja comment should be written using the jinja syntax `{# ... #}`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654441

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-29 15:48:23 +01:00
Guillaume Abrioux e4869ac8bd validate: change default value for `radosgw_address`
change default value of `radosgw_address` to keep consistency with
`monitor_address`.
Moreover, `ceph-validate` checks if the value is '0.0.0.0' to determine
if it has to run `check_eth_rgw.yml`.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1600227

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-28 23:13:38 +01:00
Sébastien Han bc2daaeb71 ceph-osd fix batch with container binary
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-27 16:47:40 +00:00
Sébastien Han 80ba45793d fix template generation
Position the right condition on ceph_docker_version, activate it when
the container_binary is 'docker'.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-27 16:47:40 +00:00
Sébastien Han 00ebdeff78 container-common: remove leftover
ntp is installation is managed by the ceph-infra role.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-27 16:47:40 +00:00
Guillaume Abrioux 3684d421e4 defaults: play set_radosgw_address.yml only on rgw nodes
This is not needed to play these tasks on nodes that are not in rgw
group.

Always playing this code makes `shrink_mon.yml` failing.

Typical error:

```
TASK [ceph-defaults : set_fact _radosgw_address to radosgw_interface - ipv4] ***
task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-shrink_mon/roles/ceph-defaults/tasks/set_radosgw_address.yml:21
Thursday 22 November 2018  12:34:51 +0000 (0:00:00.154)       0:00:12.371 *****
fatal: [localhost]: FAILED! => {}

MSG:

The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute u'ansible_eth1'
```

Indeed, `radosgw_interface` is the network interface on rgw only. It is
expected that this same interface doesn't exist on `localhost`, so, when
running `shrink_mon.yml`, the role `ceph-defaults` is called in
`hosts: localhost` and causes the playbook to fail.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-27 16:47:40 +00:00
Sébastien Han 4f57e44f9c defaults: declare container_binary
Always declare container_binary and assign it a correct value.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-27 16:47:40 +00:00
Sébastien Han ac3e18e4c1 ceph-defaults: use podman on Fedora only
It seems Atomic 7.5 has podman already, however this is an old version
(0.4). The podman integration is targetting RHEL 8, so Fedora is
currently the closest to that.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-27 16:47:40 +00:00
Sébastien Han f203031f88 iscsi: expose /dev/log in the container
During its initialisation both rbd-target-api and rbd-target-gw try to
open /dev/log for their syslog handler. If the device is not present the
service fails to start. Thus expose /dev/log from the host in the
container solves that problem.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-27 16:47:40 +00:00
Sébastien Han f192bc92a2 ceph_key: use the right container runtime binary
Rework all the ceph_key invocation to use either docker or podman
binary.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-27 16:47:40 +00:00
Sébastien Han 6cca37b683 client: do not use a dummy container anymore
Since 84fcf4639140c390a7f1fcd790ba190503713f86 we now use the container
binary cli to create ceph keys instead of creating a container and
'docker execing' into it.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-27 16:47:40 +00:00
Sébastien Han a96e910114 Add new container scenario
Test with podman instead of docker and also support for python 3 only.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-27 16:47:40 +00:00
Sébastien Han a9b337ba66 handler: show unit logs on error
This will tremendously help debugging daemons that fail on restart by
showing the systemd unit logs.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-27 11:00:37 +00:00
Sébastien Han 997667a873 osd: expose udev into the container
In order to be able to retrieve udev information, we must expose its
socket. As per, https://github.com/ceph/ceph/pull/25201 ceph-volume will
start consuming udev output.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-26 18:57:12 +00:00
Guillaume Abrioux ed42262b37 client: change default pool size
default pool size should match the real default that is defined in ceph
itself.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-21 18:23:07 +00:00