Commit Graph

2481 Commits (8d311a537da051950a240fb3ecf6615156883fd9)

Author SHA1 Message Date
Dmitriy Rabotyagov 8d311a537d Fix undefined running_mon
Since commit [1] running_mon introduced, it can be not defined
which results in fatal error [2]. This patch defines default value which
was used before patch [1]

Signed-off-by: Dmitriy Rabotyagov <drabotyagov@vexxhost.com>

[1] 8dcbcecd71
[2] https://zuul.opendev.org/t/openstack/build/c82a73aeabd64fd583694ed04b947731/log/job-output.txt#14011

(cherry picked from commit 2478a7b948)
2020-01-16 18:28:12 -05:00
Guillaume Abrioux cae24dd85a remove container_exec_cmd_mgr fact
Iterating over all monitors in order to delegate a `
{{ container_binary }}` fails when collocating mgrs with mons, because
ceph-facts reset `container_exec_cmd` to point to the first member of
the monitor group.

The idea is to force `container_exec_cmd` to be reset in ceph-mgr.
This commit also removes the `container_exec_cmd_mgr` fact.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1791282

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8dcbcecd71)
2020-01-15 21:10:54 +01:00
Dimitri Savineau 09a71e4a8c ceph-iscsi: don't use bracket with trusted_ip_list
The trusted_ip_list parameter for the rbd-target-api service doesn't
support ipv6 address with bracket.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787531

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit bd87d69183)
2020-01-14 12:48:04 -05:00
Dimitri Savineau ff3a3ee5e9 container: move lvm2 package installation
Before this patch, the lvm2 package installation was done during the
ceph-osd role.
However we were running ceph-volume command in the ceph-config role
before ceph-osd. If lvm2 wasn't installed then the ceph-volume command
fails:

error checking path "/run/lock/lvm": stat /run/lock/lvm: no such file or
directory

This wasn't visible before because lvm2 was automatically installed as
docker dependency but it's not the same for podman on CentOS 8.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit de8f2a9f83)
2020-01-14 12:47:55 -05:00
Guillaume Abrioux a81830ddc0 osd: use _devices fact in lvm batch scenario
since fd1718f379, we must use `_devices`
when deploying with lvm batch scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5558664f37)
2020-01-14 15:33:15 +01:00
Guillaume Abrioux ffdfa634ac osd: do not run openstack_config during upgrade
There is no need to run this part of the playbook when upgrading the
cluter.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit af6875706a)
2020-01-14 09:12:34 -05:00
Guillaume Abrioux 2d85fab02d osd: support scaling up using --limit
This commit lets add-osd.yml in place but mark the deprecation of the
playbook.
Scaling up OSDs is now possible using --limit

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3496a0efa2)
2020-01-14 09:12:34 -05:00
Dimitri Savineau dc797971ce ceph-facts: move grafana fact to dedicated file
We don't need to executed the grafana fact everytime but only during
the dashboard deployment.
Especially for ceph-grafana, ceph-prometheus and ceph-dashboard roles.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790303

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f940e695ab)
2020-01-13 16:28:23 -05:00
Guillaume Abrioux 266c4c7763 facts: fix osp/ceph external use case
d6da508a9b broke the osp/ceph external use case.

We must skip these tasks when no monitor is present in the inventory.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790508

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2592a1e1e8)
2020-01-13 21:07:01 +01:00
Guillaume Abrioux 532abbb9b2 defaults: change monitor|radosgw_address default values
To avoid confusion, let's change the default value from `0.0.0.0` to
`x.x.x.x`.
Users might think setting `0.0.0.0` will make the daemon binding on all
interfaces.

Fixes: #4827

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fc02fc98eb)
2020-01-13 14:55:23 -05:00
Guillaume Abrioux 9ed540da7e osd: ensure osd ids collected are well restarted
This commit refact the condition in the loop of that task so all
potential osd ids found are well started.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790212

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 58e6bfed2d)
2020-01-13 14:31:03 -05:00
Dimitri Savineau bc0f16f270 ceph-validate: add rbdmirror validation
When ceph_rbd_mirror_configure is set to true we need to ensure that
the required variables aren't empty.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1760553

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 4a065cebd7)
2020-01-10 11:11:37 -05:00
Dimitri Savineau f2e1941ef1 ceph-osd: wait for all osds once
cf8c6a3 moves the 'wait for all osds' task from openstack_config to the
main tasks list.
But the openstack_config code was executed only on the last OSD node.
We don't need to do this check on all OSD node so we need to add set
run_once to true on that task.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 5bd1cf40eb)
2020-01-10 11:07:25 -05:00
Dimitri Savineau 9fa5b296ca ceph-osd: wait for all osd before crush rules
When creating crush rules with device class parameter we need to be sure
that all OSDs are up and running because the device class list is
is populated with this information.
This is now enable for all scenario not openstack_config only.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit cf8c6a3849)
2020-01-10 11:07:25 -05:00
Dimitri Savineau bd016960cf ceph-osd: add device class to crush rules
This adds device class support to crush rules when using the class key
in the rule dict via the create-replicated sub command.
If the class key isn't specified then we use the create-simple sub
command for backward compatibility.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ef2cb99f73)
2020-01-10 11:07:25 -05:00
Dimitri Savineau 661b2c013a move crush rule creation from mon to osd role
If we want to create crush rules with the create-replicated sub command
and device class then we need to have the OSD created before the crush
rules otherwise the device classes won't exist.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ed36a11eab)
2020-01-10 11:07:25 -05:00
Guillaume Abrioux d6921f798d config: exclude ceph-disk prepared osds in lvm batch report
We must exclude the devices already used and prepared by ceph-disk when
doing the lvm batch report. Otherwise it fails because ceph-volume
complains about GPT header.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786682

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fd1718f379)
2020-01-09 20:15:27 -05:00
Guillaume Abrioux d6da508a9b mon: support replacing a mon
We must pick up a mon which actually exists in ceph-facts in order to
detect if a cluster is running. Otherwise, it will state no cluster is
already running which will end up deploying a new monitor isolated in a
new quorum.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 86f3eeb717)
2020-01-09 15:02:03 -05:00
Dimitri Savineau a3c2259bde ceph-iscsi: manage ipv6 in trusted_ip_list
Only the ipv4 addresses from the nodes running the dashboard mgr module
were added to the trusted_ip_list configuration file on the iscsigws
nodes.
This also add the iscsi gateways with ipv6 configuration to the ceph
dashboard.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787531

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 70eba66182)
2020-01-08 21:29:02 -05:00
Benoît Knecht 0cb1235962 ceph-rgw: Fix custom pool size setting
RadosGW pools can be created by setting

```yaml
rgw_create_pools:
  .rgw.root:
    pg_num: 512
    size: 2
```

for instance. However, doing so would create pools of size
`osd_pool_default_size` regardless of the `size` value. This was due to
the fact that the Ansible task used

```
{{ item.size | default(osd_pool_default_size) }}
```

as the pool size value, but `item.size` is always undefined; the
correct variable is `item.value.size`.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 3c31b19ab3)
2020-01-08 21:28:03 -05:00
Guillaume Abrioux e001ded6f6 handler: fix bug
411bd07d54 introduced a bug in handlers

using `handler_*_status` instead of `hostvars[item]['handler_*_status']`
causes handlers to be triggered in anycase even though
`handler_*_status` was set to `False` on a specific node.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 30200802d9)
2020-01-08 19:46:11 -05:00
Dimitri Savineau 56f1b232a5 ceph-nfs: add ganesha_t type to selinux
Since RHEL 8.1 we need to add the ganesha_t type to the permissive
SELinux list.
Otherwise the nfs-ganesha service won't start.
This was done on RHEL 7 previously and part of the nfs-ganesha-selinux
package on RHEL 8.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786110

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d758125290)
2020-01-08 16:23:41 -05:00
Dimitri Savineau 701ade88c3 ceph-defaults: exclude rbd devices from discovery
The RBD devices aren't excluded from the devices list in the LVM auto
discovery scenario.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783908

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 6f0556f015)
2020-01-08 16:15:26 -05:00
Dimitri Savineau af3c1b4c1a ceph-infra: replace hardcoded grafana group name
The grafana-server group name was hardcoded for the grafana/prometheus
firewalld tasks condition.
We should we the associated variable : grafana_server_group_name

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2c06678cde)
2020-01-08 16:15:09 -05:00
Dimitri Savineau 27530b1d3f ceph-infra: move dashboard into a dedicated file
Instead of using multiple dashboard_enabled condition in the
configure_firewall file we could just have the condition once
and include the dedicated tasks list.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f4c261ef90)
2020-01-08 16:15:09 -05:00
Dimitri Savineau 43ffcd7d28 ceph-infra: open dashboard port on monitor
When there's no mgr group defined in the ansible inventory then the
mgrs are deployed implicitly on the mons nodes.
If the dashboard is enabled then we need to open the dashboard port on
the node that is running the ceph mgr process (mgr or mon).
The current code only allow to open that port on the mgr nodes when they
are present explicitly in the inventory but not implicitly.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783520

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 4535985188)
2020-01-08 16:15:09 -05:00
Guillaume Abrioux 3caba2c31c dashboard: use fqdn in external url
Force fqdn to be used in external url for prometheus and alertmanager.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1765485

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 498bc45859)
2020-01-08 16:14:33 -05:00
Guillaume Abrioux 4cf5c08cd8 facts: use correct python interpreter
that task is delegated on the first mon so we should always use the
`discovered_interpreter_python` from that node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5adb735c78)
2020-01-08 11:18:45 -05:00
Guillaume Abrioux ba4787817e defaults: change default value for dashboard_admin_password
A recent change in ceph/ceph prevent from having username in the
password:

`Error EINVAL: Password cannot contain username.`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0756fa467d)
2019-12-11 08:48:34 -05:00
Guillaume Abrioux 00bdb60663 defaults: add a comment
This commit isolates and adds an explicit comment about variables not
intended to be modified by the user.

Fixes: #4828

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a234338eff)
2019-12-10 13:45:18 -05:00
Guillaume Abrioux 6295a33912 dashboard: run node_export as privileged container
Typical error:

```
type=AVC msg=audit(1575367499.582:3210): avc:  denied  { search } for  pid=26680 comm="node_exporter" name="1" dev="proc" ino=11528 scontext=system_u:system_r:container_t:s0:c100,c1014 tcontext=system_u:system_r:init_t:s0 tclass=dir permissive=0
```

node_exporter needs to be run as privileged to avoid avc denied error
since it gathers lot of information on the host.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762168

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d245eb7e7d)
2019-12-09 17:27:51 +01:00
Dimitri Savineau 0340929ed3 ceph-defaults: exclude md devices from discovery
The md devices (RAID software) aren't excluded from the devices list in
the auto discovery scenario.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1764601

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 014f51c2a4)
2019-12-09 09:32:21 +01:00
Guillaume Abrioux cfc10a8142 facts: avoid duplicated element in devices list
When using `osd_auto_discovery`, `devices` is built multiple times due
to multiple runs of `ceph-facts` role. It end up with duplicate
instances of a same device in the list.

Using `unique` filter when building the list fixes this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 23b1f43897)
2019-12-05 14:51:18 +01:00
Dimitri Savineau f4e5f3ee9e ceph-grafana: remove ipv6 brakets on wait_for
The wait_for ansible module doesn't support the backets on IPv6 address
so need to remove them.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769710

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 55adc10be3)
2019-12-03 17:12:36 +01:00
Dimitri Savineau a3bfed88c9 ceph-defaults: pin prometheus container tags
In addition to the grafana container tag change, we need to do the same
for the prometheus container stack based on the release present in the
OSE 4.1 container image.

$ docker run --rm openshift4/ose-prometheus-node-exporter:v4.1 --version
node_exporter, version 0.17.0
  build user:       root@67fee13ed48f
  build date:       20191023-14:38:12
  go version:       go1.11.13
$ docker run --rm openshift4/ose-prometheus-alertmanager:4.1 --version
alertmanager, version 0.16.2
  build user:       root@70b79a3f29b6
  build date:       20191023-14:57:30
  go version:       go1.11.13
$ docker run --rm openshift4/ose-prometheus:4.1 --version
prometheus, version 2.7.2
  build user:       root@12da054778a3
  build date:       20191023-14:39:36
  go version:       go1.11.13

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3e29b8d5ff)
2019-12-03 16:19:16 +01:00
Guillaume Abrioux 6592caab08 facts: isolate container_binary facts
in order to be able to call container_binary without having to run the
whole ceph-facts role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fe5ffe589e)
2019-12-03 09:57:11 -05:00
Guillaume Abrioux 1f30327688 purge: remove docker_* task
All containers are removed when systemd stops them.
There is no need to call this module in purge container playbook.

This commit also removes all docker_image task and remove all container
images in the final cleanup play.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1776736

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d23383a820)
2019-12-03 09:57:11 -05:00
Guillaume Abrioux ba2925df32 dashboard: use fqdn url for active alert
When using the shortname, the URL for active alert launches with short
hostname and fails to connect to the server.

This commit changes the template in order to use the fqdn.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1765485

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a8d76d72d7)
2019-12-03 09:44:17 -05:00
Guillaume Abrioux e8ed36fdae dashboard: only print dashboard url of the grafana-server node
This commit makes the ceph-dashboard role only printing ceph-dashboard
URL of the nodes present in grafana-server group

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762163

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cc0c1ce301)
2019-12-03 14:49:12 +01:00
Guillaume Abrioux 88d060f6e1 docker2podman: import ceph-handler role
This is needed to avoid following error:

```
ERROR! The requested handler 'restart ceph mons' was not found in either the main handlers list nor in the listening handlers list
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a43a872105)
2019-12-03 10:44:48 +01:00
VasishtaShastry 4edef59feb Fixes failure of cephfs configuration using --limit
Configuration of cephfs with an existing cluster using --limit used to fail
at different tasks while running with site-docker.yml
This commit addresses both of those tasks

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1773489
Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
(cherry picked from commit 72c43cc5d9)
2019-11-20 09:41:38 -05:00
VasishtaShastry e54b6be74e Evades validation of ceph_repository_type in containerized scenario
This will prevent failure of site-docker.yml with configs in doc.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769760

Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9a1f1626c3)
2019-11-18 16:41:10 +01:00
Dimitri Savineau 3ebd71c8c2 ceph-osd: fix fs.aio-max-nr sysctl condition
[1] introduced a regression on the fs.aio-max-nr sysctl value condition.
The enable key isn't a boolean but a string because the expression isn't
evaluated.
This string output "(osd_objectstore == 'bluestore')" is always true
because item.enable condition only matches non empty string. So the
sysctl value was applyied for both filestore and bluestore backend.

[2] added the bool filter to the condition but the filter always returns
false on string and the sysctl wasn't applyed at all.

This commit fixes the enable key value by evaluating the value instead
of using the string.

[1] https://github.com/ceph/ceph-ansible/commit/08a2b58
[2] https://github.com/ceph/ceph-ansible/commit/ab54fe2

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ece46d33be)
2019-11-07 20:31:02 +01:00
Dimitri Savineau 0dcaec64ec ceph-defaults: pin grafana container tag to 5.2.4
The latest grafana container tag is using grafana 6.x release which could
cause issue with the ceph dashboard integration.
Considering that the grafana container in RHCS 3 is based on 5.x then we
should use the same version.

$ docker run --rm rhceph/rhceph-3-dashboard-rhel7:3 -v
Version 5.2.4 (commit: unknown-dev)

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2037fb87b6)
2019-10-31 19:10:04 -04:00
Dimitri Savineau 27eb40714c ceph-osd: Remove ulimit nofile on container start
Even if this improves ceph-disk/ceph-volume performances then it also
impact the ceph-osd process.
The ceph-osd process shouldn't use 1024:4096 value for the max open
files.
Removing the ulimit option from the container engine and doing this kind
of change on the container side [1].

[1] https://github.com/ceph/ceph-container/pull/1497

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 9a996aef7f)
2019-10-31 14:42:30 -04:00
fmount 20b4234ddc Set grafana-server user and password in ceph-dashboard role
This change adds two tasks to set grafana-api user and password
that are required to inject dashboard layouts to the external
grafana instance.
Without these two parameters the ceph-ansible playbook fails
showing an authorization error (HTTPError: 401 Client Error:
Unauthorized").

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1767365
Signed-off-by: fmount <fpantano@redhat.com>
(cherry picked from commit 41b8c17356)
2019-10-31 11:43:54 -04:00
Mihai Plasoianu 6015a6ca40 ceph-mon: use --admin-daemon to set default crush rule
Signed-off-by: Mihai Plasoianu <m.plasoianu@vertical.de>
(cherry picked from commit d3f67d63ae)
2019-10-29 22:26:53 -04:00
Dimitri Savineau ffd05ca8df defaults: add user/pass auth registry variables
Add ceph_docker_registry_username and ceph_docker_registry_password
variables in ceph-defaults role so they will be present in the group_vars
samples but commented.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1763139

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit b33c476f16)
2019-10-24 16:24:54 -04:00
Dimitri Savineau b3ee07b242 dashboard: add ceph iscsi management
When deploying with ceph-iscsi nodes and dashboard enabled, we need to
add the ceph iscsi gateway endpoints to the dashboard configuration and
add the mgr ip address in the trusted list in the iscsi gateway
configuration file.

Closes: #4638
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1764173

https://docs.ceph.com/docs/master/mgr/dashboard/#enabling-iscsi-management

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d050391cbb)
2019-10-23 09:47:46 +02:00
Dimitri Savineau 567e90cd2e ceph-iscsi: add ceph-iscsi stable repositories
This commit adds the support of the ceph-iscsi stable repository when
use ceph_repository community instead of always using the devel
repositories.
We're still using the devel repositories for rtslib and tcmu-runner in
both cases (dev and community).

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f2cb937193)
2019-10-23 09:47:46 +02:00