Commit Graph

2495 Commits (3c31b19ab39f297635c84edb9e8a5de6c2da7707)

Author SHA1 Message Date
Benoît Knecht 3c31b19ab3 ceph-rgw: Fix custom pool size setting
RadosGW pools can be created by setting

```yaml
rgw_create_pools:
  .rgw.root:
    pg_num: 512
    size: 2
```

for instance. However, doing so would create pools of size
`osd_pool_default_size` regardless of the `size` value. This was due to
the fact that the Ansible task used

```
{{ item.size | default(osd_pool_default_size) }}
```

as the pool size value, but `item.size` is always undefined; the
correct variable is `item.value.size`.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2020-01-08 16:16:38 -05:00
Dimitri Savineau 70eba66182 ceph-iscsi: manage ipv6 in trusted_ip_list
Only the ipv4 addresses from the nodes running the dashboard mgr module
were added to the trusted_ip_list configuration file on the iscsigws
nodes.
This also add the iscsi gateways with ipv6 configuration to the ceph
dashboard.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787531

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-08 13:54:04 -05:00
Guillaume Abrioux 5adb735c78 facts: use correct python interpreter
that task is delegated on the first mon so we should always use the
`discovered_interpreter_python` from that node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-01-08 10:06:43 -05:00
Guillaume Abrioux 498bc45859 dashboard: use fqdn in external url
Force fqdn to be used in external url for prometheus and alertmanager.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1765485

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-01-08 09:06:49 -05:00
Guillaume Abrioux fca6f788a0 Revert "nfs: do not run privileged nfs container"
This reverts commit d06158e9d9.

Otherwise ganesha consumers can't dynamically update exports using dbus.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1784562
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-01-08 14:18:21 +01:00
Dimitri Savineau 254ab54f80 ceph-iscsi: remove python rtslib shaman repository
The rtslib python library is now available in the distribution so we
shouldn't have to use the shaman repository

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-08 11:13:46 +01:00
Dimitri Savineau d758125290 ceph-nfs: add ganesha_t type to selinux
Since RHEL 8.1 we need to add the ganesha_t type to the permissive
SELinux list.
Otherwise the nfs-ganesha service won't start.
This was done on RHEL 7 previously and part of the nfs-ganesha-selinux
package on RHEL 8.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786110

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-08 11:13:46 +01:00
Dimitri Savineau de8f2a9f83 container: move lvm2 package installation
Before this patch, the lvm2 package installation was done during the
ceph-osd role.
However we were running ceph-volume command in the ceph-config role
before ceph-osd. If lvm2 wasn't installed then the ceph-volume command
fails:

error checking path "/run/lock/lvm": stat /run/lock/lvm: no such file or
directory

This wasn't visible before because lvm2 was automatically installed as
docker dependency but it's not the same for podman on CentOS 8.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-08 11:13:46 +01:00
Dimitri Savineau d4fd38c967 ceph-nfs: change ganesha CentOS repository
Since we don't have nfs-ganesha builds available on CentOS 8 at the
moment on shaman then we can use the alternative repository at [1]

[1] https://download.nfs-ganesha.org/3/LATEST/CentOS

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-08 11:13:46 +01:00
Guillaume Abrioux 217d95abb2 common: add centos8 support
Ceph octopus only supports CentOS 8.

This commit adds CentOS 8 support:
  - update vagrant image in tox configurations.
  - add CentOS 8 repository for el8 dependencies.
  - CentOS 8 container engine is podman (same than RHEL 8).
  - don't use the epel mirror on sepia because it's epel7 only.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-08 11:13:46 +01:00
Stanley Lam 2ca3364109 ceph-rgw-loadbalancer: Modify keepalived master selection
Currently the keepalived template only works when system hostnames exactly match the Ansible inventory name. If these are different, all generated templates become BACKUP without a MASTER assigned. Using the inventory_hostname in the template file resolves this issue.

Signed-off-by: Stanley Lam stanleylam_604@hotmail.com
2020-01-06 09:25:04 -05:00
Dimitri Savineau 2c06678cde ceph-infra: replace hardcoded grafana group name
The grafana-server group name was hardcoded for the grafana/prometheus
firewalld tasks condition.
We should we the associated variable : grafana_server_group_name

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-18 16:09:14 +01:00
Dimitri Savineau f4c261ef90 ceph-infra: move dashboard into a dedicated file
Instead of using multiple dashboard_enabled condition in the
configure_firewall file we could just have the condition once
and include the dedicated tasks list.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-18 16:09:14 +01:00
Dimitri Savineau 4535985188 ceph-infra: open dashboard port on monitor
When there's no mgr group defined in the ansible inventory then the
mgrs are deployed implicitly on the mons nodes.
If the dashboard is enabled then we need to open the dashboard port on
the node that is running the ceph mgr process (mgr or mon).
The current code only allow to open that port on the mgr nodes when they
are present explicitly in the inventory but not implicitly.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783520

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-18 16:09:14 +01:00
Dimitri Savineau 6f0556f015 ceph-defaults: exclude rbd devices from discovery
The RBD devices aren't excluded from the devices list in the LVM auto
discovery scenario.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783908

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-18 09:03:19 +01:00
Guillaume Abrioux fc02fc98eb defaults: change monitor|radosgw_address default values
To avoid confusion, let's change the default value from `0.0.0.0` to
`x.x.x.x`.
Users might think setting `0.0.0.0` will make the daemon binding on all
interfaces.

Fixes: #4827

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-12 09:58:33 +01:00
Philip Brown 9021c29b61 Add comment on auto-SSL cert generation
Fixes: #4830

Signed-off-by: Philip Brown <phil@bolthole.com>
2019-12-11 10:57:28 +01:00
Dimitri Savineau 68c6f39349 ceph-facts: set use_new_ceph_iscsi on iscsi nodes
We don't need to set the use_new_ceph_iscsi fact on other nodes than
those present in the iscsigws group.
Also remove the duplicate iscsi_gw_group_name condition already present
on the include_task.
Finally validate the ansible distribution as the first task.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-10 23:57:03 +01:00
Guillaume Abrioux 8d0dc34ebe defaults: fix a typo
s/above/below

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-10 09:32:02 -05:00
Guillaume Abrioux a234338eff defaults: add a comment
This commit isolates and adds an explicit comment about variables not
intended to be modified by the user.

Fixes: #4828

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-09 13:50:43 -05:00
Guillaume Abrioux d245eb7e7d dashboard: run node_export as privileged container
Typical error:

```
type=AVC msg=audit(1575367499.582:3210): avc:  denied  { search } for  pid=26680 comm="node_exporter" name="1" dev="proc" ino=11528 scontext=system_u:system_r:container_t:s0:c100,c1014 tcontext=system_u:system_r:init_t:s0 tclass=dir permissive=0
```

node_exporter needs to be run as privileged to avoid avc denied error
since it gathers lot of information on the host.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762168

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-09 09:40:13 -05:00
Dimitri Savineau 1a77dd7e91 ceph-validate: start with ansible version test
It doesn't make sense to start validating configuration if the ansible
version isn't the good one.
This commit moves the check_system task as the first task in the
ceph-validate role.
The ansible version test tasks are moved at the top of this file.
Also moving the iscsi kernel tests from check_system to check_iscsi
file.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-09 09:35:03 +01:00
Dimitri Savineau 12aa8f4025 ceph-facts: move ntp/chrony facts to ceph-infra
The ntp/chrony facts are only used in the ceph-infra role so we don't
really need to set them in the ceph-facts roles.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-05 19:46:59 +01:00
Guillaume Abrioux 0756fa467d defaults: change default value for dashboard_admin_password
A recent change in ceph/ceph prevent from having username in the
password:

`Error EINVAL: Password cannot contain username.`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-05 13:02:06 -05:00
Dimitri Savineau 014f51c2a4 ceph-defaults: exclude md devices from discovery
The md devices (RAID software) aren't excluded from the devices list in
the auto discovery scenario.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1764601

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-05 10:14:25 +01:00
Guillaume Abrioux a8d76d72d7 dashboard: use fqdn url for active alert
When using the shortname, the URL for active alert launches with short
hostname and fails to connect to the server.

This commit changes the template in order to use the fqdn.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1765485

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-03 14:30:32 +01:00
Guillaume Abrioux fe5ffe589e facts: isolate container_binary facts
in order to be able to call container_binary without having to run the
whole ceph-facts role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-03 13:29:52 +01:00
Guillaume Abrioux d23383a820 purge: remove docker_* task
All containers are removed when systemd stops them.
There is no need to call this module in purge container playbook.

This commit also removes all docker_image task and remove all container
images in the final cleanup play.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1776736

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-03 13:29:52 +01:00
Stanley Lam ad7a5dad3f Add option for HAproxy to act a SSL frontend termination point for loadbalanced RGW instances.
Signed-off-by: Stanley Lam <stanleylam_604@hotmail.com>
2019-12-02 16:54:33 -05:00
Guillaume Abrioux a43a872105 docker2podman: import ceph-handler role
This is needed to avoid following error:

```
ERROR! The requested handler 'restart ceph mons' was not found in either the main handlers list nor in the listening handlers list
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-02 09:11:12 -05:00
Dimitri Savineau 5bd1cf40eb ceph-osd: wait for all osds once
cf8c6a3 moves the 'wait for all osds' task from openstack_config to the
main tasks list.
But the openstack_config code was executed only on the last OSD node.
We don't need to do this check on all OSD node so we need to add set
run_once to true on that task.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-27 13:05:42 -05:00
Guillaume Abrioux 23b1f43897 facts: avoid duplicated element in devices list
When using `osd_auto_discovery`, `devices` is built multiple times due
to multiple runs of `ceph-facts` role. It end up with duplicate
instances of a same device in the list.

Using `unique` filter when building the list fixes this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-27 16:35:41 +01:00
Guillaume Abrioux cc0c1ce301 dashboard: only print dashboard url of the grafana-server node
This commit makes the ceph-dashboard role only printing ceph-dashboard
URL of the nodes present in grafana-server group

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762163

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-27 10:28:23 -05:00
Guillaume Abrioux f19a2aef1a Revert "tox-podman: use centos 8 vagrant image"
This reverts commit 19e9a06ab1.
2019-11-27 16:19:58 +01:00
Dimitri Savineau cf8c6a3849 ceph-osd: wait for all osd before crush rules
When creating crush rules with device class parameter we need to be sure
that all OSDs are up and running because the device class list is
is populated with this information.
This is now enable for all scenario not openstack_config only.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-27 07:43:07 +01:00
Dimitri Savineau 55adc10be3 ceph-grafana: remove ipv6 brakets on wait_for
The wait_for ansible module doesn't support the backets on IPv6 address
so need to remove them.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769710

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-26 10:08:17 +01:00
Guillaume Abrioux 33bfb10af9 nfs: remove legacy file
this file is provided by the packaging (nfs-ganesha) so there's no need
to maintain it in ceph-ansible

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-22 05:11:41 +01:00
Guillaume Abrioux d06158e9d9 nfs: do not run privileged nfs container
At the moment, we bindmount the dbus socket from the host, this requires
to run the container with --privileged.
Since we now run a dedicated dbus daemon inside the same container, we
can stop running privileged nfs-ganesha containers

Related ceph-container PR : ceph/ceph-container#1517

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1725254

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-22 05:11:41 +01:00
Dimitri Savineau 19e9a06ab1 tox-podman: use centos 8 vagrant image
Switch the podman scenario from atomic centos 7 to centos 8 (not atomic)

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-20 10:34:34 +01:00
VasishtaShastry 72c43cc5d9 Fixes failure of cephfs configuration using --limit
Configuration of cephfs with an existing cluster using --limit used to fail
at different tasks while running with site-docker.yml
This commit addresses both of those tasks

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1773489
Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
2019-11-18 16:44:47 +01:00
Dimitri Savineau ef2cb99f73 ceph-osd: add device class to crush rules
This adds device class support to crush rules when using the class key
in the rule dict via the create-replicated sub command.
If the class key isn't specified then we use the create-simple sub
command for backward compatibility.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-14 16:25:46 +01:00
Dimitri Savineau ed36a11eab move crush rule creation from mon to osd role
If we want to create crush rules with the create-replicated sub command
and device class then we need to have the OSD created before the crush
rules otherwise the device classes won't exist.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-14 16:25:46 +01:00
Dimitri Savineau 3e29b8d5ff ceph-defaults: pin prometheus container tags
In addition to the grafana container tag change, we need to do the same
for the prometheus container stack based on the release present in the
OSE 4.1 container image.

$ docker run --rm openshift4/ose-prometheus-node-exporter:v4.1 --version
node_exporter, version 0.17.0
  build user:       root@67fee13ed48f
  build date:       20191023-14:38:12
  go version:       go1.11.13
$ docker run --rm openshift4/ose-prometheus-alertmanager:4.1 --version
alertmanager, version 0.16.2
  build user:       root@70b79a3f29b6
  build date:       20191023-14:57:30
  go version:       go1.11.13
$ docker run --rm openshift4/ose-prometheus:4.1 --version
prometheus, version 2.7.2
  build user:       root@12da054778a3
  build date:       20191023-14:39:36
  go version:       go1.11.13

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-14 16:11:14 +01:00
VasishtaShastry 9a1f1626c3 Evades validation of ceph_repository_type in containerized scenario
This will prevent failure of site-docker.yml with configs in doc.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769760

Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-14 15:53:22 +01:00
Dimitri Savineau 4a065cebd7 ceph-validate: add rbdmirror validation
When ceph_rbd_mirror_configure is set to true we need to ensure that
the required variables aren't empty.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1760553

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-07 08:57:43 -05:00
Dimitri Savineau 60cbfdc2a6 ceph-handler: Use /proc/net/unix for rgw socket
If for some reason, there's an old rgw socket file present in the
/var/run/ceph/ directory then the test command could fail with

test: xxxxxxxxx.asok: binary operator expected

$ ls -hl /var/run/ceph/
total 0
srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94153614631472.asok
srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94240997655088.asok

We can check the radosgw socket in /proc/net/unix to avoid using wildcard
in the socket name.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-07 14:41:11 +01:00
Dimitri Savineau ece46d33be ceph-osd: fix fs.aio-max-nr sysctl condition
[1] introduced a regression on the fs.aio-max-nr sysctl value condition.
The enable key isn't a boolean but a string because the expression isn't
evaluated.
This string output "(osd_objectstore == 'bluestore')" is always true
because item.enable condition only matches non empty string. So the
sysctl value was applyied for both filestore and bluestore backend.

[2] added the bool filter to the condition but the filter always returns
false on string and the sysctl wasn't applyed at all.

This commit fixes the enable key value by evaluating the value instead
of using the string.

[1] https://github.com/ceph/ceph-ansible/commit/08a2b58
[2] https://github.com/ceph/ceph-ansible/commit/ab54fe2

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-07 13:51:48 +01:00
Dimitri Savineau 2037fb87b6 ceph-defaults: pin grafana container tag to 5.2.4
The latest grafana container tag is using grafana 6.x release which could
cause issue with the ceph dashboard integration.
Considering that the grafana container in RHCS 3 is based on 5.x then we
should use the same version.

$ docker run --rm rhceph/rhceph-3-dashboard-rhel7:3 -v
Version 5.2.4 (commit: unknown-dev)

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-31 18:44:51 -04:00
Dimitri Savineau 9a996aef7f ceph-osd: Remove ulimit nofile on container start
Even if this improves ceph-disk/ceph-volume performances then it also
impact the ceph-osd process.
The ceph-osd process shouldn't use 1024:4096 value for the max open
files.
Removing the ulimit option from the container engine and doing this kind
of change on the container side [1].

[1] https://github.com/ceph/ceph-container/pull/1497

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-31 10:42:09 -04:00
fmount 41b8c17356 Set grafana-server user and password in ceph-dashboard role
This change adds two tasks to set grafana-api user and password
that are required to inject dashboard layouts to the external
grafana instance.
Without these two parameters the ceph-ansible playbook fails
showing an authorization error (HTTPError: 401 Client Error:
Unauthorized").

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1767365
Signed-off-by: fmount <fpantano@redhat.com>
2019-10-31 10:29:57 -04:00