Commit Graph

5186 Commits (ccfa249919b338197daec353cb5d4e535b3fb734)
 

Author SHA1 Message Date
Guillaume Abrioux 206ee589d6 update: reset flags before and after each osd node upgrade
It might be possible at some point even with osd flags `noout` and
`norebalance` set the PGs states can change depending on the amount of data
written meantime. It means the check for PGs state will fail.

This commit changes the way we set those flags:
we set them before an OSD node upgrade and unset them before the PGs
state check so they can recover.

Fixes: #3961

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-08 09:10:52 -05:00
Javier Pena 19a43ff261 Fixes for Makefile
- Set default mock configuration to epel-8-x86_64, to match the
  default dist value.
- Add support for alpha tags, like the recently added v5.0.0alpha1

Signed-off-by: Javier Pena <jpena@redhat.com>
2019-11-08 09:09:30 -05:00
Guillaume Abrioux db77fbda15 tests: add coverage on purge playbook
This commit adds a playbook to be played before we run purge playbook,
it first creates an rbd image then map an rbd device on client0 so the
purge playbook will try to unmap it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-08 09:06:11 -05:00
Guillaume Abrioux 3cfcc7a105 purge: use sysfs to unmap rbd devices
in containerized context, using the binary provided in atomic os won't
work because it's an old version provided by ceph-common based on
10.2.5.
Using a container could be an idea but for large cluster with hundreds
of client nodes, that would require to pull the image of each of them
just to unmap the rbd devices.

Let's use the sysfs method in order to avoid any issue related to ceph
version that is shipped on the host.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1766064

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-08 09:06:11 -05:00
Dimitri Savineau 4a065cebd7 ceph-validate: add rbdmirror validation
When ceph_rbd_mirror_configure is set to true we need to ensure that
the required variables aren't empty.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1760553

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-07 08:57:43 -05:00
Dimitri Savineau 60cbfdc2a6 ceph-handler: Use /proc/net/unix for rgw socket
If for some reason, there's an old rgw socket file present in the
/var/run/ceph/ directory then the test command could fail with

test: xxxxxxxxx.asok: binary operator expected

$ ls -hl /var/run/ceph/
total 0
srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94153614631472.asok
srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94240997655088.asok

We can check the radosgw socket in /proc/net/unix to avoid using wildcard
in the socket name.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-07 14:41:11 +01:00
Dimitri Savineau 34b03d1873 add-{mon,osd}: run raw install python tasks
If the new mon/osd node doesn't have python installed then we need to
execute the tasks from raw_install_python.yml.

Closes: #4368

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-07 14:04:26 +01:00
Dimitri Savineau ece46d33be ceph-osd: fix fs.aio-max-nr sysctl condition
[1] introduced a regression on the fs.aio-max-nr sysctl value condition.
The enable key isn't a boolean but a string because the expression isn't
evaluated.
This string output "(osd_objectstore == 'bluestore')" is always true
because item.enable condition only matches non empty string. So the
sysctl value was applyied for both filestore and bluestore backend.

[2] added the bool filter to the condition but the filter always returns
false on string and the sysctl wasn't applyed at all.

This commit fixes the enable key value by evaluating the value instead
of using the string.

[1] https://github.com/ceph/ceph-ansible/commit/08a2b58
[2] https://github.com/ceph/ceph-ansible/commit/ab54fe2

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-07 13:51:48 +01:00
Dimitri Savineau 02df2ab5ea tests/requirements: bump testinfra and pytest
The ansible ssh connections are now using the ssh backend instead of
paramiko starting testinfra 3.1 and persistent connections too.
pytest 4.6 is the latest release to be supported by python 2.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-04 09:09:49 -05:00
Dimitri Savineau 2037fb87b6 ceph-defaults: pin grafana container tag to 5.2.4
The latest grafana container tag is using grafana 6.x release which could
cause issue with the ceph dashboard integration.
Considering that the grafana container in RHCS 3 is based on 5.x then we
should use the same version.

$ docker run --rm rhceph/rhceph-3-dashboard-rhel7:3 -v
Version 5.2.4 (commit: unknown-dev)

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-31 18:44:51 -04:00
Dimitri Savineau 9a996aef7f ceph-osd: Remove ulimit nofile on container start
Even if this improves ceph-disk/ceph-volume performances then it also
impact the ceph-osd process.
The ceph-osd process shouldn't use 1024:4096 value for the max open
files.
Removing the ulimit option from the container engine and doing this kind
of change on the container side [1].

[1] https://github.com/ceph/ceph-container/pull/1497

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-31 10:42:09 -04:00
fmount 41b8c17356 Set grafana-server user and password in ceph-dashboard role
This change adds two tasks to set grafana-api user and password
that are required to inject dashboard layouts to the external
grafana instance.
Without these two parameters the ceph-ansible playbook fails
showing an authorization error (HTTPError: 401 Client Error:
Unauthorized").

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1767365
Signed-off-by: fmount <fpantano@redhat.com>
2019-10-31 10:29:57 -04:00
Javier Pena ed8341568b Allow setting dist and mock configuration in Makefile
Curently, the dist and mock configurations are hardcoded in the
Makefile to be el8 and epel-7-x86_64, respectively. This commit
allows the user to override those settings using the DIST and
MOCK_CONFIG environment variables, falling back to the current
defaults if not set.

This provides additional flexibility when building the RPM directly
from the repository.

Signed-off-by: Javier Peña <jpena@redhat.com>
2019-10-29 20:59:50 -04:00
Mihai Plasoianu d3f67d63ae ceph-mon: use --admin-daemon to set default crush rule
Signed-off-by: Mihai Plasoianu <m.plasoianu@vertical.de>
2019-10-29 20:59:32 -04:00
Radu Toader f2573c9e6b nfs: support specific keys for rgw nfs user
This brings the possibility to modify the rgw nfs user to use specific
keys when those are defined.

Signed-off-by: Radu Toader <radu.m.toader@gmail.com>
2019-10-29 14:59:26 -04:00
Guillaume Abrioux e9823f319b update: add default values when setting fact
This commit adds a default value in the `with_dict` because when using
python 2.7, if a task using a `with_dict` has a condition, it is
evaluated anyway whereas in python 3 it isn't.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1766499

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-29 14:45:28 -04:00
Dimitri Savineau 15f7c7195a ceph-nfs: add nfs-ganesha-rados-grace explicitly
Since nfs-ganesha V3.0-rc4 and [1] we need to explicitly install the
nfs-ganesha-rados-grace package.

[1] https://github.com/nfs-ganesha/nfs-ganesha/commit/0fea990

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-28 16:27:36 -04:00
Dimitri Savineau 2ca79fcc99 rolling_update: remove default filter on mds group
There's no need to use the default filter on active/standby groups
because if the group doesn't exist then the play is just skipped.

Currently this generates warnings like:

[WARNING]: Could not match supplied host pattern, ignoring: |
[WARNING]: Could not match supplied host pattern, ignoring: default([])

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-28 15:02:50 +01:00
Dimitri Savineau f1f2352c79 rolling_update: fix active mds host value
The active mds host should be based on the inventory hostname and not on
the ansible hostname.
The value returns under the mdsmap structure is based on the OS hostname
so we need to find the right node in the inventory with this value when
doing operation on inventory nodes.

Othewise we could see error like:

The task includes an option with an undefined variable. The error was:
"hostvars[foobar]" is undefined

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-28 15:02:50 +01:00
Dimitri Savineau 8a0a13f67a ipaddrs_in_ranges: fix python indent
pycodestyle returns:

 E111 indentation is not a multiple of four

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-28 09:23:17 +01:00
Dimitri Savineau 6ce4fde820 move library/plugins tests files under tests dir
To avoid unnecessary ansible warnings during playbook execution we can
move the library and plugins test files under a different directory.

[WARNING]: Skipping plugin (plugins/filter/test_ipaddrs_in_ranges.py) as
it seems to be invalid:
cannot import name 'ipaddrs_in_ranges'

Closes: #4656

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-28 09:23:17 +01:00
Dimitri Savineau 77b212833e add-mon: add missing become flag
Without the become flag set to true, we can't executed the roles
successfully.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-25 23:05:25 +02:00
Dimitri Savineau 650bc0c3f0 rolling_update: fix reset mon_host variable
mon_host should use the inventory hostname and not the node hostname.
Fix creates an issue when the inventory and node hostname are different.

Closes: #4670

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-25 23:04:35 +02:00
Dimitri Savineau bfb1d6be12 add-{mon,osd}: add ceph-container-engine role
The ceph-container-engine role is missing from both playbooks so the
container engine (docker, podman) isn't install resulting in a failure
on the added nodes.

fatal: [xxxxx]: FAILED! => changed=false
  cmd: docker --version
  msg: '[Errno 2] No such file or directory'
  rc: 2

Closes: #4634

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-24 16:24:02 -04:00
Guillaume Abrioux d06057ebd2 update: use right node when creating active mds group
This must be consistent with what is used in `name` parameter.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-24 15:15:51 -04:00
Dimitri Savineau b33c476f16 defaults: add user/pass auth registry variables
Add ceph_docker_registry_username and ceph_docker_registry_password
variables in ceph-defaults role so they will be present in the group_vars
samples but commented.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1763139

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-24 15:11:45 -04:00
Guillaume Abrioux 3d28773da5 mon: call mon_status from asok
since c09b82a80a392ccd0da7677c7b424ce5cd3fa5d6 in ceph/ceph we must call
mon_status from asok instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-24 10:19:16 -04:00
Guillaume Abrioux 1122da7f4a update: avoid skipping single mds deployment upgrade
otherwise a single MDS would never be updated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-24 09:28:36 +02:00
Guillaume Abrioux 5ec906c3af update: skip mds deactivation when no mds in inventory
Let's skip this part of the code if there's no mds node in the
inventory.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-23 11:06:13 -04:00
Dimitri Savineau d050391cbb dashboard: add ceph iscsi management
When deploying with ceph-iscsi nodes and dashboard enabled, we need to
add the ceph iscsi gateway endpoints to the dashboard configuration and
add the mgr ip address in the trusted list in the iscsi gateway
configuration file.

Closes: #4638
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1764173

https://docs.ceph.com/docs/master/mgr/dashboard/#enabling-iscsi-management

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-22 23:24:17 +02:00
Dimitri Savineau f2cb937193 ceph-iscsi: add ceph-iscsi stable repositories
This commit adds the support of the ceph-iscsi stable repository when
use ceph_repository community instead of always using the devel
repositories.
We're still using the devel repositories for rtslib and tcmu-runner in
both cases (dev and community).

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-22 23:24:17 +02:00
Dimitri Savineau 82495eaf97 rhcs: set ceph_iscsi_config_dev to false
We don't have to use ceph_iscsi_config_dev (default true) on RHCS
because all iscsi packages are already included in the RHCS
repositories.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-22 23:24:17 +02:00
Dimitri Savineau fd8d47da98 Revert "iscsigw: install python-requests"
We don't need this since [1]. Also this was only working for python2 and
not supporting python3.

[1] https://github.com/ceph/ceph-iscsi/commit/00f198a

This reverts commit 167737dd3d.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-22 23:24:17 +02:00
Dimitri Savineau 9ad000618f container/dashboard: run the registry auth task
When deploying with packages then the ceph-container-common role isn't
executed so the registry authentication task is ignored.

Closes: #4636

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-22 23:23:32 +02:00
Guillaume Abrioux b5a61fe2e3 tests: use osd ids instead of device name in ooo_collocation
on master, it doesn't make sense anymore to use device name, we should
use osd id instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-22 13:45:19 +02:00
Guillaume Abrioux 384161edcd tests: fix keyring creation in ooo_collocation
This commit removes the backslash in allow command parameter, this was
needed before the ceph_key module integration.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-22 13:45:19 +02:00
Dimitri Savineau 3c2840da03 tests: update container tag for ooo_collocation
It doesn't make sense to test the old 3.0.x container images with
nautilus+ ceph releases.
Also disable the dashboard deployment and switch to bluestore backend.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-22 13:45:19 +02:00
Guillaume Abrioux da4215e9c0 validate: fix credentials validation
This task is failing when `ceph_docker_registry_auth` is enabled and
`ceph_docker_registry_username` is undefined with an ansible error
instead of the expected message.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1763139

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-21 13:26:55 -04:00
Dimitri Savineau 3969470fca travis: fail on ansible-lint errors
If ansible-lint reports an error then it's skipped. We should fail in
this case.

This patch also fixes the pipefail lint in the rbd mirror role

[306] Shells that use pipes should set the pipefail option

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-21 13:26:02 -04:00
Guillaume Abrioux 8d72ff8e5e update: add missing quotes
Add missing quote in order to keep consistency.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-21 09:19:34 -04:00
Guillaume Abrioux 25b98b2ce3 tests: add multimds coverage
This commit makes the all_daemons scenario deploying 3 mds in order to
cover the multimds case.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-18 13:43:13 -04:00
Guillaume Abrioux c4fc8cc878 upgrade: fix standby_mdss group creation
This commit fixes the standby_mdss group creation by using `{{ item }}`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-18 13:43:13 -04:00
Dimitri Savineau 47ce9c0d22 playbooks: show cluster status after dashboard
We should show the ceph cluster status as the last task of the playbook.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-17 21:54:09 -04:00
Dimitri Savineau 8426856262 Move the dashboard playbook in the main directory
The [group|host]_vars directories are ignored for the dashboard playbook
when the inventory file directory doesn't contain those directories.

Closes: #4601
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1761612

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-17 21:54:09 -04:00
Guillaume Abrioux 4e9504c939 common: do not override ceph_release when using custom repo
Otherwise it fails like following:

```
TASK [ceph-mds : allow multimds] **************************************************************************************************************************************************
Monday 22 July 2019  16:37:38 +0800 (0:00:03.269)       0:13:25.651 ***********
fatal: [rhel7u6clone1]: FAILED! => {"msg": "The conditional check 'ceph_release_num[ceph_release] == ceph_release_num.luminous' failed. The error was: error while evaluating conditional (ceph_release_num[ceph_release] == ceph_release_num.luminous): 'dict object' has no attribute u'dummy'\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-mds/tasks/create_mds_filesystems.yml': line 43, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: allow multimds\n  ^ here\n"}
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-17 22:58:16 +02:00
Dimitri Savineau 2c03c6fcd3 tests: fix the size on the second data LV
The commit replaces the pv/vg/lv commands used with the ansible command
module by the lvg and lvol modules.
This also fixes the size of the second data LV because we were only using
50% of the remaining space instead of 100%.

With a 50G device, the result was:
  - data-lv1 was 25G
  - data-lv2 was 12.5G
Instead of:
  - data-lv1 was 25G
  - data-lv2 was 25G

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-17 15:49:15 -04:00
Mike Christie ba141298d7 iscsi-gw: Fix rtslib installation
When using python3 the name of the rtslib rpm is python3-rtslib. The
packages that use rtslib already have code that detects the python
version and distro deps, so drop it from the ceph iscsi gw task list and
let the ceph-iscsi rpm dependency handle it.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1760930

Signed-off-by: Mike Christie <mchristi@redhat.com>
2019-10-16 12:59:31 -04:00
Guillaume Abrioux 71cebf80a6 update: follow new recommandation to upgrade mds cluster
Refact the mds cluster upgrade code in order to follow the documented
recommandation.
See: https://github.com/ceph/ceph/blob/master/doc/cephfs/upgrading.rst

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1569689

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-16 11:23:12 -04:00
Guillaume Abrioux b63bd13073 nfs: remove unnecessary set_fact in main.yml
this task is a leftover and no longer needed.
It even causes bug when collocating nfs with mon.

Closes: #4609

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-16 11:23:02 -04:00
Dimitri Savineau 0b1e9c0737 rbd-mirror: fail if the peer is not added
Due the 'failed_when: false' statement present in the peer task then
the playbook continues to ran even if the peer task was failing (like
incorrect remote peer format.

"stderr": "rbd: invalid spec 'admin@cluster1'"

This patch adds a task to list the peer present and add the peer only if
it's not already added. With this we don't need the failed_when statement
anymore.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1665877

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-16 16:27:46 +02:00