Commit Graph

503 Commits (8a907cb1caff23ee3698afc72dd6b8e12280b755)

Author SHA1 Message Date
Guillaume Abrioux d5dca5087a tests: add 'all_in_one' scenario
Add new scenario 'all_in_one' in order to catch more collocated related
issues.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3e7dbb4b16)
2020-01-27 17:54:39 -05:00
Dimitri Savineau 0abea70e29 filestore-to-bluestore: fix osd_auto_discovery
When osd_auto_discovery is set then we need to refresh the
ansible_devices fact between after the filestore OSD purge
otherwise the devices fact won't be populated.
Also remove the gpt header on ceph_disk_osds_devices because
the devices is empty at this point for osd_auto_discovery.
Adding the bool filter when needed.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit bb3eae0c80)
2020-01-22 10:06:17 +01:00
Dimitri Savineau e4965e9ea9 filestore-to-bluestore: --destroy with raw devices
We still need --destroy when using a raw device otherwise we won't be
able to recreate the lvm stack on that device with bluestore.

Running command: /usr/sbin/vgcreate -s 1G --force --yes ceph-bdc67a84-894a-4687-b43f-bcd76317580a /dev/sdd
 stderr: Physical volume '/dev/sdd' is already in volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151'
  Unable to add physical volume '/dev/sdd' to volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151'
  /dev/sdd: physical volume not initialized.
--> Was unable to complete a new OSD, will rollback changes

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792227

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f995b079a6)
2020-01-21 18:26:55 +01:00
Guillaume Abrioux fc7212b192 tests: add time command in vagrant_up.sh
monitor how long it takes to get all VMs up and running

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 16bcef4f28)
2020-01-10 17:41:27 +01:00
Guillaume Abrioux 2c96155c32 tests: retry to fire up VMs on vagrant failure
Add a script to retry several times to fire up VMs to avoid vagrant
failures.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 1ecb3a9352)
2020-01-10 17:41:27 +01:00
Guillaume Abrioux 7c2918d684 tests: add a docker2podman scenario
This commit adds a new scenario in order to test docker-to-podman.yml
migration playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit dc672e86ec)
2020-01-10 17:41:27 +01:00
Dimitri Savineau bd016960cf ceph-osd: add device class to crush rules
This adds device class support to crush rules when using the class key
in the rule dict via the create-replicated sub command.
If the class key isn't specified then we use the create-simple sub
command for backward compatibility.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ef2cb99f73)
2020-01-10 11:07:25 -05:00
Dimitri Savineau 09ccf22052 tests: use community repository
We don't need to use dev repository on stable branches.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-09 21:39:23 +01:00
Dimitri Savineau d625cefbac tests: use ceph iscsi stable repository
The ceph iscsi repository was still set to dev (shaman) instead of
using the stable ceph-iscsi repository.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-08 19:29:59 +01:00
Guillaume Abrioux 78799ecf55 tests: add filestore_to_bluestore job
This commit adds a new job in order to test the
filestore-to-bluestore.yml infrastructure playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 40de34fb5e)
2019-12-11 16:37:21 +01:00
Dimitri Savineau acf2476f09 tests: reduce max_mds from 3 to 2
Having max_mds value equals to the number of mds nodes generates a
warning in the ceph cluster status:

cluster:
id:     6d3e49a4-ab4d-4e03-a7d6-58913b8ec00a'
health: HEALTH_WARN'
        insufficient standby MDS daemons available'
(...)
services:
  mds:     cephfs:3 {0=mds1=up:active,1=mds0=up:active,2=mds2=up:active}'

Let's use 2 active and 1 standby mds.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 4a6d19dae2)
2019-12-04 17:49:33 -05:00
Guillaume Abrioux 99cdcf9d29 tests: add coverage on purge playbook
This commit adds a playbook to be played before we run purge playbook,
it first creates an rbd image then map an rbd device on client0 so the
purge playbook will try to unmap it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit db77fbda15)
2019-11-14 10:49:38 -05:00
Dimitri Savineau 54365389f8 tests/requirements: bump testinfra and pytest
The ansible ssh connections are now using the ssh backend instead of
paramiko starting testinfra 3.1 and persistent connections too.
pytest 4.6 is the latest release to be supported by python 2.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 02df2ab5ea)
2019-11-04 12:58:37 -05:00
Dimitri Savineau 359abedbb1 move library/plugins tests files under tests dir
To avoid unnecessary ansible warnings during playbook execution we can
move the library and plugins test files under a different directory.

[WARNING]: Skipping plugin (plugins/filter/test_ipaddrs_in_ranges.py) as
it seems to be invalid:
cannot import name 'ipaddrs_in_ranges'

Closes: #4656

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 6ce4fde820)
2019-10-28 15:54:31 +01:00
Guillaume Abrioux 2fd51f8097 tests: use osd ids instead of device name in ooo_collocation
on master, it doesn't make sense anymore to use device name, we should
use osd id instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b5a61fe2e3)
2019-10-23 17:17:24 +02:00
Guillaume Abrioux cd397590f6 tests: fix keyring creation in ooo_collocation
This commit removes the backslash in allow command parameter, this was
needed before the ceph_key module integration.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 384161edcd)
2019-10-23 17:17:24 +02:00
Dimitri Savineau bb2f139a1d tests: update container tag for ooo_collocation
It doesn't make sense to test the old 3.0.x container images with
nautilus+ ceph releases.
Also disable the dashboard deployment and switch to bluestore backend.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3c2840da03)
2019-10-23 17:17:24 +02:00
Dimitri Savineau f8b84ce108 tests: fix the size on the second data LV
The commit replaces the pv/vg/lv commands used with the ansible command
module by the lvg and lvol modules.
This also fixes the size of the second data LV because we were only using
50% of the remaining space instead of 100%.

With a 50G device, the result was:
  - data-lv1 was 25G
  - data-lv2 was 12.5G
Instead of:
  - data-lv1 was 25G
  - data-lv2 was 25G

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2c03c6fcd3)
2019-10-18 17:26:18 -04:00
Guillaume Abrioux 9bc7f8a7d7 tests: add multimds coverage
This commit makes the all_daemons scenario deploying 3 mds in order to
cover the multimds case.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 25b98b2ce3)
2019-10-18 22:09:04 +02:00
Dimitri Savineau c6f1ef893d tests: reduce handler mon and osd delay
We don't need to have high handler delay in the CI so reducing to
10 seconds.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 04ec1ad3cc)
2019-10-18 22:09:04 +02:00
Dimitri Savineau 8117ed34d4 Remove validate action and notario dependency
The current ceph-validate role is using both validate action and fail
module tasks to validate the ceph configuration.
The validate action is based on the notario python library. When one of
the notario validation fails then a python stack trace is reported to the
ansible task. This output isn't understandable by users.

This patch removes the validate action and the notario depencendy. The
validation is now done with only fail ansible module.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654790

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0f978d969b)
2019-10-15 10:21:54 -04:00
Dimitri Savineau 067aa3aabd tests: fix rgw multisite vagrant variables
The secondary vagrant variables didn't have the grafana vm variable
set which create an vagrant error.

There was an error loading a Vagrantfile. The file being loaded
and the error message are shown below. This is usually caused by
an invalid or undefined variable.

This patch also changes the ssh-extra-args parameter to ssh-common-args
to get the same values for ssh/sftp/scp. Otherwise we can see warnings
from ansible and some tasks are failing.

[WARNING]: sftp transfer mechanism failed on [mon0]. Use ANSIBLE_DEBUG=1
to see detailed information

It also updates the ssh-common-args value for the rgw-multisite scenario
to reflect the ANSIBLE_SSH_ARGS environment variable value.

Finally changing the IP addresses due to the Vagrant refact done in the
commit 778c51a

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 010158ff84)
2019-10-04 16:48:00 -04:00
Guillaume Abrioux 4f806786da tests: remove debug log verbosity
This was added for debugging purpose.
It's generating very large log output, let's remove this now.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 01f6dd52b3)
2019-09-30 09:16:33 +02:00
Guillaume Abrioux 99e6807f51 tests: pin jinja2 version
ensure we get the latest jinja2 version.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 006df148d0)
2019-09-26 16:21:54 +02:00
Guillaume Abrioux b1e61be9c6 tests: set copy_admin_key at group_vars level
setting it at extra vars level prevent from setting it per node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5bb6a4da42)
2019-09-26 16:21:54 +02:00
Dimitri Savineau 0d55eeba79 tests: use a single grafana node on podman
We don't use multiple grafana nodes for the moment on the others
scenarios and I don't think this is supposed to be working.
We can often see failure on grafana on that scenario.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 825045f6b4)
2019-08-28 17:48:12 +00:00
Dimitri Savineau 6a5308fa7f tests/shrink_rgw: Disable dashboard
The shrink_rgw scenario has been merge just after the PR about enable
ceph dashboard by default.
So right now the shrink_rgw scenrio doesn't have nodes in the grafana
group and fails.
We just need to set dashboard_enabled to false.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 867583d5dd)
2019-07-31 15:25:15 -04:00
Rishabh Dave 06c0a06122 tests/functional: add a test for shrink-rgw.yml
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and RGW and then runs shrink-rgw.yml to test it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 236b081a3a)

# Conflicts:
#	tox.ini
2019-07-31 15:25:15 -04:00
Guillaume Abrioux d2ef85b615 tests: add more memory in podman job
Typical error :

```
fatal: [mon1 -> mon0]: FAILED! => changed=true
  cmd:
  - podman
  - exec
  - ceph-mon-mon0
  - ceph
  - config
  - set
  - mgr
  - mgr/dashboard/ssl
  - 'false'
  delta: '0:00:00.644870'
  end: '2019-07-30 10:17:32.715639'
  msg: non-zero return code
  rc: 1
  start: '2019-07-30 10:17:32.070769'
  stderr: |-
    Traceback (most recent call last):
      File "/usr/bin/ceph", line 140, in <module>
        import rados
    ImportError: libceph-common.so.0: cannot map zero-fill pages: Cannot allocate memory
    Error: exit status 1
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>
```

Let's add more memory to get around this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0f620b2584)
2019-07-30 15:08:46 +02:00
Guillaume Abrioux d7d661d5d7 tests: deploy dashboard on mons
there's no dedicated nodes for mgr, let's use monitor nodes.
The mgr0 instance spawned isn't used, so if this node is part of the
inventory for this scenario, testinfra will complain because there's no
ceph.conf on this node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d649e00893)
2019-07-30 15:08:46 +02:00
Guillaume Abrioux 432257b6dd tests: test dashboard deployment with podman scenario
This commit adds a grafana-server section in order to test dashboard
deployment with podman.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3c2fd337d9)
2019-07-29 15:46:58 +02:00
Guillaume Abrioux 93826e061d dashboard: enable dashboard by default
This commit enables dashboard deployment by default.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1726739

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fb1b5b3251)

# Conflicts:
#	tox-dashboard.ini
2019-07-29 15:46:58 +02:00
Guillaume Abrioux 12839a3f66 tests: bump nfs-ganesha version
in stable-4.0, nfs-ganesha 2.8 should be used.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-23 15:02:19 +02:00
Dimitri Savineau c0abaec019 tests/dashboard: use the dedicated grafana node
The Vagrant dashboard scenario creates a dedicated grafana node but
was not use in the ansible inventory.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a9a1f633a9)
2019-07-19 20:33:42 +00:00
Rishabh Dave 41a4ded2b5 tests/functional: add a test for shrink-rbdmirror.yml
Add a new functional test that deploys Ceph cluster with three nodes for
MON, OSD and RBD Mirror and, then, runs shrink-rbdmirror.yml to test it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit f80521f773)

# Conflicts:
#	tox.ini
2019-07-16 15:02:49 +02:00
Rishabh Dave 1b6d8f9b45 tests/functional: add a test for shrink-mgr.yml
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and MGR and then runs shrink-mgr.yml to test it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 5c95c34d4b)

# Conflicts:
#	tox.ini
2019-07-09 15:00:56 +00:00
Rishabh Dave e213163b63 tests/functional: add a test for shrink-mds.yml
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and MDS and then runs shrink-mds.yml to test it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 324b3b4a6c)

# Conflicts:
#	tox.ini
2019-07-09 12:07:47 +02:00
Mike Christie db65b82a42 igw: Update tests to use ceph-iscsi package
gateway_ip_list is depreciated and is only used when using the old
ceph-iscsi-config/cli packages that are no longer being developed
(GH repos are archived). Because ceph-iscsi-config/cli is no longer
being worked on, this modifies the tests to stress the ceph-iscsi
based installs.

Signed-off-by: Mike Christie <mchristi@redhat.com>
(cherry picked from commit 1e64efc2f0)
2019-07-04 00:04:04 +00:00
Mike Christie f180eccb84 igw: drop gateway_ip_list for container setups
The gateway_ip_list is not used in container setups, so drop it
for that case.

Signed-off-by: Mike Christie <mchristi@redhat.com>
(cherry picked from commit b7b2213be1)
2019-07-04 00:04:04 +00:00
Guillaume Abrioux 3ae024f404 tests: clean nfs_ganesha variables
- clean some leftover.
- move nfs_ganesha_[stable|dev] in group_vars so dev_setup.yml can modify them.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 45041f52fd)
2019-06-26 13:13:11 +02:00
Guillaume Abrioux a328ead69a tests: test nfs-ganesha deployment
Add back the nfs-ganesha deployment testing which was removed because of
broken dependencies.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 013ae62177)
2019-06-26 13:13:11 +02:00
Guillaume Abrioux 27dbac7396 tests: deploy nfs-ganesha in container-all_daemons
this commit bring back the nfs-ganesha testing in containerized
deployment.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9201674b5b)
2019-06-24 13:20:50 +02:00
Dimitri Savineau aa197f77fc remove ceph restapi references
The ceph restapi configuration was only available until Luminous
release so we don't need those leftovers for nautilus+.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit da8b7ab7fb)
2019-06-20 15:15:10 -04:00
Rishabh Dave c51e0b51d2 align cephfs pool creation
The definitions of cephfs pools should match openstack pools.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
Co-Authored-by: Simone Caronni <simone.caronni@teralytics.net>
(cherry picked from commit 67071c3169)
2019-06-18 09:17:13 +02:00
Guillaume Abrioux 82ab98326c tests: increase docker pull timeout
CI is facing issues where docker pull reach the timeout, let's increase
this to avoid CI failures.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1019e3b3dc)
2019-06-14 17:51:32 +00:00
Guillaume Abrioux 6805eb3184 iscsi: assign application (rbd) to pool 'rbd'
if we don't assign the rbd application tag on this pool,
the cluster will get `HEALTH_WARN` state like following:

```
HEALTH_WARN application not enabled on 1 pool(s)
POOL_APP_NOT_ENABLED application not enabled on 1 pool(s)
    application not enabled on pool 'rbd'
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4cf17a6fdd)
2019-06-13 14:51:19 -04:00
fmount 138fa19ccf Fix units and add ability to have a dedicated instance
Few fixes on systemd unit templates for node_exporter and
alertmanager container parameters.
Added the ability to use a dedicated instance to deploy the
dashboard components (prometheus and grafana).
This commit also introduces the grafana_group_name variable
to refer grafana group and keep consistency with the other
groups.
During the integration with TripleO some grafana/prometheus
template variables resulted undefined. This commit adds the
ability to check if the group exist and create, accordingly,
different job groups in prometheus template.

Signed-off-by: fmount <fpantano@redhat.com>
(cherry picked from commit 069076bbfd)
2019-06-12 11:48:12 +02:00
L3D 1daca1ba83 ansible: use 'bool' filter on boolean conditionals
By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these:
```
[DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable,
this behaviour will go away and you might need to add |bool to the expression
in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This
feature will be removed in version 2.12. Deprecation warnings can be disabled
by setting deprecation_warnings=False in ansible.cfg.
```

Now appended ``| bool`` on a lot of the affected variables.

Sometimes the coding style from ``variable|bool`` changed to ``variable | bool`` *(with spaces at the pipe)*.

Closes: #4022

Signed-off-by: L3D <l3d@c3woc.de>
(cherry picked from commit ab54fe20ec)
2019-06-07 16:05:51 +02:00
Guillaume Abrioux 3b40380870 tests: test podman against atomic os instead rhel8
the rhel8 image used is an outdated beta version, it is not worth it to
maintain this image upstream, since it's possible to test podman with a
newer version of centos/atomic-host image.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a78fb209b1)
2019-06-04 22:09:27 +00:00
Guillaume Abrioux 769e0d2f5c tests: add retries on failing tests in testinfra
This commit adds `pytest-rerunfailures` in requirements.txt so we can
retry failing test in testinfra to avoid false positive. (eg: sometimes it
can happen for some reason a service takes too much time to start)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4708b7615f)
2019-05-22 15:24:57 -04:00