Commit Graph

228 Commits (d14723d5b47be85f05e3a8febb04aeddbf62b5c9)

Author SHA1 Message Date
Dimitri Savineau 33f74771d2 switch2container: disable ceph-osd enabled-runtime
When deploying the ceph OSD via the packages then the ceph-osd@.service
unit is configured as enabled-runtime.
This means that each ceph-osd service will inherit from that state.
The enabled-runtime systemd state doesn't survive after a reboot.
For non containerized deployment the OSD are still starting after a
reboot because there's the ceph-volume@.service and/or ceph-osd.target
units that are doing the job.

$ systemctl list-unit-files|egrep '^ceph-(volume|osd)'|column -t
ceph-osd@.service     enabled-runtime
ceph-volume@.service  enabled
ceph-osd.target       enabled

When switching to containerized deployment we are stopping/disabling
ceph-osd@XX.servive, ceph-volume and ceph.target and then removing the
systemd unit files.
But the new systemd units for containerized ceph-osd service will still
inherit from ceph-osd@.service unit file.

As a consequence, if an OSD host is rebooting after the playbook execution
then the ceph-osd service won't come back because they aren't enabled at
boot.

This patch also adds a reboot and testinfra run after running the switch
to container playbook.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1881288

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit fa2bb3af86)
2020-11-12 21:08:32 +01:00
Guillaume Abrioux b13f0d12e7 tests: reboot and test idempotency on collocation
test reboot and idempotency on collocation scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f83f798206)
2020-10-06 09:21:58 -04:00
Guillaume Abrioux 2001039c0e tests: migrate to quay.ceph.io registry
in order to avoid docker.io rate limiting

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 218aedaab6)
2020-09-10 21:37:06 +02:00
Guillaume Abrioux 2095df3397 tests: add docker hub authentication in jobs
This commit makes all jobs authenticating to docker hub in order to
avoid the rate limit.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 40307f810c)
2020-07-15 09:44:51 +02:00
Guillaume Abrioux 66bdd585da test: set sitepackages=false in tox
Otherwise it might try to use the system installed version of ansible
when there's one available.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6d9acb5e6d)
2020-05-14 11:35:08 -04:00
Dimitri Savineau a472064cb8 tox: replace testinfra by pytest for add-mgrs
The add-mgrs scenario is still using the testinfra command instead of
pytest so the tests exectution are failling.

ERROR: InvocationError for command could not find executable testinfra

This also adds the missing --ssh-config option to testinfra.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 92f538f1af)
2020-04-03 10:43:21 -04:00
Dimitri Savineau ca35aa355a tox: update shrink scenario configuration
The shrink scenarios don't need the docker variables (except for OSD).
Removing pytest for shrink-mgr.
Adding environment variables for xxx_to_kill ansible variable.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2f4413f5ce)
2020-03-03 15:19:45 +01:00
Guillaume Abrioux d5dca5087a tests: add 'all_in_one' scenario
Add new scenario 'all_in_one' in order to catch more collocated related
issues.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3e7dbb4b16)
2020-01-27 17:54:39 -05:00
Guillaume Abrioux 51596e8b32 tests: use main playbook for add_osds job
This commit replaces the playbook used for add_osds job given
accordingly to the add-osd.yml playbook removal

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fef1cd4c4b)
2020-01-14 09:12:34 -05:00
Guillaume Abrioux 2c96155c32 tests: retry to fire up VMs on vagrant failure
Add a script to retry several times to fire up VMs to avoid vagrant
failures.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 1ecb3a9352)
2020-01-10 17:41:27 +01:00
Guillaume Abrioux 1c03d2b526 purge: rename playbook (container)
Since we now support podman, let's rename the playbook so it's more
generic.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7bc7e3669d)
2019-12-04 09:12:41 -05:00
Guillaume Abrioux 99cdcf9d29 tests: add coverage on purge playbook
This commit adds a playbook to be played before we run purge playbook,
it first creates an rbd image then map an rbd device on client0 so the
purge playbook will try to unmap it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit db77fbda15)
2019-11-14 10:49:38 -05:00
Dimitri Savineau bb2f139a1d tests: update container tag for ooo_collocation
It doesn't make sense to test the old 3.0.x container images with
nautilus+ ceph releases.
Also disable the dashboard deployment and switch to bluestore backend.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3c2840da03)
2019-10-23 17:17:24 +02:00
Guillaume Abrioux 4e42d085f7 tests: update tox due to pipeline removal
This commit reflects the recent changes in ceph/ceph-build#1406

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit bcaf8cedee)
2019-10-08 14:03:14 -04:00
Dimitri Savineau 067aa3aabd tests: fix rgw multisite vagrant variables
The secondary vagrant variables didn't have the grafana vm variable
set which create an vagrant error.

There was an error loading a Vagrantfile. The file being loaded
and the error message are shown below. This is usually caused by
an invalid or undefined variable.

This patch also changes the ssh-extra-args parameter to ssh-common-args
to get the same values for ssh/sftp/scp. Otherwise we can see warnings
from ansible and some tasks are failing.

[WARNING]: sftp transfer mechanism failed on [mon0]. Use ANSIBLE_DEBUG=1
to see detailed information

It also updates the ssh-common-args value for the rgw-multisite scenario
to reflect the ANSIBLE_SSH_ARGS environment variable value.

Finally changing the IP addresses due to the Vagrant refact done in the
commit 778c51a

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 010158ff84)
2019-10-04 16:48:00 -04:00
Harald Jensås 5fea830414 Replace ipaddr() with ips_in_ranges()
This change implements a filter_plugin that is used in the
ceph-facts, ceph-validate roles and infrastucture-playbooks.
The new filter plugin will return a list of all IP address
that reside in any one of the given IP ranges. The new filter
replaces the use of the ipaddr filter.

ceph.conf already support a comma separated list of CIDRs
for the public_network and cluster_network options.

Changes: [1] and [2] introduced a regression in ceph-ansible
where public_network can no longer be a comma separated list
of cidrs.

With this change a comma separated list of subnet CIDRs can
also be used for monitor_address_block and radosgw_address_block.

[1] commit: d67230b2a2
[2] commit: 20e4852888

Related-To: https://bugs.launchpad.net/tripleo/+bug/1840030
Related-To: https://bugzilla.redhat.com/show_bug.cgi?id=1740283

Closes: #4333
Please backport to stable-4.0

Signed-off-by: Harald Jensås <hjensas@redhat.com>
(cherry picked from commit e695efcaf7)
2019-09-27 17:49:46 +02:00
Guillaume Abrioux b1e61be9c6 tests: set copy_admin_key at group_vars level
setting it at extra vars level prevent from setting it per node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5bb6a4da42)
2019-09-26 16:21:54 +02:00
Rishabh Dave 06c0a06122 tests/functional: add a test for shrink-rgw.yml
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and RGW and then runs shrink-rgw.yml to test it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 236b081a3a)

# Conflicts:
#	tox.ini
2019-07-31 15:25:15 -04:00
Guillaume Abrioux 2c64166eac tests: remove useless setting
this setting is not needed here since we explicitely set it for
container and non container context.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 87b173d022)
2019-07-17 09:04:20 +00:00
Rishabh Dave 41a4ded2b5 tests/functional: add a test for shrink-rbdmirror.yml
Add a new functional test that deploys Ceph cluster with three nodes for
MON, OSD and RBD Mirror and, then, runs shrink-rbdmirror.yml to test it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit f80521f773)

# Conflicts:
#	tox.ini
2019-07-16 15:02:49 +02:00
Rishabh Dave 1b6d8f9b45 tests/functional: add a test for shrink-mgr.yml
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and MGR and then runs shrink-mgr.yml to test it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 5c95c34d4b)

# Conflicts:
#	tox.ini
2019-07-09 15:00:56 +00:00
Rishabh Dave e213163b63 tests/functional: add a test for shrink-mds.yml
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and MDS and then runs shrink-mds.yml to test it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 324b3b4a6c)

# Conflicts:
#	tox.ini
2019-07-09 12:07:47 +02:00
Guillaume Abrioux 61213d77d9 tests: wait 30sec before running testinfra
adding back a sleep 30s after nodes have rebooted before running
testinfra.
This was removed accidentally by d5be83e

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ca84a5359f)
2019-07-03 14:17:33 -04:00
Dimitri Savineau b1f8518ef9 tests: Update ansible ssh_args variable
Because we're using vagrant, a ssh config file will be created for
each nodes with options like user, host, port, identity, etc...
But via tox we're override ANSIBLE_SSH_ARGS to use this file. This
remove the default value set in ansible.cfg.

Also adding PreferredAuthentications=publickey because CentOS/RHEL
servers are configured with GSSAPIAuthenticationis enabled for ssh
server forcing the client to make a PTR DNS query.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 34f9d51178)
2019-06-17 16:45:38 +02:00
Guillaume Abrioux 3b40380870 tests: test podman against atomic os instead rhel8
the rhel8 image used is an outdated beta version, it is not worth it to
maintain this image upstream, since it's possible to test podman with a
newer version of centos/atomic-host image.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a78fb209b1)
2019-06-04 22:09:27 +00:00
Guillaume Abrioux 769e0d2f5c tests: add retries on failing tests in testinfra
This commit adds `pytest-rerunfailures` in requirements.txt so we can
retry failing test in testinfra to avoid false positive. (eg: sometimes it
can happen for some reason a service takes too much time to start)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4708b7615f)
2019-05-22 15:24:57 -04:00
Dimitri Savineau 63f99fb965 tests: update testinfra release
In order to support ansible 2.8 with testinfra we need to use the
latest release (3.0.x).
Adding ssh-config option to py.test.
Also bumping the pytest and xdist version.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit de147469d7)
2019-05-21 09:17:46 +02:00
Dimitri Savineau 32e966db73 tox: Don't copy infrastructure playbook
Since a1a871c we don't need to copy the infrastructure playbooks
under the ceph-ansible root directory.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0f89a3f7a5)
2019-05-20 09:38:35 +02:00
Dimitri Savineau 975987d043 tox: Refact lvm_osds scenario
The current lvm_osds only tests filestore on one OSD node.
We also have bs_lvm_osds to test bluestore and encryption.
Let's use only one scenario to test filestore/bluestore and with or
without dmcrypt on four OSD nodes.
Also use validate_dmcrypt_bool_value instead of types.boolean on
dmcrypt validation via notario.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 52b9f3fb28)
2019-05-09 13:11:33 +02:00
Rishabh Dave ab50486dd9 allow adding a manager to a deployed cluster
Add a playbook that deploys manager on a new node and adds that node to
the already deployed Ceph cluster.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit d2cfd8b780)
2019-05-07 15:12:29 +02:00
Rishabh Dave a56bbb46fe allow adding a RGW to already deployed cluster
Add a tox scenario that adds a new RGW node as a part of already
deployed Ceph cluster and deploys RGW there.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit f201222447)

Conflicts:
	tox.ini
replaced "dev" and "nautilus" during cherry-pick.
2019-05-07 14:03:29 +02:00
Rishabh Dave ae9fa7ca09 allow adding a RBD mirror to already deployed cluster
Add a tox scenario that adds a new RBD mirror node as a part of already
deployed Ceph cluster and deploys RBD mirror there.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 221b2b4988)

Conflicts:
	tox.ini
"dev" was to replaced by "nautilus" in "envlist"
2019-05-07 11:37:53 +02:00
Dimitri Savineau 748605293e tox: Remove update scenario reference
update scenario is now handled by tox-update.ini file so we shoudn't
have update reference in tox.ini file.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8ab6a3391f)
2019-04-24 15:12:24 -04:00
Andrew Schoen ba7d4c4954 tests: adds the migrate_ceph_disk_to_ceph_volume scenario
This test deploys a luminous cluster with ceph-disk created osds
and then upgrades to nautilus and migrates those osds to ceph-volume.
The nodes are then rebooted and cluster state verified.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 399a821439)
2019-04-18 19:12:13 +02:00
Rishabh Dave 72309b49fe allow adding a monitor to a deployed cluster
Add a playbook that deploys a new monitor on a new node, adds that node
to the Ceph cluster and the monitor to the quorum and updates the ceph
configuration file on OSD nodes.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit d5967af7fb)
2019-04-16 11:14:21 +02:00
Dimitri Savineau c7d0fbbd19 tests: Add debug to ceph-override.json
It's usefull to have logs in debug mode enabled in order to have
more information for developpers.
Also reindent to json file.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d25af1b872)
2019-04-11 17:47:21 +02:00
Rishabh Dave c60915733a allow adding a MDS to already deployed cluster
Add a tox scenario that adds an new MDS node as a part of already
deployed Ceph cluster and deploys MDS there.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit c0dfa9b61a)
2019-04-09 16:48:59 +02:00
Guillaume Abrioux cd120baaba tests: add back testinfra testing
136bfe0 removed testinfra testing on all scenario excepted all_daemons

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8d106c2c58)
2019-04-04 13:11:33 +00:00
Guillaume Abrioux 3fd4354aaa tests: switch rhel-container-podman to nautilus
in stable-4.0 this should be set to nautilus.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-03 11:27:46 +02:00
Guillaume Abrioux 655ac5eb93 tests: test idempotency only on all_daemons job
there's no need to test this on all scenarios.
testing idempotency on all_daemons should be enough and allow us to save
precious resources for the CI.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 136bfe096c)
2019-04-03 11:27:46 +02:00
Dimitri Savineau 47d6e505a0 tox: Set nautilus as default release
On stable-4.0 branch we don't want to use dev setup but stable
release (nautilus).
Also update the container image tag to reflect this change.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-01 17:00:36 +02:00
Dimitri Savineau bd0869cd01 tox: Fix container purge jobs
On containerized CI jobs the playbook executed is purge-cluster.yml
but it should be set to purge-docker-cluster.yml

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-26 21:36:56 +00:00
Guillaume Abrioux b42250332a tests: pin testinfra version
As of testinfra 2.0.0, the binary name is `py.test`.

But let's pin the version to 1.19.0.
Indeed, migrating to 2.0.0 requires our current testing to be reworked a bit.
Since we don't have the bandwidth ATM for this, it's better to simply
keep testing with testinfra 1.19.0.

Note that I've replaced all `testinfra` occurences by `py.test` anyway.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-04 14:44:27 +01:00
Guillaume Abrioux d5be83e504 osd: add ipc=host in systemd template for containers
in addition to 15812970f0

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-28 13:14:09 +00:00
Guillaume Abrioux 3d66e913f4 tests: switch ubuntu image to bionic
I didn't use the `ceph/ubuntu-bionic` image because it's broken at the
time of writing this commit. I'll switch back to `ceph/ubuntu-bionic` as
soon as it will be fixed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-22 09:20:16 +01:00
Guillaume Abrioux 7f7f3769b3 main: add a retry/until for python installation
Add a retry/until in raw_install_python.yml to avoid unexpected
repository failures.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-19 16:40:08 +01:00
Guillaume Abrioux 85296c25c4 tests: add more verbosity when running testinfra
Could be useful when troubleshooting testinfra/pytest issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-19 16:40:08 +01:00
Guillaume Abrioux 4a1bafdc21 tests: use memory backend for cache fact
force ansible to generate facts for each run.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-14 10:34:37 +00:00
Guillaume Abrioux ad3a489847 tests: rename switch_to_containers
rename switch_to_containers scenario

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-13 09:42:27 +01:00
Guillaume Abrioux 0d72fe9b30 tests: add a rhel8 scenario testing
test upstream with rhel8 vagrant image

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00