Commit Graph

258 Commits (473673ab414418c3fa20e6833174958a3f238891)

Author SHA1 Message Date
Sébastien Han a53aa9e8b4 ci: new osd scenarios
This commit add new osd scenarios, it aims to simplify the CI setup and
brings a better coverage on the OSD scenarios.
We decided to differentiate between filestore and bluestore, thinking
ahead when filestore won't be supported anymore.
So we now have two classes of tests:

* Filestore
* Bluestore

In each of those classes we have container and non-container.
Then for each we test the following:

* collocated
* collocated dmcrypt
* non-collocated
* non-collocated dmcrypt
* auto discovery collocated
* auto discovery collocated dmcrypt

This gives us a nice coverage and also reduces the footprint on the CI.
We are now up to 4 scenarios, each containing 6 OSD VMs.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-18 09:26:06 +02:00
Guillaume Abrioux 7ee9aa94b5 Merge pull request #1963 from ceph/pull-in-para
site-docker.yml try to fetch images in //
2017-10-13 19:35:11 +02:00
Sébastien Han 71d819620c mds: fix fs pool creation
1. add the variables to docker_collocation
2. trigger the check when a MDS is part of the inventory file, not when
we run on an MDS...

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-13 16:03:04 +02:00
Sébastien Han 90ce4276ca ci: use a container client VM
The client won't run on centos7 anymore but on Atomic host just like the
rest of the daemons.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-13 15:26:03 +02:00
Sébastien Han 3e058bff06 ci: reboot with ansible instead of vagrant reload
vagrant is serialized and takes a lot of time compare to simple reboot.
See the benchmarks below for 3 VMs:

[leseb@rick docker]$ time ANSIBLE_SSH_ARGS="-F
/home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/vagrant_ssh_config"  ansible-playbook -i /home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/hosts reboot.yml

PLAY [mons]
****************************************************************************************************************************************************************************************************

TASK [Gathering Facts]
*****************************************************************************************************************************************************************************************
ok: [mon1]
ok: [mon2]
ok: [mon0]

TASK [restart machine]
*****************************************************************************************************************************************************************************************
changed: [mon2]
changed: [mon1]
changed: [mon0]

TASK [wait for server to boot]
*********************************************************************************************************************************************************************************
ok: [mon2 -> localhost]
ok: [mon0 -> localhost]
ok: [mon1 -> localhost]

TASK [uptime]
**************************************************************************************************************************************************************************************************
changed: [mon2]
changed: [mon0]
changed: [mon1]

PLAY RECAP
*****************************************************************************************************************************************************************************************************
mon0                       : ok=4    changed=2    unreachable=0
failed=0
mon1                       : ok=4    changed=2    unreachable=0
failed=0
mon2                       : ok=4    changed=2    unreachable=0
failed=0

real    0m35.112s
user    0m5.737s
sys     0m1.849s

[leseb@rick docker]$ time vagrant reload
==> mon0: Halting domain...
==> mon0: Starting domain.
==> mon0: Waiting for domain to get an IP address...
==> mon0: Waiting for SSH to become available...
==> mon0: Creating shared folders metadata...
==> mon0: Rsyncing folder:
/home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/
=> /home/vagrant/sync
==> mon0: Machine already provisioned. Run `vagrant provision` or use
the `--provision`
==> mon0: flag to force provisioning. Provisioners marked to run always
will still run.
==> mon1: Halting domain...
==> mon1: Starting domain.
==> mon1: Waiting for domain to get an IP address...
==> mon1: Waiting for SSH to become available...
==> mon1: Creating shared folders metadata...
==> mon1: Rsyncing folder:
/home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/
=> /home/vagrant/sync
==> mon1: Machine already provisioned. Run `vagrant provision` or use
the `--provision`
==> mon1: flag to force provisioning. Provisioners marked to run always
will still run.
==> mon2: Halting domain...
==> mon2: Starting domain.
==> mon2: Waiting for domain to get an IP address...
==> mon2: Waiting for SSH to become available...
==> mon2: Creating shared folders metadata...
==> mon2: Rsyncing folder:
/home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/
=> /home/vagrant/sync
==> mon2: Machine already provisioned. Run `vagrant provision` or use
the `--provision`
==> mon2: flag to force provisioning. Provisioners marked to run always
will still run.

real    1m31.850s
user    0m7.387s
sys     0m0.796s

Reboot via Ansible: 0m35.112s
Reboot via vagrant: 1m31.850s

We save 1/3 time.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-13 09:04:26 +02:00
Guillaume Abrioux 17623a2157 Merge pull request #2036 from ceph/cephfs-pool
mds: precisely define cephfs pool
2017-10-12 17:47:10 +02:00
Sébastien Han 6bd152d555 Merge pull request #2037 from major/remove-always-run
Avoid deprecated always_run
2017-10-12 17:15:28 +02:00
Sébastien Han b49f9bda21 mds: precisely define cephfs pool
We now have a variable called ceph_pools that is mandatory when
deploying a MDS.
It's a dictionnary that contains a pool name and a PG count. PG count is
mandatory and must be set, the playbook will fail otherwise.

Closes: https://github.com/ceph/ceph-ansible/issues/2017
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-12 15:56:04 +02:00
Major Hayden 620fb37dd4
Avoid deprecated always_run
The `always_run` key is deprecated and being removed in Ansible 2.4.
Using it causes a warning to be displayed:

    [DEPRECATION WARNING]: always_run is deprecated.

This patch changes all instances of `always_run` to use the `always`
tag, which causes the task to run each time the playbook runs.
2017-10-12 08:29:44 -05:00
Guillaume Abrioux a179e312fd tests: add missing override for collocation scenario
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-12 14:43:25 +02:00
Guillaume Abrioux a2880e6345 tests: rbd/rgw adapt testinfra for jewel
- the rbd-mirror unit systemd name is not the same when running jewel vs
luminous.
- servicemap is not available on jewel.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-12 00:06:08 +02:00
Guillaume Abrioux a1ea6e7f59 tests: adapt current testing for collocation scenario
Since we introduced collocation testing scenario, we need to adapt
current tests to this new scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-09 17:25:45 +02:00
Sébastien Han 6d7b73fa91 ci: re-add osd_pool_default_size to 1 with the override
If we don't do this the client will create pools with a replica 3 since
osd_pool_default_size was gone in ceph-override.json. This was making
switch_to_containers failing.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-09 17:25:45 +02:00
Sébastien Han abb8c374cf ci: use by-id instead of by-path
by-id relies on the disk WWID which is more reliable then by-path
(pointing to the PCI info)

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-07 03:39:09 +02:00
Guillaume Abrioux 680ec8758e tests: skip tests for nfs nodes when release is jewel
nfs nodes are not deployed on jewel so we should skip the tests on them.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-06 12:49:39 +02:00
Sébastien Han b6b24a5ca9 iscsi: fix wrong group name for iscsi
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498490
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-05 17:25:32 +02:00
Guillaume Abrioux 53a69640c9 tests: disable shared folder
Shared folder is not required for tests.
We should avoid hitting the error :
```
uninitialized constant VagrantPlugins::ProviderLibvirt::Action::ShareFolders
```
Also, disabling it might reduce the needed time in certains cases for the VMs
to be started.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-05 15:07:38 +02:00
Guillaume Abrioux 6aa7050acd tests: make all subnet uniq per scenario
If two environments are using the same subnet, we will get trouble
because of ips addresses conflicts.
This commit ensures each scenario has a uniq subnet for both public and cluster
network so we can setup several test environment at a time on a same hypervisor.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-05 15:07:38 +02:00
Guillaume Abrioux 635111bf6a tests: add ceph-override.json for ubuntu/cluster
in addition to 18e2ab4d this commit adds the same file for ubuntu
testing scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-05 12:59:29 +02:00
Guillaume Abrioux 4135091c98 tests: fix broken osd test for xenial_cluster
the path `/dev/disk/by-path/pci-0000:00:01.1-ata-1.0` doesn't exist.
it has to be changed to `/dev/disk/by-path/pci-0000:00:01.1-ata-1`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-05 11:03:41 +02:00
Guillaume Abrioux cdb5023d84 tests: fix brokens tests for mds
5968cf0 broke the test on mds because of leftover.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-04 16:48:23 +02:00
Guillaume Abrioux 2c4258a0fd Refact code for set_osd_pool_default_*
This commit refacts the code regarding all `set_osd_pool_default_*`
related tasks by avoiding usage of useless `set_fact` to determine
whether a key is present in `ceph_conf_overrides`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-04 15:40:10 +02:00
Sébastien Han 5968cf09b1 ci: add collocation scenario
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-04 11:19:12 +02:00
Sébastien Han 3bd341f6c0 osd: container use id instead of dev name
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1494127
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-03 14:44:00 +02:00
Sébastien Han 18e2ab4d07 test: add handler support
Add idempotency and handler test.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-03 14:44:00 +02:00
Sébastien Han 39ee25637b test: add test for device with 'by-path'
We now test devices to be passed like:
/dev/disk/by-path/pci-0000:00:01.1-ata-1.0

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-03 14:43:57 +02:00
Sébastien Han b4bec52442 tests: add tests for rgw-nfs
rgw-nfs is part of servicemap so we should use it to make sure the
process is up and running.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-29 02:38:24 +02:00
Sébastien Han 77fc8ba87f Merge pull request #1931 from ceph/re-enable-iscsi
iscsi: re-enable the scenario
2017-09-28 19:44:52 +02:00
Sébastien Han 67c78da056 iscsi: re-enable the scenario
CentOS 7.4 vagrant box is now available so re-enabling this scenario.
For more info:
https://seven.centos.org/2017/09/updated-centos-vagrant-images-available-v1708-01/

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-28 18:46:28 +02:00
Ali Maredia ae18cf24d2 test: add test making sure rgw http endpoints are enabled
Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-09-25 14:41:18 -04:00
Sébastien Han d5bfc6f85d mgr: always bootstrap mgr right after the mon
If we don't bootstrap the mgr after the mon and the osds handler are
called, we will never be able to reach a clean state since the pgs
stats are handled by the mgr. This also happens when doing daemon
collocation.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1493920
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-22 17:26:28 +02:00
Sébastien Han c7d9838ad4 tests: add nfs container test
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-21 11:07:14 +02:00
Guillaume Abrioux a069a6fe63 tests: temporary disable `test_nfs_rgw_fsal_export`
This test doesn't work at the moment and need to be fixed.
Disabling it temporary to avoid errors in the CI.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-21 09:56:37 +02:00
Guillaume Abrioux f4fc3bbfea ci: add precise tests to valide daemons are up
Add daemon health check for rgw, mds, mgr, rbd mirror.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-21 09:56:37 +02:00
Sébastien Han 66d41f342d Merge pull request #1889 from ceph/client-containers
client: ability to create keys and pool with no ceph binaries
2017-09-18 17:27:32 +02:00
Sébastien Han 85d73e3be2 client: ability to create keys and pool with no cpeh binaries
On a container env, machines don't have any ceph binaries so we need to
use a container to run the commands.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-18 14:41:52 +02:00
Andrew Schoen 5eff7e24b0 Merge pull request #1890 from ceph/lvm-setup
tests: fix lvm_setup.yml for purge_cluster.yml
2017-09-14 11:38:13 -05:00
Sébastien Han 2f51f0de28 Merge pull request #1880 from ceph/wip-rgw-nfs
nfs: configure RGW FSAL to start up correctly
2017-09-13 14:20:14 -06:00
Andrew Schoen 57f2ad7ef1 tests: delete journal partitions in lvm_setup.yml
Delete these before creating them incase they are left around in a purge
cluster testing scenario. The purge-cluster.yml playbook does not
currently remove partitions used for journals.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-09-13 15:02:54 -05:00
Sébastien Han f67b47d056 Merge pull request #1882 from ceph/multi-journal
osd: drop support for device partition
2017-09-13 11:43:48 -06:00
Sébastien Han aa364264cd resync ceph-iscsi-gw with old upstream
Taken from https://github.com/pcuzner/ceph-iscsi-ansible/tree/tcmu-fixes

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1454945 and
https://bugzilla.redhat.com/show_bug.cgi?id=1484083
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-12 18:06:10 -06:00
Sébastien Han fdf924401f osd: drop support for device partition
We have been struggling with this, it's still broken and breaking other
things too now.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1490283
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-12 17:42:07 -06:00
Ali Maredia 52efe92a87 nfs: configure RGW FSAL to start up correctly
- Add RGW keyring to nfs node
- Add RGW section to ganesha.conf
- Add RGW section to ceph.conf onf nfs node

Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-09-12 16:27:16 -04:00
Andrew Schoen 61357c8e20 tests: no need to create a filesystem on /dev/sdc1 for lvm tests
The partition only needs created and given a gpt label so that a
PARTUUID will exist on the partition.

This task also makes the purge_lvm_osds scenario fail on the second
deployment after purging.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-09-12 15:14:21 -05:00
Sébastien Han 7054615551 ci: deploy rbd mirror
Deploy rbd mirorr in cluster scenario

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-09 01:17:10 +02:00
Sébastien Han 4f325c7ebe ci: remove scenario bluestore_docker_cluster
We don't need to bootstrap a full cluster to bootstrap bluestore. We
have individual scenarios for that.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-08 19:33:24 +02:00
Ali Maredia f8171e8b4a nfs: rename host to have ceph- prefix
Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-09-08 11:38:05 -04:00
Ali Maredia f3e2235b3a nfs-ganesha: add config overrides section
Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-09-08 11:37:58 -04:00
Ali Maredia c907ec41ae nfs: add automated testing for nfs-ganesha roles
Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-09-08 09:14:01 -04:00
Sébastien Han 3753e6cfa7 ceph-osd: fix autodetection activation
Prior to this patch this activation sequence for autodetection was
always skipped because we were asking to activate on device without
partitions, which doesn't make sense.

We also fix the way we lookup for a device, since the data partition is
always numbered 1, we take the min element of the dict.

Closes: https://github.com/ceph/ceph-ansible/issues/1782
Signed-off-by: Sébastien Han <seb@redhat.com>
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-07 17:47:37 +02:00