ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	edfeb98593	tests: add mgr nodes to shrink_mon inventory Since `306ce82` we explicitly fail when there's no mgr node preent in the inventory. fatal: [mon0]: FAILED! => { "changed": false } MSG: Please add a mgr host to your inventory. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-04-02 22:02:35 +02:00
Guillaume Abrioux	a0f01db800	tests: add inventory host for 4.0 upgrade job This inventory is intended to be used in the upgrade scenario in stable-4.0 branch. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-04 23:18:43 +01:00
Dimitri Savineau	2d2cec99fc	tests: pg num should be a power of two number This patch changes the pg_num value of the rgw pools foo and bar to be a power of two number. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 14:52:09 -05:00
Guillaume Abrioux	b7a21d94d3	tests: retry to fire up VMs on vagrant failure Add a script to retry several times to fire up VMs to avoid vagrant failures. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `1ecb3a9352`)	2020-02-03 10:20:19 +01:00
Guillaume Abrioux	523a93b0e1	tests: add external_clients scenario This commit adds a new 'external ceph clients' scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `641729357e`)	2020-02-03 10:20:19 +01:00
Guillaume Abrioux	ce7503a3a6	tests: add 'all_in_one' scenario Add new scenario 'all_in_one' in order to catch more collocated related issues. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3e7dbb4b16`)	2020-01-31 11:26:40 +01:00
Guillaume Abrioux	01095f1f4c	tests: add coverage on purge playbook This commit adds a playbook to be played before we run purge playbook, it first creates an rbd image then map an rbd device on client0 so the purge playbook will try to unmap it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `db77fbda15`)	2020-01-13 14:50:29 -05:00
Dimitri Savineau	af57597df6	ceph-osd: add device class to crush rules This adds device class support to crush rules when using the class key in the rule dict via the create-replicated sub command. If the class key isn't specified then we use the create-simple sub command for backward compatibility. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ef2cb99f73`)	2020-01-13 16:54:01 +01:00
Guillaume Abrioux	195c49eaa9	tests: add shrink-osd-legacy testing This commit introduce back testing against ceph-disk deployed osds. In stable-3.2 which is the most common version used at customers (downstream pov), a bunch of OSDs are still deployed using ceph-disk. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-09 09:24:22 +01:00
Dimitri Savineau	193ce4f572	ceph-iscsi: add ceph-iscsi stable repositories This commit adds the support of the ceph-iscsi stable repository when use ceph_repository community instead of always using the devel repositories. We're still using the devel repositories for rtslib and tcmu-runner in both cases (dev and community). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-08 17:47:52 +01:00
Dimitri Savineau	c409d6e960	tests: add lvm-auto-discovery scenario This adds the lvm-auto-discovery scenario. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-12-09 09:32:55 +01:00
Dimitri Savineau	825429658b	tests: reduce max_mds from 3 to 2 Having max_mds value equals to the number of mds nodes generates a warning in the ceph cluster status: cluster: id: 6d3e49a4-ab4d-4e03-a7d6-58913b8ec00a' health: HEALTH_WARN' insufficient standby MDS daemons available' (...) services: mds: cephfs:3 {0=mds1=up:active,1=mds0=up:active,2=mds2=up:active}' Let's use 2 active and 1 standby mds. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `4a6d19dae2`)	2019-12-04 17:57:33 -05:00
Dimitri Savineau	52bba29a7f	tests: fix the size on the second data LV The commit replaces the pv/vg/lv commands used with the ansible command module by the lvg and lvol modules. This also fixes the size of the second data LV because we were only using 50% of the remaining space instead of 100%. With a 50G device, the result was: - data-lv1 was 25G - data-lv2 was 12.5G Instead of: - data-lv1 was 25G - data-lv2 was 25G Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2c03c6fcd3`)	2019-10-18 14:49:57 -04:00
Guillaume Abrioux	9dad8fc201	tests: add multimds coverage This commit makes the all_daemons scenario deploying 3 mds in order to cover the multimds case. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-10-18 00:34:48 +02:00
Dimitri Savineau	1eea339f87	Remove validate action and notario dependency The current ceph-validate role is using both validate action and fail module tasks to validate the ceph configuration. The validate action is based on the notario python library. When one of the notario validation fails then a python stack trace is reported to the ansible task. This output isn't understandable by users. This patch removes the validate action and the notario depencendy. The validation is now done with only fail ansible module. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654790 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-15 18:05:16 +02:00
Dimitri Savineau	475d2b1557	tests: fix rgw multisite vagrant variables The secondary vagrant variables didn't have the grafana vm variable set which create an vagrant error. There was an error loading a Vagrantfile. The file being loaded and the error message are shown below. This is usually caused by an invalid or undefined variable. This patch also changes the ssh-extra-args parameter to ssh-common-args to get the same values for ssh/sftp/scp. Otherwise we can see warnings from ansible and some tasks are failing. [WARNING]: sftp transfer mechanism failed on [mon0]. Use ANSIBLE_DEBUG=1 to see detailed information It also updates the ssh-common-args value for the rgw-multisite scenario to reflect the ANSIBLE_SSH_ARGS environment variable value. Finally changing the IP addresses due to the Vagrant refact done in the commit `778c51a` Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `010158ff84`)	2019-10-14 09:46:38 +02:00
Guillaume Abrioux	d9f6b37ae6	tests: set gateway_ip_list dynamically so we dont' have to hardcode this in the tests Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-10-04 07:39:41 +02:00
Dimitri Savineau	ff8e3a5a2e	tests: update dedidated mgr node all_daemons `5b29144` change the mgr node to a dedicated node instead of the first monitor node. But the change didn't update the switch-to-containers inventory which cause this playbook to fail. Also update the ubuntu inventory to have the same configuration. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-08-30 15:19:33 -04:00
Guillaume Abrioux	5b29144bbd	tests: deploy mgr on a dedicated node (all_daemons scenario) let's deploy mgr on a dedicated node. This makes update job failing on stable-4.0 branch since there's a mismatch between the two inventories. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-08 13:43:29 +02:00
Dimitri Savineau	bf8bd4c0f1	tests: Update ooo-collocation scenario The ooo-collocation scenario was still using an old container image and doesn't match the requirement on latest stable-3.2 code. We need to use at least the container image v3.2.5. Also updating the OSD tests to reflect the changes introduced by the commit `bedc0ab` because we don't have the OSD systemd unit script using device name anymore. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-30 08:27:13 +02:00
Ramana Raja	9097f9847c	Install nfs-ganesha stable v2.7 nfs-ganesha v2.5 and 2.6 have hit EOL. Install nfs-ganesha v2.7 stable that is currently being maintained. Signed-off-by: Ramana Raja <rraja@redhat.com> (cherry picked from commit `dfff89ce67`)	2019-07-10 22:09:14 +02:00
Guillaume Abrioux	27aad73471	tests: add nfs-ganesha testing This was removed because of broken repositories which made the CI failing. That doesn't make sense anymore so adding back it Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-24 10:35:42 +02:00
Guillaume Abrioux	64659d2c82	iscsi: assign application (rbd) to pool 'rbd' if we don't assign the rbd application tag on this pool, the cluster will get `HEALTH_WARN` state like following: ``` HEALTH_WARN application not enabled on 1 pool(s) POOL_APP_NOT_ENABLED application not enabled on 1 pool(s) application not enabled on pool 'rbd' ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4cf17a6fdd`)	2019-06-13 14:43:25 +02:00
Dimitri Savineau	8a74928a19	tox: Refact lvm_osds scenario The current lvm_osds only tests filestore on one OSD node. We also have bs_lvm_osds to test bluestore and encryption. Let's use only one scenario to test filestore/bluestore and with or without dmcrypt on four OSD nodes. Also use validate_dmcrypt_bool_value instead of types.boolean on dmcrypt validation via notario. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `52b9f3fb28`)	2019-05-10 11:24:32 +02:00
Guillaume Abrioux	5053f32c15	osds: allow passing devices by path ceph-volume didn't work when the devices where passed by path. Since it now support it, let's allow this feature in ceph-ansible Closes: #3812 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8f2c45dfd3`)	2019-05-09 14:21:43 +02:00
Dimitri Savineau	f3785ef7dd	tests: Add debug to ceph-override.json It's usefull to have logs in debug mode enabled in order to have more information for developpers. Also reindent to json file. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d25af1b872`)	2019-04-11 15:38:14 +00:00
Dimitri Savineau	e3e6285aa9	tests/functional: use ceph-override.json symlink We don't need to have multiple ceph-override.json copies. We currently already have symlink to all_daemons/ceph-override.json so we can do it for all scenarios. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `a19054be18`)	2019-04-11 15:38:14 +00:00
Ali Maredia	e943288cae	rgw multisite: add more than 1 rgw to the master or secondary zone Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1664869 Signed-off-by: Ali Maredia <amaredia@redhat.com> (cherry picked from commit `37f46a8c5d`)	2019-04-06 08:50:30 +00:00
Guillaume Abrioux	68a832e3c8	tests: pin pytest-xdist to 1.27.0 looks like newer version of pytest-xdist requires pytest>=4.4.0 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ba0a95211c`)	2019-04-04 10:36:34 +00:00
Guillaume Abrioux	f200f1ca87	tests: refact update scenario (stable-3.2) refact the update scenario like it has been made in master. (see `f0e616962`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-01 16:35:24 +02:00
Guillaume Abrioux	005cb09ba9	tests: add mgr and nfs nodes in all_daemons even not used, we need to fire up those VMs to be able to perform the upgrade in the CI. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-28 15:40:43 +01:00
Guillaume Abrioux	224bab0d70	tests: add mgrs section in non_container-collocation No mgrs are deployed in this scenario, causing the testinfra jobs to fail. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-05 10:49:45 +01:00
Guillaume Abrioux	36fafadc67	tests: fix collocation scenario ceph_origin and ceph_repository are mandatory variables. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-05 10:49:45 +01:00
Guillaume Abrioux	1209fb1874	tests: pin testinfra version As of testinfra 2.0.0, the binary name is `py.test`. But let's pin the version to 1.19.0. Indeed, migrating to 2.0.0 requires our current testing to be reworked a bit. Since we don't have the bandwidth ATM for this, it's better to simply keep testing with testinfra 1.19.0. Note that I've replaced all `testinfra` occurences by `py.test` anyway. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b42250332a`)	2019-03-04 15:48:44 +00:00
Guillaume Abrioux	b7f5233d07	tests: add lvm bluestore dmcrypt support Add coverage for container / non container lvm bluestore dmcrypt OSDs Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `207fae38d4`)	2019-02-28 13:48:39 +00:00
Guillaume Abrioux	15b1f22ca3	tests: do not deploy iscsigw on ubuntu not supported on non rhel based distribution Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-06 14:48:21 +01:00
Guillaume Abrioux	2738a945a3	tests: add inventory file add missing inventory file for ubuntu-container-all_daemons job Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-06 14:48:21 +01:00
Noah Watkins	ebd72708b1	test: add missing test dependency [nwatkins@smash ceph-ansible]$ virtualenv env [nwatkins@smash ceph-ansible]$ env/bin/pip install -r tests/requirements.txt [nwatkins@smash ceph-ansible]$ env/bin/python -c "import mock" Traceback (most recent call last): File "<string>", line 1, in <module> ModuleNotFoundError: No module named 'mock' Signed-off-by: Noah Watkins <noahwatkins@gmail.com> (cherry picked from commit `8a5530ee98`)	2019-02-06 00:37:11 +00:00
Guillaume Abrioux	1877e1b330	tests: run lvm_setup.yml only when osd_scenario is lvm especially for ooo_collocation scenario which is still using ceph-disk testing. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-31 00:33:10 +01:00
Guillaume Abrioux	2abde600cd	tests: add nodes for container-all_daemons scenario add back iscsigw and rbdmirror vm in all_daemons testing Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-30 14:58:59 +01:00
Noah Watkins	fc6bae26ac	Fixup shrink_osd[_container] scenario config configuration seems to be for filestore: [ERROR]: [ceph-osd0] Validation failed for variable: lvm_volumes Removing `radosgw_interface: eth1` to resolve: The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute u'ansible_eth1' The error appears to have been in '/home/nwatkins/src/ceph-ansible/roles/ceph-defaults/tasks/set_radosgw_address.yml': line 21, column 5, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: set_fact _radosgw_address to radosgw_interface - ipv4 ^ here Signed-off-by: Noah Watkins <noahwatkins@gmail.com> (cherry picked from commit `50255b9640`)	2019-01-30 14:58:59 +01:00
Guillaume Abrioux	299baed635	tests: refact testing in stable-3.2 Apply the same refact recently introduced in master to stable-3.2 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-30 14:58:59 +01:00
Sébastien Han	50fe56044e	disable nfs scenario The packages are broken, so let's remove it, until this solved. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `a502327e52`)	2018-12-04 14:39:05 +00:00
Sébastien Han	fa8bd10cac	test: disable nfs for containers Based on https://github.com/ceph/ceph-container/pull/1269 and given there are no stable packages and reliable repository, we disable nfs ganesha temporarly. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `6c3ef90ebe`)	2018-12-04 14:39:05 +00:00
Guillaume Abrioux	68b2ad11ee	mon: move `osd_pool_default_pg_num` in `ceph-defaults` `osd_pool_default_pg_num` parameter is set in `ceph-mon`. When using ceph-ansible with `--limit` on a specifc group of nodes, it will fail when trying to access this variables since it wouldn't be defined. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d4c0960f04`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	e8dd6b8993	tests: change default pools size default pool size in our test should be explicitly set to 1 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	30cec03ae7	tests: do not fully override previous ceph_conf_overrides We run an initial deployment with `osd_pool_default_size: 1` in `ceph_conf_overrides`. When re-running the playbook to test idempotency and handlers, we reset `ceph_conf_overrides`, we must append a new value instead of just overwritting it, otherwise, this can lead to error in the CI. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f290e49df8`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	133615471a	tests: set pool size to 1 in ceph-override.json setting this setting to 1 makes the CI covering the related code in the playbook without breaking the upgrade scenarios. Those scenarios were broken because there is a check `TASK [waiting for clean pgs...]` in rolling_update.yml, since the pool size for `cephfs_metadata` and `cephfs_data` are updated to `2` in `ceph-override.json` and there is not enough osd to honor this size, some PGs are degraded and make the mentioned check failing. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3ac6619fb9`)	2018-11-28 23:11:46 +01:00
Guillaume Abrioux	f52344300a	tests: add more memory for rgw_multsite scenarios Adding more memory to VMs for rgw_multisite scenarios could avoid this error I have recently hit in the CI: (It is worth it to set 1024Mb since there is only 2 nodes in those scenarios.) ``` fatal: [osd0]: FAILED! => { "changed": false, "cmd": [ "docker", "run", "--rm", "--entrypoint", "/usr/bin/ceph", "docker.io/ceph/daemon:latest-luminous", "--version" ], "delta": "0:00:04.799084", "end": "2018-10-29 17:10:39.136602", "rc": 1, "start": "2018-10-29 17:10:34.337518" } STDERR: Traceback (most recent call last): File "/usr/bin/ceph", line 125, in <module> import rados ImportError: libceph-common.so.0: cannot map zero-fill pages: Cannot allocate memory ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Guillaume Abrioux	37970a5b3c	tests: add rgw_multisite functional test Add a playbook that will upload a file on the master then try to get info from the secondary node, this way we can check if the replication is ok. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00

1 2 3 4 5 ...

422 Commits (724620ed3dc76bc8dbddb50538ff64b82b3025a7)