ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	779a4a6d71	tests: don't install s3cmd on containerized setup The s3cmd package should only be installed on non containerized deployment. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 11:27:52 +01:00
Guillaume Abrioux	910fc61fdc	tests: remove legacy `osd_scenario` variable As of stable-4.0 most of these references aren't needed anymore. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-02-04 10:05:33 +01:00
Guillaume Abrioux	641729357e	tests: add external_clients scenario This commit adds a new 'external ceph clients' scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-31 12:02:15 +01:00
Guillaume Abrioux	c040199c8f	tests: set dashboard\|grafana_admin_password Set these 2 variables in all test scenarios where `dashboard_enabled` is `True` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-29 08:45:34 +01:00
Guillaume Abrioux	3e7dbb4b16	tests: add 'all_in_one' scenario Add new scenario 'all_in_one' in order to catch more collocated related issues. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-27 15:30:45 -05:00
Dimitri Savineau	bb3eae0c80	filestore-to-bluestore: fix osd_auto_discovery When osd_auto_discovery is set then we need to refresh the ansible_devices fact between after the filestore OSD purge otherwise the devices fact won't be populated. Also remove the gpt header on ceph_disk_osds_devices because the devices is empty at this point for osd_auto_discovery. Adding the bool filter when needed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-22 09:36:09 +01:00
Dimitri Savineau	f995b079a6	filestore-to-bluestore: --destroy with raw devices We still need --destroy when using a raw device otherwise we won't be able to recreate the lvm stack on that device with bluestore. Running command: /usr/sbin/vgcreate -s 1G --force --yes ceph-bdc67a84-894a-4687-b43f-bcd76317580a /dev/sdd stderr: Physical volume '/dev/sdd' is already in volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151' Unable to add physical volume '/dev/sdd' to volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151' /dev/sdd: physical volume not initialized. --> Was unable to complete a new OSD, will rollback changes Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792227 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-21 11:37:39 -05:00
Dimitri Savineau	a5385e1048	vagrant: temp workaround for CentOS 8 cloud image The CentOS cloud infrastructure storing the vagrant CentOS 8 image changed the directory path and remove the old 8.0 image so the vagrant box add centos/8 fails returning a 404 http error. As a workaround we can pull the image from CentOS instead of letting vagrant doing the resolution. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-15 17:52:35 +01:00
Dimitri Savineau	3900527e16	tests/setup: update mount options on EL 8 The nobarrier mount flag doesn't exist anymoer on XFS in the EL 8 kernel. That's why the task wasn't working on those systems. We can still use the other options instead of skipping the task. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-11 05:33:01 +01:00
Guillaume Abrioux	dc672e86ec	tests: add a docker2podman scenario This commit adds a new scenario in order to test docker-to-podman.yml migration playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-10 10:21:29 -05:00
Guillaume Abrioux	4f2baaab8c	tests: disable nfs testing nfs-ganesha makes the CI failing because of issue related to SELinux. See: - https://bugzilla.redhat.com/show_bug.cgi?id=1788563 - https://github.com/nfs-ganesha/nfs-ganesha/issues/527 Until we can get this fixed, let's disable nfs-ganesha testing temporarily. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-08 11:13:46 +01:00
Dimitri Savineau	7b3e6b932c	tests/functional: change docker to podman Some docker commands were hardcoded in tests playbooks and some conditions were not taking care of the containerized_deployment variable but only the atomic fact. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-08 11:13:46 +01:00
Guillaume Abrioux	217d95abb2	common: add centos8 support Ceph octopus only supports CentOS 8. This commit adds CentOS 8 support: - update vagrant image in tox configurations. - add CentOS 8 repository for el8 dependencies. - CentOS 8 container engine is podman (same than RHEL 8). - don't use the epel mirror on sepia because it's epel7 only. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-08 11:13:46 +01:00
Guillaume Abrioux	40de34fb5e	tests: add filestore_to_bluestore job This commit adds a new job in order to test the filestore-to-bluestore.yml infrastructure playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-11 09:04:41 -05:00
Dimitri Savineau	4a6d19dae2	tests: reduce max_mds from 3 to 2 Having max_mds value equals to the number of mds nodes generates a warning in the ceph cluster status: cluster: id: 6d3e49a4-ab4d-4e03-a7d6-58913b8ec00a' health: HEALTH_WARN' insufficient standby MDS daemons available' (...) services: mds: cephfs:3 {0=mds1=up:active,1=mds0=up:active,2=mds2=up:active}' Let's use 2 active and 1 standby mds. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-12-04 14:07:29 -05:00
Dimitri Savineau	3f29b243ea	tests: fix cluster health status The current ceph cluster health is in warning state: health: HEALTH_WARN 13 pool(s) have no replicas configured 2 pool(s) have non-power-of-two pg_num Because we're using only 1 replica then we need to disable the redundancy check. The pool pg num should be a power of two number (like 16). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-27 16:20:17 +01:00
Dimitri Savineau	ef2cb99f73	ceph-osd: add device class to crush rules This adds device class support to crush rules when using the class key in the rule dict via the create-replicated sub command. If the class key isn't specified then we use the create-simple sub command for backward compatibility. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-14 16:25:46 +01:00
Guillaume Abrioux	16bcef4f28	tests: add time command in vagrant_up.sh monitor how long it takes to get all VMs up and running Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-08 15:47:46 +01:00
Guillaume Abrioux	db77fbda15	tests: add coverage on purge playbook This commit adds a playbook to be played before we run purge playbook, it first creates an rbd image then map an rbd device on client0 so the purge playbook will try to unmap it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-08 09:06:11 -05:00
Dimitri Savineau	02df2ab5ea	tests/requirements: bump testinfra and pytest The ansible ssh connections are now using the ssh backend instead of paramiko starting testinfra 3.1 and persistent connections too. pytest 4.6 is the latest release to be supported by python 2. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-04 09:09:49 -05:00
Dimitri Savineau	6ce4fde820	move library/plugins tests files under tests dir To avoid unnecessary ansible warnings during playbook execution we can move the library and plugins test files under a different directory. [WARNING]: Skipping plugin (plugins/filter/test_ipaddrs_in_ranges.py) as it seems to be invalid: cannot import name 'ipaddrs_in_ranges' Closes: #4656 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-28 09:23:17 +01:00
Guillaume Abrioux	b5a61fe2e3	tests: use osd ids instead of device name in ooo_collocation on master, it doesn't make sense anymore to use device name, we should use osd id instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-10-22 13:45:19 +02:00
Guillaume Abrioux	384161edcd	tests: fix keyring creation in ooo_collocation This commit removes the backslash in allow command parameter, this was needed before the ceph_key module integration. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-10-22 13:45:19 +02:00
Dimitri Savineau	3c2840da03	tests: update container tag for ooo_collocation It doesn't make sense to test the old 3.0.x container images with nautilus+ ceph releases. Also disable the dashboard deployment and switch to bluestore backend. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-22 13:45:19 +02:00
Guillaume Abrioux	25b98b2ce3	tests: add multimds coverage This commit makes the all_daemons scenario deploying 3 mds in order to cover the multimds case. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-10-18 13:43:13 -04:00
Dimitri Savineau	2c03c6fcd3	tests: fix the size on the second data LV The commit replaces the pv/vg/lv commands used with the ansible command module by the lvg and lvol modules. This also fixes the size of the second data LV because we were only using 50% of the remaining space instead of 100%. With a 50G device, the result was: - data-lv1 was 25G - data-lv2 was 12.5G Instead of: - data-lv1 was 25G - data-lv2 was 25G Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-17 15:49:15 -04:00
Dimitri Savineau	0f978d969b	Remove validate action and notario dependency The current ceph-validate role is using both validate action and fail module tasks to validate the ceph configuration. The validate action is based on the notario python library. When one of the notario validation fails then a python stack trace is reported to the ansible task. This output isn't understandable by users. This patch removes the validate action and the notario depencendy. The validation is now done with only fail ansible module. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654790 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-15 11:34:49 +02:00
Dimitri Savineau	04ec1ad3cc	tests: reduce handler mon and osd delay We don't need to have high handler delay in the CI so reducing to 10 seconds. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-09 09:08:20 +02:00
Dimitri Savineau	010158ff84	tests: fix rgw multisite vagrant variables The secondary vagrant variables didn't have the grafana vm variable set which create an vagrant error. There was an error loading a Vagrantfile. The file being loaded and the error message are shown below. This is usually caused by an invalid or undefined variable. This patch also changes the ssh-extra-args parameter to ssh-common-args to get the same values for ssh/sftp/scp. Otherwise we can see warnings from ansible and some tasks are failing. [WARNING]: sftp transfer mechanism failed on [mon0]. Use ANSIBLE_DEBUG=1 to see detailed information It also updates the ssh-common-args value for the rgw-multisite scenario to reflect the ANSIBLE_SSH_ARGS environment variable value. Finally changing the IP addresses due to the Vagrant refact done in the commit `778c51a` Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-04 15:12:50 -04:00
Guillaume Abrioux	01f6dd52b3	tests: remove debug log verbosity This was added for debugging purpose. It's generating very large log output, let's remove this now. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-09-28 11:20:49 +02:00
Guillaume Abrioux	006df148d0	tests: pin jinja2 version ensure we get the latest jinja2 version. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-09-26 11:35:24 +02:00
Guillaume Abrioux	5bb6a4da42	tests: set copy_admin_key at group_vars level setting it at extra vars level prevent from setting it per node. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-09-26 11:35:24 +02:00
Guillaume Abrioux	da094ac5ee	tests: do not rely on pg_num to validate rgw_tuning_pools Since the pg_autoscaler has been enabled recently in ceph, this check should stick to validate the requested pools are well created only. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-09-18 14:05:23 +02:00
Dimitri Savineau	825045f6b4	tests: use a single grafana node on podman We don't use multiple grafana nodes for the moment on the others scenarios and I don't think this is supposed to be working. We can often see failure on grafana on that scenario. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-08-28 11:42:48 -04:00
Guillaume Abrioux	05686509f3	tests: update test_mgr_is_up() the data structure has changed in octopus: ``` "mgrmap": { "available": true, "modules": [ "dashboard", "prometheus" ], "num_standbys": 0, "services": { "prometheus": "http://mgr0:9283/" } }, ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-14 16:42:02 +02:00
Dimitri Savineau	31bd5e08a6	Revert "tests: disable nfs-ganesha deployment" This reverts commit `83940e624b`. Because nfs-ganesha@master (2.9-dev) build has been fixed by [1] then we can test nfs-ganesha in the CI for master/octopus. [1] https://github.com/ceph/ceph-build/pull/1346 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-08-07 10:40:43 +02:00
Dimitri Savineau	867583d5dd	tests/shrink_rgw: Disable dashboard The shrink_rgw scenario has been merge just after the PR about enable ceph dashboard by default. So right now the shrink_rgw scenrio doesn't have nodes in the grafana group and fails. We just need to set dashboard_enabled to false. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-31 14:53:05 -04:00
Guillaume Abrioux	0f620b2584	tests: add more memory in podman job Typical error : ``` fatal: [mon1 -> mon0]: FAILED! => changed=true cmd: - podman - exec - ceph-mon-mon0 - ceph - config - set - mgr - mgr/dashboard/ssl - 'false' delta: '0:00:00.644870' end: '2019-07-30 10:17:32.715639' msg: non-zero return code rc: 1 start: '2019-07-30 10:17:32.070769' stderr: \|- Traceback (most recent call last): File "/usr/bin/ceph", line 140, in <module> import rados ImportError: libceph-common.so.0: cannot map zero-fill pages: Cannot allocate memory Error: exit status 1 stderr_lines: <omitted> stdout: '' stdout_lines: <omitted> ``` Let's add more memory to get around this issue. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-30 13:52:44 +02:00
Guillaume Abrioux	d649e00893	tests: deploy dashboard on mons there's no dedicated nodes for mgr, let's use monitor nodes. The mgr0 instance spawned isn't used, so if this node is part of the inventory for this scenario, testinfra will complain because there's no ceph.conf on this node. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-30 13:52:44 +02:00
Rishabh Dave	236b081a3a	tests/functional: add a test for shrink-rgw.yml Add a new functional test that deploys a Ceph cluster with three nodes for MON, OSD and RGW and then runs shrink-rgw.yml to test it. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-07-30 08:45:57 +02:00
Guillaume Abrioux	3c2fd337d9	tests: test dashboard deployment with podman scenario This commit adds a grafana-server section in order to test dashboard deployment with podman. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-29 14:42:45 +02:00
Guillaume Abrioux	fb1b5b3251	dashboard: enable dashboard by default This commit enables dashboard deployment by default. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1726739 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-29 14:42:45 +02:00
Dimitri Savineau	07c6695d16	Remove NBSP characters Some NBSP are still present in the yaml files. Adding a test in travis CI. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-26 16:09:23 -04:00
Guillaume Abrioux	83940e624b	tests: disable nfs-ganesha deployment nfs-ganesha repositories @ dev are broken, this commit disables the nfs-ganesha deployment so the CI isn't stuck. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-24 14:13:06 +02:00
Dimitri Savineau	a9a1f633a9	tests/dashboard: use the dedicated grafana node The Vagrant dashboard scenario creates a dedicated grafana node but was not use in the ansible inventory. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-18 07:22:13 +02:00
Rishabh Dave	f80521f773	tests/functional: add a test for shrink-rbdmirror.yml Add a new functional test that deploys Ceph cluster with three nodes for MON, OSD and RBD Mirror and, then, runs shrink-rbdmirror.yml to test it. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-07-15 11:22:17 +02:00
Rishabh Dave	5c95c34d4b	tests/functional: add a test for shrink-mgr.yml Add a new functional test that deploys a Ceph cluster with three nodes for MON, OSD and MGR and then runs shrink-mgr.yml to test it. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-07-09 14:37:02 +02:00
Rishabh Dave	324b3b4a6c	tests/functional: add a test for shrink-mds.yml Add a new functional test that deploys a Ceph cluster with three nodes for MON, OSD and MDS and then runs shrink-mds.yml to test it. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-07-08 11:05:28 +02:00
Mike Christie	1e64efc2f0	igw: Update tests to use ceph-iscsi package gateway_ip_list is depreciated and is only used when using the old ceph-iscsi-config/cli packages that are no longer being developed (GH repos are archived). Because ceph-iscsi-config/cli is no longer being worked on, this modifies the tests to stress the ceph-iscsi based installs. Signed-off-by: Mike Christie <mchristi@redhat.com>	2019-07-03 22:13:19 +02:00
Mike Christie	b7b2213be1	igw: drop gateway_ip_list for container setups The gateway_ip_list is not used in container setups, so drop it for that case. Signed-off-by: Mike Christie <mchristi@redhat.com>	2019-07-03 22:13:19 +02:00
Guillaume Abrioux	45041f52fd	tests: clean nfs_ganesha variables - clean some leftover. - move nfs_ganesha_[stable\|dev] in group_vars so dev_setup.yml can modify them. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-26 08:58:51 +02:00
Guillaume Abrioux	013ae62177	tests: test nfs-ganesha deployment Add back the nfs-ganesha deployment testing which was removed because of broken dependencies. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-26 08:58:51 +02:00
Guillaume Abrioux	9201674b5b	tests: deploy nfs-ganesha in container-all_daemons this commit bring back the nfs-ganesha testing in containerized deployment. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-24 10:05:11 +02:00
Dimitri Savineau	da8b7ab7fb	remove ceph restapi references The ceph restapi configuration was only available until Luminous release so we don't need those leftovers for nautilus+. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-06-18 09:13:19 +02:00
Guillaume Abrioux	1019e3b3dc	tests: increase docker pull timeout CI is facing issues where docker pull reach the timeout, let's increase this to avoid CI failures. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-14 16:23:24 +02:00
Rishabh Dave	67071c3169	align cephfs pool creation The definitions of cephfs pools should match openstack pools. Signed-off-by: Rishabh Dave <ridave@redhat.com> Co-Authored-by: Simone Caronni <simone.caronni@teralytics.net>	2019-06-13 09:44:05 +02:00
Guillaume Abrioux	4cf17a6fdd	iscsi: assign application (rbd) to pool 'rbd' if we don't assign the rbd application tag on this pool, the cluster will get `HEALTH_WARN` state like following: ``` HEALTH_WARN application not enabled on 1 pool(s) POOL_APP_NOT_ENABLED application not enabled on 1 pool(s) application not enabled on pool 'rbd' ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-13 07:35:39 +02:00
Guillaume Abrioux	9e4e692c61	tests: remove unused variable `e MGR_DASHBOARD=0` isn't needed anymore here, let's remove this legacy. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-12 10:41:01 -04:00
Guillaume Abrioux	8dd774a99b	tests: update docker image tag used in ooo job ceph-ansible@master isn't intended to deploy luminous. Let's use latest-master on ceph-ansible@master branch Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-12 10:41:01 -04:00
fmount	069076bbfd	Fix units and add ability to have a dedicated instance Few fixes on systemd unit templates for node_exporter and alertmanager container parameters. Added the ability to use a dedicated instance to deploy the dashboard components (prometheus and grafana). This commit also introduces the grafana_group_name variable to refer grafana group and keep consistency with the other groups. During the integration with TripleO some grafana/prometheus template variables resulted undefined. This commit adds the ability to check if the group exist and create, accordingly, different job groups in prometheus template. Signed-off-by: fmount <fpantano@redhat.com>	2019-06-10 18:18:46 +02:00
L3D	ab54fe20ec	ansible: use 'bool' filter on boolean conditionals By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these: ``` [DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable, this behaviour will go away and you might need to add \|bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This feature will be removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. ``` Now appended ``\| bool`` on a lot of the affected variables. Sometimes the coding style from ``variable\|bool`` changed to ``variable \| bool`` (with spaces at the pipe). Closes: #4022 Signed-off-by: L3D <l3d@c3woc.de>	2019-06-06 10:21:17 +02:00
Guillaume Abrioux	a78fb209b1	tests: test podman against atomic os instead rhel8 the rhel8 image used is an outdated beta version, it is not worth it to maintain this image upstream, since it's possible to test podman with a newer version of centos/atomic-host image. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-04 11:32:41 -04:00
Guillaume Abrioux	4708b7615f	tests: add retries on failing tests in testinfra This commit adds `pytest-rerunfailures` in requirements.txt so we can retry failing test in testinfra to avoid false positive. (eg: sometimes it can happen for some reason a service takes too much time to start) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-05-22 09:47:36 -04:00
Dimitri Savineau	de147469d7	tests: update testinfra release In order to support ansible 2.8 with testinfra we need to use the latest release (3.0.x). Adding ssh-config option to py.test. Also bumping the pytest and xdist version. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-05-20 13:04:58 +02:00
Guillaume Abrioux	72d8315299	switch to ansible 2.8 - remove private attribute with import_role. - update documentation. - update rpm spec requirement. - fix MagicMock python import in unit tests. Closes: #3765 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-05-20 13:04:58 +02:00
Guillaume Abrioux	17634fc3df	tests: add dashboard scenario testing This commit add a new scenario to test the dashboard deployment via ceph-ansible. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-05-16 16:39:13 +02:00
Guillaume Abrioux	2798774e96	tests: fix a typo in dev_setup.yml `c907ec41ae` introduced a typo. This commit fixes it. ``` [WARNING]: While constructing a mapping from /home/guits/ceph-ansible/tests/functional/dev_setup.yml, line 21, column 9, found a duplicate dict key (replace). ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-05-15 11:33:26 +02:00
Dimitri Savineau	52b9f3fb28	tox: Refact lvm_osds scenario The current lvm_osds only tests filestore on one OSD node. We also have bs_lvm_osds to test bluestore and encryption. Let's use only one scenario to test filestore/bluestore and with or without dmcrypt on four OSD nodes. Also use validate_dmcrypt_bool_value instead of types.boolean on dmcrypt validation via notario. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-05-09 09:38:20 +02:00
Rishabh Dave	d2cfd8b780	allow adding a manager to a deployed cluster Add a playbook that deploys manager on a new node and adds that node to the already deployed Ceph cluster. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-05-07 14:13:06 +02:00
Rishabh Dave	f201222447	allow adding a RGW to already deployed cluster Add a tox scenario that adds a new RGW node as a part of already deployed Ceph cluster and deploys RGW there. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-05-07 12:36:16 +02:00
Rishabh Dave	221b2b4988	allow adding a RBD mirror to already deployed cluster Add a tox scenario that adds a new RBD mirror node as a part of already deployed Ceph cluster and deploys RBD mirror there. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-05-07 09:45:20 +02:00
Dimitri Savineau	564ec9c992	tests: group and parametrize tests Instead of creating a dedicated test and using the same testinfra module we can group them into a single test to avoid multiple ansible connections and testinfra module execution. This patch also adds parametrize pytest decorator when possible. Finally fixing some flake minor issue. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-24 10:03:25 +02:00
Rishabh Dave	739a662c80	improve coding style Keywords requiring only one item shouldn't express it by creating a list with single item. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-04-23 15:37:07 +02:00
Andrew Schoen	399a821439	tests: adds the migrate_ceph_disk_to_ceph_volume scenario This test deploys a luminous cluster with ceph-disk created osds and then upgrades to nautilus and migrates those osds to ceph-volume. The nodes are then rebooted and cluster state verified. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2019-04-18 10:55:11 +02:00
Dimitri Savineau	f601549a8a	test_osds: remove scenario leftover Since there's only only scenario available we don't need lvm_scenario and no_lvm_scenario. Also add missing assert for ceph-volume tests. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-17 17:28:12 +02:00
Dimitri Savineau	9f99f539f7	tests/functional/setup: change mount options In the CI jobs we can change the mount options of the main partition to avoid extra operations on disk. Adding jmespath to tests/requirements.txt due to the json_query filter usage. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-17 08:23:07 +02:00
Dimitri Savineau	c84a74592a	test_mons: test mon listening on port 3300 Since nautilus and msgr2 the monitors also bind on port 3300 in addition of 6789. This patch updates test_mons to reflect that change. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-17 08:19:48 +02:00
Rishabh Dave	d5967af7fb	allow adding a monitor to a deployed cluster Add a playbook that deploys a new monitor on a new node, adds that node to the Ceph cluster and the monitor to the quorum and updates the ceph configuration file on OSD nodes. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-04-15 10:00:50 +02:00
Guillaume Abrioux	83e84c6a4a	tests: remove test_journal_collocation.py in OSD testing this test is related to ceph-disk which is dropped as of stable-4.0 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-11 11:57:02 -04:00
Guillaume Abrioux	4d35e9eeed	osd: remove variable osd_scenario As of stable-4.0, the only valid scenario is `lvm`. Thus, this makes this variable useless. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-11 11:57:02 -04:00
Dimitri Savineau	d25af1b872	tests: Add debug to ceph-override.json It's usefull to have logs in debug mode enabled in order to have more information for developpers. Also reindent to json file. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-11 15:15:41 +02:00
Dimitri Savineau	a19054be18	tests/functional: use ceph-override.json symlink We don't need to have multiple ceph-override.json copies. We currently already have symlink to all_daemons/ceph-override.json so we can do it for all scenarios. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-11 15:15:41 +02:00
Guillaume Abrioux	7e0adca7a4	osds: allow passing devices by path ceph-volume didn't work when the devices where passed by path. Since it now support it, let's allow this feature in ceph-ansible Closes: #3812 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-10 13:22:30 -04:00
Guillaume Abrioux	609c538848	tests: update ceph_release_num in conftest.py add nautilus and octopus releases. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-09 09:55:03 +02:00
Rishabh Dave	c0dfa9b61a	allow adding a MDS to already deployed cluster Add a tox scenario that adds an new MDS node as a part of already deployed Ceph cluster and deploys MDS there. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-04-08 13:33:28 +02:00
Rishabh Dave	192fea0fec	add-osds: don't hardcode group names Instead of hardcoding group names, import ceph-defaults earlier. Also, rectify a minor mistake in vagrant_varaibles.yml for containerized version of add_osds. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-04-08 10:09:00 +02:00
Ali Maredia	37f46a8c5d	rgw multisite: add more than 1 rgw to the master or secondary zone Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1664869 Signed-off-by: Ali Maredia <amaredia@redhat.com>	2019-04-06 08:01:19 +02:00
Guillaume Abrioux	ba0a95211c	tests: pin pytest-xdist to 1.27.0 looks like newer version of pytest-xdist requires pytest>=4.4.0 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-04 11:12:21 +02:00
Guillaume Abrioux	1ecb3a9352	tests: retry to fire up VMs on vagrant failure Add a script to retry several times to fire up VMs to avoid vagrant failures. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Andrew Schoen <aschoen@redhat.com>	2019-04-03 08:38:06 +02:00
Dimitri Savineau	7b7f79171a	tests/functional: Use the ansible reboot module Ansible 2.7 introduces the reboot module so we don't need to use the shell/reboot + wait_for tasks. https://docs.ansible.com/ansible/latest/modules/reboot_module.html Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-03-27 08:30:50 +00:00
Rishabh Dave	62abe7068a	use os.path.join() correctly os.path.join adds the separator (i.e. '/') between the provided path components only if needed. Providing a single path component doesn't lead to any checks. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-03-14 22:35:12 +00:00
Guillaume Abrioux	3d3eee8f38	tests: add symlink for ubuntu hosts inventory otherwise a bunch of jobs will fail like following: ``` [WARNING]: Unable to parse /home/jenkins-build/build/workspace/ceph-ansible-nightly-luminous-ubuntu-container-stable-3.2-bluestore_lvm_osds/tests/functional/bs-lvm-osds/container/hosts-ubuntu as an inventory source [WARNING]: No inventory was parsed, only implicit localhost is available [WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all' ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-04 18:05:26 +01:00
Guillaume Abrioux	b42250332a	tests: pin testinfra version As of testinfra 2.0.0, the binary name is `py.test`. But let's pin the version to 1.19.0. Indeed, migrating to 2.0.0 requires our current testing to be reworked a bit. Since we don't have the bandwidth ATM for this, it's better to simply keep testing with testinfra 1.19.0. Note that I've replaced all `testinfra` occurences by `py.test` anyway. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-04 14:44:27 +01:00
Guillaume Abrioux	207fae38d4	tests: add lvm bluestore dmcrypt support Add coverage for container / non container lvm bluestore dmcrypt OSDs Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-28 12:01:18 +00:00
Guillaume Abrioux	4ab02d2cd1	tests: set ceph_origin and ceph_repository for non_container-collocation those variables are mandatory. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-27 15:58:35 +00:00
Guillaume Abrioux	7fd92348bb	tests: add mgr node for all_daemons scenario add a monitor node to cover in the CI the case where mgrs and monitors are not collocated Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-26 13:19:06 +00:00
Guillaume Abrioux	fa13289c65	tests: fix network interfaces names in conftest.py Set network interfaces names according to the OS distribution in conftest.py Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-22 16:24:18 +01:00
Guillaume Abrioux	2ed203da61	Revert "tests: add ubuntu bionic support" This reverts commit `33c09af250`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-22 16:24:18 +01:00
Guillaume Abrioux	33c09af250	tests: add ubuntu bionic support This commit brings all modifications needed to test against ubuntu bionic. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-22 09:20:16 +01:00
Guillaume Abrioux	f80e43a0d8	tests: share fixture instance across the whole module. there's no need to rerun this part of the code for each function test. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-19 16:40:08 +01:00
Guillaume Abrioux	187b2bc9d9	tests: avoid 'Cannot allocate memory' error in testinfra ``` ------------------------------ Captured log setup ------------------------------ display.py 174 INFO Wednesday 13 February 2019 15:54:15 +0000 (0:00:07.787) 0:02:11.607 **** ansible.py 61 INFO RUN Ansible('setup', None, {'check': True, 'become': False}): {'_ansible_no_log': False, '_ansible_parsed': False, '_ansible_verbose_override': True, 'changed': False, 'module_stderr': u'Connection to 192.168.121.87 closed.\r\n', 'module_stdout': u'bash: /bin/sh: Cannot allocate memory\r\n', 'msg': u'MODULE FAILURE\nSee stdout/stderr for the exact error', 'rc': 126} ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-14 10:34:37 +00:00
Guillaume Abrioux	ac4aded4aa	tests: fix ubuntu-container-all_daemons the public_network subnet used for this scenario was wrong. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-14 10:34:37 +00:00
Guillaume Abrioux	ac7f4b3a01	tests: increase amount of memory for all vms double the amount of memory from 512m to 1024m. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Guillaume Abrioux	ff5509295a	tests: remove useless test `test_mon_host_line_has_correct_value()` will cover this test in anycase. It doesn't worth to have a dedicated test for this. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Guillaume Abrioux	efc051d17c	tests: update test_mon_host_line_has_correct_value() since msgr2 introduction, this test must be updated. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Guillaume Abrioux	0d72fe9b30	tests: add a rhel8 scenario testing test upstream with rhel8 vagrant image Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Andrew Schoen	e0dcd9f2c7	tests: fix Vagrantfile symlink for lvm-auto-discovery tests Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2019-02-01 12:28:12 +01:00
Andrew Schoen	fc9502039d	tests: adds the lvm_auto_discovery container testing scenario This tests osd_auto_discovery: True, containerized_deployment: True and the lvm osd scenario Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2019-02-01 12:28:12 +01:00
Guillaume Abrioux	c0ad91957c	tests: add missing hosts file for ubuntu testing we need it for dev-ubuntu-container-all_daemons job Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-31 08:19:17 +01:00
Guillaume Abrioux	fd8856b80a	tests: do not deploy iscsigw node on ubuntu iSCSI gateways can only be deployed on Red Hat OS family. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-31 08:19:17 +01:00
Guillaume Abrioux	02b18b15c0	tests: run lvm_setup.yml only when osd_scenario is lvm especially for ooo_collocation scenario which is still using ceph-disk testing. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-31 08:19:17 +01:00
Ramana Raja	dfff89ce67	Install nfs-ganesha stable v2.7 nfs-ganesha v2.5 and 2.6 have hit EOL. Install nfs-ganesha v2.7 stable that is currently being maintained. Signed-off-by: Ramana Raja <rraja@redhat.com>	2019-01-30 14:57:26 +01:00
Guillaume Abrioux	25d8198c5d	tests: do not play dev_setup on containerized deployment using `!` mark in tox.ini doesn't work on comma separated list. The idea here is to skip all containerized scenario in dev_setup.yml and use the `!` for the update scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-29 14:12:05 +00:00
Guillaume Abrioux	312867af56	tests: fix rgw testinfra failure fix the wrong path used in various rgw testinfra tests. set `1` as default value for `radosgw_num_instances`: if `ansible_vars.get(radosgw_num_instances)` returns `None`, we can assume there's only 1 instance since it's the default value in ceph-defaults. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-25 11:11:16 +00:00
Noah Watkins	8a5530ee98	test: add missing test dependency [nwatkins@smash ceph-ansible]$ virtualenv env [nwatkins@smash ceph-ansible]$ env/bin/pip install -r tests/requirements.txt [nwatkins@smash ceph-ansible]$ env/bin/python -c "import mock" Traceback (most recent call last): File "<string>", line 1, in <module> ModuleNotFoundError: No module named 'mock' Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2019-01-24 16:34:13 +01:00
Guillaume Abrioux	973f316595	tests: fix symlink to Vagrantfile fix the symplink for Vagrantfile in rgw-multisite/container/secondary Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-24 11:46:32 +01:00
Guillaume Abrioux	5ccfa95b9a	tests: update dev_setup for nfs_ganesha_stable value since we now set this variable in inventory host, the regexp needs to be updated, the assignment operator is `=`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-24 11:46:32 +01:00
Guillaume Abrioux	d177711f6e	tests: rename all node let's name all node the same way to avoid confusion. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-24 11:46:32 +01:00
Guillaume Abrioux	e6205e0070	tests: change ceph_osd_docker_run_script_path for add_osds set `ceph_osd_docker_run_script_path` to /var/tmp for `add_osds` scenario Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-24 11:46:32 +01:00
Guillaume Abrioux	7d705395fb	tests: play lvm_setup.yml on all scenarios We should play lvm_setup.yml on all scenario except `lvm_batch`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-24 11:46:32 +01:00
Guillaume Abrioux	6293a98a0c	tests: reorganize directories layout This commit reorganizes the testing directory layout. The idea is to have more consistency with the names of scenario and their corresponding path, eg: non-container vs. container: each scenario has a subdirectory for container deployment. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-24 11:46:32 +01:00
Guillaume Abrioux	fd91096215	tox: refact environment naming this commit refacts the way the environment are named by adding a factor `{non_container,container}`. This will avoid a lot of duplicate definition in tox.ini and bring kind of consistency. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-24 11:46:32 +01:00
Guillaume Abrioux	a467fc1613	tests: require six==1.10.0 sometimes it can fail because the version of `six` package is prior to 1.10.0. This commit ensures the version is enforced. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-24 11:46:32 +01:00
Rishabh Dave	557d44451d	tests: fix identation issue in test_rgw.py The error was introduced by the commit `1ac94c048f`. Fixes: https://github.com/ceph/ceph-ansible/issues/3528 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-01-23 14:50:41 +01:00
guihecheng	1ac94c048f	rgw: add support for multiple rgw instances on a single host With this, we could have multiple rgw instances on a single host with a single run, don't have to use rgw-standalone.yml which does not seems able to bind ports separately. If you want to have multiple rgw instances, just change 'radosgw_instances' to the number you want, which defaults to 1. Not compatible with Multi-Site yet. Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>	2019-01-18 11:12:28 +01:00
Guillaume Abrioux	b94290af43	refact the 'raw' installation of python to avoid duplicating code in `site.yml.sample`, `site-docker.yml.sample` and `setup.yml`, let's isolate this part of the code and simply include it each time we need it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-16 10:16:11 +01:00
Alfredo Deza	9e4ec1a776	testinfra/osds double the amount of ports OSDs listen to Since msgr2 changes got merged, the OSDs in master (to be nautilus) will double the amount of ports they listen to. Signed-off-by: Alfredo Deza <adeza@redhat.com>	2019-01-08 13:40:34 +01:00
Guillaume Abrioux	d7e77012ef	retry on packages and repositories failures add register/until on all packaging related tasks to avoid non valid CI failure. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-12-19 14:48:27 +00:00
Sébastien Han	14cd286e3a	test: disable nfs for containers Based on https://github.com/ceph/ceph-container/pull/1269 and given there are no stable packages and reliable repository, we disable nfs ganesha temporarly. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-04 12:34:54 +01:00
Sébastien Han	1b6b275229	test: remove leftover [mgrs] Since we now collocated mgrs and mons on the same machine we have to remove the mgrs section, they are not needed anymore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-04 12:34:54 +01:00
Sébastien Han	1c760904b0	site: collocated mon and mgr by default This will speed up the deployment and also deploy mon and mgr collocated just as recommended. This won't prevent you of adding more and dedicaded machines for mgr if needed. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	a502327e52	disable nfs scenario The packages are broken, so let's remove it, until this solved. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Guillaume Abrioux	5d05a09b03	tests: update default pg num and pool size for podman scenario bring the recent refact about `osd_pool_default_pg_num` and `osd_pool_default_size` into podman scenario as well. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-28 11:22:04 +00:00
Sébastien Han	4e5d862bb7	testinfra: linting Make flake8 happy on the testinfra files. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	dcc765d7c7	testinfra: add support for podman Since we are now testing on docker and podman our functionnal tests must reflect that. So now, if we detect the podman binary we will use it, otherwise we default to docker. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	a96e910114	Add new container scenario Test with podman instead of docker and also support for python 3 only. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Guillaume Abrioux	f290e49df8	tests: do not fully override previous ceph_conf_overrides We run an initial deployment with `osd_pool_default_size: 1` in `ceph_conf_overrides`. When re-running the playbook to test idempotency and handlers, we reset `ceph_conf_overrides`, we must append a new value instead of just overwritting it, otherwise, this can lead to error in the CI. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-26 18:22:20 +01:00
Guillaume Abrioux	5601af8de2	tests: change default pools size default pool size in our test should be explicitly set to 1 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 18:23:07 +00:00
Guillaume Abrioux	d4c0960f04	mon: move `osd_pool_default_pg_num` in `ceph-defaults` `osd_pool_default_pg_num` parameter is set in `ceph-mon`. When using ceph-ansible with `--limit` on a specifc group of nodes, it will fail when trying to access this variables since it wouldn't be defined. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 15:42:50 +00:00
Guillaume Abrioux	3ac6619fb9	tests: set pool size to 1 in ceph-override.json setting this setting to 1 makes the CI covering the related code in the playbook without breaking the upgrade scenarios. Those scenarios were broken because there is a check `TASK [waiting for clean pgs...]` in rolling_update.yml, since the pool size for `cephfs_metadata` and `cephfs_data` are updated to `2` in `ceph-override.json` and there is not enough osd to honor this size, some PGs are degraded and make the mentioned check failing. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-12 10:51:48 +01:00
Sébastien Han	e552026418	rbd-mirror: use the new rbd-mirror key Instead of using the old rbd key let's use the new rbr-mirror key to bootstrap the rbd -mirror daemon. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-09 12:45:52 +01:00
Noah Watkins	50255b9640	Fixup shrink_osd[_container] scenario config configuration seems to be for filestore: [ERROR]: [ceph-osd0] Validation failed for variable: lvm_volumes Removing `radosgw_interface: eth1` to resolve: The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute u'ansible_eth1' The error appears to have been in '/home/nwatkins/src/ceph-ansible/roles/ceph-defaults/tasks/set_radosgw_address.yml': line 21, column 5, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: set_fact _radosgw_address to radosgw_interface - ipv4 ^ here Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2018-11-08 17:45:37 +01:00
Rishabh Dave	8edbda96df	use blocks directives to group tasks Using block directives simplifies the playbooks and makes them more readable. Fixes: https://github.com/ceph/ceph-ansible/issues/2835 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-10-31 09:37:43 +01:00
Guillaume Abrioux	62c314e2ba	tests: test master against ansible 2.7 Let's test ceph-ansible master against ansible 2.7 to catch early any potential issue with this ansible version. Closes: #3148 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 17:07:05 +01:00
Guillaume Abrioux	d8d3e55006	remove restapi role As of `mimic`, restapi is no longer available because of manager daemon. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:19:13 +01:00
Guillaume Abrioux	f52344300a	tests: add more memory for rgw_multsite scenarios Adding more memory to VMs for rgw_multisite scenarios could avoid this error I have recently hit in the CI: (It is worth it to set 1024Mb since there is only 2 nodes in those scenarios.) ``` fatal: [osd0]: FAILED! => { "changed": false, "cmd": [ "docker", "run", "--rm", "--entrypoint", "/usr/bin/ceph", "docker.io/ceph/daemon:latest-luminous", "--version" ], "delta": "0:00:04.799084", "end": "2018-10-29 17:10:39.136602", "rc": 1, "start": "2018-10-29 17:10:34.337518" } STDERR: Traceback (most recent call last): File "/usr/bin/ceph", line 125, in <module> import rados ImportError: libceph-common.so.0: cannot map zero-fill pages: Cannot allocate memory ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Guillaume Abrioux	37970a5b3c	tests: add rgw_multisite functional test Add a playbook that will upload a file on the master then try to get info from the secondary node, this way we can check if the replication is ok. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Guillaume Abrioux	4d464c1003	rgw: add testing scenario for rgw multisite This will setup 2 cluster with rgw multisite enabled. First cluster will act as the 'master', the 2nd will be the secondary one. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Sébastien Han	22aed97266	testinfra: change test osds for containers We do not use @<device> anymore so we don't need to perform the readlink check anymore. Also we are making an exception for ooo which is still using ceph-disk. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-29 18:31:17 +01:00
Sébastien Han	1cdec4069a	test_osd: dynamically get the osd container Do not enforce the container name since this will fail when we have multiple VMs running OSDs. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-29 15:33:12 +01:00
Sébastien Han	876f6ced74	test: convert all the tests to use lvm ceph-disk is now deprecated in ceph-ansible so let's convert all the ci tests to use lvm instead of ceph-disk. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-29 15:33:12 +01:00
Sébastien Han	2fd7da12bb	test: remove ceph-disk CI tests Since we are removing the ceph-disk test from the ci in master then there is no need to have the functionnal tests in master anymore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-29 15:33:12 +01:00
Rishabh Dave	ee2d52d33d	allow custom pool size Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1596339 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-10-22 16:00:21 +02:00
Guillaume Abrioux	c47aa2e83b	tests: remove unnecessary variables definition since we set `configure_firewall: true` in `ceph-defaults/defaults/main.yml` there is no need to explicitly set it in `centos7_cluster` and `docker_cluster` testing scenarios. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-19 15:12:45 +02:00
Guillaume Abrioux	1f9090884e	Revert "tests: test `test_all_docker_osds_are_up_and_in()` from mon nodes" This approach doesn't work with all scenarios because it's comparing a local OSD number expected to a global OSD number found in the whole cluster. This reverts commit `b8ad35ceb9`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-19 00:12:43 +00:00
Guillaume Abrioux	cb35cac926	tests: set configure_firewall: true in centos7\|docker_cluster This way the CI will cover this part of the code. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-19 00:12:43 +00:00
Guillaume Abrioux	b8ad35ceb9	tests: test `test_all_docker_osds_are_up_and_in()` from mon nodes Let's get the osd tree from mons instead on osds. This way we don't have to predict an OSD container name. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-17 17:07:25 +02:00
Guillaume Abrioux	b8418ebd17	add-osds: followup on `3632b26` Three fixes: - fix a typo in vagrant_variables that cause a networking issue for containerized scenario. - add containerized_deployment: true - remove a useless block of code: the fact docker_exec_cmd is set in ceph-defaults which is played right after. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-17 17:07:25 +02:00
Guillaume Abrioux	3632b26005	tests: add tests for day-2-operation playbook Adding testing scenarios for day-2-operation playbook. Steps: - deploys a cluster, - run testinfra, - test idempotency, - add a new osd node, - run testinfra Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-17 11:26:11 +00:00
Guillaume Abrioux	40b7747af7	remove jewel support As of now, we should no longer support Jewel in ceph-ansible. The latest ceph-ansible release supporting Jewel is `stable-3.1`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-12 23:38:17 +00:00
Sébastien Han	fa38b86cf8	test: fix docker test for lvm The CI is still running ceph-disk tests upstream. So until https://github.com/ceph/ceph-ansible/pull/3187 is merged nothing will pass anymore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-12 20:33:01 +00:00
Sébastien Han	31a0438cb2	ceph_volume: refactor This commit does a couple of things: * Avoid code duplication * Clarify the code * add more unit tests * add myself to the author of the module Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Guillaume Abrioux	d2ca24eca8	tests: do not install lvm2 on atomic host we need to detect whether we are running on atomic host to not try to install lvm2 package. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-10 16:08:41 -04:00
Sébastien Han	90c66a5848	ci: test lvm in containerized Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Sébastien Han	0735d39518	tests: osd adjust osd name Now we use id of the OSD instead of the device name. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Guillaume Abrioux	cc6f41f76a	tests: fix lvm2 setup issue not gathering fact causes `package` module to fail because it needs to detect which OS we are running on to select the right package manager. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-09 16:12:54 -04:00
Alfredo Deza	3e488e8298	tests: install lvm2 before setting up ceph-volume/LVM tests Signed-off-by: Alfredo Deza <adeza@redhat.com>	2018-10-09 13:48:50 -04:00
Andrew Schoen	a68c680225	tests: remove journal_size from lvm-batch testing scenario Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-10-09 10:09:50 -04:00
Sébastien Han	9fe86c2268	test: use osd_objecstore default value Do not force filestore on our test but whatever is the default of osd_objecstore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-09-27 21:23:49 +00:00
Guillaume Abrioux	3285b47703	tests: add an RGW node on osd0 for ooo-collocation get more coverage by adding an RGW daemon collocated on osd0. We've missed a bug in the past which could have been caught earlier in the CI. Let's add this additional daemon in order to have a better coverage. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-09-24 14:35:25 +02:00
Guillaume Abrioux	3382c5226c	tests: fix monitor_address for shrink_osd scenario `b89cc1746` introduced a typo. This commit fixes it Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-09-13 18:14:01 +02:00
Andrew Schoen	b36f3e06b5	ceph_volume: adds the osds_per_device parameter If this is set to anything other than the default value of 1 then the --osds-per-device flag will be used by the batch command to define how many osds will be created per device. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-09-12 20:27:14 +00:00
Alfredo Deza	58b2308036	tests: use new 'num_osds' variable in tests Signed-off-by: Alfredo Deza <adeza@redhat.com>	2018-08-31 21:23:20 +00:00
Alfredo Deza	e5fcb0d2d2	tests: allow defining arbitrary number of OSDs Some tests might want to set this since number of devices will not necessarily map to number of OSDs Signed-off-by: Alfredo Deza <adeza@redhat.com>	2018-08-31 21:23:20 +00:00
Sébastien Han	7012835d2b	ci: stop using different images on the same run There is no point of using hosts running on atomic AND centos hosts. So let's run containerized scenarios on Atomic only. This solves this error here: ``` fatal: [client2]: FAILED! => { "failed": true } MSG: The conditional check 'ceph_current_status.rc == 0' failed. The error was: error while evaluating conditional (ceph_current_status.rc == 0): 'dict object' has no attribute 'rc' The error appears to have been in '/home/jenkins-build/build/workspace/ceph-ansible-nightly-luminous-stable-3.1-ooo_collocation/roles/ceph-defaults/tasks/facts.yml': line 74, column 3, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: set_fact ceph_current_status (convert to json) ^ here ``` From https://2.jenkins.ceph.com/view/ceph-ansible-stable3.1/job/ceph-ansible-nightly-luminous-stable-3.1-ooo_collocation/37/consoleFull#1765217701b5dd38fa-a56e-4233-a5ca-584604e56e3a What's happening here is all the hosts excepts the clients are running atomic, so here: https://github.com/ceph/ceph-ansible/blob/master/site-docker.yml.sample#L62 The condition will skipped all the nodes excepts the clients, thus when running ceph-default, the task "is ceph running already?" is skipped but the task above needs the rc of the skipped task. This is not an error from the playbook, it's a CI setup issue. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-23 16:13:54 +02:00
Andrew Schoen	810cc47892	tests: adds a testing scenario for lv-create and lv-teardown Using an explicitly named testing environment name allows us to have a specific [testenv] block for this test. This greatly simplifies how it will work as it doesn't really anything from the ceph cluster tests. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-08-16 16:38:23 +02:00
Andrew Schoen	647bbd8f1e	tests: adds crush_device_class to lvm-batch scenario Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-08-09 09:41:58 -04:00
Andrew Schoen	6d431ec22d	ceph-volume: implement the 'lvm batch' subcommand This adds the action 'batch' to the ceph-volume module so that we can run the new 'ceph-volume lvm batch' subcommand. A functional test is also included. If devices is defind and osd_scenario is lvm then the 'ceph-volume lvm batch' command will be used to create the OSDs. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-08-09 09:41:58 -04:00
Sébastien Han	77d4023fbe	test: follow up on osd_crush_location for containers This was fixed by `578aa5c2d5` on non-container, we need to apply the same fix for containers. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-07 16:20:13 +00:00
Sébastien Han	50be3fd9e8	test: remove osd_crush_location from shrink scenarios This is not needed since this is already covered by docker_cluster and centos_cluster scenarios. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-07 16:20:13 +00:00
Guillaume Abrioux	578aa5c2d5	tests: leave an OSD node in default crush root jewel used to create a default `rbd` pool in the default crush root `default`, we need to have at least 1 osd to satisfy the PGs for this created pool, otherwise the cluster will be in HEALTH_ERR state because of `pgs stuck unclean`/`pgs stuck inactive` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-26 18:47:10 +00:00
Guillaume Abrioux	0a88bccf87	tests: followup on `b89cc1746f` Update network subnets in group_vars/all Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-24 16:55:15 +02:00
Guillaume Abrioux	b89cc1746f	tests: do not deploy all daemons for shrink osds scenarios Let's create a dedicated environment for these scenarios, there is no need to deploy everything. By the way, doing so will save some times. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-23 18:30:06 +02:00
Guillaume Abrioux	af82e7523d	tests: test master against ansible 2.6 Ansible 2.4 is currently end-of-life. Ansible 2.5 will go end-of-life after Ansible 2.7 is released. Fixes: #2901 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-23 11:59:15 +00:00
Guillaume Abrioux	0c863a3783	tests: add support of 'ooo-collocation' scenario when testing against ceph dev The group_vars/all file is not available on 'ooo-collocation' scenario, it's making the `dev_setup.yml` failing because this path is hardcoded. The idea here is to check if the pattern 'ooo-collocation' is present in `change_dir` variable so we can set this path properly according to the scenario being run. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-20 07:47:33 +02:00
Guillaume Abrioux	d8281e50f1	tests: support update scenarios in test_rbd_mirror_is_up() `test_rbd_mirror_is_up()` is failing on update scenarios because it assumes the `ceph_stable_release` is still set to the value of the original ceph release, it means it won't enter in the right part of the condition and fails. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-20 07:46:41 +02:00
Guillaume Abrioux	cc71bb96cc	tests: followup on #2656 `34f70428` has introduced a fix using `command` module while this could have been achieved by using `lvol` module. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-13 07:55:14 +00:00
Guillaume Abrioux	9a65ec231d	tests: fix `_get_osd_id_from_host()` in TestOSDs() We must initialize `children` variable in `_get_osd_id_from_host()`, otherwise, if for any reason the deployment has failed and result with an osd host with no OSD registered, we won't enter in the condition, therefore, `children` is never set and the function tries to return something undefined. Typical error: ``` E UnboundLocalError: local variable 'children' referenced before assignment ``` Fixes: #2860 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-10 13:06:23 +00:00
Guillaume Abrioux	b6d09b510f	tests: refact ci testing master We should test ceph-ansible against the latest ansible stable version on master. This commit also remove the pinning to 1.7.1 version of testinfra because ansible 2.5 requires a newer version. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-06 16:31:49 +00:00
Guillaume Abrioux	09d795b5b7	tests: add mimic support for test_rbd_mirror_is_up() prior mimic, the data structure returned by `ceph -s -f json` used to gather information about rbd-mirror daemons looked like below: ``` "servicemap": { "epoch": 8, "modified": "2018-07-05 13:21:06.207483", "services": { "rbd-mirror": { "daemons": { "summary": "", "ceph-nano-luminous-faa32aebf00b": { "start_epoch": 8, "start_stamp": "2018-07-05 13:21:04.668450", "gid": 14107, "addr": "172.17.0.2:0/2229952892", "metadata": { "arch": "x86_64", "ceph_version": "ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable)", "cpu": "Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz", "distro": "centos", "distro_description": "CentOS Linux 7 (Core)", "distro_version": "7", "hostname": "ceph-nano-luminous-faa32aebf00b", "instance_id": "14107", "kernel_description": "#1 SMP Wed Mar 14 15:12:16 UTC 2018", "kernel_version": "4.9.87-linuxkit-aufs", "mem_swap_kb": "1048572", "mem_total_kb": "2046652", "os": "Linux" } } } } } } ``` This part has changed from mimic and became: ``` "servicemap": { "epoch": 2, "modified": "2018-07-04 09:54:36.164786", "services": { "rbd-mirror": { "daemons": { "summary": "", "14151": { "start_epoch": 2, "start_stamp": "2018-07-04 09:54:35.541272", "gid": 14151, "addr": "192.168.1.80:0/240942528", "metadata": { "arch": "x86_64", "ceph_release": "mimic", "ceph_version": "ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic (stable)", "ceph_version_short": "13.2.0", "cpu": "Intel(R) Xeon(R) CPU X5650 @ 2.67GHz", "distro": "centos", "distro_description": "CentOS Linux 7 (Core)", "distro_version": "7", "hostname": "ceph-rbd-mirror0", "id": "ceph-rbd-mirror0", "instance_id": "14151", "kernel_description": "#1 SMP Wed May 9 18:05:47 UTC 2018", "kernel_version": "3.10.0-862.2.3.el7.x86_64", "mem_swap_kb": "1572860", "mem_total_kb": "1015548", "os": "Linux" } } } } } } ``` This patch modifies the function `test_rbd_mirror_is_up()` in `test_rbd_mirror.py` so it works with `mimic` and keeps backward compatibility with `luminous` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-06 14:39:13 +02:00
Guillaume Abrioux	f2e57a56db	tests: factorize docker tests using docker_exec_cmd logic avoid duplicating test unnecessarily just because of docker exec syntax. Using the same logic than in the playbook with `docker_exec_cmd` allow us to execute the same test on both containerized and non containerized environment. The idea is to set a variable `docker_exec_cmd` with the 'docker exec <container-name>' string when containerized and set it to '' when non containerized. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-27 07:00:14 +00:00
Guillaume Abrioux	fe79a5d240	tests: refact test_all__osds_are_up_and_in these tests are skipped on bluestore osds scenarios. they were going to fail anyway since they are run on mon nodes and `devices` is defined in inventory for each osd node. It means `num_devices num_osd_hosts` returns `0`. The result is that the test expects to have 0 OSDs up. The idea here is to move these tests so they are run on OSD nodes. Each OSD node checks their respective OSD to be UP, if an OSD has 2 devices defined in `devices` variable, it means we are checking for 2 OSD to be up on that node, if each node has all its OSD up, we can say all OSD are up. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-26 15:23:39 +00:00
Guillaume Abrioux	2d560b562a	tests: skip tests for node iscsi-gw when deploying jewel CI is deploying a iscsigw node anyway but its not deployed let's skip test accordingly Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-26 15:23:39 +00:00
Guillaume Abrioux	1c3dae4a90	tests: skip rgw_tuning_pools_are_set when rgw_create_pools is not defined since ooo_collocation scenario is supposed to be the same scenario than the one tested by OSP and they are not passing `rgw_create_pools` the test `test_docker_rgw_tuning_pools_are_set` will fail: ``` > pools = node["vars"]["rgw_create_pools"] E KeyError: 'rgw_create_pools' ``` skipping this test if `node["vars"]["rgw_create_pools"]` is not defined fixes this failure. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-26 15:23:39 +00:00
Guillaume Abrioux	d83b24d271	tests: fix broken test when collocated daemons scenarios At the moment, a lot of tests are skipped when daemons are collocated. Our tests consider a node belong to only 1 group while it's possible for certain scenario it can belong to multiple groups. Also pinning to pytest 3.6.1 so we can use `request.node.iter_markers()` Co-Authored-by: Alfredo Deza <adeza@redhat.com> Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-26 15:23:39 +00:00
Guillaume Abrioux	f68936ca7e	tests: fix *_has_correct_value tests It might happen that the list of ips/hosts in following line (ceph.conf) - `mon initial memebers = <hosts>` - `mon host = <ips>` are not ordered the same way depending on deployment. This patch makes the tests looking for each ip or hostname in respective lines. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-20 08:01:57 +02:00
Guillaume Abrioux	481c14455a	tests: add more nodes in ooo testing scenario adding more node in this scenario could help to have a better coverage so we can catch more potential bugs. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-18 16:44:23 +02:00
Guillaume Abrioux	21894655a7	tests: keep same ceph release during handlers/idempotency test since `latest` points to `mimic`, we need to force the test to keep the same ceph release when testing anything else than `mimic`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-15 11:45:51 -04:00
Guillaume Abrioux	bbb8691335	tests: increase memory to 1024Mb for centos7_cluster scenario we see more and more failure like `fatal: [mon0]: UNREACHABLE! => {}` in `centos7_cluster` scenario, Since we have 30Gb RAM on hypervisors, we can give monitors a bit more RAM. By the way, nodes on containerized cluster testing scenario have already 1024Mb memory allocated. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-11 23:52:15 +08:00
Sébastien Han	6035978ed9	test: only on containerized iscsi We don't have the same service running on non-container for now, this will change soon but for let's only run the test on container. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-11 08:34:48 +02:00
Sébastien Han	20c8065e48	ceph-iscsi: rename group iscsi_gws Let's try to avoid using dashes as testinfra needs to be able to read the groups. Typically, with iscsi-gws we can't add a marker for these iscsi nodes, using an underscore fixes the issue. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-08 10:21:54 +02:00
Sébastien Han	c00fb12497	ci: add functionnal tests for iscsi We test if: * packages are installed * services are runnning * service units are enabled Also fix linting issues Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-08 10:21:54 +02:00
Sébastien Han	5ff2f03e3f	ci: add iscsi test Add iscsi CI coverage, this will now deploy iscsi gateways in container. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-08 10:21:54 +02:00
Guillaume Abrioux	28d21b4e9c	tests: update ooo inventory hostfile Update the inventory host for tripleo testing scenario so it's the same parameters than in tripleo CI. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-07 17:26:35 +02:00
Guillaume Abrioux	5eacc8f8d8	tests: add a dummy value for 'dev' release Functional tests are broken when testing against 'dev' release (ceph). Adding a dummy value here will make it possible to run ceph-ansible CI against dev ceph release. Typical error: ``` > if request.node.get_marker("from_luminous") and ceph_release_num[ceph_stable_release] < ceph_release_num['luminous']: E KeyError: 'dev' ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit fd1487d93f21b609a637053f5b33cd2a4e408d00)	2018-06-07 13:59:17 +02:00
Guillaume Abrioux	c94ada69e8	tests: improve mds tests the expected number of mds daemon consist of number of daemons that are 'up' + number of daemons 'up:standby'. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-07 14:01:58 +08:00
Guillaume Abrioux	f0cd4b0651	tests: skip disabling fastest mirror detection on atomic host There is no need to execute this task on atomic hosts. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-05 15:39:37 +02:00
Guillaume Abrioux	47276764f7	tests: fix rgw tests `41b4632` has introduced a change in functionnals tests. Since the admin keyring isn't copied on rgw nodes anymore in tests, let's use the rgw keyring to achieve them. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-05 15:24:32 +02:00
Sébastien Han	41b4632abc	test: do not always copy admin key The admin key must be copied on the osd nodes only when we test the shrink scenario. Shrink relies on ceph-disk commands that require the admin key on the node where it's being executed. Now we only copy the key when running on the shrink-osd scenario. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-05 09:39:30 +02:00
Guillaume Abrioux	2cf06b515f	rgw: refact rgw pools creation Refact of `8704144e31` There is no need to have duplicated tasks for this. The rgw pools creation should be delegated on a monitor node se we don't have to care if the admin keyring is present on rgw node. By the way, only one task is needed to create the pools, we just need to use the `docker_exec_cmd` fact already defined in `ceph-defaults` to achieve it. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1550281 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-05 15:00:20 +08:00
Erwan Velu	493f615eae	ceph-defaults: Enable local epel repository During the tests, the remote epel repository is generating a lots of errors leading to broken jobs (issue #2666) This patch is about using a local repository instead of a random one. To achieve that, we make a preliminary install of epel-release, remove the metalink and enforce a baseurl to our local http mirror. That should speed up the build process but also avoid the random errors we face. This patch is part of a patch series that tries to remove all possible yum failures. Signed-off-by: Erwan Velu <erwan@redhat.com>	2018-06-04 08:11:35 +02:00
jtudelag	600e1e2c26	rgws: renames create_pools variable with rgw_create_pools. Renamed to be consistent with the role (rgw) and have a meaningful name. Signed-off-by: Jorge Tudela <jtudelag@redhat.com>	2018-06-04 06:23:42 +02:00
jtudelag	8704144e31	Adds RGWs pool creation to containerized installation. ceph command has to be executed from one of the monitor containers if not admin copy present in RGWs. Task has to be delegated then. Adds test to check proper RGW pool creation for Docker container scenarios. Signed-off-by: Jorge Tudela <jtudelag@redhat.com>	2018-06-04 06:23:42 +02:00
Guillaume Abrioux	c68126d6fd	mdss: do not make pg_num a mandatory params When playing ceph-mds role, mon nodes have set a fact with the default pg num for osd pools, we can simply default to this value for cephfs pools (`cephfs_pools` variable). At the moment the variable definition for `cephfs_pools` looks like: ``` cephfs_pools: - { name: "{{ cephfs_data }}", pgs: "" } - { name: "{{ cephfs_metadata }}", pgs: "" } ``` and we have a task in `ceph-validate` to ensure `pgs` has been set to a valid value. We could simply avoid this check by setting the default value of `pgs` to `hostvars[groups[mon_group_name][0]]['osd_pool_default_pg_num']` and let to users the possibility to override this value. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1581164 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-30 16:20:34 +02:00
Guillaume Abrioux	6f489015e4	tests: fix broken symlink `requirements2.5.txt` is pointing to `tests/requirements2.4.txt` while it should point to `requirements2.4.txt` since they are in the same directory. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-30 16:13:47 +02:00
Guillaume Abrioux	34f7042852	tests: resize root partition when atomic host For a few moment we can see failures in the CI for containerized scenarios because VMs are running out of space at some point. The default in the images used is to have only 3Gb for root partition which doesn't sound like a lot. Typical error seen: ``` STDERR: failed to register layer: Error processing tar file(exit status 1): open /usr/share/zoneinfo/Atlantic/Canary: no space left on device ``` Indeed, on the machine we can see: ``` Every 2.0s: df -h Tue May 29 17:21:13 2018 Filesystem Size Used Avail Use% Mounted on /dev/mapper/atomicos-root 3.0G 3.0G 14M 100% / ``` The idea here is to expand this partition with all the available space remaining by issuing an `lvresize` followed by an `xfs_growfs`. ``` -bash-4.2# lvresize -l +100%FREE /dev/atomicos/root Size of logical volume atomicos/root changed from <2.93 GiB (750 extents) to 9.70 GiB (2484 extents). Logical volume atomicos/root successfully resized. ``` ``` -bash-4.2# xfs_growfs / meta-data=/dev/mapper/atomicos-root isize=512 agcount=4, agsize=192000 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=0 spinodes=0 data = bsize=4096 blocks=768000, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 data blocks changed from 768000 to 2543616 ``` ``` -bash-4.2# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/atomicos-root 9.7G 1.4G 8.4G 14% / ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-30 10:54:35 +02:00
Guillaume Abrioux	98cb6ed8f6	tests: avoid yum failures In the CI we can see at many times failures like following: `Failure talking to yum: Cannot find a valid baseurl for repo: base/7/x86_64` It seems the fastest mirror detection is sometimes counterproductive and leads yum to fail. This fix has been added in the `setup.yml`. This playbook was used until now only just before playing `testinfra` and could be used before running ceph-ansible so we can add some provisionning tasks. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Erwan Velu <evelu@redhat.com>	2018-05-28 22:04:35 +02:00
Guillaume Abrioux	a10e73d78d	tests: move cephfs_pools variable let's move this variable in group_vars/all.yml in all testing scenarios accordingly to this commit `1f15a81c48` so we keep consistency between the playbook and the tests. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-24 09:39:38 -07:00
Guillaume Abrioux	564a662baf	osds: move openstack pools creation in ceph-osd When deploying a large number of OSD nodes it can be an issue because the protection check [1] won't pass since it tries to create pools before all OSDs are active. The idea here is to move openstack pools creation at the end of `ceph-osd` role. [1] `e59258943b/src/mon/OSDMonitor.cc (L5673)` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-24 09:39:38 -07:00
Luigi Toscano	43e96c1f98	ceph-radosgw: disable NSS PKI db when SSL is disabled The NSS PKI database is needed only if radosgw_keystone_ssl is explicitly set to true, otherwise the SSL integration is not enabled. It is worth noting that the PKI support was removed from Keystone starting from the Ocata release, so some code paths should be changed anyway. Also, remove radosgw_keystone, which is not useful anymore. This variable was used until `fcba2c801a`. Now profiles drives the setting of rgw keystone *. Signed-off-by: Luigi Toscano <ltoscano@redhat.com>	2018-05-23 23:24:09 -07:00
Andrew Schoen	dea1ea93d5	tests: use notario>=0.0.13 when testing Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Andrew Schoen	9f68dad2ff	validate: first pass at validating the install options Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Guillaume Abrioux	a68091c923	tests: update the type for the rule used in pools As of ceph 12.2.5 the type of the parameter `type` is not a name anymore but an id, therefore an `int` is expected otherwise it will fail with the following error Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-30 08:15:18 +02:00
Sébastien Han	71efa2eaf4	ci: bump client nodes to 2 In order to test the key distribution is correct we must have 2 client nodes. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-23 18:34:58 +02:00
Sébastien Han	203c9af0ac	ci: test ansible 2.5 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-23 10:17:24 +02:00
Guillaume Abrioux	77831ccb7a	tests: update tests for mds to cover multimds case in case of multimds we must check for the number of mds up instead of just checking if the hostname of the node is in the fsmap. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-12 18:20:58 +02:00
Sébastien Han	82589021e0	ci: fix tripleO scenario Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-11 12:18:34 +02:00
Sébastien Han	2011ec3bcd	ci: client copy admin key If we don't copy the admin key we can't add the key into ceph. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-11 12:18:34 +02:00
Sébastien Han	cf73647e7a	ci: remove useless tests These are already handled by ceph-client/defaults/main.yml so the keys will be created once user_config is set to True. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-11 12:18:34 +02:00
Andrew Schoen	98e237d234	tests: no need to remove partitions in lvm_setup.yml Now that we are using ceph_volume_zap the partitions are kept around and should be able to be reused. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-04-10 14:19:21 +02:00
Sébastien Han	f3caee8460	ceph-iscsi: fix certificates generation and distribution Prior to this patch, the certificates where being generated on a single node only (because of the run_once: true). Thus certificates were not distributed on all the gateway nodes. This would require a second ansible run to work. This patches fix the creation and keys's distribution on all the nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1540845 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-04 09:27:39 +02:00
John Fulton	e6e6bd078a	Refer to expected-num-ojects as expected_num_objects, not size Follow up patch to PR 2432 [1] which replaces "size" (sorry if the original bug used that term, which can be confusing) with expected_num_objects as is used in the Ceph documentation [2]. [1] https://github.com/ceph/ceph-ansible/pull/2432/files [2] http://docs.ceph.com/docs/jewel/rados/operations/pools	2018-03-26 15:41:51 +02:00
Sébastien Han	3ab89ab48c	ci: re-arrange group_vars files We should stop putting everything in 'all'. This is too easy and this is error prone as well for those who are separating variables into host type, things that you should do. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-14 14:22:00 +01:00
Sébastien Han	d5f8cac820	ci: remove left over iscsi_gws file Wrong file that is not used, only iscsi-ggw that is present is correct. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-14 14:22:00 +01:00
Sébastien Han	8000ae342e	remove unsed ceph_rgw_civetweb_port variable Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-14 14:22:00 +01:00
Sébastien Han	f119b25bbe	client: implement proper pools creation Just like we did for the monitor and openstack_config we now have the ability to precisely create pools. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-14 14:22:00 +01:00
Sébastien Han	e302c1baae	mon: add support for erasure code pool You can now specify type: erasure and erasure_profile to use when declaring the pool dictionnary. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-14 14:22:00 +01:00
Sébastien Han	4806ff4ff8	ci: test pool creation on container On containerized scenario we also want to test pool creation. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-14 14:22:00 +01:00
Sébastien Han	fc0fa48e0d	test: add tests for creating crush tree We now run tests on the newly created ceph_crush module. Now the CI will create a specific hierarchy for the OSD. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-06 15:24:31 +00:00
Sébastien Han	fd94840a6e	ci: add copy_admin_key test to container scenario Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-02 20:59:10 +00:00
Sébastien Han	165d9dec10	remove kernel.pid_max This is now managed by Ceph packages. See: https://github.com/ceph/ceph/pull/18544/files http://tracker.ceph.com/issues/21929 Closes: https://github.com/ceph/ceph-ansible/issues/2410 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-02-23 13:57:57 +01:00
Guillaume Abrioux	4a8986459f	tests: change ceph_docker_image_tag for 2nd run The ceph-ansible upstream CI runs severals tests, including a 'idempotency/handlers' test. It means the playbook is run a first time and then a second time with an other container image version to ensure the handlers run properly and the containers are well restarted. This can cause issues. For instance, in that specific case which drove me to submit this commit, I've hit the case where `latest` image ships ceph 12.2.3 while the `stable-3.0` (which is the image used for the second run) ships ceph 12.2.2. The goal of this test is not to verify we can upgrade from a specific version to another but to ensure handlers are working even if it's a valid failure here. It should be caught by a test dedicated to that usecase. We just need to have a container image which has a different id for the upstream CI, we need the same content in container imagebut a different image id in the registry since the test relies on image id to decide whether the container should be restarted. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-02-23 13:54:32 +01:00
Guillaume Abrioux	707458c979	ci: add tripleo scenario testing This should help to see earlier any failure in a tripleo deployment scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-02-23 13:54:32 +01:00
Sébastien Han	7d690878df	test: add test for containers resources changes We change the ceph_mon_docker_memory_limit on the second run, this should trigger a restart of services. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-02-14 02:01:29 +01:00
Sébastien Han	79864a8936	test: add test for restart on new container image Since we have a task to test the handlers we can test a new container to validate the service restart on a new container image. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-02-14 02:01:29 +01:00
Guillaume Abrioux	deaf273b25	syntax: change local_action syntax Use a nicer syntax for `local_action` tasks. We used to have oneliner like this: ``` local_action: wait_for port=22 host={{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }} state=started delay=10 timeout=500 }} ``` The usual syntax: ``` local_action: module: wait_for port: 22 host: "{{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }}" state: started delay: 10 timeout: 500 ``` is nicer and kind of way to keep consistency regarding the whole playbook. This also fix a potential issue about missing quotation : ``` Traceback (most recent call last): File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 213, in <module> main() File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 185, in main rc, out, err = module.run_command(args, executable=executable, use_unsafe_shell=shell, encoding=None, data=stdin) File "/tmp/ansible_wQtWsi/ansible_modlib.zip/ansible/module_utils/basic.py", line 2710, in run_command File "/usr/lib64/python2.7/shlex.py", line 279, in split return list(lex) File "/usr/lib64/python2.7/shlex.py", line 269, in next token = self.get_token() File "/usr/lib64/python2.7/shlex.py", line 96, in get_token raw = self.read_token() File "/usr/lib64/python2.7/shlex.py", line 172, in read_token raise ValueError, "No closing quotation" ValueError: No closing quotation ``` writing `local_action: shell echo {{ fsid }} \| tee {{ fetch_directory }}/ceph_cluster_uuid.conf` can cause trouble because it's complaining with missing quotes, this fix solves this issue. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1510555 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-31 10:45:34 +01:00
Andrew Schoen	cfb75b8e29	tests: remove crush_device_class from lvm tests The --crush-device-class flag for ceph-volume is not available in luminous so lets remove this testing option for now until it's more widely available. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-01-18 15:03:38 +01:00
Andrew Schoen	64f5772140	tests: adds crush_device_class to lvm tests Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-01-17 13:49:29 +01:00
Sébastien Han	39f2bfd5d5	fix jewel scenarios on container When deploying Jewel from master we still need to enable this code since the container image has such check. This check still exists because ceph-disk is not able to create a GPT label on a drive that does not have one. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-12-20 13:43:19 +01:00
Guillaume Abrioux	ab1dd3027a	client: don't try to generate keys the entrypoint to generate users keyring is `ceph-authtool`, therefore, it can expand the `$(ceph-authtool --gen-print-key)` inside the container. Users must generate a keyring themselves. This commit also adds a check to ensure keyring are properly filled when `user_config: true`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-12-14 17:22:07 +01:00

... 3 4 5 6 7 ...

719 Commits (18da10bb7a22f85fbdd9e80ed965a0268e654f4a)