ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Sébastien Han	1b6b275229	test: remove leftover [mgrs] Since we now collocated mgrs and mons on the same machine we have to remove the mgrs section, they are not needed anymore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-04 12:34:54 +01:00
Sébastien Han	1c760904b0	site: collocated mon and mgr by default This will speed up the deployment and also deploy mon and mgr collocated just as recommended. This won't prevent you of adding more and dedicaded machines for mgr if needed. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	a502327e52	disable nfs scenario The packages are broken, so let's remove it, until this solved. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Guillaume Abrioux	5d05a09b03	tests: update default pg num and pool size for podman scenario bring the recent refact about `osd_pool_default_pg_num` and `osd_pool_default_size` into podman scenario as well. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-28 11:22:04 +00:00
Sébastien Han	4e5d862bb7	testinfra: linting Make flake8 happy on the testinfra files. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	dcc765d7c7	testinfra: add support for podman Since we are now testing on docker and podman our functionnal tests must reflect that. So now, if we detect the podman binary we will use it, otherwise we default to docker. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	a96e910114	Add new container scenario Test with podman instead of docker and also support for python 3 only. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Guillaume Abrioux	f290e49df8	tests: do not fully override previous ceph_conf_overrides We run an initial deployment with `osd_pool_default_size: 1` in `ceph_conf_overrides`. When re-running the playbook to test idempotency and handlers, we reset `ceph_conf_overrides`, we must append a new value instead of just overwritting it, otherwise, this can lead to error in the CI. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-26 18:22:20 +01:00
Guillaume Abrioux	5601af8de2	tests: change default pools size default pool size in our test should be explicitly set to 1 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 18:23:07 +00:00
Guillaume Abrioux	d4c0960f04	mon: move `osd_pool_default_pg_num` in `ceph-defaults` `osd_pool_default_pg_num` parameter is set in `ceph-mon`. When using ceph-ansible with `--limit` on a specifc group of nodes, it will fail when trying to access this variables since it wouldn't be defined. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 15:42:50 +00:00
Guillaume Abrioux	3ac6619fb9	tests: set pool size to 1 in ceph-override.json setting this setting to 1 makes the CI covering the related code in the playbook without breaking the upgrade scenarios. Those scenarios were broken because there is a check `TASK [waiting for clean pgs...]` in rolling_update.yml, since the pool size for `cephfs_metadata` and `cephfs_data` are updated to `2` in `ceph-override.json` and there is not enough osd to honor this size, some PGs are degraded and make the mentioned check failing. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-12 10:51:48 +01:00
Sébastien Han	e552026418	rbd-mirror: use the new rbd-mirror key Instead of using the old rbd key let's use the new rbr-mirror key to bootstrap the rbd -mirror daemon. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-09 12:45:52 +01:00
Noah Watkins	50255b9640	Fixup shrink_osd[_container] scenario config configuration seems to be for filestore: [ERROR]: [ceph-osd0] Validation failed for variable: lvm_volumes Removing `radosgw_interface: eth1` to resolve: The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute u'ansible_eth1' The error appears to have been in '/home/nwatkins/src/ceph-ansible/roles/ceph-defaults/tasks/set_radosgw_address.yml': line 21, column 5, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: set_fact _radosgw_address to radosgw_interface - ipv4 ^ here Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2018-11-08 17:45:37 +01:00
Rishabh Dave	8edbda96df	use blocks directives to group tasks Using block directives simplifies the playbooks and makes them more readable. Fixes: https://github.com/ceph/ceph-ansible/issues/2835 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-10-31 09:37:43 +01:00
Guillaume Abrioux	d8d3e55006	remove restapi role As of `mimic`, restapi is no longer available because of manager daemon. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:19:13 +01:00
Guillaume Abrioux	f52344300a	tests: add more memory for rgw_multsite scenarios Adding more memory to VMs for rgw_multisite scenarios could avoid this error I have recently hit in the CI: (It is worth it to set 1024Mb since there is only 2 nodes in those scenarios.) ``` fatal: [osd0]: FAILED! => { "changed": false, "cmd": [ "docker", "run", "--rm", "--entrypoint", "/usr/bin/ceph", "docker.io/ceph/daemon:latest-luminous", "--version" ], "delta": "0:00:04.799084", "end": "2018-10-29 17:10:39.136602", "rc": 1, "start": "2018-10-29 17:10:34.337518" } STDERR: Traceback (most recent call last): File "/usr/bin/ceph", line 125, in <module> import rados ImportError: libceph-common.so.0: cannot map zero-fill pages: Cannot allocate memory ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Guillaume Abrioux	37970a5b3c	tests: add rgw_multisite functional test Add a playbook that will upload a file on the master then try to get info from the secondary node, this way we can check if the replication is ok. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Guillaume Abrioux	4d464c1003	rgw: add testing scenario for rgw multisite This will setup 2 cluster with rgw multisite enabled. First cluster will act as the 'master', the 2nd will be the secondary one. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Sébastien Han	1cdec4069a	test_osd: dynamically get the osd container Do not enforce the container name since this will fail when we have multiple VMs running OSDs. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-29 15:33:12 +01:00
Sébastien Han	876f6ced74	test: convert all the tests to use lvm ceph-disk is now deprecated in ceph-ansible so let's convert all the ci tests to use lvm instead of ceph-disk. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-29 15:33:12 +01:00
Sébastien Han	2fd7da12bb	test: remove ceph-disk CI tests Since we are removing the ceph-disk test from the ci in master then there is no need to have the functionnal tests in master anymore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-29 15:33:12 +01:00
Rishabh Dave	ee2d52d33d	allow custom pool size Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1596339 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-10-22 16:00:21 +02:00
Guillaume Abrioux	c47aa2e83b	tests: remove unnecessary variables definition since we set `configure_firewall: true` in `ceph-defaults/defaults/main.yml` there is no need to explicitly set it in `centos7_cluster` and `docker_cluster` testing scenarios. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-19 15:12:45 +02:00
Guillaume Abrioux	1f9090884e	Revert "tests: test `test_all_docker_osds_are_up_and_in()` from mon nodes" This approach doesn't work with all scenarios because it's comparing a local OSD number expected to a global OSD number found in the whole cluster. This reverts commit `b8ad35ceb9`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-19 00:12:43 +00:00
Guillaume Abrioux	cb35cac926	tests: set configure_firewall: true in centos7\|docker_cluster This way the CI will cover this part of the code. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-19 00:12:43 +00:00
Guillaume Abrioux	b8ad35ceb9	tests: test `test_all_docker_osds_are_up_and_in()` from mon nodes Let's get the osd tree from mons instead on osds. This way we don't have to predict an OSD container name. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-17 17:07:25 +02:00
Guillaume Abrioux	b8418ebd17	add-osds: followup on `3632b26` Three fixes: - fix a typo in vagrant_variables that cause a networking issue for containerized scenario. - add containerized_deployment: true - remove a useless block of code: the fact docker_exec_cmd is set in ceph-defaults which is played right after. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-17 17:07:25 +02:00
Guillaume Abrioux	3632b26005	tests: add tests for day-2-operation playbook Adding testing scenarios for day-2-operation playbook. Steps: - deploys a cluster, - run testinfra, - test idempotency, - add a new osd node, - run testinfra Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-17 11:26:11 +00:00
Guillaume Abrioux	40b7747af7	remove jewel support As of now, we should no longer support Jewel in ceph-ansible. The latest ceph-ansible release supporting Jewel is `stable-3.1`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-12 23:38:17 +00:00
Sébastien Han	fa38b86cf8	test: fix docker test for lvm The CI is still running ceph-disk tests upstream. So until https://github.com/ceph/ceph-ansible/pull/3187 is merged nothing will pass anymore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-12 20:33:01 +00:00
Sébastien Han	31a0438cb2	ceph_volume: refactor This commit does a couple of things: * Avoid code duplication * Clarify the code * add more unit tests * add myself to the author of the module Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Guillaume Abrioux	d2ca24eca8	tests: do not install lvm2 on atomic host we need to detect whether we are running on atomic host to not try to install lvm2 package. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-10 16:08:41 -04:00
Sébastien Han	90c66a5848	ci: test lvm in containerized Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Sébastien Han	0735d39518	tests: osd adjust osd name Now we use id of the OSD instead of the device name. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Guillaume Abrioux	cc6f41f76a	tests: fix lvm2 setup issue not gathering fact causes `package` module to fail because it needs to detect which OS we are running on to select the right package manager. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-09 16:12:54 -04:00
Alfredo Deza	3e488e8298	tests: install lvm2 before setting up ceph-volume/LVM tests Signed-off-by: Alfredo Deza <adeza@redhat.com>	2018-10-09 13:48:50 -04:00
Andrew Schoen	a68c680225	tests: remove journal_size from lvm-batch testing scenario Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-10-09 10:09:50 -04:00
Sébastien Han	9fe86c2268	test: use osd_objecstore default value Do not force filestore on our test but whatever is the default of osd_objecstore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-09-27 21:23:49 +00:00
Guillaume Abrioux	3285b47703	tests: add an RGW node on osd0 for ooo-collocation get more coverage by adding an RGW daemon collocated on osd0. We've missed a bug in the past which could have been caught earlier in the CI. Let's add this additional daemon in order to have a better coverage. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-09-24 14:35:25 +02:00
Guillaume Abrioux	3382c5226c	tests: fix monitor_address for shrink_osd scenario `b89cc1746` introduced a typo. This commit fixes it Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-09-13 18:14:01 +02:00
Alfredo Deza	58b2308036	tests: use new 'num_osds' variable in tests Signed-off-by: Alfredo Deza <adeza@redhat.com>	2018-08-31 21:23:20 +00:00
Sébastien Han	7012835d2b	ci: stop using different images on the same run There is no point of using hosts running on atomic AND centos hosts. So let's run containerized scenarios on Atomic only. This solves this error here: ``` fatal: [client2]: FAILED! => { "failed": true } MSG: The conditional check 'ceph_current_status.rc == 0' failed. The error was: error while evaluating conditional (ceph_current_status.rc == 0): 'dict object' has no attribute 'rc' The error appears to have been in '/home/jenkins-build/build/workspace/ceph-ansible-nightly-luminous-stable-3.1-ooo_collocation/roles/ceph-defaults/tasks/facts.yml': line 74, column 3, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: set_fact ceph_current_status (convert to json) ^ here ``` From https://2.jenkins.ceph.com/view/ceph-ansible-stable3.1/job/ceph-ansible-nightly-luminous-stable-3.1-ooo_collocation/37/consoleFull#1765217701b5dd38fa-a56e-4233-a5ca-584604e56e3a What's happening here is all the hosts excepts the clients are running atomic, so here: https://github.com/ceph/ceph-ansible/blob/master/site-docker.yml.sample#L62 The condition will skipped all the nodes excepts the clients, thus when running ceph-default, the task "is ceph running already?" is skipped but the task above needs the rc of the skipped task. This is not an error from the playbook, it's a CI setup issue. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-23 16:13:54 +02:00
Andrew Schoen	810cc47892	tests: adds a testing scenario for lv-create and lv-teardown Using an explicitly named testing environment name allows us to have a specific [testenv] block for this test. This greatly simplifies how it will work as it doesn't really anything from the ceph cluster tests. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-08-16 16:38:23 +02:00
Andrew Schoen	647bbd8f1e	tests: adds crush_device_class to lvm-batch scenario Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-08-09 09:41:58 -04:00
Andrew Schoen	6d431ec22d	ceph-volume: implement the 'lvm batch' subcommand This adds the action 'batch' to the ceph-volume module so that we can run the new 'ceph-volume lvm batch' subcommand. A functional test is also included. If devices is defind and osd_scenario is lvm then the 'ceph-volume lvm batch' command will be used to create the OSDs. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-08-09 09:41:58 -04:00
Sébastien Han	77d4023fbe	test: follow up on osd_crush_location for containers This was fixed by `578aa5c2d5` on non-container, we need to apply the same fix for containers. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-07 16:20:13 +00:00
Sébastien Han	50be3fd9e8	test: remove osd_crush_location from shrink scenarios This is not needed since this is already covered by docker_cluster and centos_cluster scenarios. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-07 16:20:13 +00:00
Guillaume Abrioux	578aa5c2d5	tests: leave an OSD node in default crush root jewel used to create a default `rbd` pool in the default crush root `default`, we need to have at least 1 osd to satisfy the PGs for this created pool, otherwise the cluster will be in HEALTH_ERR state because of `pgs stuck unclean`/`pgs stuck inactive` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-26 18:47:10 +00:00
Guillaume Abrioux	0a88bccf87	tests: followup on `b89cc1746f` Update network subnets in group_vars/all Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-24 16:55:15 +02:00
Guillaume Abrioux	b89cc1746f	tests: do not deploy all daemons for shrink osds scenarios Let's create a dedicated environment for these scenarios, there is no need to deploy everything. By the way, doing so will save some times. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-23 18:30:06 +02:00

1 2 3 4 5 ...

336 Commits (896676ee80226121785f44f50d1f01fff5aa2fd7)