ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Sébastien Han	e552026418	rbd-mirror: use the new rbd-mirror key Instead of using the old rbd key let's use the new rbr-mirror key to bootstrap the rbd -mirror daemon. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-09 12:45:52 +01:00
Noah Watkins	50255b9640	Fixup shrink_osd[_container] scenario config configuration seems to be for filestore: [ERROR]: [ceph-osd0] Validation failed for variable: lvm_volumes Removing `radosgw_interface: eth1` to resolve: The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute u'ansible_eth1' The error appears to have been in '/home/nwatkins/src/ceph-ansible/roles/ceph-defaults/tasks/set_radosgw_address.yml': line 21, column 5, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: set_fact _radosgw_address to radosgw_interface - ipv4 ^ here Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2018-11-08 17:45:37 +01:00
Rishabh Dave	8edbda96df	use blocks directives to group tasks Using block directives simplifies the playbooks and makes them more readable. Fixes: https://github.com/ceph/ceph-ansible/issues/2835 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-10-31 09:37:43 +01:00
Guillaume Abrioux	d8d3e55006	remove restapi role As of `mimic`, restapi is no longer available because of manager daemon. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:19:13 +01:00
Guillaume Abrioux	f52344300a	tests: add more memory for rgw_multsite scenarios Adding more memory to VMs for rgw_multisite scenarios could avoid this error I have recently hit in the CI: (It is worth it to set 1024Mb since there is only 2 nodes in those scenarios.) ``` fatal: [osd0]: FAILED! => { "changed": false, "cmd": [ "docker", "run", "--rm", "--entrypoint", "/usr/bin/ceph", "docker.io/ceph/daemon:latest-luminous", "--version" ], "delta": "0:00:04.799084", "end": "2018-10-29 17:10:39.136602", "rc": 1, "start": "2018-10-29 17:10:34.337518" } STDERR: Traceback (most recent call last): File "/usr/bin/ceph", line 125, in <module> import rados ImportError: libceph-common.so.0: cannot map zero-fill pages: Cannot allocate memory ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Guillaume Abrioux	37970a5b3c	tests: add rgw_multisite functional test Add a playbook that will upload a file on the master then try to get info from the secondary node, this way we can check if the replication is ok. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Guillaume Abrioux	4d464c1003	rgw: add testing scenario for rgw multisite This will setup 2 cluster with rgw multisite enabled. First cluster will act as the 'master', the 2nd will be the secondary one. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Sébastien Han	1cdec4069a	test_osd: dynamically get the osd container Do not enforce the container name since this will fail when we have multiple VMs running OSDs. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-29 15:33:12 +01:00
Sébastien Han	876f6ced74	test: convert all the tests to use lvm ceph-disk is now deprecated in ceph-ansible so let's convert all the ci tests to use lvm instead of ceph-disk. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-29 15:33:12 +01:00
Sébastien Han	2fd7da12bb	test: remove ceph-disk CI tests Since we are removing the ceph-disk test from the ci in master then there is no need to have the functionnal tests in master anymore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-29 15:33:12 +01:00
Rishabh Dave	ee2d52d33d	allow custom pool size Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1596339 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-10-22 16:00:21 +02:00
Guillaume Abrioux	c47aa2e83b	tests: remove unnecessary variables definition since we set `configure_firewall: true` in `ceph-defaults/defaults/main.yml` there is no need to explicitly set it in `centos7_cluster` and `docker_cluster` testing scenarios. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-19 15:12:45 +02:00
Guillaume Abrioux	1f9090884e	Revert "tests: test `test_all_docker_osds_are_up_and_in()` from mon nodes" This approach doesn't work with all scenarios because it's comparing a local OSD number expected to a global OSD number found in the whole cluster. This reverts commit `b8ad35ceb9`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-19 00:12:43 +00:00
Guillaume Abrioux	cb35cac926	tests: set configure_firewall: true in centos7\|docker_cluster This way the CI will cover this part of the code. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-19 00:12:43 +00:00
Guillaume Abrioux	b8ad35ceb9	tests: test `test_all_docker_osds_are_up_and_in()` from mon nodes Let's get the osd tree from mons instead on osds. This way we don't have to predict an OSD container name. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-17 17:07:25 +02:00
Guillaume Abrioux	b8418ebd17	add-osds: followup on `3632b26` Three fixes: - fix a typo in vagrant_variables that cause a networking issue for containerized scenario. - add containerized_deployment: true - remove a useless block of code: the fact docker_exec_cmd is set in ceph-defaults which is played right after. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-17 17:07:25 +02:00
Guillaume Abrioux	3632b26005	tests: add tests for day-2-operation playbook Adding testing scenarios for day-2-operation playbook. Steps: - deploys a cluster, - run testinfra, - test idempotency, - add a new osd node, - run testinfra Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-17 11:26:11 +00:00
Guillaume Abrioux	40b7747af7	remove jewel support As of now, we should no longer support Jewel in ceph-ansible. The latest ceph-ansible release supporting Jewel is `stable-3.1`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-12 23:38:17 +00:00
Sébastien Han	fa38b86cf8	test: fix docker test for lvm The CI is still running ceph-disk tests upstream. So until https://github.com/ceph/ceph-ansible/pull/3187 is merged nothing will pass anymore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-12 20:33:01 +00:00
Sébastien Han	31a0438cb2	ceph_volume: refactor This commit does a couple of things: * Avoid code duplication * Clarify the code * add more unit tests * add myself to the author of the module Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Guillaume Abrioux	d2ca24eca8	tests: do not install lvm2 on atomic host we need to detect whether we are running on atomic host to not try to install lvm2 package. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-10 16:08:41 -04:00
Sébastien Han	90c66a5848	ci: test lvm in containerized Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Sébastien Han	0735d39518	tests: osd adjust osd name Now we use id of the OSD instead of the device name. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Guillaume Abrioux	cc6f41f76a	tests: fix lvm2 setup issue not gathering fact causes `package` module to fail because it needs to detect which OS we are running on to select the right package manager. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-09 16:12:54 -04:00
Alfredo Deza	3e488e8298	tests: install lvm2 before setting up ceph-volume/LVM tests Signed-off-by: Alfredo Deza <adeza@redhat.com>	2018-10-09 13:48:50 -04:00
Andrew Schoen	a68c680225	tests: remove journal_size from lvm-batch testing scenario Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-10-09 10:09:50 -04:00
Sébastien Han	9fe86c2268	test: use osd_objecstore default value Do not force filestore on our test but whatever is the default of osd_objecstore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-09-27 21:23:49 +00:00
Guillaume Abrioux	3285b47703	tests: add an RGW node on osd0 for ooo-collocation get more coverage by adding an RGW daemon collocated on osd0. We've missed a bug in the past which could have been caught earlier in the CI. Let's add this additional daemon in order to have a better coverage. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-09-24 14:35:25 +02:00
Guillaume Abrioux	3382c5226c	tests: fix monitor_address for shrink_osd scenario `b89cc1746` introduced a typo. This commit fixes it Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-09-13 18:14:01 +02:00
Alfredo Deza	58b2308036	tests: use new 'num_osds' variable in tests Signed-off-by: Alfredo Deza <adeza@redhat.com>	2018-08-31 21:23:20 +00:00
Sébastien Han	7012835d2b	ci: stop using different images on the same run There is no point of using hosts running on atomic AND centos hosts. So let's run containerized scenarios on Atomic only. This solves this error here: ``` fatal: [client2]: FAILED! => { "failed": true } MSG: The conditional check 'ceph_current_status.rc == 0' failed. The error was: error while evaluating conditional (ceph_current_status.rc == 0): 'dict object' has no attribute 'rc' The error appears to have been in '/home/jenkins-build/build/workspace/ceph-ansible-nightly-luminous-stable-3.1-ooo_collocation/roles/ceph-defaults/tasks/facts.yml': line 74, column 3, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: set_fact ceph_current_status (convert to json) ^ here ``` From https://2.jenkins.ceph.com/view/ceph-ansible-stable3.1/job/ceph-ansible-nightly-luminous-stable-3.1-ooo_collocation/37/consoleFull#1765217701b5dd38fa-a56e-4233-a5ca-584604e56e3a What's happening here is all the hosts excepts the clients are running atomic, so here: https://github.com/ceph/ceph-ansible/blob/master/site-docker.yml.sample#L62 The condition will skipped all the nodes excepts the clients, thus when running ceph-default, the task "is ceph running already?" is skipped but the task above needs the rc of the skipped task. This is not an error from the playbook, it's a CI setup issue. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-23 16:13:54 +02:00
Andrew Schoen	810cc47892	tests: adds a testing scenario for lv-create and lv-teardown Using an explicitly named testing environment name allows us to have a specific [testenv] block for this test. This greatly simplifies how it will work as it doesn't really anything from the ceph cluster tests. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-08-16 16:38:23 +02:00
Andrew Schoen	647bbd8f1e	tests: adds crush_device_class to lvm-batch scenario Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-08-09 09:41:58 -04:00
Andrew Schoen	6d431ec22d	ceph-volume: implement the 'lvm batch' subcommand This adds the action 'batch' to the ceph-volume module so that we can run the new 'ceph-volume lvm batch' subcommand. A functional test is also included. If devices is defind and osd_scenario is lvm then the 'ceph-volume lvm batch' command will be used to create the OSDs. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-08-09 09:41:58 -04:00
Sébastien Han	77d4023fbe	test: follow up on osd_crush_location for containers This was fixed by `578aa5c2d5` on non-container, we need to apply the same fix for containers. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-07 16:20:13 +00:00
Sébastien Han	50be3fd9e8	test: remove osd_crush_location from shrink scenarios This is not needed since this is already covered by docker_cluster and centos_cluster scenarios. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-07 16:20:13 +00:00
Guillaume Abrioux	578aa5c2d5	tests: leave an OSD node in default crush root jewel used to create a default `rbd` pool in the default crush root `default`, we need to have at least 1 osd to satisfy the PGs for this created pool, otherwise the cluster will be in HEALTH_ERR state because of `pgs stuck unclean`/`pgs stuck inactive` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-26 18:47:10 +00:00
Guillaume Abrioux	0a88bccf87	tests: followup on `b89cc1746f` Update network subnets in group_vars/all Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-24 16:55:15 +02:00
Guillaume Abrioux	b89cc1746f	tests: do not deploy all daemons for shrink osds scenarios Let's create a dedicated environment for these scenarios, there is no need to deploy everything. By the way, doing so will save some times. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-23 18:30:06 +02:00
Guillaume Abrioux	0c863a3783	tests: add support of 'ooo-collocation' scenario when testing against ceph dev The group_vars/all file is not available on 'ooo-collocation' scenario, it's making the `dev_setup.yml` failing because this path is hardcoded. The idea here is to check if the pattern 'ooo-collocation' is present in `change_dir` variable so we can set this path properly according to the scenario being run. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-20 07:47:33 +02:00
Guillaume Abrioux	d8281e50f1	tests: support update scenarios in test_rbd_mirror_is_up() `test_rbd_mirror_is_up()` is failing on update scenarios because it assumes the `ceph_stable_release` is still set to the value of the original ceph release, it means it won't enter in the right part of the condition and fails. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-20 07:46:41 +02:00
Guillaume Abrioux	cc71bb96cc	tests: followup on #2656 `34f70428` has introduced a fix using `command` module while this could have been achieved by using `lvol` module. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-13 07:55:14 +00:00
Guillaume Abrioux	9a65ec231d	tests: fix `_get_osd_id_from_host()` in TestOSDs() We must initialize `children` variable in `_get_osd_id_from_host()`, otherwise, if for any reason the deployment has failed and result with an osd host with no OSD registered, we won't enter in the condition, therefore, `children` is never set and the function tries to return something undefined. Typical error: ``` E UnboundLocalError: local variable 'children' referenced before assignment ``` Fixes: #2860 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-10 13:06:23 +00:00
Guillaume Abrioux	09d795b5b7	tests: add mimic support for test_rbd_mirror_is_up() prior mimic, the data structure returned by `ceph -s -f json` used to gather information about rbd-mirror daemons looked like below: ``` "servicemap": { "epoch": 8, "modified": "2018-07-05 13:21:06.207483", "services": { "rbd-mirror": { "daemons": { "summary": "", "ceph-nano-luminous-faa32aebf00b": { "start_epoch": 8, "start_stamp": "2018-07-05 13:21:04.668450", "gid": 14107, "addr": "172.17.0.2:0/2229952892", "metadata": { "arch": "x86_64", "ceph_version": "ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable)", "cpu": "Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz", "distro": "centos", "distro_description": "CentOS Linux 7 (Core)", "distro_version": "7", "hostname": "ceph-nano-luminous-faa32aebf00b", "instance_id": "14107", "kernel_description": "#1 SMP Wed Mar 14 15:12:16 UTC 2018", "kernel_version": "4.9.87-linuxkit-aufs", "mem_swap_kb": "1048572", "mem_total_kb": "2046652", "os": "Linux" } } } } } } ``` This part has changed from mimic and became: ``` "servicemap": { "epoch": 2, "modified": "2018-07-04 09:54:36.164786", "services": { "rbd-mirror": { "daemons": { "summary": "", "14151": { "start_epoch": 2, "start_stamp": "2018-07-04 09:54:35.541272", "gid": 14151, "addr": "192.168.1.80:0/240942528", "metadata": { "arch": "x86_64", "ceph_release": "mimic", "ceph_version": "ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic (stable)", "ceph_version_short": "13.2.0", "cpu": "Intel(R) Xeon(R) CPU X5650 @ 2.67GHz", "distro": "centos", "distro_description": "CentOS Linux 7 (Core)", "distro_version": "7", "hostname": "ceph-rbd-mirror0", "id": "ceph-rbd-mirror0", "instance_id": "14151", "kernel_description": "#1 SMP Wed May 9 18:05:47 UTC 2018", "kernel_version": "3.10.0-862.2.3.el7.x86_64", "mem_swap_kb": "1572860", "mem_total_kb": "1015548", "os": "Linux" } } } } } } ``` This patch modifies the function `test_rbd_mirror_is_up()` in `test_rbd_mirror.py` so it works with `mimic` and keeps backward compatibility with `luminous` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-06 14:39:13 +02:00
Guillaume Abrioux	f2e57a56db	tests: factorize docker tests using docker_exec_cmd logic avoid duplicating test unnecessarily just because of docker exec syntax. Using the same logic than in the playbook with `docker_exec_cmd` allow us to execute the same test on both containerized and non containerized environment. The idea is to set a variable `docker_exec_cmd` with the 'docker exec <container-name>' string when containerized and set it to '' when non containerized. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-27 07:00:14 +00:00
Guillaume Abrioux	fe79a5d240	tests: refact test_all__osds_are_up_and_in these tests are skipped on bluestore osds scenarios. they were going to fail anyway since they are run on mon nodes and `devices` is defined in inventory for each osd node. It means `num_devices num_osd_hosts` returns `0`. The result is that the test expects to have 0 OSDs up. The idea here is to move these tests so they are run on OSD nodes. Each OSD node checks their respective OSD to be UP, if an OSD has 2 devices defined in `devices` variable, it means we are checking for 2 OSD to be up on that node, if each node has all its OSD up, we can say all OSD are up. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-26 15:23:39 +00:00
Guillaume Abrioux	1c3dae4a90	tests: skip rgw_tuning_pools_are_set when rgw_create_pools is not defined since ooo_collocation scenario is supposed to be the same scenario than the one tested by OSP and they are not passing `rgw_create_pools` the test `test_docker_rgw_tuning_pools_are_set` will fail: ``` > pools = node["vars"]["rgw_create_pools"] E KeyError: 'rgw_create_pools' ``` skipping this test if `node["vars"]["rgw_create_pools"]` is not defined fixes this failure. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-26 15:23:39 +00:00
Guillaume Abrioux	f68936ca7e	tests: fix *_has_correct_value tests It might happen that the list of ips/hosts in following line (ceph.conf) - `mon initial memebers = <hosts>` - `mon host = <ips>` are not ordered the same way depending on deployment. This patch makes the tests looking for each ip or hostname in respective lines. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-20 08:01:57 +02:00
Guillaume Abrioux	481c14455a	tests: add more nodes in ooo testing scenario adding more node in this scenario could help to have a better coverage so we can catch more potential bugs. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-18 16:44:23 +02:00
Guillaume Abrioux	21894655a7	tests: keep same ceph release during handlers/idempotency test since `latest` points to `mimic`, we need to force the test to keep the same ceph release when testing anything else than `mimic`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-15 11:45:51 -04:00
Guillaume Abrioux	bbb8691335	tests: increase memory to 1024Mb for centos7_cluster scenario we see more and more failure like `fatal: [mon0]: UNREACHABLE! => {}` in `centos7_cluster` scenario, Since we have 30Gb RAM on hypervisors, we can give monitors a bit more RAM. By the way, nodes on containerized cluster testing scenario have already 1024Mb memory allocated. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-11 23:52:15 +08:00
Sébastien Han	6035978ed9	test: only on containerized iscsi We don't have the same service running on non-container for now, this will change soon but for let's only run the test on container. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-11 08:34:48 +02:00
Sébastien Han	20c8065e48	ceph-iscsi: rename group iscsi_gws Let's try to avoid using dashes as testinfra needs to be able to read the groups. Typically, with iscsi-gws we can't add a marker for these iscsi nodes, using an underscore fixes the issue. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-08 10:21:54 +02:00
Sébastien Han	c00fb12497	ci: add functionnal tests for iscsi We test if: * packages are installed * services are runnning * service units are enabled Also fix linting issues Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-08 10:21:54 +02:00
Sébastien Han	5ff2f03e3f	ci: add iscsi test Add iscsi CI coverage, this will now deploy iscsi gateways in container. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-08 10:21:54 +02:00
Guillaume Abrioux	28d21b4e9c	tests: update ooo inventory hostfile Update the inventory host for tripleo testing scenario so it's the same parameters than in tripleo CI. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-07 17:26:35 +02:00
Guillaume Abrioux	c94ada69e8	tests: improve mds tests the expected number of mds daemon consist of number of daemons that are 'up' + number of daemons 'up:standby'. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-07 14:01:58 +08:00
Guillaume Abrioux	f0cd4b0651	tests: skip disabling fastest mirror detection on atomic host There is no need to execute this task on atomic hosts. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-05 15:39:37 +02:00
Guillaume Abrioux	47276764f7	tests: fix rgw tests `41b4632` has introduced a change in functionnals tests. Since the admin keyring isn't copied on rgw nodes anymore in tests, let's use the rgw keyring to achieve them. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-05 15:24:32 +02:00
Sébastien Han	41b4632abc	test: do not always copy admin key The admin key must be copied on the osd nodes only when we test the shrink scenario. Shrink relies on ceph-disk commands that require the admin key on the node where it's being executed. Now we only copy the key when running on the shrink-osd scenario. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-05 09:39:30 +02:00
Guillaume Abrioux	2cf06b515f	rgw: refact rgw pools creation Refact of `8704144e31` There is no need to have duplicated tasks for this. The rgw pools creation should be delegated on a monitor node se we don't have to care if the admin keyring is present on rgw node. By the way, only one task is needed to create the pools, we just need to use the `docker_exec_cmd` fact already defined in `ceph-defaults` to achieve it. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1550281 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-05 15:00:20 +08:00
Erwan Velu	493f615eae	ceph-defaults: Enable local epel repository During the tests, the remote epel repository is generating a lots of errors leading to broken jobs (issue #2666) This patch is about using a local repository instead of a random one. To achieve that, we make a preliminary install of epel-release, remove the metalink and enforce a baseurl to our local http mirror. That should speed up the build process but also avoid the random errors we face. This patch is part of a patch series that tries to remove all possible yum failures. Signed-off-by: Erwan Velu <erwan@redhat.com>	2018-06-04 08:11:35 +02:00
jtudelag	600e1e2c26	rgws: renames create_pools variable with rgw_create_pools. Renamed to be consistent with the role (rgw) and have a meaningful name. Signed-off-by: Jorge Tudela <jtudelag@redhat.com>	2018-06-04 06:23:42 +02:00
jtudelag	8704144e31	Adds RGWs pool creation to containerized installation. ceph command has to be executed from one of the monitor containers if not admin copy present in RGWs. Task has to be delegated then. Adds test to check proper RGW pool creation for Docker container scenarios. Signed-off-by: Jorge Tudela <jtudelag@redhat.com>	2018-06-04 06:23:42 +02:00
Guillaume Abrioux	c68126d6fd	mdss: do not make pg_num a mandatory params When playing ceph-mds role, mon nodes have set a fact with the default pg num for osd pools, we can simply default to this value for cephfs pools (`cephfs_pools` variable). At the moment the variable definition for `cephfs_pools` looks like: ``` cephfs_pools: - { name: "{{ cephfs_data }}", pgs: "" } - { name: "{{ cephfs_metadata }}", pgs: "" } ``` and we have a task in `ceph-validate` to ensure `pgs` has been set to a valid value. We could simply avoid this check by setting the default value of `pgs` to `hostvars[groups[mon_group_name][0]]['osd_pool_default_pg_num']` and let to users the possibility to override this value. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1581164 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-30 16:20:34 +02:00
Guillaume Abrioux	34f7042852	tests: resize root partition when atomic host For a few moment we can see failures in the CI for containerized scenarios because VMs are running out of space at some point. The default in the images used is to have only 3Gb for root partition which doesn't sound like a lot. Typical error seen: ``` STDERR: failed to register layer: Error processing tar file(exit status 1): open /usr/share/zoneinfo/Atlantic/Canary: no space left on device ``` Indeed, on the machine we can see: ``` Every 2.0s: df -h Tue May 29 17:21:13 2018 Filesystem Size Used Avail Use% Mounted on /dev/mapper/atomicos-root 3.0G 3.0G 14M 100% / ``` The idea here is to expand this partition with all the available space remaining by issuing an `lvresize` followed by an `xfs_growfs`. ``` -bash-4.2# lvresize -l +100%FREE /dev/atomicos/root Size of logical volume atomicos/root changed from <2.93 GiB (750 extents) to 9.70 GiB (2484 extents). Logical volume atomicos/root successfully resized. ``` ``` -bash-4.2# xfs_growfs / meta-data=/dev/mapper/atomicos-root isize=512 agcount=4, agsize=192000 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=0 spinodes=0 data = bsize=4096 blocks=768000, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 data blocks changed from 768000 to 2543616 ``` ``` -bash-4.2# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/atomicos-root 9.7G 1.4G 8.4G 14% / ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-30 10:54:35 +02:00
Guillaume Abrioux	98cb6ed8f6	tests: avoid yum failures In the CI we can see at many times failures like following: `Failure talking to yum: Cannot find a valid baseurl for repo: base/7/x86_64` It seems the fastest mirror detection is sometimes counterproductive and leads yum to fail. This fix has been added in the `setup.yml`. This playbook was used until now only just before playing `testinfra` and could be used before running ceph-ansible so we can add some provisionning tasks. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Erwan Velu <evelu@redhat.com>	2018-05-28 22:04:35 +02:00
Guillaume Abrioux	a10e73d78d	tests: move cephfs_pools variable let's move this variable in group_vars/all.yml in all testing scenarios accordingly to this commit `1f15a81c48` so we keep consistency between the playbook and the tests. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-24 09:39:38 -07:00
Guillaume Abrioux	564a662baf	osds: move openstack pools creation in ceph-osd When deploying a large number of OSD nodes it can be an issue because the protection check [1] won't pass since it tries to create pools before all OSDs are active. The idea here is to move openstack pools creation at the end of `ceph-osd` role. [1] `e59258943b/src/mon/OSDMonitor.cc (L5673)` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-24 09:39:38 -07:00
Luigi Toscano	43e96c1f98	ceph-radosgw: disable NSS PKI db when SSL is disabled The NSS PKI database is needed only if radosgw_keystone_ssl is explicitly set to true, otherwise the SSL integration is not enabled. It is worth noting that the PKI support was removed from Keystone starting from the Ocata release, so some code paths should be changed anyway. Also, remove radosgw_keystone, which is not useful anymore. This variable was used until `fcba2c801a`. Now profiles drives the setting of rgw keystone *. Signed-off-by: Luigi Toscano <ltoscano@redhat.com>	2018-05-23 23:24:09 -07:00
Guillaume Abrioux	a68091c923	tests: update the type for the rule used in pools As of ceph 12.2.5 the type of the parameter `type` is not a name anymore but an id, therefore an `int` is expected otherwise it will fail with the following error Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-30 08:15:18 +02:00
Sébastien Han	71efa2eaf4	ci: bump client nodes to 2 In order to test the key distribution is correct we must have 2 client nodes. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-23 18:34:58 +02:00
Guillaume Abrioux	77831ccb7a	tests: update tests for mds to cover multimds case in case of multimds we must check for the number of mds up instead of just checking if the hostname of the node is in the fsmap. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-04-12 18:20:58 +02:00
Sébastien Han	82589021e0	ci: fix tripleO scenario Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-11 12:18:34 +02:00
Sébastien Han	2011ec3bcd	ci: client copy admin key If we don't copy the admin key we can't add the key into ceph. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-11 12:18:34 +02:00
Sébastien Han	cf73647e7a	ci: remove useless tests These are already handled by ceph-client/defaults/main.yml so the keys will be created once user_config is set to True. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-11 12:18:34 +02:00
Andrew Schoen	98e237d234	tests: no need to remove partitions in lvm_setup.yml Now that we are using ceph_volume_zap the partitions are kept around and should be able to be reused. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-04-10 14:19:21 +02:00
Sébastien Han	f3caee8460	ceph-iscsi: fix certificates generation and distribution Prior to this patch, the certificates where being generated on a single node only (because of the run_once: true). Thus certificates were not distributed on all the gateway nodes. This would require a second ansible run to work. This patches fix the creation and keys's distribution on all the nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1540845 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-04 09:27:39 +02:00
John Fulton	e6e6bd078a	Refer to expected-num-ojects as expected_num_objects, not size Follow up patch to PR 2432 [1] which replaces "size" (sorry if the original bug used that term, which can be confusing) with expected_num_objects as is used in the Ceph documentation [2]. [1] https://github.com/ceph/ceph-ansible/pull/2432/files [2] http://docs.ceph.com/docs/jewel/rados/operations/pools	2018-03-26 15:41:51 +02:00
Sébastien Han	3ab89ab48c	ci: re-arrange group_vars files We should stop putting everything in 'all'. This is too easy and this is error prone as well for those who are separating variables into host type, things that you should do. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-14 14:22:00 +01:00
Sébastien Han	d5f8cac820	ci: remove left over iscsi_gws file Wrong file that is not used, only iscsi-ggw that is present is correct. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-14 14:22:00 +01:00
Sébastien Han	8000ae342e	remove unsed ceph_rgw_civetweb_port variable Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-14 14:22:00 +01:00
Sébastien Han	f119b25bbe	client: implement proper pools creation Just like we did for the monitor and openstack_config we now have the ability to precisely create pools. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-14 14:22:00 +01:00
Sébastien Han	e302c1baae	mon: add support for erasure code pool You can now specify type: erasure and erasure_profile to use when declaring the pool dictionnary. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-14 14:22:00 +01:00
Sébastien Han	4806ff4ff8	ci: test pool creation on container On containerized scenario we also want to test pool creation. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-14 14:22:00 +01:00
Sébastien Han	fc0fa48e0d	test: add tests for creating crush tree We now run tests on the newly created ceph_crush module. Now the CI will create a specific hierarchy for the OSD. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-06 15:24:31 +00:00
Sébastien Han	fd94840a6e	ci: add copy_admin_key test to container scenario Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-02 20:59:10 +00:00
Sébastien Han	165d9dec10	remove kernel.pid_max This is now managed by Ceph packages. See: https://github.com/ceph/ceph/pull/18544/files http://tracker.ceph.com/issues/21929 Closes: https://github.com/ceph/ceph-ansible/issues/2410 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-02-23 13:57:57 +01:00
Guillaume Abrioux	4a8986459f	tests: change ceph_docker_image_tag for 2nd run The ceph-ansible upstream CI runs severals tests, including a 'idempotency/handlers' test. It means the playbook is run a first time and then a second time with an other container image version to ensure the handlers run properly and the containers are well restarted. This can cause issues. For instance, in that specific case which drove me to submit this commit, I've hit the case where `latest` image ships ceph 12.2.3 while the `stable-3.0` (which is the image used for the second run) ships ceph 12.2.2. The goal of this test is not to verify we can upgrade from a specific version to another but to ensure handlers are working even if it's a valid failure here. It should be caught by a test dedicated to that usecase. We just need to have a container image which has a different id for the upstream CI, we need the same content in container imagebut a different image id in the registry since the test relies on image id to decide whether the container should be restarted. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-02-23 13:54:32 +01:00
Guillaume Abrioux	707458c979	ci: add tripleo scenario testing This should help to see earlier any failure in a tripleo deployment scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-02-23 13:54:32 +01:00
Sébastien Han	7d690878df	test: add test for containers resources changes We change the ceph_mon_docker_memory_limit on the second run, this should trigger a restart of services. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-02-14 02:01:29 +01:00
Sébastien Han	79864a8936	test: add test for restart on new container image Since we have a task to test the handlers we can test a new container to validate the service restart on a new container image. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-02-14 02:01:29 +01:00
Guillaume Abrioux	deaf273b25	syntax: change local_action syntax Use a nicer syntax for `local_action` tasks. We used to have oneliner like this: ``` local_action: wait_for port=22 host={{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }} state=started delay=10 timeout=500 }} ``` The usual syntax: ``` local_action: module: wait_for port: 22 host: "{{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }}" state: started delay: 10 timeout: 500 ``` is nicer and kind of way to keep consistency regarding the whole playbook. This also fix a potential issue about missing quotation : ``` Traceback (most recent call last): File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 213, in <module> main() File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 185, in main rc, out, err = module.run_command(args, executable=executable, use_unsafe_shell=shell, encoding=None, data=stdin) File "/tmp/ansible_wQtWsi/ansible_modlib.zip/ansible/module_utils/basic.py", line 2710, in run_command File "/usr/lib64/python2.7/shlex.py", line 279, in split return list(lex) File "/usr/lib64/python2.7/shlex.py", line 269, in next token = self.get_token() File "/usr/lib64/python2.7/shlex.py", line 96, in get_token raw = self.read_token() File "/usr/lib64/python2.7/shlex.py", line 172, in read_token raise ValueError, "No closing quotation" ValueError: No closing quotation ``` writing `local_action: shell echo {{ fsid }} \| tee {{ fetch_directory }}/ceph_cluster_uuid.conf` can cause trouble because it's complaining with missing quotes, this fix solves this issue. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1510555 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-31 10:45:34 +01:00
Andrew Schoen	cfb75b8e29	tests: remove crush_device_class from lvm tests The --crush-device-class flag for ceph-volume is not available in luminous so lets remove this testing option for now until it's more widely available. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-01-18 15:03:38 +01:00
Andrew Schoen	64f5772140	tests: adds crush_device_class to lvm tests Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-01-17 13:49:29 +01:00
Sébastien Han	39f2bfd5d5	fix jewel scenarios on container When deploying Jewel from master we still need to enable this code since the container image has such check. This check still exists because ceph-disk is not able to create a GPT label on a drive that does not have one. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-12-20 13:43:19 +01:00
Guillaume Abrioux	ab1dd3027a	client: don't try to generate keys the entrypoint to generate users keyring is `ceph-authtool`, therefore, it can expand the `$(ceph-authtool --gen-print-key)` inside the container. Users must generate a keyring themselves. This commit also adds a check to ensure keyring are properly filled when `user_config: true`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-12-14 17:22:07 +01:00
Guillaume Abrioux	aa0b1ed118	tests: remove OSD_FORCE_ZAP variable from tests according to ceph/ceph-container#840, this variable is no longer needed. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-11-14 17:55:01 +01:00
Sébastien Han	d05206236c	Merge pull request #2124 from ceph/lvm-setup-fix test: when creating the /dev/sdc2 partition specify label as gpt	2017-10-31 16:51:16 +01:00
Andrew Schoen	37a48209cc	test: when creating the /dev/sdc2 partition specify label as gpt ansible==2.4 requires that label be set to gpt, or it will be defaulted to msdos. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-10-31 09:38:47 -05:00
Guillaume Abrioux	c28882c1cd	tests: add missing test for rbd Add a missing test `test_rbd_mirror_service_is_running_from_luminous()`. Also using bash -c "<cmd>" to make testinfra aware that later in the upgrade process we are now running `luminous` ceph release so we must skip the rbd tests related to `jewel` ceph release. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-30 19:44:56 +01:00
Sébastien Han	faccd0acf0	Merge pull request #2100 from ceph/lvm-bluestore ceph-volume lvm bluestore support	2017-10-27 17:36:16 +02:00
Major Hayden	f73232caa4	Use check_mode instead of always_run This patch changes the `always_run: yes` task option to `check_mode: no` to avoid Ansible warnings.	2017-10-25 09:53:34 -05:00
Major Hayden	c2b5118c1b	Revert "Avoid deprecated always_run" This reverts commit `620fb37dd4`.	2017-10-25 09:48:09 -05:00
Alfredo Deza	027d57dd29	tests create a bluestore osd scenario Signed-off-by: Alfredo Deza <adeza@redhat.com>	2017-10-25 06:46:39 -04:00
Sébastien Han	a53aa9e8b4	ci: new osd scenarios This commit add new osd scenarios, it aims to simplify the CI setup and brings a better coverage on the OSD scenarios. We decided to differentiate between filestore and bluestore, thinking ahead when filestore won't be supported anymore. So we now have two classes of tests: * Filestore * Bluestore In each of those classes we have container and non-container. Then for each we test the following: * collocated * collocated dmcrypt * non-collocated * non-collocated dmcrypt * auto discovery collocated * auto discovery collocated dmcrypt This gives us a nice coverage and also reduces the footprint on the CI. We are now up to 4 scenarios, each containing 6 OSD VMs. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-18 09:26:06 +02:00
Guillaume Abrioux	7ee9aa94b5	Merge pull request #1963 from ceph/pull-in-para site-docker.yml try to fetch images in //	2017-10-13 19:35:11 +02:00
Sébastien Han	71d819620c	mds: fix fs pool creation 1. add the variables to docker_collocation 2. trigger the check when a MDS is part of the inventory file, not when we run on an MDS... Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-13 16:03:04 +02:00
Sébastien Han	90ce4276ca	ci: use a container client VM The client won't run on centos7 anymore but on Atomic host just like the rest of the daemons. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-13 15:26:03 +02:00
Sébastien Han	3e058bff06	ci: reboot with ansible instead of vagrant reload vagrant is serialized and takes a lot of time compare to simple reboot. See the benchmarks below for 3 VMs: [leseb@rick docker]$ time ANSIBLE_SSH_ARGS="-F /home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/vagrant_ssh_config" ansible-playbook -i /home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/hosts reboot.yml PLAY [mons] ************************************************************************************************************************************************************************************************** TASK [Gathering Facts] ************************************************************************************************************************************************************************************* ok: [mon1] ok: [mon2] ok: [mon0] TASK [restart machine] ************************************************************************************************************************************************************************************* changed: [mon2] changed: [mon1] changed: [mon0] TASK [wait for server to boot] ***************************************************************************************************************************************************************************** ok: [mon2 -> localhost] ok: [mon0 -> localhost] ok: [mon1 -> localhost] TASK [uptime] ********************************************************************************************************************************************************************************************** changed: [mon2] changed: [mon0] changed: [mon1] PLAY RECAP *************************************************************************************************************************************************************************************************** mon0 : ok=4 changed=2 unreachable=0 failed=0 mon1 : ok=4 changed=2 unreachable=0 failed=0 mon2 : ok=4 changed=2 unreachable=0 failed=0 real 0m35.112s user 0m5.737s sys 0m1.849s [leseb@rick docker]$ time vagrant reload ==> mon0: Halting domain... ==> mon0: Starting domain. ==> mon0: Waiting for domain to get an IP address... ==> mon0: Waiting for SSH to become available... ==> mon0: Creating shared folders metadata... ==> mon0: Rsyncing folder: /home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/ => /home/vagrant/sync ==> mon0: Machine already provisioned. Run `vagrant provision` or use the `--provision` ==> mon0: flag to force provisioning. Provisioners marked to run always will still run. ==> mon1: Halting domain... ==> mon1: Starting domain. ==> mon1: Waiting for domain to get an IP address... ==> mon1: Waiting for SSH to become available... ==> mon1: Creating shared folders metadata... ==> mon1: Rsyncing folder: /home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/ => /home/vagrant/sync ==> mon1: Machine already provisioned. Run `vagrant provision` or use the `--provision` ==> mon1: flag to force provisioning. Provisioners marked to run always will still run. ==> mon2: Halting domain... ==> mon2: Starting domain. ==> mon2: Waiting for domain to get an IP address... ==> mon2: Waiting for SSH to become available... ==> mon2: Creating shared folders metadata... ==> mon2: Rsyncing folder: /home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/ => /home/vagrant/sync ==> mon2: Machine already provisioned. Run `vagrant provision` or use the `--provision` ==> mon2: flag to force provisioning. Provisioners marked to run always will still run. real 1m31.850s user 0m7.387s sys 0m0.796s Reboot via Ansible: 0m35.112s Reboot via vagrant: 1m31.850s We save 1/3 time. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-13 09:04:26 +02:00
Guillaume Abrioux	17623a2157	Merge pull request #2036 from ceph/cephfs-pool mds: precisely define cephfs pool	2017-10-12 17:47:10 +02:00
Sébastien Han	6bd152d555	Merge pull request #2037 from major/remove-always-run Avoid deprecated always_run	2017-10-12 17:15:28 +02:00
Sébastien Han	b49f9bda21	mds: precisely define cephfs pool We now have a variable called ceph_pools that is mandatory when deploying a MDS. It's a dictionnary that contains a pool name and a PG count. PG count is mandatory and must be set, the playbook will fail otherwise. Closes: https://github.com/ceph/ceph-ansible/issues/2017 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-12 15:56:04 +02:00
Major Hayden	620fb37dd4	Avoid deprecated always_run The `always_run` key is deprecated and being removed in Ansible 2.4. Using it causes a warning to be displayed: [DEPRECATION WARNING]: always_run is deprecated. This patch changes all instances of `always_run` to use the `always` tag, which causes the task to run each time the playbook runs.	2017-10-12 08:29:44 -05:00
Guillaume Abrioux	a179e312fd	tests: add missing override for collocation scenario Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-12 14:43:25 +02:00
Guillaume Abrioux	a2880e6345	tests: rbd/rgw adapt testinfra for jewel - the rbd-mirror unit systemd name is not the same when running jewel vs luminous. - servicemap is not available on jewel. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-12 00:06:08 +02:00
Guillaume Abrioux	a1ea6e7f59	tests: adapt current testing for collocation scenario Since we introduced collocation testing scenario, we need to adapt current tests to this new scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-09 17:25:45 +02:00
Sébastien Han	6d7b73fa91	ci: re-add osd_pool_default_size to 1 with the override If we don't do this the client will create pools with a replica 3 since osd_pool_default_size was gone in ceph-override.json. This was making switch_to_containers failing. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-09 17:25:45 +02:00
Sébastien Han	abb8c374cf	ci: use by-id instead of by-path by-id relies on the disk WWID which is more reliable then by-path (pointing to the PCI info) Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-07 03:39:09 +02:00
Sébastien Han	b6b24a5ca9	iscsi: fix wrong group name for iscsi Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498490 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-05 17:25:32 +02:00
Guillaume Abrioux	53a69640c9	tests: disable shared folder Shared folder is not required for tests. We should avoid hitting the error : ``` uninitialized constant VagrantPlugins::ProviderLibvirt::Action::ShareFolders ``` Also, disabling it might reduce the needed time in certains cases for the VMs to be started. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-05 15:07:38 +02:00
Guillaume Abrioux	6aa7050acd	tests: make all subnet uniq per scenario If two environments are using the same subnet, we will get trouble because of ips addresses conflicts. This commit ensures each scenario has a uniq subnet for both public and cluster network so we can setup several test environment at a time on a same hypervisor. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-05 15:07:38 +02:00
Guillaume Abrioux	635111bf6a	tests: add ceph-override.json for ubuntu/cluster in addition to `18e2ab4d` this commit adds the same file for ubuntu testing scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-05 12:59:29 +02:00
Guillaume Abrioux	4135091c98	tests: fix broken osd test for xenial_cluster the path `/dev/disk/by-path/pci-0000:00:01.1-ata-1.0` doesn't exist. it has to be changed to `/dev/disk/by-path/pci-0000:00:01.1-ata-1` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-05 11:03:41 +02:00
Guillaume Abrioux	cdb5023d84	tests: fix brokens tests for mds `5968cf0` broke the test on mds because of leftover. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-04 16:48:23 +02:00
Guillaume Abrioux	2c4258a0fd	Refact code for set_osd_pool_default_* This commit refacts the code regarding all `set_osd_pool_default_*` related tasks by avoiding usage of useless `set_fact` to determine whether a key is present in `ceph_conf_overrides`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-04 15:40:10 +02:00
Sébastien Han	5968cf09b1	ci: add collocation scenario Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-04 11:19:12 +02:00
Sébastien Han	18e2ab4d07	test: add handler support Add idempotency and handler test. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-03 14:44:00 +02:00
Sébastien Han	39ee25637b	test: add test for device with 'by-path' We now test devices to be passed like: /dev/disk/by-path/pci-0000:00:01.1-ata-1.0 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-03 14:43:57 +02:00
Sébastien Han	b4bec52442	tests: add tests for rgw-nfs rgw-nfs is part of servicemap so we should use it to make sure the process is up and running. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-29 02:38:24 +02:00
Sébastien Han	77fc8ba87f	Merge pull request #1931 from ceph/re-enable-iscsi iscsi: re-enable the scenario	2017-09-28 19:44:52 +02:00
Sébastien Han	67c78da056	iscsi: re-enable the scenario CentOS 7.4 vagrant box is now available so re-enabling this scenario. For more info: https://seven.centos.org/2017/09/updated-centos-vagrant-images-available-v1708-01/ Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-28 18:46:28 +02:00
Ali Maredia	ae18cf24d2	test: add test making sure rgw http endpoints are enabled Signed-off-by: Ali Maredia <amaredia@redhat.com>	2017-09-25 14:41:18 -04:00
Sébastien Han	d5bfc6f85d	mgr: always bootstrap mgr right after the mon If we don't bootstrap the mgr after the mon and the osds handler are called, we will never be able to reach a clean state since the pgs stats are handled by the mgr. This also happens when doing daemon collocation. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1493920 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-22 17:26:28 +02:00
Sébastien Han	c7d9838ad4	tests: add nfs container test Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-21 11:07:14 +02:00
Guillaume Abrioux	a069a6fe63	tests: temporary disable `test_nfs_rgw_fsal_export` This test doesn't work at the moment and need to be fixed. Disabling it temporary to avoid errors in the CI. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-21 09:56:37 +02:00
Guillaume Abrioux	f4fc3bbfea	ci: add precise tests to valide daemons are up Add daemon health check for rgw, mds, mgr, rbd mirror. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-21 09:56:37 +02:00
Sébastien Han	66d41f342d	Merge pull request #1889 from ceph/client-containers client: ability to create keys and pool with no ceph binaries	2017-09-18 17:27:32 +02:00
Sébastien Han	85d73e3be2	client: ability to create keys and pool with no cpeh binaries On a container env, machines don't have any ceph binaries so we need to use a container to run the commands. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-18 14:41:52 +02:00
Andrew Schoen	5eff7e24b0	Merge pull request #1890 from ceph/lvm-setup tests: fix lvm_setup.yml for purge_cluster.yml	2017-09-14 11:38:13 -05:00
Sébastien Han	2f51f0de28	Merge pull request #1880 from ceph/wip-rgw-nfs nfs: configure RGW FSAL to start up correctly	2017-09-13 14:20:14 -06:00
Andrew Schoen	57f2ad7ef1	tests: delete journal partitions in lvm_setup.yml Delete these before creating them incase they are left around in a purge cluster testing scenario. The purge-cluster.yml playbook does not currently remove partitions used for journals. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-09-13 15:02:54 -05:00
Sébastien Han	f67b47d056	Merge pull request #1882 from ceph/multi-journal osd: drop support for device partition	2017-09-13 11:43:48 -06:00
Sébastien Han	aa364264cd	resync ceph-iscsi-gw with old upstream Taken from https://github.com/pcuzner/ceph-iscsi-ansible/tree/tcmu-fixes Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1454945 and https://bugzilla.redhat.com/show_bug.cgi?id=1484083 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-12 18:06:10 -06:00
Sébastien Han	fdf924401f	osd: drop support for device partition We have been struggling with this, it's still broken and breaking other things too now. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1490283 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-12 17:42:07 -06:00
Ali Maredia	52efe92a87	nfs: configure RGW FSAL to start up correctly - Add RGW keyring to nfs node - Add RGW section to ganesha.conf - Add RGW section to ceph.conf onf nfs node Signed-off-by: Ali Maredia <amaredia@redhat.com>	2017-09-12 16:27:16 -04:00
Andrew Schoen	61357c8e20	tests: no need to create a filesystem on /dev/sdc1 for lvm tests The partition only needs created and given a gpt label so that a PARTUUID will exist on the partition. This task also makes the purge_lvm_osds scenario fail on the second deployment after purging. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-09-12 15:14:21 -05:00
Sébastien Han	7054615551	ci: deploy rbd mirror Deploy rbd mirorr in cluster scenario Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-09 01:17:10 +02:00
Sébastien Han	4f325c7ebe	ci: remove scenario bluestore_docker_cluster We don't need to bootstrap a full cluster to bootstrap bluestore. We have individual scenarios for that. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-08 19:33:24 +02:00
Ali Maredia	f8171e8b4a	nfs: rename host to have ceph- prefix Signed-off-by: Ali Maredia <amaredia@redhat.com>	2017-09-08 11:38:05 -04:00
Ali Maredia	f3e2235b3a	nfs-ganesha: add config overrides section Signed-off-by: Ali Maredia <amaredia@redhat.com>	2017-09-08 11:37:58 -04:00
Ali Maredia	c907ec41ae	nfs: add automated testing for nfs-ganesha roles Signed-off-by: Ali Maredia <amaredia@redhat.com>	2017-09-08 09:14:01 -04:00
Sébastien Han	3753e6cfa7	ceph-osd: fix autodetection activation Prior to this patch this activation sequence for autodetection was always skipped because we were asking to activate on device without partitions, which doesn't make sense. We also fix the way we lookup for a device, since the data partition is always numbered 1, we take the min element of the dict. Closes: https://github.com/ceph/ceph-ansible/issues/1782 Signed-off-by: Sébastien Han <seb@redhat.com> Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-07 17:47:37 +02:00
Sébastien Han	b04946430d	Merge pull request #1812 from ceph/switch-migration-conta switch-from-non-containerized-to-containerized: mask unit files	2017-09-07 07:30:34 +02:00
Guillaume Abrioux	d987d26719	tests: force docker variable for switch-to-containers scenario we need to force the value of `docker` variable which is initially set to `false` since it's a migration from non-containerized to containerized cluster. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-06 18:03:52 +02:00
Guillaume Abrioux	c265180158	tests: Add mgr node for all scenarios With Luminous we need to have mgr daemon. This commit adds an mgr daemon for all scenarios. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-05 10:47:10 +02:00
Sébastien Han	11122a0101	client: copy admin key so we can create pools and keys Needed when user_config is set to true Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-04 23:07:14 +02:00
Sébastien Han	f7f4a61d7b	Merge pull request #1836 from ceph/shrink-osd-mon shrink mon and osd	2017-09-01 19:57:44 +02:00
Sébastien Han	298a63c437	shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-01 19:12:00 +02:00
Alfredo Deza	1efea767fe	tests parted should create gpt labels on new disk But only for the first partition, so that a new label doesn't blow away the previous partition created Signed-off-by: Alfredo Deza <adeza@redhat.com>	2017-08-31 10:09:42 -04:00
Alfredo Deza	fec030cd27	tests osds units are not enabled in lvm scenarios Signed-off-by: Alfredo Deza <adeza@redhat.com>	2017-08-31 08:47:42 -04:00
Andrew Schoen	3185a287de	tests: add a filesystem on /dev/sdc1 for lvm osd testing Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-08-30 16:16:04 -05:00
Andrew Schoen	fcba9d17f0	ceph-osd: add support for --journal vg/lv for lvm osds This also updates the tests Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-08-30 15:55:16 -05:00
Andrew Schoen	d026e04470	tests: create 2 partitions on /dev/sdc for lvm scenario testing Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-08-30 15:53:28 -05:00
Sébastien Han	13aac5027a	Merge pull request #1741 from ceph/refactor-installation common: refactor installation method	2017-08-30 17:42:29 +02:00
Sébastien Han	e0a264c7e9	osd: allow multi dedicated journals for containers Fix: https://bugzilla.redhat.com/show_bug.cgi?id=1475820 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-30 12:34:06 +02:00
Sébastien Han	ae2fd45994	common: refactor installation method The installation process is now described as follow: * you still have to choose a 'ceph_origin' installation method. The origin can be a 'repository' (add a new repository), distro (it will use the packages provided by the native repo source of your distribution), local (only available on redhat system, it installs locally built packages). This option is not well tested, so use it carefully * if ceph_origin == 'repository' you will have to decide what kind of repository you want to enable: - community: corresponds to the stable upstream/community version - enterprise: corresponds to the stable enterprise/downstream version (basically you are a red hat customer) - dev: it will install ceph from packages built out of the github development branches Signed-off-by: Sébastien Han <seb@redhat.com> Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com> Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-30 10:52:01 +02:00
Ali Maredia	eab9d52ddb	tests: fix duplicate osd service test Signed-off-by: Ali Maredia <amaredia@redhat.com>	2017-08-29 21:24:13 -04:00
Guillaume Abrioux	b264bfece0	tests: Update tests according to `ceph-config` role implementation Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-24 11:33:02 +02:00
Andrew Schoen	594d5e017a	ceph-osd: restructure lvm_volumes variable for more flexiblity The lvm_volumes variable is now a list of dictionaries that represent each OSD you'd like to deploy using ceph-volume. Each dictionary must have the following keys: data, journal and data_vg. Each dictionary also can optionaly provide a journal_vg key. The 'data' key represents the lv name used for the OSD and the 'data_vg' key is the vg name that the given lv resides on. The 'journal' key is either an lv, device or partition. The 'journal_vg' key is optional and must be the vg name for the journal lv if given. This key is mainly used for purging of the journal lv if purge-cluster.yml is run. For example: lvm_volumes: - data: data_lv1 journal: journal_lv1 data_vg: vg1 journal_vg: vg2 - data: data_lv2 journal: /dev/sdc data_vg: vg1 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-08-23 10:14:14 -05:00
SirishaGuduru	1359869497	Common: changed civetweb line in rgw section(conf) Resolves issue: Multiple RGW Ceph.conf Issue #1258 In multi-RGW setup, in ceph.conf the RGW sections contain identical bind IP in civetweb line. So this modification fixes that issue and puts the right IP for each RGW. Signed-off-by: SirishaGuduru SGuduru@walmartlabs.com Modified ceph-defaults and ran generate_group_vars_sample.sh group_vars/osds.yml.sample and group_vars/rhcs.yml.sample are not part of the changes. But they got modified when generate_group_vars_sample.sh is ran to generate group_vars/ all.yml.sample. Uncommented added variables in ceph-defaults Updated tests by adding value for radosgw_interface Added radosgw_interface to centos cluster tests Modified ceph-rgw role,rebased and ran generate_group_vars_sample.sh In ceph-rgw role removed check_mandatory_vars.yml. Rebased on master. Ran generate_group_vars_sample.sh and then the below files got modified.	2017-08-23 15:03:37 +05:30
Alfredo Deza	9540f472a7	tests/rgw: update tests to use new host fixture Signed-off-by: Alfredo Deza <adeza@redhat.com>	2017-08-21 16:50:02 -04:00
Alfredo Deza	e3ff46dce2	tests/osd add tests for ceph-volume* executables Signed-off-by: Alfredo Deza <adeza@redhat.com>	2017-08-21 16:50:02 -04:00
Alfredo Deza	75060119c1	tests/osd: update tests to use new host fixture Signed-off-by: Alfredo Deza <adeza@redhat.com>	2017-08-21 16:50:01 -04:00
Alfredo Deza	31a960c323	tests/mons: update tests to use new host fixture Signed-off-by: Alfredo Deza <adeza@redhat.com>	2017-08-21 16:50:01 -04:00
Alfredo Deza	ecf917d354	tests/install: update tests to use new host fixture Signed-off-by: Alfredo Deza <adeza@redhat.com>	2017-08-21 16:50:01 -04:00
Andrew Schoen	7ab3711cf5	tests: do not use /dev/sda in the lvm scenario When you udpate to the latest version of the centos/7 box it always puts the OS on /dev/sda, so do not use it as an OSD. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-08-04 15:57:56 -05:00
Andrew Schoen	e597628be9	lvm: update scenario for new osd_scenario variable Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-08-04 06:38:36 -05:00
Andrew Schoen	66df80d600	tests: do not use sudo with dev_setup.yml This causes problems when the tests are run locally and not in the CI Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-08-04 06:13:09 -05:00
Andrew Schoen	661de0f3b0	tests: adds an lvm_osds testing scenario Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-08-04 06:13:09 -05:00
Sébastien Han	30991b1c0a	osd: simplify scenarios There is only two main scenarios now: * collocated: everything remains on the same device: - data, db, wal for bluestore - data and journal for filestore * non-collocated: dedicated device for some of the component Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-03 10:20:39 +02:00
Sébastien Han	8ac7d2e4c9	osd: do not enable osd@id unit file ceph-disk is responsable for enabling the unit file if needed. Actually since https://github.com/ceph/ceph/pull/12241 it seems that it's not even needed. On an event of a restart, udev rules will be trigger and they will ceph-disk activate the device too so the 'enabled' is not needed. Closes: https://github.com/ceph/ceph-ansible/issues/1142 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-07-26 17:17:57 +02:00
Alfredo Deza	bc0678e17d	Merge pull request #1687 from ceph/dev-tests tests: run all existing tests with shaman repos	2017-07-18 08:53:25 -04:00
Guillaume Abrioux	0c1352277c	tests: add config in ceph_conf_overrides to journal collocation tests Add config in ceph_conf_overrides options to journal collocation tests. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-18 01:02:02 +02:00
Guillaume Abrioux	0570008718	tests: Add a client node to docker scenario Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-17 22:47:35 +02:00
Andrew Schoen	7293c40c40	tests: run all existing tests with shaman repos If you use the 'dev' factor, the testing scenario will use repos from shaman.ceph.com. You can define CEPH_DEV_BRANCH and CEPH_DEV_SHA1 to specify which repo you'd like to test. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-07-17 14:42:32 -05:00
Guillaume Abrioux	14dbcb122c	tests: fix test_osds_listen_on_* tests the `test_osds_listen_on_*` consider OSDs will always listen on tcp port with consecutive tcp port number starting from `6800`. Eg. If you have 2 OSDs, tests will assume it should listen on 2 ports for each network (`public_network` and `cluster_network`), therefore: `6800, 6801, 6802, 6803` but sometime it doesn't happen this way and you can get OSDs listening on tcp port like this : `6800, 6801, 6802, 6805` Then the test are failing while it shouldn't. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-13 12:37:15 +02:00
Andrew Schoen	7029be70fa	tests: remove monitor_interface from centos/7/cluster/group_vars/all This is to ensure that the template must use the values set in the inventory. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-07-12 16:34:41 +02:00
Guillaume Abrioux	1183112ee7	Tests: Add an mgr node do dmcrypt-dedicated-journal Add an mgr node to `dmcrypt-dedicated-journal` scenario testing. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-12 15:02:39 +02:00
Guillaume Abrioux	f16841e09e	Tests: rename tests directories Since we are hitting this bug : https://bugzilla.redhat.com/show_bug.cgi?id=1324587 eg: `failed: internal error: Monitor path /var/lib/libvirt/qemu/domain-bs-docker-cl uster-dmcrypt-journal-collocation_mon0_1499294943_ba9faf7bf296533177f6/monitor. sock too big for destination` and we can't upgrade libvirt in our CI for some reason we need to get the directories name shorter in order to workaround this issue Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-12 15:02:39 +02:00
Guillaume Abrioux	94c3756167	Tests: Add bluestore scenarios Since we started testing against Luminous, we need to add more scenarios testing. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-12 15:02:32 +02:00
Sébastien Han	476d1677eb	tests: allow bluestore devices Signed-off-by: Sébastien Han <seb@redhat.com>	2017-07-11 22:08:35 +02:00
Sébastien Han	ed8a7cf1e0	tests: fix block.db partition size Our devices in the CI are 12GB, there are not big enough for the default size. Reducing its size. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-07-11 22:08:35 +02:00
Sébastien Han	035846217e	Merge pull request #1627 from ceph/ceph-osd-prepare-script osd: docker, refactor ceph-osd-run.sh.j2	2017-07-06 16:08:59 +02:00
Guillaume Abrioux	527eb050fc	Tests: fix scenario for docker-cluster-dmcrypt-journal-collocation The scenario set in `group_vars/all` for docker-cluster-dmcrypt-journal-collocation is not the correct one. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-06 08:10:50 +02:00
Sébastien Han	044802a979	test: fix docker dmcrypt collocated scenario We were setting journal_collocation and used raw_journal_devices which is definitely wrong. We should just stick with devices. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-07-04 19:34:26 +02:00
Guillaume Abrioux	1c8680ef2d	Tests: Add bluestore tests Add two scenarios bluestore_journal_collocation and bluestore_cluster. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-07-04 19:07:23 +02:00
Guillaume Abrioux	896d62d78b	Refact: remove ceph_mon_docker_interface variable remove `ceph_mon_docker_interface` and use `monitor_interface` instead for both containerized and non-containerized deployment. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-04 18:08:59 +02:00
Guillaume Abrioux	2a52d5b555	Tests: update tests according to ipv6 support Since ceph.conf.j2 has been updated to add ipv6 support, the different variables in many scenarios need to be updated. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-04 10:57:27 +02:00
Guillaume Abrioux	35ad0e1c11	Remove duplicate entry in test Vagrantfile remove some leftover since code has been refactored Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-06-15 16:42:58 +02:00

... 2 3 4 5 6 ...

475 Commits (18e3c7a0a2f5ff1f2482e519178a00cec0c81420)