ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	9d2f2108e1	ceph-crash: introduce new role ceph-crash This commit introduces a new role `ceph-crash` in order to deploy everything needed for the ceph-crash daemon. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-21 20:22:12 +02:00
Dimitri Savineau	957903d561	cephadm: add playbook This adds a new playbook for deploying ceph via cephadm. This also adds a new dedicated tox file for CI purpose. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-16 11:40:45 -04:00
Dimitri Savineau	fc599ed9f5	tests: remove nfs_ganesha_stable_branch variable We don't need to override this variable in the group_vars but use the default value instead. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-06 16:58:59 +02:00
Guillaume Abrioux	5b6f5486f7	tests: update nfs-ganesha to V3.3-stable not really needed in master, commit intended to be backported in octopus branch. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-05 17:10:40 +02:00
Dimitri Savineau	829990e60d	ceph-osd: remove ceph-osd-run.sh script Since we only have one scenario since nautilus then we can just move the container start command from ceph-osd-run.sh to the systemd unit service. As a result, the ceph-osd-run.sh.j2 template and the ceph_osd_docker_run_script_path variable are removed. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-06-18 17:51:13 +02:00
Jan Fajerski	1fe8e819f9	lvm_setup: lookup device from inventory, default to /dev/sd* names This fixes a long standing fail in ceph-volumes lvm test suite. Otherwise the default behaviour should not change. Signed-off-by: Jan Fajerski <jfajerski@suse.com>	2020-06-16 18:17:34 +02:00
Guillaume Abrioux	83faf94351	tests: update pools definitions setting attributes with empty string is a bad user input. Also, removing `rule_name` attribute when creating a code erasure pool. (this rule isnt intended for code erasure pool type). Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-05-16 07:31:57 +02:00
Dimitri Savineau	252e78b4e4	docker2podman: manage dashboard nodes The dashboard nodes (alertmanager, grafana, node-exporter, and prometheus) were not manage during the docker to podman migration. This adds the systemd container template of those services to a dedicated file (systemd.yml) in order to include it in the docker2podman playbook. This also adds the dashboard container images pull from docker to podman. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1829389 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-05-13 12:02:00 +02:00
Dimitri Savineau	2547ab601a	Readd CentOS 7 with conditions The CentOS 7 distribution could still be used be deploying ceph if - it's a containerized deployment - it's a non containerized deployment without the dashboard (due to missing python3 libraries). The ceph_stable_redhat_distro variable has been remove because we can rely on the ansible_distribution_major_version fact instead. The copr el8 repository configuration is only applied for CentOS 8. The ceph-mgr-dashboard package is only installed when the dashboard_enabled variable is set to true. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-04-23 13:31:11 +02:00
Guillaume Abrioux	86959abf9b	tests: add back nfs testing on master This commit adds back nfs testing on master branch (containerized scenario only). Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-04-23 13:27:48 +02:00
Guillaume Abrioux	8c1c34b201	tests: add more coverage in external_clients scenario Run create_users_keys.yml in external_clients scenario Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-31 14:49:38 -04:00
Dimitri Savineau	0f0a14772c	tests: update mgr dashboard socket listening test Since `15ed9ee` the ceph-mgr daemon binds on the IP address on the public network instead of binding on all addresses. This commit updates the testinfra code to reflect that change. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-24 21:11:02 +01:00
Dimitri Savineau	f2c6281207	tests: add dashboard testinfra configuration This commit adds basic tests for grafana, prometheus, node-exporter and ceph mgr dashboard services. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-24 15:19:18 +01:00
Dimitri Savineau	fb69f6990c	dashboard: allow to set read-only admin user This commit allows one to set the role for the admin user as read-only. This can be controlled via the dashboard_admin_user_ro variable but the default value is false for backward compatibility. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1810176 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-19 15:34:41 +01:00
Guillaume Abrioux	60a2e28189	rgw: add multi-instances support when deploying multisite This commit adds the multi-instances when deploying rgw multisite Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-12 16:44:48 -04:00
Dimitri Savineau	e62532de46	update osd pool set size command Since [1] we can't use osd pool without replicas (size: 1) by default. We now need to set the mon_allow_pool_size_one flag to true in the ceph configuration and add the --yes-i-really-mean-it flag to the osd pool set size cli. [1] https://github.com/ceph/ceph/commit/21508bd Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-11 11:25:42 +01:00
Ali Maredia	71f55bd54d	rgw multisite: enable more than 1 realm per cluster Make it so that more than one realm, zonegroup, or zone can be created during a run of the rgw multisite ansible playbooks. The rgw hosts now need to be grouped into zones and realms in the inventory. .yml files need to be created in group_vars for the realms and zones. Sample yaml files are available. Also remove multsite destroy playbook and add --cluster before radosgw-admin commands remove manually added rgw_zone_endpoints var and have ceph-ansible automatically add the correct endpoints of all the rgws in a rgw_zone from the information provided in that rgws hostvars. Signed-off-by: Ali Maredia <amaredia@redhat.com>	2020-03-04 12:58:13 -05:00
Guillaume Abrioux	9f0c6df94f	tests: add more osd nodes in all_daemons scenario This commit adds more osd nodes in all_daemons scenario in order to test erasure pool creation. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-04 09:29:01 -05:00
Guillaume Abrioux	248978596a	tests: update ooo job This commit changes the value passed for the attribute 'rule_name' in openstack_pools definition. It doesn't make sense to have emptry string as passed value here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-04 09:29:01 -05:00
Guillaume Abrioux	8cacba1f54	tests: add erasure pool creation test in CI This commit makes the CI testing an OSD pool erasure creation due to the recent refact of the OSD pool creation tasks in the playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-04 09:29:01 -05:00
Guillaume Abrioux	a3b797e059	tests: enable pg autoscaler on 1 pool This commit enables the pg autoscaler on 1 pool. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-04 09:29:01 -05:00
Guillaume Abrioux	896d00b50e	tests: add lvm batch filestore testing This commit adds an OSD node in lvm-batch scenario in order to test filestore backend. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-03 13:50:19 -05:00
Guillaume Abrioux	0fc99bb6fa	tests: increase journal_size value Looks like we are still seeing issue [1]. Let's increase this value to unlock the CI (however, it still needs to be investigated). Typical error (see [1] for further details) : ``` [root@osd2 ~]# ceph-volume --cluster ceph lvm batch --filestore --yes --journal-size '2048' /dev/sda /dev/sdb --journal-devices /dev/sdc Running command: /sbin/vgcreate --force --yes ceph-journals-817ef90b-77ac-4f52-b8a9-30893849fb78 /dev/sdc stdout: Physical volume "/dev/sdc" successfully created. stdout: Volume group "ceph-journals-817ef90b-77ac-4f52-b8a9-30893849fb78" successfully created --> Refusing to continue with configured size for journal --> RuntimeError: journal sizes must be larger than 2GB, detected: 1024.00 MB ``` [1] https://tracker.ceph.com/issues/41374 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-03 13:23:57 -05:00
Dimitri Savineau	ac0f68ccf0	ceph-dashboard: update create/get rgw user tasks Since [1] if a rgw user already exists then the radosgw-admin user create command will return an error instead of modifying the current user. We were already doing separated tasks for create and get operation but only for multisite configuration but it's not enough. Instead we should do the get task first and depending on the result execute the create. This commit also adds missing run_once and delegate_to statement. [1] https://github.com/ceph/ceph/commit/269e9b9 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-18 10:22:21 +01:00
Ali Maredia	1834c1e48d	rgw: extend automatic rgw pool creation capability Add support for erasure code pools. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731148 Signed-off-by: Ali Maredia <amaredia@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 16:07:43 +01:00
Dimitri Savineau	779a4a6d71	tests: don't install s3cmd on containerized setup The s3cmd package should only be installed on non containerized deployment. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 11:27:52 +01:00
Guillaume Abrioux	910fc61fdc	tests: remove legacy `osd_scenario` variable As of stable-4.0 most of these references aren't needed anymore. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-02-04 10:05:33 +01:00
Guillaume Abrioux	641729357e	tests: add external_clients scenario This commit adds a new 'external ceph clients' scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-31 12:02:15 +01:00
Guillaume Abrioux	c040199c8f	tests: set dashboard\|grafana_admin_password Set these 2 variables in all test scenarios where `dashboard_enabled` is `True` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-29 08:45:34 +01:00
Guillaume Abrioux	3e7dbb4b16	tests: add 'all_in_one' scenario Add new scenario 'all_in_one' in order to catch more collocated related issues. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-27 15:30:45 -05:00
Dimitri Savineau	bb3eae0c80	filestore-to-bluestore: fix osd_auto_discovery When osd_auto_discovery is set then we need to refresh the ansible_devices fact between after the filestore OSD purge otherwise the devices fact won't be populated. Also remove the gpt header on ceph_disk_osds_devices because the devices is empty at this point for osd_auto_discovery. Adding the bool filter when needed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-22 09:36:09 +01:00
Dimitri Savineau	f995b079a6	filestore-to-bluestore: --destroy with raw devices We still need --destroy when using a raw device otherwise we won't be able to recreate the lvm stack on that device with bluestore. Running command: /usr/sbin/vgcreate -s 1G --force --yes ceph-bdc67a84-894a-4687-b43f-bcd76317580a /dev/sdd stderr: Physical volume '/dev/sdd' is already in volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151' Unable to add physical volume '/dev/sdd' to volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151' /dev/sdd: physical volume not initialized. --> Was unable to complete a new OSD, will rollback changes Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792227 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-21 11:37:39 -05:00
Dimitri Savineau	3900527e16	tests/setup: update mount options on EL 8 The nobarrier mount flag doesn't exist anymoer on XFS in the EL 8 kernel. That's why the task wasn't working on those systems. We can still use the other options instead of skipping the task. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-11 05:33:01 +01:00
Guillaume Abrioux	dc672e86ec	tests: add a docker2podman scenario This commit adds a new scenario in order to test docker-to-podman.yml migration playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-10 10:21:29 -05:00
Guillaume Abrioux	4f2baaab8c	tests: disable nfs testing nfs-ganesha makes the CI failing because of issue related to SELinux. See: - https://bugzilla.redhat.com/show_bug.cgi?id=1788563 - https://github.com/nfs-ganesha/nfs-ganesha/issues/527 Until we can get this fixed, let's disable nfs-ganesha testing temporarily. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-08 11:13:46 +01:00
Dimitri Savineau	7b3e6b932c	tests/functional: change docker to podman Some docker commands were hardcoded in tests playbooks and some conditions were not taking care of the containerized_deployment variable but only the atomic fact. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-08 11:13:46 +01:00
Guillaume Abrioux	217d95abb2	common: add centos8 support Ceph octopus only supports CentOS 8. This commit adds CentOS 8 support: - update vagrant image in tox configurations. - add CentOS 8 repository for el8 dependencies. - CentOS 8 container engine is podman (same than RHEL 8). - don't use the epel mirror on sepia because it's epel7 only. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-08 11:13:46 +01:00
Guillaume Abrioux	40de34fb5e	tests: add filestore_to_bluestore job This commit adds a new job in order to test the filestore-to-bluestore.yml infrastructure playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-11 09:04:41 -05:00
Dimitri Savineau	4a6d19dae2	tests: reduce max_mds from 3 to 2 Having max_mds value equals to the number of mds nodes generates a warning in the ceph cluster status: cluster: id: 6d3e49a4-ab4d-4e03-a7d6-58913b8ec00a' health: HEALTH_WARN' insufficient standby MDS daemons available' (...) services: mds: cephfs:3 {0=mds1=up:active,1=mds0=up:active,2=mds2=up:active}' Let's use 2 active and 1 standby mds. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-12-04 14:07:29 -05:00
Dimitri Savineau	3f29b243ea	tests: fix cluster health status The current ceph cluster health is in warning state: health: HEALTH_WARN 13 pool(s) have no replicas configured 2 pool(s) have non-power-of-two pg_num Because we're using only 1 replica then we need to disable the redundancy check. The pool pg num should be a power of two number (like 16). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-27 16:20:17 +01:00
Dimitri Savineau	ef2cb99f73	ceph-osd: add device class to crush rules This adds device class support to crush rules when using the class key in the rule dict via the create-replicated sub command. If the class key isn't specified then we use the create-simple sub command for backward compatibility. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-14 16:25:46 +01:00
Guillaume Abrioux	db77fbda15	tests: add coverage on purge playbook This commit adds a playbook to be played before we run purge playbook, it first creates an rbd image then map an rbd device on client0 so the purge playbook will try to unmap it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-08 09:06:11 -05:00
Guillaume Abrioux	384161edcd	tests: fix keyring creation in ooo_collocation This commit removes the backslash in allow command parameter, this was needed before the ceph_key module integration. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-10-22 13:45:19 +02:00
Dimitri Savineau	3c2840da03	tests: update container tag for ooo_collocation It doesn't make sense to test the old 3.0.x container images with nautilus+ ceph releases. Also disable the dashboard deployment and switch to bluestore backend. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-22 13:45:19 +02:00
Guillaume Abrioux	25b98b2ce3	tests: add multimds coverage This commit makes the all_daemons scenario deploying 3 mds in order to cover the multimds case. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-10-18 13:43:13 -04:00
Dimitri Savineau	2c03c6fcd3	tests: fix the size on the second data LV The commit replaces the pv/vg/lv commands used with the ansible command module by the lvg and lvol modules. This also fixes the size of the second data LV because we were only using 50% of the remaining space instead of 100%. With a 50G device, the result was: - data-lv1 was 25G - data-lv2 was 12.5G Instead of: - data-lv1 was 25G - data-lv2 was 25G Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-17 15:49:15 -04:00
Dimitri Savineau	04ec1ad3cc	tests: reduce handler mon and osd delay We don't need to have high handler delay in the CI so reducing to 10 seconds. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-09 09:08:20 +02:00
Dimitri Savineau	010158ff84	tests: fix rgw multisite vagrant variables The secondary vagrant variables didn't have the grafana vm variable set which create an vagrant error. There was an error loading a Vagrantfile. The file being loaded and the error message are shown below. This is usually caused by an invalid or undefined variable. This patch also changes the ssh-extra-args parameter to ssh-common-args to get the same values for ssh/sftp/scp. Otherwise we can see warnings from ansible and some tasks are failing. [WARNING]: sftp transfer mechanism failed on [mon0]. Use ANSIBLE_DEBUG=1 to see detailed information It also updates the ssh-common-args value for the rgw-multisite scenario to reflect the ANSIBLE_SSH_ARGS environment variable value. Finally changing the IP addresses due to the Vagrant refact done in the commit `778c51a` Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-04 15:12:50 -04:00
Guillaume Abrioux	01f6dd52b3	tests: remove debug log verbosity This was added for debugging purpose. It's generating very large log output, let's remove this now. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-09-28 11:20:49 +02:00
Guillaume Abrioux	5bb6a4da42	tests: set copy_admin_key at group_vars level setting it at extra vars level prevent from setting it per node. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-09-26 11:35:24 +02:00

1 2 3 4 5 ...

475 Commits (18e3c7a0a2f5ff1f2482e519178a00cec0c81420)