ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	fd0b9491b6	ansible: bump to ansible 2.9 Prior this commit we were supporting both ansible 2.8 and 2.9. Let's drop 2.8 now. Closes: #5459 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1879178 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-15 13:13:09 -04:00
Guillaume Abrioux	f31258d604	tests: do not run node_exporter test on clients We need to skip these tests on client nodes since we don't deploy node_exporter on them anymore Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5650a6d7d0`)	2020-09-14 16:13:25 -04:00
Dimitri Savineau	47f24ec047	Add CentOS 8 support for rpm deployment We were only supporting CentOS 8 for containerized deployment. Since Nautilus 14.2.10 we now have el8 rpm packages so we should be able to deploy a nautilus ceph cluster with el8. Note that the nfs-ganesha isn't supported because there's no el8 rpm packages for nfs-ganesha V2.8. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-10 20:38:34 -04:00
Dimitri Savineau	0f7da8b9d1	pytest: register ceph_crash mark Otherwise we see some pytest warning. PytestUnknownMarkWarning: Unknown pytest.mark.ceph_crash - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `03d4620269`)	2020-09-10 20:35:04 -04:00
Guillaume Abrioux	66dde0034b	ceph-crash: introduce new role ceph-crash This commit introduces a new role `ceph-crash` in order to deploy everything needed for the ceph-crash daemon. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9d2f2108e1`)	2020-09-10 20:35:04 -04:00
Dimitri Savineau	d461631c86	tests: use grafana from quay.io This changes the grafana container image regitry from docker.io to quay.io to avoid rate limit. This also adds the missing container image values for docker2podman and podman scenarios. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `dd05d8ba90`)	2020-09-10 21:37:06 +02:00
Guillaume Abrioux	2001039c0e	tests: migrate to quay.ceph.io registry in order to avoid docker.io rate limiting Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `218aedaab6`)	2020-09-10 21:37:06 +02:00
Guillaume Abrioux	2754895b89	tests: move erasure pool testing in lvm_osds This commit moves the erasure pool creation testing from `all_daemons` to `lvm_osds` so we can decrease the number of osd nodes we spawn so the OVH Jenkins slaves aren't less overwhelmed when a `all_daemons` based scenario is being tested. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8476beb5b1`)	2020-08-20 14:16:57 +02:00
Guillaume Abrioux	bd5cde631b	tests: refact shrink_osd scenario This adds more coverage on the shrink_osd scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7efea219d6`)	2020-08-06 13:10:42 +02:00
Guillaume Abrioux	9e40062570	tests: lvm_setup.yml, add carriage return This commit adds crlf between each task. It makes the playbook more readable. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8ef9fb68bc`)	2020-07-22 18:47:27 -04:00
Guillaume Abrioux	53793b352e	tests: (lvm_setup.yml), don't shrink lvol when rerunning lvm_setup.yml on existing cluster with OSDs already deployed, it fails like following: ``` fatal: [osd0]: FAILED! => changed=false msg: Sorry, no shrinking of data-lv2 to 0 permitted. ``` because we are asking `lvol` module to create a volume on an empty VG with size extents = `100%FREE`. The default behavior of `lvol` is to shrink the volume if the LV's current size is greater than the requested size. Given the requested size is calculated like this: `size_requested = size_percent * this_vg['free'] / 100` in our case, it is similar to: `size_requested = 100 * 0 / 100` which basically means `0` So the current LV size is well greater than the requested size which leads the module to attempt to shrink it to 0 which isn't obviously now allowed. Adding `shrink: false` to the module calls fixes this issue. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `218f4ae361`)	2020-07-22 18:47:27 -04:00
Dimitri Savineau	056a4fe866	ceph-dashboard: update create/get rgw user tasks Since [1] if a rgw user already exists then the radosgw-admin user create command will return an error instead of modifying the current user. We were already doing separated tasks for create and get operation but only for multisite configuration but it's not enough. Instead we should do the get task first and depending on the result execute the create. This commit also adds missing run_once and delegate_to statement. [1] https://github.com/ceph/ceph/commit/269e9b9 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ac0f68ccf0`)	2020-07-20 21:21:57 +02:00
Jan Fajerski	14e9672f00	lvm_setup: lookup device from inventory, default to /dev/sd* names This fixes a long standing fail in ceph-volumes lvm test suite. Otherwise the default behaviour should not change. Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit `1fe8e819f9`)	2020-06-29 10:25:58 +02:00
Dimitri Savineau	a99c94ea11	ceph-osd: remove ceph-osd-run.sh script Since we only have one scenario since nautilus then we can just move the container start command from ceph-osd-run.sh to the systemd unit service. As a result, the ceph-osd-run.sh.j2 template and the ceph_osd_docker_run_script_path variable are removed. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `829990e60d`)	2020-06-23 17:35:01 +02:00
Guillaume Abrioux	8ef3fee41b	ceph_volume: make zap function idempotent This commit makes the zap function idempotent, especially when using lvm_volumes variable. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1845668 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3f47236470`)	2020-06-23 10:49:07 +02:00
Dimitri Savineau	a97e24fee9	docker2podman: manage dashboard nodes The dashboard nodes (alertmanager, grafana, node-exporter, and prometheus) were not manage during the docker to podman migration. This adds the systemd container template of those services to a dedicated file (systemd.yml) in order to include it in the docker2podman playbook. This also adds the dashboard container images pull from docker to podman. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1829389 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `252e78b4e4`)	2020-06-03 13:20:24 -04:00
Dimitri Savineau	e34c95d28f	tests: update mgr dashboard socket listening test Since `15ed9ee` the ceph-mgr daemon binds on the IP address on the public network instead of binding on all addresses. This commit updates the testinfra code to reflect that change. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0f0a14772c`)	2020-04-07 15:25:45 +02:00
Dimitri Savineau	47be2a2719	tests: register mark in pytest configuration Unregister marks generates warnings like: PytestUnknownMarkWarning: Unknown pytest.mark.docker - is this a typo? You can register custom marks to avoid this warning https://docs.pytest.org/en/latest/mark.html Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ac4f8763aa`)	2020-04-07 15:25:45 +02:00
Dimitri Savineau	1cfb84ae94	tests: add dashboard testinfra configuration This commit adds basic tests for grafana, prometheus, node-exporter and ceph mgr dashboard services. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f2c6281207`)	2020-04-07 15:25:45 +02:00
Guillaume Abrioux	825aed5ec1	ceph_key: remove 'update' state With this change, the state `present` is enough to update a keyring. If the keyring already exist, it will be updated if caps or secret passed to the module are different. If the keyring doen't exist, it will be created. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1808367 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `553584cbd0`)	2020-04-01 18:08:51 -04:00
Guillaume Abrioux	03355aec8c	tests: add more coverage in external_clients scenario Run create_users_keys.yml in external_clients scenario Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8c1c34b201`)	2020-03-31 19:42:40 -04:00
Dimitri Savineau	dcd02e6494	ceph_volume: fix multiple db/wal devices When using the lvm batch ceph-volume subcommand with dedicated devices for bluestore (db/wal) then the list of devices is convert to a string instead of being extended via an iterable. This was working with only one dedicated device but starting with more then the ceph_volume module fails. TASK [ceph-osd : use ceph-volume lvm batch to create bluestore osds] ** fatal: [xxxxxx]: FAILED! => changed=true cmd: - ceph-volume - --cluster - ceph - lvm - batch - --bluestore - --yes - --prepare - --osds-per-device - '4' - /dev/nvme2n1 - /dev/nvme3n1 - /dev/nvme4n1 - /dev/nvme5n1 - /dev/nvme6n1 - --db-devices - /dev/nvme0n1 /dev/nvme1n1 - --report - --format=json msg: non-zero return code rc: 2 stderr: \|2- stderr: lsblk: /dev/nvme0n1 /dev/nvme1n1: not a block device stderr: error: /dev/nvme0n1 /dev/nvme1n1: No such file or directory stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected. usage: ceph-volume lvm batch [-h] [--db-devices [DB_DEVICES [DB_DEVICES ...]]] [--wal-devices [WAL_DEVICES [WAL_DEVICES ...]]] [--journal-devices [JOURNAL_DEVICES [JOURNAL_DEVICES ...]]] [--no-auto] [--bluestore] [--filestore] [--report] [--yes] [--format {json,pretty}] [--dmcrypt] [--crush-device-class CRUSH_DEVICE_CLASS] [--no-systemd] [--osds-per-device OSDS_PER_DEVICE] [--block-db-size BLOCK_DB_SIZE] [--block-wal-size BLOCK_WAL_SIZE] [--journal-size JOURNAL_SIZE] [--prepare] [--osd-ids [OSD_IDS [OSD_IDS ...]]] [DEVICES [DEVICES ...]] ceph-volume lvm batch: error: Unable to proceed with non-existing device: /dev/nvme0n1 /dev/nvme1n1 So the dedicated device list is considered as a single string. This commit also adds the block_db_devices and wal_devices documentation to the ceph_volume module. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1816713 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `760b6cd7b0`)	2020-03-30 10:04:26 -04:00
Guillaume Abrioux	d682cf6de5	tests: add inventory host for 5.0 upgrade job This inventory is intended to be used in the upgrade scenario in stable-5.0 branch. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-26 11:23:23 +01:00
Dimitri Savineau	55c222d088	dashboard: allow to set read-only admin user This commit allows one to set the role for the admin user as read-only. This can be controlled via the dashboard_admin_user_ro variable but the default value is false for backward compatibility. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1810176 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `fb69f6990c`)	2020-03-19 13:24:05 -04:00
Guillaume Abrioux	c26e80fdbf	rgw: add multi-instances support when deploying multisite This commit adds the multi-instances when deploying rgw multisite Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `60a2e28189`)	2020-03-12 19:04:26 -04:00
Dimitri Savineau	3fc4cc9f62	tests/requirements: bump testinfra 3.4 is the latest testinfra release available but python2 is dropped starting 4.0. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ccec67aa6a`)	2020-03-09 13:33:44 +01:00
Guillaume Abrioux	b9e397ebaf	tests: add more osd nodes in all_daemons scenario This commit adds more osd nodes in all_daemons scenario in order to test erasure pool creation. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9f0c6df94f`)	2020-03-06 16:10:03 +01:00
Guillaume Abrioux	0800c60721	tests: update ooo job This commit changes the value passed for the attribute 'rule_name' in openstack_pools definition. It doesn't make sense to have emptry string as passed value here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `248978596a`)	2020-03-06 16:10:03 +01:00
Guillaume Abrioux	6cc9c28c5d	tests: add erasure pool creation test in CI This commit makes the CI testing an OSD pool erasure creation due to the recent refact of the OSD pool creation tasks in the playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8cacba1f54`)	2020-03-06 16:10:03 +01:00
Guillaume Abrioux	01559892d1	tests: enable pg autoscaler on 1 pool This commit enables the pg autoscaler on 1 pool. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a3b797e059`)	2020-03-06 16:10:03 +01:00
Ali Maredia	2c440d4427	rgw multisite: enable more than 1 realm per cluster Make it so that more than one realm, zonegroup, or zone can be created during a run of the rgw multisite ansible playbooks. The rgw hosts now need to be grouped into zones and realms in the inventory. .yml files need to be created in group_vars for the realms and zones. Sample yaml files are available. Also remove multsite destroy playbook and add --cluster before radosgw-admin commands remove manually added rgw_zone_endpoints var and have ceph-ansible automatically add the correct endpoints of all the rgws in a rgw_zone from the information provided in that rgws hostvars. Signed-off-by: Ali Maredia <amaredia@redhat.com> (cherry picked from commit `71f55bd54d`)	2020-03-04 14:39:23 -05:00
Guillaume Abrioux	96b7857347	requirements: enforce ansible version requirement See https://github.com/advisories/GHSA-3m93-m4q6-mc6v Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a2d2e70ac2`)	2020-02-27 09:56:55 -05:00
Ali Maredia	7d2a217270	rgw: extend automatic rgw pool creation capability Add support for erasure code pools. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731148 Signed-off-by: Ali Maredia <amaredia@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1834c1e48d`)	2020-02-17 17:44:53 -05:00
Guillaume Abrioux	e3cd719ebe	tests: add external_clients scenario This commit adds a new 'external ceph clients' scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `641729357e`)	2020-01-31 13:37:10 +01:00
Guillaume Abrioux	2e7d7b70ed	tests: set dashboard\|grafana_admin_password Set these 2 variables in all test scenarios where `dashboard_enabled` is `True` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c040199c8f`)	2020-01-29 14:15:41 +01:00
Guillaume Abrioux	d5dca5087a	tests: add 'all_in_one' scenario Add new scenario 'all_in_one' in order to catch more collocated related issues. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3e7dbb4b16`)	2020-01-27 17:54:39 -05:00
Dimitri Savineau	0abea70e29	filestore-to-bluestore: fix osd_auto_discovery When osd_auto_discovery is set then we need to refresh the ansible_devices fact between after the filestore OSD purge otherwise the devices fact won't be populated. Also remove the gpt header on ceph_disk_osds_devices because the devices is empty at this point for osd_auto_discovery. Adding the bool filter when needed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `bb3eae0c80`)	2020-01-22 10:06:17 +01:00
Dimitri Savineau	e4965e9ea9	filestore-to-bluestore: --destroy with raw devices We still need --destroy when using a raw device otherwise we won't be able to recreate the lvm stack on that device with bluestore. Running command: /usr/sbin/vgcreate -s 1G --force --yes ceph-bdc67a84-894a-4687-b43f-bcd76317580a /dev/sdd stderr: Physical volume '/dev/sdd' is already in volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151' Unable to add physical volume '/dev/sdd' to volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151' /dev/sdd: physical volume not initialized. --> Was unable to complete a new OSD, will rollback changes Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792227 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f995b079a6`)	2020-01-21 18:26:55 +01:00
Guillaume Abrioux	fc7212b192	tests: add time command in vagrant_up.sh monitor how long it takes to get all VMs up and running Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `16bcef4f28`)	2020-01-10 17:41:27 +01:00
Guillaume Abrioux	2c96155c32	tests: retry to fire up VMs on vagrant failure Add a script to retry several times to fire up VMs to avoid vagrant failures. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `1ecb3a9352`)	2020-01-10 17:41:27 +01:00
Guillaume Abrioux	7c2918d684	tests: add a docker2podman scenario This commit adds a new scenario in order to test docker-to-podman.yml migration playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `dc672e86ec`)	2020-01-10 17:41:27 +01:00
Dimitri Savineau	bd016960cf	ceph-osd: add device class to crush rules This adds device class support to crush rules when using the class key in the rule dict via the create-replicated sub command. If the class key isn't specified then we use the create-simple sub command for backward compatibility. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ef2cb99f73`)	2020-01-10 11:07:25 -05:00
Dimitri Savineau	09ccf22052	tests: use community repository We don't need to use dev repository on stable branches. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-09 21:39:23 +01:00
Dimitri Savineau	d625cefbac	tests: use ceph iscsi stable repository The ceph iscsi repository was still set to dev (shaman) instead of using the stable ceph-iscsi repository. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-08 19:29:59 +01:00
Guillaume Abrioux	78799ecf55	tests: add filestore_to_bluestore job This commit adds a new job in order to test the filestore-to-bluestore.yml infrastructure playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `40de34fb5e`)	2019-12-11 16:37:21 +01:00
Dimitri Savineau	acf2476f09	tests: reduce max_mds from 3 to 2 Having max_mds value equals to the number of mds nodes generates a warning in the ceph cluster status: cluster: id: 6d3e49a4-ab4d-4e03-a7d6-58913b8ec00a' health: HEALTH_WARN' insufficient standby MDS daemons available' (...) services: mds: cephfs:3 {0=mds1=up:active,1=mds0=up:active,2=mds2=up:active}' Let's use 2 active and 1 standby mds. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `4a6d19dae2`)	2019-12-04 17:49:33 -05:00
Guillaume Abrioux	99cdcf9d29	tests: add coverage on purge playbook This commit adds a playbook to be played before we run purge playbook, it first creates an rbd image then map an rbd device on client0 so the purge playbook will try to unmap it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `db77fbda15`)	2019-11-14 10:49:38 -05:00
Dimitri Savineau	54365389f8	tests/requirements: bump testinfra and pytest The ansible ssh connections are now using the ssh backend instead of paramiko starting testinfra 3.1 and persistent connections too. pytest 4.6 is the latest release to be supported by python 2. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `02df2ab5ea`)	2019-11-04 12:58:37 -05:00
Dimitri Savineau	359abedbb1	move library/plugins tests files under tests dir To avoid unnecessary ansible warnings during playbook execution we can move the library and plugins test files under a different directory. [WARNING]: Skipping plugin (plugins/filter/test_ipaddrs_in_ranges.py) as it seems to be invalid: cannot import name 'ipaddrs_in_ranges' Closes: #4656 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `6ce4fde820`)	2019-10-28 15:54:31 +01:00
Guillaume Abrioux	2fd51f8097	tests: use osd ids instead of device name in ooo_collocation on master, it doesn't make sense anymore to use device name, we should use osd id instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b5a61fe2e3`)	2019-10-23 17:17:24 +02:00

1 2 3 4 5 ...

538 Commits (9412c44906a75ae250069cf2017134d1cda8478c)