ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	5062d4094c	update: restart iscsigws daemons after upgrade In containerized context, containers aren't stopped early in the sequence. It means they aren't restarted after the upgrade because the task is just checking the daemon status is started (eg: `state: started`). This commit also removes the task which ensure services are started because it's already done in the role ceph-iscsigw. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c7708eb458`)	2019-12-11 08:48:34 -05:00
Guillaume Abrioux	fe8858af38	upgrade: add dashboard deployment when upgrading from RHCS 3, dashboard has obviously never been deployed and it forces us to deploy it later manually. This commit adds the dashboard deployment as part of the upgrade to RHCS 4. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1779092 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `451c5ca934`)	2019-12-11 08:48:34 -05:00
Dimitri Savineau	3b26df8c75	purge-cluster: add podman support The podman support was added to the purge-container-cluster playbook but containers are always used for the dashboard even on non containerized deployment. This commits adds the podman support on purging the dashboard resources in the purge-cluster playbook. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `89f6cc54a2`)	2019-12-04 18:00:07 -05:00
Guillaume Abrioux	1c03d2b526	purge: rename playbook (container) Since we now support podman, let's rename the playbook so it's more generic. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7bc7e3669d`)	2019-12-04 09:12:41 -05:00
Dimitri Savineau	98392be368	add-{mon,osd}: run raw install python tasks If the new mon/osd node doesn't have python installed then we need to execute the tasks from raw_install_python.yml. Closes: #4368 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `34b03d1873`)	2019-12-04 10:59:39 +01:00
Dimitri Savineau	a325ff61e8	switch_to_containers: fix umount ceph partitions When a container is already running on a non containerized node then the umount ceph partition task is skipped. This is due to the container ps command which always returns 0 even if the filter matches nothing. We should run the umount task when: 1/ the container command is failing (not installed) : rc != 0 2/ the container command reports running ceph-osd containers : rc == 0 Also we should not fail on the ceph directory listing. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1616159 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `39cfe0aa65`)	2019-12-03 15:58:36 +01:00
Guillaume Abrioux	1e7fd9fe36	purge: do not try to stop docker when binary is podman If the container binary is podman, we shouldn't try to stop docker here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b18476a1a6`)	2019-12-03 09:57:11 -05:00
Guillaume Abrioux	6592caab08	facts: isolate container_binary facts in order to be able to call container_binary without having to run the whole ceph-facts role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fe5ffe589e`)	2019-12-03 09:57:11 -05:00
Guillaume Abrioux	1f30327688	purge: remove docker_* task All containers are removed when systemd stops them. There is no need to call this module in purge container playbook. This commit also removes all docker_image task and remove all container images in the final cleanup play. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1776736 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d23383a820`)	2019-12-03 09:57:11 -05:00
Guillaume Abrioux	88d060f6e1	docker2podman: import ceph-handler role This is needed to avoid following error: ``` ERROR! The requested handler 'restart ceph mons' was not found in either the main handlers list nor in the listening handlers list ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a43a872105`)	2019-12-03 10:44:48 +01:00
Guillaume Abrioux	3bd8129859	docker2podman: do not hardcode group name let's use `client_group_name` instead of hardcoding the name. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7fe0d55eff`)	2019-12-03 10:44:48 +01:00
Guillaume Abrioux	c5145ccf25	docker2podman: import ceph-defaults in first play We must import this role in the first play otherwise the first call to `client_group_name`fails. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6526a25ab5`)	2019-12-03 10:44:48 +01:00
Guillaume Abrioux	15b78ae252	purge: use sysfs to unmap rbd devices in containerized context, using the binary provided in atomic os won't work because it's an old version provided by ceph-common based on 10.2.5. Using a container could be an idea but for large cluster with hundreds of client nodes, that would require to pull the image of each of them just to unmap the rbd devices. Let's use the sysfs method in order to avoid any issue related to ceph version that is shipped on the host. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1766064 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3cfcc7a105`)	2019-11-14 10:49:38 -05:00
Guillaume Abrioux	e4c657d711	update: add default values when setting fact This commit adds a default value in the `with_dict` because when using python 2.7, if a task using a `with_dict` has a condition, it is evaluated anyway whereas in python 3 it isn't. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1766499 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e9823f319b`)	2019-10-29 16:00:21 -04:00
Dimitri Savineau	56f0cf79d9	rolling_update: remove default filter on mds group There's no need to use the default filter on active/standby groups because if the group doesn't exist then the play is just skipped. Currently this generates warnings like: [WARNING]: Could not match supplied host pattern, ignoring: \| [WARNING]: Could not match supplied host pattern, ignoring: default([]) Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2ca79fcc99`)	2019-10-28 13:08:33 -04:00
Dimitri Savineau	ba4059d15a	rolling_update: fix active mds host value The active mds host should be based on the inventory hostname and not on the ansible hostname. The value returns under the mdsmap structure is based on the OS hostname so we need to find the right node in the inventory with this value when doing operation on inventory nodes. Othewise we could see error like: The task includes an option with an undefined variable. The error was: "hostvars[foobar]" is undefined Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f1f2352c79`)	2019-10-28 13:08:33 -04:00
Dimitri Savineau	b547ad9e71	rolling_update: fix reset mon_host variable mon_host should use the inventory hostname and not the node hostname. Fix creates an issue when the inventory and node hostname are different. Closes: #4670 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `650bc0c3f0`)	2019-10-26 08:20:54 -04:00
Dimitri Savineau	ff3bea871d	add-mon: add missing become flag Without the become flag set to true, we can't executed the roles successfully. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `77b212833e`)	2019-10-26 08:18:27 -04:00
Guillaume Abrioux	3625ea6ef8	update: use right node when creating active mds group This must be consistent with what is used in `name` parameter. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d06057ebd2`)	2019-10-25 09:42:52 +02:00
Guillaume Abrioux	73d97f525e	update: avoid skipping single mds deployment upgrade otherwise a single MDS would never be updated. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d8ab11d2f8`)	2019-10-25 09:42:52 +02:00
Guillaume Abrioux	c599af6724	update: skip mds deactivation when no mds in inventory Let's skip this part of the code if there's no mds node in the inventory. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5ec906c3af`)	2019-10-25 09:42:52 +02:00
Dimitri Savineau	f36306ebf4	add-{mon,osd}: add ceph-container-engine role The ceph-container-engine role is missing from both playbooks so the container engine (docker, podman) isn't install resulting in a failure on the added nodes. fatal: [xxxxx]: FAILED! => changed=false cmd: docker --version msg: '[Errno 2] No such file or directory' rc: 2 Closes: #4634 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `bfb1d6be12`)	2019-10-24 20:01:04 -04:00
Guillaume Abrioux	4a5d3c3c2d	update: add missing quotes Add missing quote in order to keep consistency. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8d72ff8e5e`)	2019-10-21 13:26:37 -04:00
Dimitri Savineau	703c834dab	Move the dashboard playbook in the main directory The [group\|host]_vars directories are ignored for the dashboard playbook when the inventory file directory doesn't contain those directories. Closes: #4601 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1761612 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `8426856262`)	2019-10-18 19:32:42 -04:00
Guillaume Abrioux	9bc7f8a7d7	tests: add multimds coverage This commit makes the all_daemons scenario deploying 3 mds in order to cover the multimds case. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `25b98b2ce3`)	2019-10-18 22:09:04 +02:00
Guillaume Abrioux	bc3138eff4	upgrade: fix standby_mdss group creation This commit fixes the standby_mdss group creation by using `{{ item }}`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c4fc8cc878`)	2019-10-18 22:09:04 +02:00
Guillaume Abrioux	c962d87def	update: follow new recommandation to upgrade mds cluster Refact the mds cluster upgrade code in order to follow the documented recommandation. See: https://github.com/ceph/ceph/blob/master/doc/cephfs/upgrading.rst Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1569689 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `71cebf80a6`)	2019-10-16 12:59:08 -04:00
Dimitri Savineau	0b49538621	Execute common roles once on all nodes The common roles don't need to be executed again on each group plays (like mons, osds, etc..). We only need to execute them during the first play. That wat, we will apply the changes on all nodes in parallel instead of doing it once per group. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `68a3dac7cd`)	2019-10-16 10:41:32 -04:00
Dimitri Savineau	fd759f97fa	dashboard: disable facts gathering This is already done in the main playbooks but absent in the dashboard playbook. The facts are already gathered during the first play of the main playbooks so we don't need to doing twice. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5ae7304ace`)	2019-10-14 09:45:11 +02:00
Guillaume Abrioux	ebfe7f31ed	dashboard: if no host is available, let's just skip these plays. If there is no host available, let's just skip these plays. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1759917 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0b245bd007`)	2019-10-09 14:47:36 -04:00
Dimitri Savineau	5f91be8740	switch_to_containers: umount osd lockbox partition When switching from a baremetal deployment to a containerized deployment we only umount the OSD data partition. If the OSD is encrypted (dmcrypt: true) then there's an additional partition (part number 5) used for the lockbox and mount in the /var/lib/ceph/osd-lockbox/ directory. Because this partition isn't umount then the containerized OSD aren't able to start. The partition is still mount by the system and can't be remount from the container. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1616159 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `19edf707a5`)	2019-10-08 00:57:05 +00:00
Guillaume Abrioux	b325cc386e	switch_to_containers: do not re-set `ceph_uid` This commit refacts the way we set `ceph_uid` fact in `ceph-facts` and removes all `set_fact` tasks for `ceph_uid` in switch-to-containers playbook to avoid duplicated code. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fa9b42e98e`)	2019-10-07 10:18:17 -04:00
Guillaume Abrioux	468aa5d63b	switch_to_containers: optimize ownership change As per https://github.com/ceph/ceph-ansible/pull/4323#issuecomment-538420164 using `find` command should be faster. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1757400 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-Authored-by: Giulio Fidente <gfidente@redhat.com> (cherry picked from commit `c5d0c90bb7`)	2019-10-07 10:18:17 -04:00
Guillaume Abrioux	37fd0b179b	update: import ceph-defaults role in first play Typical error: ``` fatal: [mon0]: FAILED! => msg: \|- The conditional check 'not delegate_facts_host \| bool or inventory_hostname in groups.get(client_group_name, [])' failed. The error was: error while evaluating conditional (not delegate_facts_host \| bool or inventory_hostname in groups.get(client_group_name, [])): 'client_group_name' is undefined ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8138d4193c`)	2019-10-07 11:21:23 +02:00
Guillaume Abrioux	9a4fcfabe1	main: exclude client nodes from facts gathering when delegate_facts_host This commit excludes client nodes from facts gathering, they are not needed and can speed up this task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `865d2eac9b`)	2019-10-07 11:21:23 +02:00
Dimitri Savineau	ec1c57f690	dashboard: remove useless block section The block section were used with the dashboard_enabled condition when the code was included in the main playbooks. Because this condition isn't present in the dashboard playbook anymore we can remove the block section. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cf47594b47`)	2019-10-04 13:28:37 +02:00
Guillaume Abrioux	9a79ed1bf0	rgw: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/` directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e08194dd67`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	7f902994b3	rbdmirror: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/` directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c69816c6b7`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	d7a06c67db	iscsigw: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/ directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4636f3f7e2`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	b564c37696	upgrade: add an infra playbook to migrate systemd units to podman this commit adds a new playbook to force systemd units for containers to use podman instead of docker. This is needed in the rhel8 upgrade context so after the base OS is upgraded containers can be started using podman. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f2017dcda2`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	4afe1b748c	update: reset mon_host after mons upgrade after all mon are upgraded, let's reset mon_host which is used in the rest of the playbook for setting `container_exec_cmd` so we are sure to use the right value. Typical error: ``` failed: [mds0 -> mon0] (item={u'path': u'/var/lib/ceph/bootstrap-mds/ceph.keyring', u'name': u'client.bootstrap-mds', u'copy_key': True}) => changed=true ansible_loop_var: item cmd: - docker - exec - ceph-mon-mon2 - ceph - --cluster - ceph - auth - get - client.bootstrap-mds delta: '0:00:00.016294' end: '2019-09-27 13:54:58.828835' item: copy_key: true name: client.bootstrap-mds path: /var/lib/ceph/bootstrap-mds/ceph.keyring msg: non-zero return code rc: 1 start: '2019-09-27 13:54:58.812541' stderr: 'Error response from daemon: No such container: ceph-mon-mon2' stderr_lines: <omitted> stdout: '' stdout_lines: <omitted> ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d84160a170`)	2019-09-28 09:01:16 +02:00
Harald Jensås	5fea830414	Replace ipaddr() with ips_in_ranges() This change implements a filter_plugin that is used in the ceph-facts, ceph-validate roles and infrastucture-playbooks. The new filter plugin will return a list of all IP address that reside in any one of the given IP ranges. The new filter replaces the use of the ipaddr filter. ceph.conf already support a comma separated list of CIDRs for the public_network and cluster_network options. Changes: [1] and [2] introduced a regression in ceph-ansible where public_network can no longer be a comma separated list of cidrs. With this change a comma separated list of subnet CIDRs can also be used for monitor_address_block and radosgw_address_block. [1] commit: `d67230b2a2` [2] commit: `20e4852888` Related-To: https://bugs.launchpad.net/tripleo/+bug/1840030 Related-To: https://bugzilla.redhat.com/show_bug.cgi?id=1740283 Closes: #4333 Please backport to stable-4.0 Signed-off-by: Harald Jensås <hjensas@redhat.com> (cherry picked from commit `e695efcaf7`)	2019-09-27 17:49:46 +02:00
Sam Choraria	7594bc9181	rolling_update.yml: force ceph-volume scan on osds The rolling_update.yml playbook fails when scanning ceph-disk osds while deploying nautilus. The --force flag is required to scan existing osds and rewrite their json metadata. Signed-off-by: Sam Choraria <sam.choraria@bbc.co.uk> (cherry picked from commit `7cc9f93680`)	2019-09-26 14:51:59 -04:00
Guillaume Abrioux	96dafd676c	infrastructure-playbooks: add filestore-to-bluestore.yml This playbook helps to migrate all osds on a node from filestore to bluestore backend. Note that ALL osd on the specified osd nodes will be shrinked and redeployed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3f9ccdaa8a`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	26e0f4db97	lv-create: fix a typo This commit fixes a typo. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c785ad3637`)	2019-09-26 16:21:54 +02:00
Mehdy	8c37894109	shrink-rgw.yml: fix confirmation play's name the confirmation play's name should confirm removing rgw instead of monitor Signed-off-by: Mehdy Khoshnoody <mehdy.khoshnoody@gmail.com> (cherry picked from commit `9fa98d79fd`)	2019-09-25 16:37:44 +02:00
Dimitri Savineau	a5775be7c4	shrink-mon: search mon in the quorum_names list If we're looking at the mon hostname in the ceph status output then there's some scenarios where this could be true. If we collocate some services (mons, mgrs, etc..) then the hostname of the monitor to shrink will still be present in the ceph status (like in mgrs or other). Instead we should check the hostame only in the mon part of the output. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `734c0dc310`)	2019-09-18 14:47:40 +00:00
Kevin Jones	3a8de9cc36	Set proper ownership command performance improvement By changing the set ownership command from using the file module in combination with a with_items loop to a raw chown command, we can achieve a 98% performance increase here. On a ceph cluster with a significant amount of directories and files in /var/lib/ceph, the file module has to run checks on ownership of all those directories and files to determine whether a change is needed. In this case, we just want to explicitly set the ownership of all these directories and files to the ceph_uid Added context note to all set proper ownership tasks Signed-off-by: Kevin Jones <kevinjones@redhat.com> (cherry picked from commit `47bf47c9d8`)	2019-08-22 12:59:58 +02:00
Guillaume Abrioux	236020fb2b	shrink-mon: refact 'verify the monitor is out of the cluster' task use `from_json` filter instead of a `\| python` so we can get rid of the `shell` module usage here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5573f17e76`)	2019-08-19 18:47:14 +00:00
Rishabh Dave	b28ed96378	use pre_tasks and post_tasks in shrink-mon.yml too This commit should've been part of commit `2fb12ae554`. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `2034387f57`)	2019-08-19 18:47:14 +00:00

1 2 3 4 5 ...

513 Commits (98679371171d0a9739f876d830b7def001799a2f)