ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	86bb734397	filestore-to-bluestore: umount partitions before zapping them When an OSD is stopped, it leaves partitions mounted. We must umount them before zapping them, otherwise error like "Device is busy" will show up. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8056514134`)	2020-01-08 11:41:48 -05:00
Guillaume Abrioux	27b1fc8981	shrink-mds: do not play ceph-facts entirely We only need to set `container_binary`. Let's use `tasks_from` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0ae0a9ce28`)	2020-01-08 11:18:45 -05:00
Guillaume Abrioux	edbb207680	shrink-mds: use fact from delegated node The command is delegated on the first monitor so we must use the fact `container_binary` from this node. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `77b39d235b`)	2020-01-08 11:18:45 -05:00
Guillaume Abrioux	0eaa66f394	shrink-mds: fix filesystem removal task This commit deletes the filesystem when no more MDS is present after shrinking operation. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787543 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `38278a6bb5`)	2020-01-08 11:18:45 -05:00
Guillaume Abrioux	bfd26e7f78	shrink-mds: ensure max_mds is always honored This commit prevent from shrinking an mds node when max_mds wouldn't be honored after that operation. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2cfe5a04bf`)	2020-01-08 11:18:45 -05:00
Guillaume Abrioux	19068659c7	filestore-to-bluestore: ensure all dm are closed This commit adds a task to ensure device mappers are well closed when lvm batch scenario is used. Otherwise, OSDs can't be redeployed given that devices that are rejected by ceph-volume because they are locked. Adding a condition `devices \| default([]) \| length > 0` to remove these dm only when using lvm batch scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8e6ef818a2`)	2019-12-11 16:37:21 +01:00
Guillaume Abrioux	99ac694cc0	filestore-to-bluestore: force OSDs to be marked down Otherwise, sometimes it can take a while for an OSD to be seen as down and causes the `ceph osd purge` command to fail. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `51d601193e`)	2019-12-11 16:37:21 +01:00
Guillaume Abrioux	586f6f6262	filestore-to-bluestore: do not use --destroy Do not use `--destroy` when zapping a device. Otherwise, it destroys VGs while they are still needed to redeploy the OSDs. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e3305e6bb6`)	2019-12-11 16:37:21 +01:00
Guillaume Abrioux	d2b1506712	filestore-to-bluestore: add non containerized support This commit adds the non containerized context support to the filestore-to-bluestore.yml infrastructure playbook. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4833b85e04`)	2019-12-11 16:37:21 +01:00
Guillaume Abrioux	5062d4094c	update: restart iscsigws daemons after upgrade In containerized context, containers aren't stopped early in the sequence. It means they aren't restarted after the upgrade because the task is just checking the daemon status is started (eg: `state: started`). This commit also removes the task which ensure services are started because it's already done in the role ceph-iscsigw. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c7708eb458`)	2019-12-11 08:48:34 -05:00
Guillaume Abrioux	fe8858af38	upgrade: add dashboard deployment when upgrading from RHCS 3, dashboard has obviously never been deployed and it forces us to deploy it later manually. This commit adds the dashboard deployment as part of the upgrade to RHCS 4. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1779092 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `451c5ca934`)	2019-12-11 08:48:34 -05:00
Dimitri Savineau	3b26df8c75	purge-cluster: add podman support The podman support was added to the purge-container-cluster playbook but containers are always used for the dashboard even on non containerized deployment. This commits adds the podman support on purging the dashboard resources in the purge-cluster playbook. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `89f6cc54a2`)	2019-12-04 18:00:07 -05:00
Guillaume Abrioux	1c03d2b526	purge: rename playbook (container) Since we now support podman, let's rename the playbook so it's more generic. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7bc7e3669d`)	2019-12-04 09:12:41 -05:00
Dimitri Savineau	98392be368	add-{mon,osd}: run raw install python tasks If the new mon/osd node doesn't have python installed then we need to execute the tasks from raw_install_python.yml. Closes: #4368 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `34b03d1873`)	2019-12-04 10:59:39 +01:00
Dimitri Savineau	a325ff61e8	switch_to_containers: fix umount ceph partitions When a container is already running on a non containerized node then the umount ceph partition task is skipped. This is due to the container ps command which always returns 0 even if the filter matches nothing. We should run the umount task when: 1/ the container command is failing (not installed) : rc != 0 2/ the container command reports running ceph-osd containers : rc == 0 Also we should not fail on the ceph directory listing. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1616159 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `39cfe0aa65`)	2019-12-03 15:58:36 +01:00
Guillaume Abrioux	1e7fd9fe36	purge: do not try to stop docker when binary is podman If the container binary is podman, we shouldn't try to stop docker here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b18476a1a6`)	2019-12-03 09:57:11 -05:00
Guillaume Abrioux	6592caab08	facts: isolate container_binary facts in order to be able to call container_binary without having to run the whole ceph-facts role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fe5ffe589e`)	2019-12-03 09:57:11 -05:00
Guillaume Abrioux	1f30327688	purge: remove docker_* task All containers are removed when systemd stops them. There is no need to call this module in purge container playbook. This commit also removes all docker_image task and remove all container images in the final cleanup play. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1776736 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d23383a820`)	2019-12-03 09:57:11 -05:00
Guillaume Abrioux	88d060f6e1	docker2podman: import ceph-handler role This is needed to avoid following error: ``` ERROR! The requested handler 'restart ceph mons' was not found in either the main handlers list nor in the listening handlers list ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a43a872105`)	2019-12-03 10:44:48 +01:00
Guillaume Abrioux	3bd8129859	docker2podman: do not hardcode group name let's use `client_group_name` instead of hardcoding the name. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7fe0d55eff`)	2019-12-03 10:44:48 +01:00
Guillaume Abrioux	c5145ccf25	docker2podman: import ceph-defaults in first play We must import this role in the first play otherwise the first call to `client_group_name`fails. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6526a25ab5`)	2019-12-03 10:44:48 +01:00
Guillaume Abrioux	15b78ae252	purge: use sysfs to unmap rbd devices in containerized context, using the binary provided in atomic os won't work because it's an old version provided by ceph-common based on 10.2.5. Using a container could be an idea but for large cluster with hundreds of client nodes, that would require to pull the image of each of them just to unmap the rbd devices. Let's use the sysfs method in order to avoid any issue related to ceph version that is shipped on the host. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1766064 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3cfcc7a105`)	2019-11-14 10:49:38 -05:00
Guillaume Abrioux	e4c657d711	update: add default values when setting fact This commit adds a default value in the `with_dict` because when using python 2.7, if a task using a `with_dict` has a condition, it is evaluated anyway whereas in python 3 it isn't. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1766499 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e9823f319b`)	2019-10-29 16:00:21 -04:00
Dimitri Savineau	56f0cf79d9	rolling_update: remove default filter on mds group There's no need to use the default filter on active/standby groups because if the group doesn't exist then the play is just skipped. Currently this generates warnings like: [WARNING]: Could not match supplied host pattern, ignoring: \| [WARNING]: Could not match supplied host pattern, ignoring: default([]) Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2ca79fcc99`)	2019-10-28 13:08:33 -04:00
Dimitri Savineau	ba4059d15a	rolling_update: fix active mds host value The active mds host should be based on the inventory hostname and not on the ansible hostname. The value returns under the mdsmap structure is based on the OS hostname so we need to find the right node in the inventory with this value when doing operation on inventory nodes. Othewise we could see error like: The task includes an option with an undefined variable. The error was: "hostvars[foobar]" is undefined Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f1f2352c79`)	2019-10-28 13:08:33 -04:00
Dimitri Savineau	b547ad9e71	rolling_update: fix reset mon_host variable mon_host should use the inventory hostname and not the node hostname. Fix creates an issue when the inventory and node hostname are different. Closes: #4670 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `650bc0c3f0`)	2019-10-26 08:20:54 -04:00
Dimitri Savineau	ff3bea871d	add-mon: add missing become flag Without the become flag set to true, we can't executed the roles successfully. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `77b212833e`)	2019-10-26 08:18:27 -04:00
Guillaume Abrioux	3625ea6ef8	update: use right node when creating active mds group This must be consistent with what is used in `name` parameter. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d06057ebd2`)	2019-10-25 09:42:52 +02:00
Guillaume Abrioux	73d97f525e	update: avoid skipping single mds deployment upgrade otherwise a single MDS would never be updated. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d8ab11d2f8`)	2019-10-25 09:42:52 +02:00
Guillaume Abrioux	c599af6724	update: skip mds deactivation when no mds in inventory Let's skip this part of the code if there's no mds node in the inventory. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5ec906c3af`)	2019-10-25 09:42:52 +02:00
Dimitri Savineau	f36306ebf4	add-{mon,osd}: add ceph-container-engine role The ceph-container-engine role is missing from both playbooks so the container engine (docker, podman) isn't install resulting in a failure on the added nodes. fatal: [xxxxx]: FAILED! => changed=false cmd: docker --version msg: '[Errno 2] No such file or directory' rc: 2 Closes: #4634 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `bfb1d6be12`)	2019-10-24 20:01:04 -04:00
Guillaume Abrioux	4a5d3c3c2d	update: add missing quotes Add missing quote in order to keep consistency. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8d72ff8e5e`)	2019-10-21 13:26:37 -04:00
Dimitri Savineau	703c834dab	Move the dashboard playbook in the main directory The [group\|host]_vars directories are ignored for the dashboard playbook when the inventory file directory doesn't contain those directories. Closes: #4601 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1761612 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `8426856262`)	2019-10-18 19:32:42 -04:00
Guillaume Abrioux	9bc7f8a7d7	tests: add multimds coverage This commit makes the all_daemons scenario deploying 3 mds in order to cover the multimds case. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `25b98b2ce3`)	2019-10-18 22:09:04 +02:00
Guillaume Abrioux	bc3138eff4	upgrade: fix standby_mdss group creation This commit fixes the standby_mdss group creation by using `{{ item }}`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c4fc8cc878`)	2019-10-18 22:09:04 +02:00
Guillaume Abrioux	c962d87def	update: follow new recommandation to upgrade mds cluster Refact the mds cluster upgrade code in order to follow the documented recommandation. See: https://github.com/ceph/ceph/blob/master/doc/cephfs/upgrading.rst Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1569689 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `71cebf80a6`)	2019-10-16 12:59:08 -04:00
Dimitri Savineau	0b49538621	Execute common roles once on all nodes The common roles don't need to be executed again on each group plays (like mons, osds, etc..). We only need to execute them during the first play. That wat, we will apply the changes on all nodes in parallel instead of doing it once per group. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `68a3dac7cd`)	2019-10-16 10:41:32 -04:00
Dimitri Savineau	fd759f97fa	dashboard: disable facts gathering This is already done in the main playbooks but absent in the dashboard playbook. The facts are already gathered during the first play of the main playbooks so we don't need to doing twice. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5ae7304ace`)	2019-10-14 09:45:11 +02:00
Guillaume Abrioux	ebfe7f31ed	dashboard: if no host is available, let's just skip these plays. If there is no host available, let's just skip these plays. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1759917 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0b245bd007`)	2019-10-09 14:47:36 -04:00
Dimitri Savineau	5f91be8740	switch_to_containers: umount osd lockbox partition When switching from a baremetal deployment to a containerized deployment we only umount the OSD data partition. If the OSD is encrypted (dmcrypt: true) then there's an additional partition (part number 5) used for the lockbox and mount in the /var/lib/ceph/osd-lockbox/ directory. Because this partition isn't umount then the containerized OSD aren't able to start. The partition is still mount by the system and can't be remount from the container. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1616159 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `19edf707a5`)	2019-10-08 00:57:05 +00:00
Guillaume Abrioux	b325cc386e	switch_to_containers: do not re-set `ceph_uid` This commit refacts the way we set `ceph_uid` fact in `ceph-facts` and removes all `set_fact` tasks for `ceph_uid` in switch-to-containers playbook to avoid duplicated code. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fa9b42e98e`)	2019-10-07 10:18:17 -04:00
Guillaume Abrioux	468aa5d63b	switch_to_containers: optimize ownership change As per https://github.com/ceph/ceph-ansible/pull/4323#issuecomment-538420164 using `find` command should be faster. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1757400 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-Authored-by: Giulio Fidente <gfidente@redhat.com> (cherry picked from commit `c5d0c90bb7`)	2019-10-07 10:18:17 -04:00
Guillaume Abrioux	37fd0b179b	update: import ceph-defaults role in first play Typical error: ``` fatal: [mon0]: FAILED! => msg: \|- The conditional check 'not delegate_facts_host \| bool or inventory_hostname in groups.get(client_group_name, [])' failed. The error was: error while evaluating conditional (not delegate_facts_host \| bool or inventory_hostname in groups.get(client_group_name, [])): 'client_group_name' is undefined ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8138d4193c`)	2019-10-07 11:21:23 +02:00
Guillaume Abrioux	9a4fcfabe1	main: exclude client nodes from facts gathering when delegate_facts_host This commit excludes client nodes from facts gathering, they are not needed and can speed up this task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `865d2eac9b`)	2019-10-07 11:21:23 +02:00
Dimitri Savineau	ec1c57f690	dashboard: remove useless block section The block section were used with the dashboard_enabled condition when the code was included in the main playbooks. Because this condition isn't present in the dashboard playbook anymore we can remove the block section. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cf47594b47`)	2019-10-04 13:28:37 +02:00
Guillaume Abrioux	9a79ed1bf0	rgw: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/` directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e08194dd67`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	7f902994b3	rbdmirror: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/` directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c69816c6b7`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	d7a06c67db	iscsigw: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/ directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4636f3f7e2`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	b564c37696	upgrade: add an infra playbook to migrate systemd units to podman this commit adds a new playbook to force systemd units for containers to use podman instead of docker. This is needed in the rhel8 upgrade context so after the base OS is upgraded containers can be started using podman. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f2017dcda2`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	4afe1b748c	update: reset mon_host after mons upgrade after all mon are upgraded, let's reset mon_host which is used in the rest of the playbook for setting `container_exec_cmd` so we are sure to use the right value. Typical error: ``` failed: [mds0 -> mon0] (item={u'path': u'/var/lib/ceph/bootstrap-mds/ceph.keyring', u'name': u'client.bootstrap-mds', u'copy_key': True}) => changed=true ansible_loop_var: item cmd: - docker - exec - ceph-mon-mon2 - ceph - --cluster - ceph - auth - get - client.bootstrap-mds delta: '0:00:00.016294' end: '2019-09-27 13:54:58.828835' item: copy_key: true name: client.bootstrap-mds path: /var/lib/ceph/bootstrap-mds/ceph.keyring msg: non-zero return code rc: 1 start: '2019-09-27 13:54:58.812541' stderr: 'Error response from daemon: No such container: ceph-mon-mon2' stderr_lines: <omitted> stdout: '' stdout_lines: <omitted> ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d84160a170`)	2019-09-28 09:01:16 +02:00

1 2 3 4 5 ...

522 Commits (86bb734397f7b1f77a005cdded39b229096dc8f4)