ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	1044940304	osds: use osd pool ls instead of osd dump command The ceph osd pool ls detail command is a subset of the ceph osd dump command. $ ceph osd dump --format json\|wc -c 10117 $ ceph osd pool ls detail --format json\|wc -c 4740 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `06471a4b82`)	2021-08-03 14:03:35 -04:00
Benoît Knecht	9668137daf	ceph-handler: Fix osd handler in check mode Run the Ceph commands that only gather information (without making any changes to the cluster) when running Ansible in check mode. This allows the tasks that depend on the variables set by those tasks to succeed in check mode. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `498acd7527`)	2021-08-02 15:54:34 +02:00
Dimitri Savineau	17b9ff03d2	common: fix py2 pool_list from_json when skipped When using python 2 and the task with a loop is skipped then it generates an error. Unexpected templating type error occurred on ({{ (pool_list.stdout \| from_json)['pools'] }}): expected string or buffer Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cf6e33346e`)	2021-07-21 09:54:46 -04:00
Guillaume Abrioux	f7882bbc02	common: disable/enable pg_autoscaler The PG autoscaler can disrupt the PG checks so the idea here is to disable it and re-enable it back after the restart is done. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `13036115e2`)	2021-07-21 09:40:18 -04:00
Guillaume Abrioux	611494b88f	rolling_update: fix mon+rgw/multisite collocation When monitors and rgw are collocated with multisite enabled, the rolling_update playbook fails because during the workflow, we run some radosgw-admin commands very early on the first mon even though this is the monitor being upgraded, it means the container doesn't exist since it was stopped. This block is relevant only for scaling out rgw daemons or initial deployment. In rolling_update workflow, it is not needed so let's skip it. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1970232 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f7166cccbf`)	2021-06-14 13:59:16 +02:00
Guillaume Abrioux	3ef9690cd1	docker2podman: skip some role imports from handler when running docker-to-podman playbook, there's no need to call `ceph-config` and `ceph-rgw` from the role `ceph-handler`. It can even have side effects when coming from a baremetal cluster that was previously migrated using the switch-to-containers playbook. Indeed it might complain about missing .target systemd unit since they are removed during that migration. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1944999 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `70f19be367`)	2021-04-12 13:30:31 +02:00
Alex Schultz	7ddbe74712	Use ansible_facts It has come to our attention that using ansible_* vars that are populated with INJECT_FACTS_AS_VARS=True is not very performant. In order to be able to support setting that to off, we need to update the references to use ansible_facts[<thing>] instead of ansible_<thing>. Related: ansible#73654 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1935406 Signed-off-by: Alex Schultz <aschultz@redhat.com> (cherry picked from commit `a7f2fa73e6`)	2021-03-26 00:16:58 +01:00
Guillaume Abrioux	aeee3471e3	rgw: avoid useless call to ceph-rgw since `ceph-rgw` may be called from `ceph-handler` in some contexts we should avoid rerunning it unnecessarily. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8617081664`)	2021-01-28 16:37:50 -05:00
Guillaume Abrioux	290d3ef369	rgw: support switching from single-site to multisite When collocating rgw with either a mon, mgr or osd, switching from single site to a multisite rgw setup failed because of the handlers triggered between the ansible play of the collocated daemon and the play of the rgw. Since the multisite changes are not yet applied the handlers fail. The idea here is to ensure we run the multisite configuration from the ceph-handler role before the restart happens, this way it won't complain because of non existing multisite configuration. (Note: this is also valid when simply changing a multisite configuration) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1888630 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `513c8cfe55`)	2021-01-06 10:38:50 -05:00
Dimitri Savineau	3f16132e44	library: add ceph_osd_flag module This adds ceph_osd_flag ansible module for replacing the command module usage with the ceph osd set/unset commands. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5da593604a`)	2020-12-15 17:36:28 +01:00
Guillaume Abrioux	709deb90cc	handler: refact check_socket_non_container the `stat --printf=%n` returns something like following: ``` ok: [osd0] => changed=false cmd: \|- stat --printf=%n /var/run/ceph/ceph-osd*.asok delta: '0:00:00.009388' end: '2020-10-06 06:18:28.109500' failed_when_result: false rc: 0 start: '2020-10-06 06:18:28.100112' stderr: '' stderr_lines: <omitted> stdout: /var/run/ceph/ceph-osd.2.asok/var/run/ceph/ceph-osd.5.asok stdout_lines: <omitted> ``` it makes the next task "check if the ceph osd socket is in-use" grep like this: ``` ok: [osd0] => changed=false cmd: - grep - -q - /var/run/ceph/ceph-osd.2.asok/var/run/ceph/ceph-osd.5.asok - /proc/net/unix ``` which will obviously fail because this path never exists. It makes the OSD handler broken. Let's use `find` module instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `46d4d97da9`)	2020-10-14 10:31:05 +02:00
Guillaume Abrioux	52826caa51	rgw: fix multi instances scaleout in baremetal When rgw and osd are collocated, the current workflow prevents from scaling out the radosgw_num_instances parameter when rerunning the playbook in baremetal deployments. When ceph-osd notifies handlers, it means rgw handlers are triggered too. The issue with this is that they are triggered before the role ceph-rgw is run. In the case a scaleout operation is expected on `radosgw_num_instances` it causes an issue because keyrings haven't been created yet so the new instances won't start. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1881313 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a802fa2810`)	2020-10-06 09:21:58 -04:00
Dimitri Savineau	fabaec6351	ceph-handler: set handler on xxx_stat result In non containerized deployment we check if the service is running via the socket file presence. This is done via the xxx_socket_stat variable that check the file socket in the /var/run/ceph/ directory. In some scenarios, we could have the socket file still present in that directory but not used by any process. That's why we have the xxx_stat variable which clean those leftovers. The problem here is that we're set the variable for the handlers status (like handler_mon_status) based on xxx_socket_stat instead of xxx_stat. That means we will trigger the handlers if there's an old socket file present on the system without any process associated. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1866834 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `733596582d`)	2020-09-29 16:33:08 +02:00
Dimitri Savineau	182319d58c	ceph-handler: add missing condition on ceph-crash The ceph-crash tasks present in the ceph-handler role don't need to be executed on all nodes. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `18e3c7a0a2`)	2020-09-10 20:35:04 -04:00
Guillaume Abrioux	66dde0034b	ceph-crash: introduce new role ceph-crash This commit introduces a new role `ceph-crash` in order to deploy everything needed for the ceph-crash daemon. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9d2f2108e1`)	2020-09-10 20:35:04 -04:00
Dimitri Savineau	cce042c65b	ceph-handler: remove iscsigws restart scripts The iscsigws restart scripts for tcmu-runner and rbd-target-{api,gw} services only call the systemctl restart command. We don't really need to copy a shell script to do it when we can use the ansible service module instead. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cbe79428e6`)	2020-07-27 09:33:00 -04:00
Guillaume Abrioux	9fb69e13ed	handler: read container_exec_cmd value from first mon Given that we delegate to the first monitor, we must read the value of `container_exec_cmd` from this node. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792320 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `eb9112d8fb`)	2020-01-23 18:34:14 +01:00
Guillaume Abrioux	1462423059	handler: fix call to container_exec_cmd in handler_osds When unsetting the noup flag, we must call container_exec_cmd from the delegated node (first mon member) Also, adding a `run_once: true` because this task needs to be run only 1 time. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792320 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `22865cde9c`)	2020-01-20 12:45:51 -05:00
Guillaume Abrioux	2d85fab02d	osd: support scaling up using --limit This commit lets add-osd.yml in place but mark the deprecation of the playbook. Scaling up OSDs is now possible using --limit Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3496a0efa2`)	2020-01-14 09:12:34 -05:00
Guillaume Abrioux	e001ded6f6	handler: fix bug `411bd07d54` introduced a bug in handlers using `handler__status` instead of `hostvars[item]['handler__status']` causes handlers to be triggered in anycase even though `handler_*_status` was set to `False` on a specific node. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `30200802d9`)	2020-01-08 19:46:11 -05:00
Dimitri Savineau	86b7137b27	ceph-iscsi: notify rbd target services When the iscsi gateway or the ceph configuration file change then we need to notify the rbd target api/gw services to be restarted. This patch also merges the rbd-target-api and rbd-target-gw handler into a single file and listen. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `bc701860d5`)	2019-10-16 11:34:15 -04:00
Dimitri Savineau	3313bc5c1f	ceph-handler: group listen topics and condition We are using multiple listen topics with the handlers. That means that we are notifying 4 tasks for each handler. Instead we can group the listen on an include_tasks and based on the group condition. Before: NOTIFIED HANDLER ceph-handler : set _mon_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy mon restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph mon daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _mon_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _osd_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy osd restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph osds daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _osd_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _mds_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy mds restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph mds daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _mds_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _rgw_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy rgw restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph rgw daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _rgw_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _mgr_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy mgr restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph mgr daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _mgr_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _rbdmirror_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy rbd mirror restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph rbd mirror daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _rbdmirror_handler_called after restart for mon0 After: NOTIFIED HANDLER ceph-handler : mons handler for mon0 NOTIFIED HANDLER ceph-handler : osds handler for mon0 NOTIFIED HANDLER ceph-handler : mdss handler for mon0 NOTIFIED HANDLER ceph-handler : rgws handler for mon0 NOTIFIED HANDLER ceph-handler : mgrs handler for mon0 NOTIFIED HANDLER ceph-handler : rbdmirrors handler for mon0 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `fe9c5b8c68`)	2019-10-15 13:29:06 -04:00
Guillaume Abrioux	13f6a0a22a	handler: followup on #4519 This commit adds some missing `\| bool` filters. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ccc11cfc93`)	2019-10-15 13:29:06 -04:00
Guillaume Abrioux	fd10fbc047	handlers: refact osd handler This commit merges the two restart tasks into a single one, this way it's one task less to notify. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `411bd07d54`)	2019-10-15 13:29:06 -04:00
Guillaume Abrioux	857c68087d	handler: followup on #4519 This commit adds some missing `\| bool` filters. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ccc11cfc93`)	2019-10-07 09:09:36 +02:00
Giulio Fidente	cb66a62ae2	Look for additional names when checking ceph-nfs container status Ganesha cannot be operated active/active, in those deployments where it is managed by pacemaker the container name can be different than the default. This change uses "ceph_nfs_service_suffix" where previously missing to ensure tasks will work with customized names. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1750005 Signed-off-by: Giulio Fidente <gfidente@redhat.com> (cherry picked from commit `d2a2bd7c42`)	2019-09-09 16:48:50 -04:00
Dimitri Savineau	3815add534	ceph-handler: replace fuser by /proc/net/unix We're using fuser command to see if a process is using a ceph unix socket file. But the fuser command runs through every PID present in /proc/<PID> to see if one of them is using the file. On a system running thousands processes, the fuser command can take a long time to finish. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1717011 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `da9891da1e`)	2019-06-12 23:00:36 +02:00
L3D	1daca1ba83	ansible: use 'bool' filter on boolean conditionals By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these: ``` [DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable, this behaviour will go away and you might need to add \|bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This feature will be removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. ``` Now appended ``\| bool`` on a lot of the affected variables. Sometimes the coding style from ``variable\|bool`` changed to ``variable \| bool`` (with spaces at the pipe). Closes: #4022 Signed-off-by: L3D <l3d@c3woc.de> (cherry picked from commit `ab54fe20ec`)	2019-06-07 16:05:51 +02:00
Rishabh Dave	06b3ab2a6b	improve coding style Keywords requiring only one item shouldn't express it by creating a list with single item. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `739a662c80`) Conflicts: roles/ceph-mon/tasks/ceph_keys.yml roles/ceph-validate/tasks/check_devices.yml	2019-05-06 15:09:06 +00:00
Sébastien Han	a96e910114	Add new container scenario Test with podman instead of docker and also support for python 3 only. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	790f52f934	ceph-handler: change osd container check Now that the container is named ceph-osd@<id> looking for something that contains a host is not necessary. This is also backward compatible as it will continue to match container names with hostname in them. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Sébastien Han	4db6a213f7	add ceph-handler role The role contains all the handlers for Ceph services. We decided to leave ceph-defaults role with variables and a few facts only. This is useful when organizing the site.yml files and also adding the known variables to infrastructure-playbooks. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-09-28 15:15:49 +00:00

32 Commits (03ed9e111c43816954919ba08b6ec0b6c5ef3e9c)