ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	6c4bb2204a	purge: reset-failed ceph-crash This ensures we always reset-failed the ceph-crash service. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5ab46f836d`)	2022-05-30 15:16:07 +02:00
Guillaume Abrioux	fdf201686e	purge: ceph-crash purge fixes This fixes the service file removal and makes the playbook call `systemctl reset-failed` on the service because in Ceph Nautilus, ceph-crash doesn't handle `SIGTERM` signal. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2f11982590`)	2022-03-04 12:51:51 +01:00
Guillaume Abrioux	5618405b60	adopt: fix node labelling When using group of group, the playbook will apply undesired labels on nodes. This commit fixes it by applying only the expected labels. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2057528 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `266b6e739c`)	2022-03-04 12:51:14 +01:00
Teoman ONAY	be241058d4	Add cluster custom name support When using cluster custom names, cephadm commands are executed using the default admin keyring name which fails. Signed-off-by: Teoman ONAY <tonay@redhat.com> (cherry picked from commit `f8c6bba657`)	2022-03-04 12:51:14 +01:00
Teoman ONAY	10a5e54f8f	Enable user to change the account used for ssh connection By default cephadm uses root account to connect remotely to other nodes in the cluster. This change allows to choose another account. This commit also allows to use a dedicated subnet for cephadm mgmt. Signed-off-by: Teoman ONAY <tonay@redhat.com> (cherry picked from commit `da42f3d139`)	2022-03-04 12:51:14 +01:00
Guillaume Abrioux	d787836350	switch2containers: fail if less than 3 monitors This playbook doesn't support less than 3 monitors present in the inventory. Just like the rolling_update playbook, let's fail if less than 3 monitors are present. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2049132 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f08129edf2`)	2022-02-22 09:23:54 +01:00
Francesco Pantano	0431746d32	Add with_pkg tag on package related tasks In the OpenStack context we let the integration tool (TripleO) deal with repositories and packages. This change just adds the with_pkg tag to allow TripleO skipping both the repositories and packages installation. Signed-off-by: Francesco Pantano <fpantano@redhat.com> (cherry picked from commit `12dd8b5df1`)	2022-02-15 18:18:22 +01:00
jowsiewski	139289bdcd	Remove the remaining packages Signed-off-by: jowsiewski <owsiewski@gmail.com> (cherry picked from commit `1dfd195c7e`)	2022-02-07 13:53:47 +01:00
Guillaume Abrioux	555d0c8037	purge: remove ceph directories on client nodes Otherwise any ceph directories are left over on client nodes after the purge. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2024815 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `20035852a4`)	2022-01-06 10:33:47 +01:00
Guillaume Abrioux	87b24a7e3b	update: speed up client play wip Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `817c03bc0e`)	2021-12-15 13:49:21 +01:00
Guillaume Abrioux	feffbba9d4	cephadm-adopt: ensure /etc/ceph is present on monitoring node When deploying the monitoring stack on a dedicated node, the directory `/etc/ceph` has never been created. Therefore, the play for adopting the monitoring stack fails because it can't write the minimal config file. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2029697 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7ece59b41d`)	2021-12-07 22:56:51 +01:00
Guillaume Abrioux	8f26939da4	cephadm-adopt: bindmount /var/lib/ceph with 'ro' When collocating osds with iscsigw daemons, cephadm bindmounts the following: ``` -v /var/lib/ceph/6126c064-6a9e-4092-8a64-977930df0843/iscsi.rbd.ceph-ameenasuhani-4fs3bq-node5.vomtqb/configfs:/sys/kernel/config ``` this prevents cephadm-adopt playbook from running container and bindmounting `/var/lib/ceph:/var/lib/ceph:z` since 'ro' is enough in this playbook, let's replace the ':z' option on this bindmount with ':ro' Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027411 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c4fdf956bd`)	2021-12-02 08:52:05 +01:00
Guillaume Abrioux	9423ec3eb6	adopt: fix ceph_origin and ceph_repository defaults This is overriding those variables because the precedence at the 'block var' level is greater than the group_vars/host_vars. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2026861 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e5ea2ece99`)	2021-11-30 10:57:34 +01:00
Guillaume Abrioux	efc93f5669	cephadm: support adding hosts with ipv6 The current implementation doesn't support adding hosts when using ipv6 addresses. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4f2c2af9b4`)	2021-11-08 10:36:27 +01:00
Guillaume Abrioux	d06c856fca	cephadm: use public_network when adding hosts When adding host, using ansible_facts['default_ipv4']['address'] might not be the desired network, we shouldn't enforce the subnet with the default route. Let's use the public_network instead. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2006415 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2f34531304`)	2021-11-08 10:36:27 +01:00
Guillaume Abrioux	5f7ad182f9	update: move a set_fact ceph-facts roles makes decisions based on the fact `rolling_update` so it must be called before we run this role. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e5edcc4214`)	2021-11-03 11:50:38 +01:00
Guillaume Abrioux	e63df909af	update: support --limit on monitor nodes Change needed in order to support --limit on mon nodes. Otherwise, a call to `hostvars[groups[mon_group_name][0]]['_current_monitor_address']` throws an error: ``` "The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute '_current_monitor_address'" ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304#c28 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `82eee4303b`)	2021-11-03 08:48:51 +01:00
Guillaume Abrioux	9526425111	rolling_update: modify default health_osd_check_* let's do more retries with a shorter delay. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `50a21d695e`)	2021-10-25 20:38:09 +02:00
Guillaume Abrioux	0d1c0c2813	rolling_update: fix pre and post osd upgrade play when using --limit osds, the play before and after osd upgrade are skipped because we use `hosts: "{{ mon_group_name \| default('mons') }}[0]"` using `hosts: "{{ osds_group_name \| default('osds') }}" with `delegate_to` to the first monitor addresses this issue. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fc9f87c45f`)	2021-10-25 20:15:17 +02:00
Guillaume Abrioux	1019c7bf25	update: support upgrading a subset of nodes It can be useful in a large cluster deployment to split the upgrade and only upgrade a group of nodes at a time. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e5cf9db2b0`)	2021-10-25 20:15:17 +02:00
Guillaume Abrioux	d73dde0fc7	adopt: fix rbd mirror adoption The rbd mirroring is broken because cephadm doesn't bindmount /etc/ceph anymore. It means the keyrings and ceph config file aren't available after the migration. The idea here is to remove the current rbd mirror peer and add it back to the mon config store so we aren't bound to the /etc/ceph directory. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967440 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9c794aa9bc`)	2021-10-25 20:14:24 +02:00
Per Abildgaard Toft	4271670a83	shrink-osd: fix regression because of a wrong regex `968891f449` introduced a regression. The regex is wrong because it doesn't allow to shrink osds with id greater than 9 Fixes: #6950 Signed-off-by: Per Abildgaard Toft <per@minfejl.dk> (cherry picked from commit `84118a3063`)	2021-10-21 12:38:45 +02:00
Guillaume Abrioux	c9582945fa	adopt: import rgw ssl certificate into kv store Without this, when rgw is managed by cephadm, it fails to start because the ssl certificate isn't present in the kv store. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1987010 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1988404 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `930fc4c850`) (cherry picked from commit `6e9cf80747`)	2021-10-18 18:38:47 +02:00
Dimitri Savineau	864acaae10	cephadm-adopt: make the playbook idempotent If the cephadm-adopt.yml fails during the first execution and some daemons have already been adopted by cephadm then we can't rerun the playbook because the old container won't exist anymore. Error: no container with name or ID ceph-mon-xxx found: no such container If the daemons are adopted then the old systemd unit doesn't exist anymore so any call to that unit with systemd will fail. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918424 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `6886700a00`)	2021-10-18 18:38:47 +02:00
Seena Fallah	360cfb156d	cephadm: install cephadm from repository Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `5822936252`)	2021-10-18 18:38:47 +02:00
Seena Fallah	5e5f45d633	cephadm-adopt: configure repository for cephadm installation Configure repository for cephadm installation and use package install in both containerized and non containerized deployment Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `339212a7c6`)	2021-10-18 18:38:47 +02:00
Seena Fallah	057f8e4315	cephadm: set ssh configs at bootstrap step Add support ssh_user and ssh_config to cephadm bootstrap plugin Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `ae6be71b08`)	2021-10-15 15:13:18 +02:00
Guillaume Abrioux	21a4c16b06	shrink-osd: check osd id format This adds a check early in order to ensure the format of osd ids passed is correct. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2005734 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `968891f449`)	2021-10-15 14:35:34 +02:00
Francesco Pantano	2e93c80f73	Add ceph_nfs_adopt tag to the cephadm-adopt playbook There are existing OpenStack scenarios where nfs is still not managed by cephadm. For this reason sometimes is useful skip the nfs part of the adoption playbook and leave this daemon unmanaged. The purpose of this patch is providing a tag to enable the OpenStack operators to skip this playbook section. Closes: https://bugzilla.redhat.com/2009212 Signed-off-by: Francesco Pantano <fpantano@redhat.com> (cherry picked from commit `b7299f258b`)	2021-10-01 23:32:47 +02:00
Seena Fallah	d2da6f8974	cephadm: use cephadm_ssh_user for ssh user Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `0b78faa723`)	2021-10-01 23:32:16 +02:00
Guillaume Abrioux	da10c22500	cephadm-adopt: add no_log: true Let's add a `no_log: true` on the `cephadm registry-login` task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0a3b916ee7`)	2021-09-28 21:15:29 +02:00
Guillaume Abrioux	276b9fd49e	adopt: stop iscsi services in the first place If old containers are still running, it can make tcmu-runner process unable to open devices and there's nothing else to do than restarting the container. Also, as per discussion with iscsi experts, iscsi should be migrated before OSDs. (the client should be closed before the server) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000412 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d12efa1ab4`)	2021-09-28 18:47:02 +02:00
Seena Fallah	25e078f685	purge: add remove_docker tag This can help to skip docker removal tasks Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `ff39c8d70b`)	2021-09-14 20:49:55 +02:00
Seena Fallah	eef429a75b	cephadm-adopt: use cephadm_ssh_user for ssh user Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `67389d08d4`)	2021-09-14 20:49:33 +02:00
Daniel Pivonka	c8cadaa154	cephadm-adopt: set cephadm registry login info registry login info needs to be stored in cluster for cephadm and future hosts Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000103 Signed-off-by: Daniel Pivonka <dpivonka@redhat.com> (cherry picked from commit `1c50dc29cf`)	2021-09-13 16:18:53 +02:00
Seena Fallah	c8841cdf41	purge: add container_binary needed for zap osds `container_binary` isn't set anymore in the purge osd play because of a regression introduced by `60aa70a`. The CI didn't catch it because the play purging node-exporter sets this variable for all nodes before we run the purge osd play. This commit fixes this regression. Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `a51ce767ca`)	2021-09-09 14:40:42 +02:00
Dimitri Savineau	befe57d017	purge-dashboard: remove cid files This adds the service cid file cleanup as supported in the classic purge playbook since `b9dd253` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786691 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cddc23f511`)	2021-09-08 12:05:25 -04:00
Guillaume Abrioux	afe442a18f	containers: introduce target systemd unit This adds ceph-*.target systemd unit files support for containerized deployments. This also fixes a regression introduced by PR #6719 (rgw and nfs systemd units not getting purged) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1962748 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `09ef465f62`)	2021-08-18 13:42:56 -04:00
Guillaume Abrioux	492c2b5389	update: gather facts only one time this play doesn't need to gather facts from localhost Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c14e9114ba`)	2021-08-17 15:31:41 -04:00
VasishtaShastry	3037d394ca	Fixes typo in rgw-add-users-buckets playbook Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com> (cherry picked from commit `478d9fdcb6`)	2021-08-09 14:31:48 -04:00
Dimitri Savineau	bcf9a2c25e	infra: use dedicated variables for balancer status The balancer status is registered during the cephadm-adopt, rolling_update and swith2container playbooks. But it is also used in the ceph-handler role which is included in those playbooks too. Even if the ceph-handler tasks are skipped for rolling_update and switch2container, the balancer_status variable is erased with the skip task result. play1: register: balancer_status play2: register: balancer_status <-- skipped play3: when: (balancer_status.stdout \| from_json)['active'] \| bool This leads to issue like: The conditional check '(balancer_status.stdout \| from_json)['active'] \| bool' failed. The error was: Unexpected templating type error occurred on ({% if (balancer_status.stdout \| from_json)['active'] \| bool %} True {% else %} False {% endif %}): expected string or buffer. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1982054 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `386661699b`)	2021-08-04 11:43:53 -04:00
Dimitri Savineau	561a7c02c0	osds: use osd pool ls instead of osd dump command The ceph osd pool ls detail command is a subset of the ceph osd dump command. $ ceph osd dump --format json\|wc -c 10117 $ ceph osd pool ls detail --format json\|wc -c 4740 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `06471a4b82`)	2021-08-03 13:57:20 -04:00
Dimitri Savineau	380e0bec83	rolling_update: get ceph version when mons exist `eec3878` introduced a regression for upgrade scenarios where there's no monitor nodes at all (like ganesha standalone, external clients, etc..) TASK [get the ceph release being deployed] ********************************** task path: infrastructure-playbooks/rolling_update.yml:121 Thursday 29 July 2021 15:55:29 +0000 (0:00:00.484) 0:00:15.802 ******* fatal: [client0]: FAILED! => msg: '''dict object'' has no attribute ''mons''' Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e87a47cf0c`)	2021-08-03 12:17:53 -04:00
Benoît Knecht	35ce2bb643	infrastructure-playbooks: Get Ceph info in check mode In the `set osd flags` block, run the Ceph commands that gather information from the cluster (and don't make any changes to it) even when running in check mode. This allows the tasks that depend on the variables set by those tasks to succeed in check mode. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `d7653dca95`)	2021-08-02 15:54:04 +02:00
Guillaume Abrioux	b9cc91f622	update: check the ceph release Check early which Ceph release is going to be deployed and fail if it doesn't correspond to the ceph-ansible version being used. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978643 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `eec38784ec`)	2021-07-26 13:39:20 -04:00
Guillaume Abrioux	f085f681f0	purge: support osd_auto_discovery This adds a task that zaps by osd id so we can support the scenario where osds were deployed with `osd_auto_discovery` is true. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876860 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4144074a50`)	2021-07-22 17:10:01 -04:00
Guillaume Abrioux	0ef447704f	purge: merge playbooks This refactor merges the two playbooks so we only have to maintain 1 playbook. (Symlink the old purge-container-cluster.yml playbook for backward compatibility). Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `17cd83bf3a`)	2021-07-22 17:10:01 -04:00
Guillaume Abrioux	b2b2871ccd	purge: drop variables from 'hosts' sections Those variables are useless given this is not possible to override them. Let's replace them with the hardcoded name instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6b50401d0c`)	2021-07-22 17:10:01 -04:00
Dimitri Savineau	06158c2ac5	common: remove unnecessary run_once statements `1303611` introduced tasks for disabling the pg_autoscaler on pools and the balancer but thoses tasks are already executed on the first monitor node so we don't need to add the run_once statement. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `738fa9428a`)	2021-07-21 10:01:25 -04:00
Dimitri Savineau	f9d60644ad	common: fix py2 pool_list from_json when skipped When using python 2 and the task with a loop is skipped then it generates an error. Unexpected templating type error occurred on ({{ (pool_list.stdout \| from_json)['pools'] }}): expected string or buffer Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cf6e33346e`)	2021-07-21 08:57:53 -04:00

1 2 3 4 5 ...

743 Commits (fd8aca866d6f9576e2c5b3ed5a676e0430b23e64)