ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Francesco Pantano	2e93c80f73	Add ceph_nfs_adopt tag to the cephadm-adopt playbook There are existing OpenStack scenarios where nfs is still not managed by cephadm. For this reason sometimes is useful skip the nfs part of the adoption playbook and leave this daemon unmanaged. The purpose of this patch is providing a tag to enable the OpenStack operators to skip this playbook section. Closes: https://bugzilla.redhat.com/2009212 Signed-off-by: Francesco Pantano <fpantano@redhat.com> (cherry picked from commit `b7299f258b`)	2021-10-01 23:32:47 +02:00
Seena Fallah	d2da6f8974	cephadm: use cephadm_ssh_user for ssh user Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `0b78faa723`)	2021-10-01 23:32:16 +02:00
Guillaume Abrioux	da10c22500	cephadm-adopt: add no_log: true Let's add a `no_log: true` on the `cephadm registry-login` task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0a3b916ee7`)	2021-09-28 21:15:29 +02:00
Guillaume Abrioux	276b9fd49e	adopt: stop iscsi services in the first place If old containers are still running, it can make tcmu-runner process unable to open devices and there's nothing else to do than restarting the container. Also, as per discussion with iscsi experts, iscsi should be migrated before OSDs. (the client should be closed before the server) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000412 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d12efa1ab4`)	2021-09-28 18:47:02 +02:00
Seena Fallah	25e078f685	purge: add remove_docker tag This can help to skip docker removal tasks Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `ff39c8d70b`)	2021-09-14 20:49:55 +02:00
Seena Fallah	eef429a75b	cephadm-adopt: use cephadm_ssh_user for ssh user Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `67389d08d4`)	2021-09-14 20:49:33 +02:00
Daniel Pivonka	c8cadaa154	cephadm-adopt: set cephadm registry login info registry login info needs to be stored in cluster for cephadm and future hosts Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000103 Signed-off-by: Daniel Pivonka <dpivonka@redhat.com> (cherry picked from commit `1c50dc29cf`)	2021-09-13 16:18:53 +02:00
Seena Fallah	c8841cdf41	purge: add container_binary needed for zap osds `container_binary` isn't set anymore in the purge osd play because of a regression introduced by `60aa70a`. The CI didn't catch it because the play purging node-exporter sets this variable for all nodes before we run the purge osd play. This commit fixes this regression. Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `a51ce767ca`)	2021-09-09 14:40:42 +02:00
Dimitri Savineau	befe57d017	purge-dashboard: remove cid files This adds the service cid file cleanup as supported in the classic purge playbook since `b9dd253` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786691 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cddc23f511`)	2021-09-08 12:05:25 -04:00
Guillaume Abrioux	afe442a18f	containers: introduce target systemd unit This adds ceph-*.target systemd unit files support for containerized deployments. This also fixes a regression introduced by PR #6719 (rgw and nfs systemd units not getting purged) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1962748 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `09ef465f62`)	2021-08-18 13:42:56 -04:00
Guillaume Abrioux	492c2b5389	update: gather facts only one time this play doesn't need to gather facts from localhost Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c14e9114ba`)	2021-08-17 15:31:41 -04:00
VasishtaShastry	3037d394ca	Fixes typo in rgw-add-users-buckets playbook Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com> (cherry picked from commit `478d9fdcb6`)	2021-08-09 14:31:48 -04:00
Dimitri Savineau	bcf9a2c25e	infra: use dedicated variables for balancer status The balancer status is registered during the cephadm-adopt, rolling_update and swith2container playbooks. But it is also used in the ceph-handler role which is included in those playbooks too. Even if the ceph-handler tasks are skipped for rolling_update and switch2container, the balancer_status variable is erased with the skip task result. play1: register: balancer_status play2: register: balancer_status <-- skipped play3: when: (balancer_status.stdout \| from_json)['active'] \| bool This leads to issue like: The conditional check '(balancer_status.stdout \| from_json)['active'] \| bool' failed. The error was: Unexpected templating type error occurred on ({% if (balancer_status.stdout \| from_json)['active'] \| bool %} True {% else %} False {% endif %}): expected string or buffer. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1982054 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `386661699b`)	2021-08-04 11:43:53 -04:00
Dimitri Savineau	561a7c02c0	osds: use osd pool ls instead of osd dump command The ceph osd pool ls detail command is a subset of the ceph osd dump command. $ ceph osd dump --format json\|wc -c 10117 $ ceph osd pool ls detail --format json\|wc -c 4740 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `06471a4b82`)	2021-08-03 13:57:20 -04:00
Dimitri Savineau	380e0bec83	rolling_update: get ceph version when mons exist `eec3878` introduced a regression for upgrade scenarios where there's no monitor nodes at all (like ganesha standalone, external clients, etc..) TASK [get the ceph release being deployed] ********************************** task path: infrastructure-playbooks/rolling_update.yml:121 Thursday 29 July 2021 15:55:29 +0000 (0:00:00.484) 0:00:15.802 ******* fatal: [client0]: FAILED! => msg: '''dict object'' has no attribute ''mons''' Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e87a47cf0c`)	2021-08-03 12:17:53 -04:00
Benoît Knecht	35ce2bb643	infrastructure-playbooks: Get Ceph info in check mode In the `set osd flags` block, run the Ceph commands that gather information from the cluster (and don't make any changes to it) even when running in check mode. This allows the tasks that depend on the variables set by those tasks to succeed in check mode. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `d7653dca95`)	2021-08-02 15:54:04 +02:00
Guillaume Abrioux	b9cc91f622	update: check the ceph release Check early which Ceph release is going to be deployed and fail if it doesn't correspond to the ceph-ansible version being used. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978643 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `eec38784ec`)	2021-07-26 13:39:20 -04:00
Guillaume Abrioux	f085f681f0	purge: support osd_auto_discovery This adds a task that zaps by osd id so we can support the scenario where osds were deployed with `osd_auto_discovery` is true. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876860 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4144074a50`)	2021-07-22 17:10:01 -04:00
Guillaume Abrioux	0ef447704f	purge: merge playbooks This refactor merges the two playbooks so we only have to maintain 1 playbook. (Symlink the old purge-container-cluster.yml playbook for backward compatibility). Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `17cd83bf3a`)	2021-07-22 17:10:01 -04:00
Guillaume Abrioux	b2b2871ccd	purge: drop variables from 'hosts' sections Those variables are useless given this is not possible to override them. Let's replace them with the hardcoded name instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6b50401d0c`)	2021-07-22 17:10:01 -04:00
Dimitri Savineau	06158c2ac5	common: remove unnecessary run_once statements `1303611` introduced tasks for disabling the pg_autoscaler on pools and the balancer but thoses tasks are already executed on the first monitor node so we don't need to add the run_once statement. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `738fa9428a`)	2021-07-21 10:01:25 -04:00
Dimitri Savineau	f9d60644ad	common: fix py2 pool_list from_json when skipped When using python 2 and the task with a loop is skipped then it generates an error. Unexpected templating type error occurred on ({{ (pool_list.stdout \| from_json)['pools'] }}): expected string or buffer Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cf6e33346e`)	2021-07-21 08:57:53 -04:00
Guillaume Abrioux	f3a9135241	common: disable/enable pg_autoscaler The PG autoscaler can disrupt the PG checks so the idea here is to disable it and re-enable it back after the restart is done. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `13036115e2`)	2021-07-20 11:48:39 -04:00
Guillaume Abrioux	559b379f73	purge: reindent playbook This commit reindents the playbook. Also improve readability by adding an extra line between plays. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `60aa70a128`)	2021-07-13 17:02:45 -04:00
Dimitri Savineau	876fa07175	rolling_update: check quorum state before upgrade If one a the monitor is out of the quorum then nothing prevents the upgrade playbook to run. We only check if we have at least three monitor nodes but we should also check if those monitor nodes are correctly present in the quorum. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1952571 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `97148dd58c`)	2021-07-12 12:59:48 -04:00
Guillaume Abrioux	a14a3e56c0	update: fail the playbook if straw2 conversion failed It's better to fail the playbook so the user is aware the straw2 migration has failed. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c396122ad9`)	2021-07-09 16:32:56 -04:00
Guillaume Abrioux	361f373e18	update: followup on pr #6689 add mising 'osd' command. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4eb4268dee`)	2021-07-09 11:34:22 +02:00
Guillaume Abrioux	a0087c425b	update: convert straw bucket After an upgrade, the presence of straw buckets will produce the following warning (HEALTH_WARN): ``` crush map has legacy tunables (require firefly, min is hammer) ``` because straw bucket is a firefly feature it needs to be converted to straw2. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967964 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `eee576477c`)	2021-07-09 09:15:42 +02:00
Dimitri Savineau	b3dde31a06	infra: add playbook to purge dashboard/monitoring The dashboard/monitoring stack can be deployed via the dashboard_enabled variable. But there's nothing similar if we can to remove that part only and keep the ceph cluster up and running. The current purge playbooks remove everything. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786691 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `8e4ef7d6da`)	2021-07-06 11:40:38 -04:00
Guillaume Abrioux	4366cb3b30	cephadm_adopt: add any_errors_fatal on play Add any_errors_fatal: true in cephadm-adopt playbook. We should stop the playbook execution when a task throws an error. Otherwise it can lead to unexpected behavior. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1976179 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3b804a61dd`)	2021-07-03 11:59:27 +02:00
Guillaume Abrioux	21a6cc2cdf	purge: add monitoring group in final cleanup play This adds the monitoring group in the "final cleanup play" so any cid files generated are well removed when purging the cluster. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1974536 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `037d8cd05e`)	2021-07-02 14:37:09 -04:00
Guillaume Abrioux	22fd0846a7	update: do not gather facts on each play There's no benefit to gather facts again on each play in rolling_update.yml Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2c77d0094c`)	2021-06-30 20:39:50 +02:00
Dimitri Savineau	0b273bbac6	switch2container: run ceph-validate role This adds the ceph-validate role before starting the switch to a containerized deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1968177 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `fc160b3be1`)	2021-06-30 09:30:12 +02:00
Guillaume Abrioux	fe2e057b51	shrink-mgr: modify existing mgr check Do not rely on the inventory aliases in order to check if the selected manager to be removed is present. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967897 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `26a7256c4c`)	2021-06-29 17:52:33 +02:00
Guillaume Abrioux	768677610c	cephadm-adopt/rgw: add host target in svc_id If multi-realms were deployed with several instances belonging to the same realm and zone using the same port on different nodes, the service id expected by cephadm will be the same and therefore only one service will be deployed. We need to create a service called `<node>.<realm>.<zone>.<port>` to be sure the service name will be unique and well deployed on the expected node in order to preserve backward compatibility with the rgws instances that were deployed with ceph-ansible. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967455 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `31311b03ed`)	2021-06-29 15:19:02 +02:00
Guillaume Abrioux	5cdc9af044	cephadm-adopt: support rgw multisite adoption We need to support rgw multisite deployments. This commit makes the adoption playbook support this kind of deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967455 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fc784fc44c`)	2021-06-24 11:25:52 +02:00
Guillaume Abrioux	1eb42b143a	cephadm-adopt: fix mgr placement hosts task When no `[mgrs]` group is defined in the inventory, mgr daemon are implicitly collocated with monitors. This task currently relies on the length of the mgr group in order to tell cephadm to deploy mgr daemons. If there's no `[mgrs]` group defined in the inventory, it will ask cephadm to deploy 0 mgr daemon which doesn't make sense and will throw an error. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1970313 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f9a73149a4`)	2021-06-14 13:56:00 +02:00
Guillaume Abrioux	ac0a5c1e68	dashboard: fix typo introduced during backport during backport of `c8b92deba1` the pattern should have been s/monitoring_group_name/grafana_server_group_name/ Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1964907 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-05-26 15:23:01 +02:00
Guillaume Abrioux	dc0d028094	fs2bs: use match filter in selectattr() `0990ae4109` changed the filter in selectattr() from 'match' to 'equalto' but due to an incompatibility with the Jinja2 version for python 2.7 on el7 we must stick to using 'match' filter. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d6745e9cd9`)	2021-05-26 09:38:33 +02:00
Guillaume Abrioux	5ece2368cd	fs2bs: fix wrong filter when setting osd_ids using 'match' filter in that task will lead to bad behavior if I have the following node names for instance: - node1 - node11 - node111 with `selectattr('name', 'match', inventory_hostname)` it will match 'node1' along with 'node11' and 'node111'. using 'equalto' filter will make sure we only match the target node. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1963066 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0990ae4109`)	2021-05-26 09:38:33 +02:00
Guillaume Abrioux	610da6ff3c	cephadm_adopt: create a 'nfs-ganesha' pool When migrating from a cluster with no MDS nodes deployed, `{{ cephfs_data_pool.name }}` doesn't exist so we need to create a pool for storing nfs export objects. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1950403 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `bb7d37fb6a`)	2021-05-06 10:15:31 +02:00
Guillaume Abrioux	13c02fe8e9	cephadm_adopt: support nfs-ganesha adoption This commit adds the nfs-ganesha adoption support in the `cephadm-adopt.yml` playbook. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1944504 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a9220654f5`)	2021-05-06 10:15:31 +02:00
Guillaume Abrioux	b05d1bdaff	cephadm_adopt: fix a typo This play doesn't nothing else than stopping/removing rgw daemons. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ee44d86072`)	2021-05-06 10:15:31 +02:00
Guillaume Abrioux	84a0ed440d	update: fix ceph-crash stop task This is a workaround for an issue in ansible. When trying to stop/mask/disable this service in one task, the stop didn't actually happen, the task doesn't fail but for some reason the container is still present and running. Then the task starting the service in the role ceph-crash fails because it can't start the container since it's already running with the same name. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1955393 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3db1ea7ec4`)	2021-05-04 16:02:29 +02:00
Guillaume Abrioux	8b284d4356	cephadm_adopt: fix ceph-crash migration ceph-ansible leaves a ceph-crash container in containerized deployment. It means we end up with 2 ceph-crash containers running after the migration playbook is complete. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1954614 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `22c18e82f0`)	2021-04-29 07:14:44 +02:00
Guillaume Abrioux	4468bd913d	switch-to-containers: only chown corresponding files When collocating daemons, if we chown all files under `/var/lib/ceph` it can cause issues for the collocated daemons that wouldn't have been migrated yet. This commit makes the playbook chown only the files corresponding to the daemon being migrated. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ddbc11c4a9`)	2021-04-15 05:24:30 +02:00
Guillaume Abrioux	82c7af195f	fs2bs: add a final play This removes the fact `skipped_nodes` which is useless when we run with `--limit` since it gets reset when a new iteration is made. Instead, let's print within a final play which node has been skipped reusing the `skip_this_node` fact. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3d4267051f`)	2021-04-14 16:46:43 +02:00
Guillaume Abrioux	568d1d6427	docker2podman: skip some role imports from handler when running docker-to-podman playbook, there's no need to call `ceph-config` and `ceph-rgw` from the role `ceph-handler`. It can even have side effects when coming from a baremetal cluster that was previously migrated using the switch-to-containers playbook. Indeed it might complain about missing .target systemd unit since they are removed during that migration. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1944999 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `70f19be367`)	2021-04-12 13:30:17 +02:00
Guillaume Abrioux	ad0bd5f907	docker2podman: add documentation/header this adds a small documentation in the header of the playbook in order to explain what is the goal of this playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `36b4227dcd`)	2021-04-12 09:44:24 +02:00
Guillaume Abrioux	74ed52e003	switch_to_containers: support iscsigws migration This adds the iscsigws migration to containers. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=<bz-number> Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2c74c27321`)	2021-04-09 15:28:17 +02:00

1 2 3 4 5 ...

715 Commits (5e40cb89572dbc42a32919f81bad8a3b79f5154d)