ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	0097cb09f1	cephadm: use public_network when adding hosts When adding host, using ansible_facts['default_ipv4']['address'] might not be the desired network, we shouldn't enforce the subnet with the default route. Let's use the public_network instead. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2006415 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2f34531304`)	2021-11-08 10:36:14 +01:00
Dimitri Savineau	041e8b0eaa	cephadm-adopt: remove logrotate configuration cephadm uses its own logrotate configuration file so ceph-ansible needs to remove that custom file during the cephadm-adopt playbook. Closes: #6944 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c41241244e`)	2021-11-03 11:51:03 +01:00
Guillaume Abrioux	19dadc98da	update: move a set_fact ceph-facts roles makes decisions based on the fact `rolling_update` so it must be called before we run this role. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e5edcc4214`)	2021-11-03 11:50:27 +01:00
Guillaume Abrioux	8f648269ec	update: support --limit on monitor nodes Change needed in order to support --limit on mon nodes. Otherwise, a call to `hostvars[groups[mon_group_name][0]]['_current_monitor_address']` throws an error: ``` "The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute '_current_monitor_address'" ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304#c28 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `82eee4303b`)	2021-11-03 08:48:38 +01:00
Guillaume Abrioux	a752edbd29	Revert "update: block upgrade when nfs+rgw is deployed" This reverts commit `93f1765259`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2017508 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-10-28 08:13:05 +02:00
Guillaume Abrioux	f7d67f7669	rolling_update: modify default health_osd_check_* let's do more retries with a shorter delay. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `50a21d695e`)	2021-10-25 21:08:44 +02:00
Guillaume Abrioux	e5ef104c57	adopt: fix rbd mirror adoption The rbd mirroring is broken because cephadm doesn't bindmount /etc/ceph anymore. It means the keyrings and ceph config file aren't available after the migration. The idea here is to remove the current rbd mirror peer and add it back to the mon config store so we aren't bound to the /etc/ceph directory. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967440 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9c794aa9bc`)	2021-10-25 20:14:07 +02:00
Guillaume Abrioux	b1bdb708d0	adopt: use mgr/nfs volume use the mgr 'nfs' module to recreate nfs exports. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1954971 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4257410dcd`)	2021-10-25 17:16:15 +02:00
Guillaume Abrioux	efc6979db5	rolling_update: fix pre and post osd upgrade play when using --limit osds, the play before and after osd upgrade are skipped because we use `hosts: "{{ mon_group_name \| default('mons') }}[0]"` using `hosts: "{{ osds_group_name \| default('osds') }}" with `delegate_to` to the first monitor addresses this issue. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fc9f87c45f`)	2021-10-25 15:33:18 +02:00
Guillaume Abrioux	ca25ebb323	update: support upgrading a subset of nodes It can be useful in a large cluster deployment to split the upgrade and only upgrade a group of nodes at a time. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e5cf9db2b0`)	2021-10-25 15:33:18 +02:00
Per Abildgaard Toft	3edc6ac5f2	shrink-osd: fix regression because of a wrong regex `968891f449` introduced a regression. The regex is wrong because it doesn't allow to shrink osds with id greater than 9 Fixes: #6950 Signed-off-by: Per Abildgaard Toft <per@minfejl.dk> (cherry picked from commit `84118a3063`)	2021-10-21 12:38:25 +02:00
Seena Fallah	fde6354dcd	cephadm: set ssh configs at bootstrap step Add support ssh_user and ssh_config to cephadm bootstrap plugin Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `ae6be71b08`)	2021-10-15 16:15:38 +02:00
Guillaume Abrioux	86ab9e44b6	shrink-osd: check osd id format This adds a check early in order to ensure the format of osd ids passed is correct. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2005734 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `968891f449`)	2021-10-15 14:35:23 +02:00
Seena Fallah	191ec4f40f	cephadm: install cephadm from repository Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `5822936252`)	2021-10-13 08:10:05 +02:00
Seena Fallah	7b19748304	cephadm-adopt: configure repository for cephadm installation Configure repository for cephadm installation and use package install in both containerized and non containerized deployment Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `339212a7c6`)	2021-10-13 08:10:05 +02:00
Francesco Pantano	642a83dc6b	Add ceph_nfs_adopt tag to the cephadm-adopt playbook There are existing OpenStack scenarios where nfs is still not managed by cephadm. For this reason sometimes is useful skip the nfs part of the adoption playbook and leave this daemon unmanaged. The purpose of this patch is providing a tag to enable the OpenStack operators to skip this playbook section. Closes: https://bugzilla.redhat.com/2009212 Signed-off-by: Francesco Pantano <fpantano@redhat.com> (cherry picked from commit `b7299f258b`)	2021-10-01 23:32:33 +02:00
Seena Fallah	c3fe1a6206	cephadm: use cephadm_ssh_user for ssh user Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `0b78faa723`)	2021-10-01 23:31:39 +02:00
Guillaume Abrioux	4b5a0c0443	cephadm: add admin label on mon nodes This is needed if you want a copy of the admin keyring on the admin nodes. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b555f1d1cd`)	2021-10-01 23:23:06 +02:00
Guillaume Abrioux	d196881ebb	cephadm-adopt: add no_log: true Let's add a `no_log: true` on the `cephadm registry-login` task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0a3b916ee7`)	2021-09-28 21:15:02 +02:00
Guillaume Abrioux	a053adbe84	adopt: stop iscsi services in the first place If old containers are still running, it can make tcmu-runner process unable to open devices and there's nothing else to do than restarting the container. Also, as per discussion with iscsi experts, iscsi should be migrated before OSDs. (the client should be closed before the server) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000412 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d12efa1ab4`)	2021-09-28 18:46:49 +02:00
Seena Fallah	cb5a675e49	cephadm-adopt: use cephadm_ssh_user for ssh user Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `67389d08d4`)	2021-09-13 16:26:24 +02:00
Daniel Pivonka	969e41fa2e	cephadm-adopt: set cephadm registry login info registry login info needs to be stored in cluster for cephadm and future hosts Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000103 Signed-off-by: Daniel Pivonka <dpivonka@redhat.com> (cherry picked from commit `1c50dc29cf`)	2021-09-13 16:18:40 +02:00
Seena Fallah	432ab37c6b	purge: add remove_docker tag This can help to skip docker removal tasks Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `ff39c8d70b`)	2021-09-09 16:41:32 +02:00
Seena Fallah	0897c08518	purge: add container_binary needed for zap osds `container_binary` isn't set anymore in the purge osd play because of a regression introduced by `60aa70a`. The CI didn't catch it because the play purging node-exporter sets this variable for all nodes before we run the purge osd play. This commit fixes this regression. Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `a51ce767ca`)	2021-09-09 14:40:30 +02:00
Dimitri Savineau	ac6604ab61	purge-dashboard: remove cid files This adds the service cid file cleanup as supported in the classic purge playbook since `b9dd253` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786691 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cddc23f511`)	2021-09-08 12:05:22 -04:00
Dimitri Savineau	ac5353a2d8	cephadm-adopt: fix orch host add with FQDN When a node is configured with FQDN as the hostname value then the `ceph orch host add` command will fail because the `ansible_hostname` used by that command contains the short hostname which won't match the current hostname (FQDN) Instead we can use the ansible_nodename fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1997083 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2630f8d47a`)	2021-08-26 17:10:55 -04:00
Dimitri Savineau	e3e849378e	cephadm-adopt: remove ceph-nfs.target This systemd target doesn't exist at all. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `8ba6101bbb`)	2021-08-18 15:29:03 -04:00
Guillaume Abrioux	d7311aeefc	containers: introduce target systemd unit This adds ceph-*.target systemd unit files support for containerized deployments. This also fixes a regression introduced by PR #6719 (rgw and nfs systemd units not getting purged) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1962748 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `09ef465f62`)	2021-08-18 13:42:50 -04:00
Guillaume Abrioux	056b18aa0e	update: gather facts only one time this play doesn't need to gather facts from localhost Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c14e9114ba`)	2021-08-17 15:31:34 -04:00
VasishtaShastry	6ed0919796	Fixes typo in rgw-add-users-buckets playbook Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com> (cherry picked from commit `478d9fdcb6`)	2021-08-09 14:31:42 -04:00
Guillaume Abrioux	6e9cf80747	adopt: import rgw ssl certificate into kv store Without this, when rgw is managed by cephadm, it fails to start because the ssl certificate isn't present in the kv store. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1987010 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1988404 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `930fc4c850`)	2021-08-05 14:47:47 -04:00
Dimitri Savineau	2377da8f9b	infra: use dedicated variables for balancer status The balancer status is registered during the cephadm-adopt, rolling_update and swith2container playbooks. But it is also used in the ceph-handler role which is included in those playbooks too. Even if the ceph-handler tasks are skipped for rolling_update and switch2container, the balancer_status variable is erased with the skip task result. play1: register: balancer_status play2: register: balancer_status <-- skipped play3: when: (balancer_status.stdout \| from_json)['active'] \| bool This leads to issue like: The conditional check '(balancer_status.stdout \| from_json)['active'] \| bool' failed. The error was: Unexpected templating type error occurred on ({% if (balancer_status.stdout \| from_json)['active'] \| bool %} True {% else %} False {% endif %}): expected string or buffer. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1982054 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `386661699b`)	2021-08-04 11:43:47 -04:00
Dimitri Savineau	31cc8bd2aa	osds: use osd pool ls instead of osd dump command The ceph osd pool ls detail command is a subset of the ceph osd dump command. $ ceph osd dump --format json\|wc -c 10117 $ ceph osd pool ls detail --format json\|wc -c 4740 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `06471a4b82`)	2021-08-03 13:57:14 -04:00
Dimitri Savineau	7f5b986e01	rolling_update: get ceph version when mons exist `eec3878` introduced a regression for upgrade scenarios where there's no monitor nodes at all (like ganesha standalone, external clients, etc..) TASK [get the ceph release being deployed] ********************************** task path: infrastructure-playbooks/rolling_update.yml:121 Thursday 29 July 2021 15:55:29 +0000 (0:00:00.484) 0:00:15.802 ******* fatal: [client0]: FAILED! => msg: '''dict object'' has no attribute ''mons''' Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e87a47cf0c`)	2021-08-03 12:17:47 -04:00
Benoît Knecht	c8348ab0d9	infrastructure-playbooks: Get Ceph info in check mode In the `set osd flags` block, run the Ceph commands that gather information from the cluster (and don't make any changes to it) even when running in check mode. This allows the tasks that depend on the variables set by those tasks to succeed in check mode. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `d7653dca95`)	2021-08-02 15:53:49 +02:00
Guillaume Abrioux	76f68843e5	update: check the ceph release Check early which Ceph release is going to be deployed and fail if it doesn't correspond to the ceph-ansible version being used. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978643 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `eec38784ec`)	2021-07-26 13:35:30 -04:00
Guillaume Abrioux	036b03a7bb	purge: support osd_auto_discovery This adds a task that zaps by osd id so we can support the scenario where osds were deployed with `osd_auto_discovery` is true. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876860 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4144074a50`)	2021-07-22 12:43:57 -04:00
Guillaume Abrioux	b36e8ec935	purge: merge playbooks This refactor merges the two playbooks so we only have to maintain 1 playbook. (Symlink the old purge-container-cluster.yml playbook for backward compatibility). Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `17cd83bf3a`)	2021-07-22 12:43:57 -04:00
Guillaume Abrioux	fc5440b71c	purge: drop variables from 'hosts' sections Those variables are useless given this is not possible to override them. Let's replace them with the hardcoded name instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6b50401d0c`)	2021-07-22 12:43:57 -04:00
Dimitri Savineau	e54c8e93ee	cephadm-adopt: set application on ganesha pool Set the nfs application to the ganesha pool. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1956840 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `aeb9f562e5`)	2021-07-21 16:22:25 +02:00
Dimitri Savineau	3ec8e90b34	cephadm-adopt: enable osd memory autotune for HCI This enables the osd_memory_target_autotune option on HCI environment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1973149 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `a305296384`)	2021-07-21 16:22:10 +02:00
Dimitri Savineau	f6cd8b9816	common: remove unnecessary run_once statements `1303611` introduced tasks for disabling the pg_autoscaler on pools and the balancer but thoses tasks are already executed on the first monitor node so we don't need to add the run_once statement. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `738fa9428a`)	2021-07-21 10:01:15 -04:00
Dimitri Savineau	cf734e19b7	common: fix py2 pool_list from_json when skipped When using python 2 and the task with a loop is skipped then it generates an error. Unexpected templating type error occurred on ({{ (pool_list.stdout \| from_json)['pools'] }}): expected string or buffer Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cf6e33346e`)	2021-07-21 14:00:30 +02:00
Guillaume Abrioux	3cc8c667d0	common: disable/enable pg_autoscaler The PG autoscaler can disrupt the PG checks so the idea here is to disable it and re-enable it back after the restart is done. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `13036115e2`)	2021-07-20 11:04:25 -04:00
Guillaume Abrioux	fb825e0659	purge: reindent playbook This commit reindents the playbook. Also improve readability by adding an extra line between plays. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `60aa70a128`)	2021-07-13 14:47:44 -04:00
Dimitri Savineau	e08cb421d4	rolling_update: check quorum state before upgrade If one a the monitor is out of the quorum then nothing prevents the upgrade playbook to run. We only check if we have at least three monitor nodes but we should also check if those monitor nodes are correctly present in the quorum. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1952571 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `97148dd58c`)	2021-07-12 12:58:02 -04:00
Guillaume Abrioux	bf5d0b7374	update: fail the playbook if straw2 conversion failed It's better to fail the playbook so the user is aware the straw2 migration has failed. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c396122ad9`)	2021-07-09 16:32:47 -04:00
Guillaume Abrioux	0a348bd396	update: followup on pr #6689 add mising 'osd' command. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4eb4268dee`)	2021-07-09 11:34:12 +02:00
Guillaume Abrioux	ea8f0c7bcb	update: convert straw bucket After an upgrade, the presence of straw buckets will produce the following warning (HEALTH_WARN): ``` crush map has legacy tunables (require firefly, min is hammer) ``` because straw bucket is a firefly feature it needs to be converted to straw2. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967964 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `eee576477c`)	2021-07-09 09:15:24 +02:00
Dimitri Savineau	ec648981e6	infra: add playbook to purge dashboard/monitoring The dashboard/monitoring stack can be deployed via the dashboard_enabled variable. But there's nothing similar if we can to remove that part only and keep the ceph cluster up and running. The current purge playbooks remove everything. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786691 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `8e4ef7d6da`)	2021-07-06 11:40:31 -04:00

1 2 3 4 5 ...

763 Commits (0097cb09f107a5925391f6e62f702025d5637fd9)