ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Seena Fallah	54aca30a24	ansible: use ansible.utils.ipwrap instead of ansible.netcommon.ipwrap ansible.netcommon.ipwrap is deprecated and is not being redirected with ansible 2.9.* Signed-off-by: Seena Fallah <seenafallah@gmail.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>	2022-06-14 09:36:39 +02:00
Guillaume Abrioux	c9dd9a09d2	switch to ansible.netcommon.ipwrap As of 2.10, Ansible moved ipwrap to netcommon collection. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2022-06-07 16:30:18 +02:00
Guillaume Abrioux	4d3e25c85e	cephadm_adopt: set autotune_memory_target_ratio This adds a task that sets `autotune_memory_target_ratio` depending on the value of `is_hci`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2028693 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `41d62596fc`)	2022-05-30 16:42:10 +02:00
Guillaume Abrioux	081c170120	cephadm-adopt: remove legacy directory after adoption When this directory is left after the osd adoption, it leads to the following error: ``` [WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices host axdesec2ocs1n002.ecommerce.inditex.grp `cephadm ceph-volume` failed: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config'. ``` this is because of an unexpected behavior regarding 'config inferring' when a legacy directory is present in /var/lib/ceph. Note: this doesn't fix the root cause, this is a workaround. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2075510 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6e2ebe857d`)	2022-05-13 06:58:16 +02:00
Teoman ONAY	274a780237	Using another user than root for cephadm ssh connections fails Fixes commit `da42f3d139` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2048734 Signed-off-by: Teoman ONAY <tonay@redhat.com> (cherry picked from commit `f851d3232c`)	2022-03-21 09:35:28 +01:00
Guillaume Abrioux	bcab0d7a55	adopt: fix node labelling When using group of group, the playbook will apply undesired labels on nodes. This commit fixes it by applying only the expected labels. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2057528 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `266b6e739c`)	2022-03-03 17:01:58 +01:00
Teoman ONAY	839ad5927d	Add cluster custom name support When using cluster custom names, cephadm commands are executed using the default admin keyring name which fails. Signed-off-by: Teoman ONAY <tonay@redhat.com> (cherry picked from commit `f8c6bba657`)	2022-03-03 17:01:58 +01:00
Teoman ONAY	c3ce6fc41a	Enable user to change the account used for ssh connection By default cephadm uses root account to connect remotely to other nodes in the cluster. This change allows to choose another account. This commit also allows to use a dedicated subnet for cephadm mgmt. Signed-off-by: Teoman ONAY <tonay@redhat.com> (cherry picked from commit `da42f3d139`)	2022-03-03 17:01:58 +01:00
Guillaume Abrioux	314ba6e3e9	adopt: fix rbd-mirror adoption We can't use `{{ cephadm_cmd }}` here because the monitors aren't yet adopted. We must use `{{ ceph_cmd }}` instead. This also fixes some filters `\| default()` (they must be moved before `\| from_json()`) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967440 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `94e51d5c14`)	2022-02-10 08:49:43 +01:00
Guillaume Abrioux	371c25f0ef	adopt: fix bug in mon_ip_list set_fact `default('{}')` must be before `\| from_json` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f30767432b`)	2022-02-09 12:40:09 +01:00
Guillaume Abrioux	cb197575dd	adopt: check for POOL_APP_NOT_ENABLED warning This commit makes the cephadm-adopt playbook fail if the cluster has the `POOL_APP_NOT_ENABLED` warning raised. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2040243 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ddae06e1a2`)	2022-02-09 12:40:09 +01:00
Francesco Pantano	8f15179d57	Add with_pkg tag on package related tasks In the OpenStack context we let the integration tool (TripleO) deal with repositories and packages. This change just adds the with_pkg tag to allow TripleO skipping both the repositories and packages installation. Signed-off-by: Francesco Pantano <fpantano@redhat.com> (cherry picked from commit `12dd8b5df1`)	2022-02-04 09:52:07 +01:00
Guillaume Abrioux	fa281c7538	adopt: create nfs exports at the user level The current implementation is wrong. ceph-ansible lists all existing buckets and try to create an export for each of them. Instead, it's easier to create the export at the user level. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2037691 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7f517cdd22`)	2022-01-29 15:25:46 +01:00
Guillaume Abrioux	17d8351971	cephadm-adopt: use named args in rgw export creation In order to avoid breaking changes, let's use named argument instead of positional argument syntax in the command line used to create rgw export. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2037691 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `aee1f06497`)	2022-01-06 16:52:05 +01:00
Guillaume Abrioux	8a32576d20	cephadm-adopt: ensure /etc/ceph is present on monitoring node When deploying the monitoring stack on a dedicated node, the directory `/etc/ceph` has never been created. Therefore, the play for adopting the monitoring stack fails because it can't write the minimal config file. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2029697 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7ece59b41d`)	2021-12-07 23:09:42 +01:00
Guillaume Abrioux	b16d9fc289	cephadm-adopt: bindmount /var/lib/ceph with 'ro' When collocating osds with iscsigw daemons, cephadm bindmounts the following: ``` -v /var/lib/ceph/6126c064-6a9e-4092-8a64-977930df0843/iscsi.rbd.ceph-ameenasuhani-4fs3bq-node5.vomtqb/configfs:/sys/kernel/config ``` this prevents cephadm-adopt playbook from running container and bindmounting `/var/lib/ceph:/var/lib/ceph:z` since 'ro' is enough in this playbook, let's replace the ':z' option on this bindmount with ':ro' Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027411 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c4fdf956bd`)	2021-11-30 21:04:31 +01:00
Guillaume Abrioux	1628347253	adopt: fix ceph_origin and ceph_repository defaults This is overriding those variables because the precedence at the 'block var' level is greater than the group_vars/host_vars. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2026861 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e5ea2ece99`)	2021-11-30 13:02:24 +01:00
Guillaume Abrioux	6bdaa9e3d5	cephadm: support adding hosts with ipv6 The current implementation doesn't support adding hosts when using ipv6 addresses. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4f2c2af9b4`)	2021-11-08 10:36:14 +01:00
Guillaume Abrioux	0097cb09f1	cephadm: use public_network when adding hosts When adding host, using ansible_facts['default_ipv4']['address'] might not be the desired network, we shouldn't enforce the subnet with the default route. Let's use the public_network instead. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2006415 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2f34531304`)	2021-11-08 10:36:14 +01:00
Dimitri Savineau	041e8b0eaa	cephadm-adopt: remove logrotate configuration cephadm uses its own logrotate configuration file so ceph-ansible needs to remove that custom file during the cephadm-adopt playbook. Closes: #6944 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c41241244e`)	2021-11-03 11:51:03 +01:00
Guillaume Abrioux	e5ef104c57	adopt: fix rbd mirror adoption The rbd mirroring is broken because cephadm doesn't bindmount /etc/ceph anymore. It means the keyrings and ceph config file aren't available after the migration. The idea here is to remove the current rbd mirror peer and add it back to the mon config store so we aren't bound to the /etc/ceph directory. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967440 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9c794aa9bc`)	2021-10-25 20:14:07 +02:00
Guillaume Abrioux	b1bdb708d0	adopt: use mgr/nfs volume use the mgr 'nfs' module to recreate nfs exports. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1954971 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4257410dcd`)	2021-10-25 17:16:15 +02:00
Seena Fallah	7b19748304	cephadm-adopt: configure repository for cephadm installation Configure repository for cephadm installation and use package install in both containerized and non containerized deployment Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `339212a7c6`)	2021-10-13 08:10:05 +02:00
Francesco Pantano	642a83dc6b	Add ceph_nfs_adopt tag to the cephadm-adopt playbook There are existing OpenStack scenarios where nfs is still not managed by cephadm. For this reason sometimes is useful skip the nfs part of the adoption playbook and leave this daemon unmanaged. The purpose of this patch is providing a tag to enable the OpenStack operators to skip this playbook section. Closes: https://bugzilla.redhat.com/2009212 Signed-off-by: Francesco Pantano <fpantano@redhat.com> (cherry picked from commit `b7299f258b`)	2021-10-01 23:32:33 +02:00
Guillaume Abrioux	d196881ebb	cephadm-adopt: add no_log: true Let's add a `no_log: true` on the `cephadm registry-login` task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0a3b916ee7`)	2021-09-28 21:15:02 +02:00
Guillaume Abrioux	a053adbe84	adopt: stop iscsi services in the first place If old containers are still running, it can make tcmu-runner process unable to open devices and there's nothing else to do than restarting the container. Also, as per discussion with iscsi experts, iscsi should be migrated before OSDs. (the client should be closed before the server) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000412 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d12efa1ab4`)	2021-09-28 18:46:49 +02:00
Seena Fallah	cb5a675e49	cephadm-adopt: use cephadm_ssh_user for ssh user Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `67389d08d4`)	2021-09-13 16:26:24 +02:00
Daniel Pivonka	969e41fa2e	cephadm-adopt: set cephadm registry login info registry login info needs to be stored in cluster for cephadm and future hosts Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000103 Signed-off-by: Daniel Pivonka <dpivonka@redhat.com> (cherry picked from commit `1c50dc29cf`)	2021-09-13 16:18:40 +02:00
Dimitri Savineau	ac5353a2d8	cephadm-adopt: fix orch host add with FQDN When a node is configured with FQDN as the hostname value then the `ceph orch host add` command will fail because the `ansible_hostname` used by that command contains the short hostname which won't match the current hostname (FQDN) Instead we can use the ansible_nodename fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1997083 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2630f8d47a`)	2021-08-26 17:10:55 -04:00
Dimitri Savineau	e3e849378e	cephadm-adopt: remove ceph-nfs.target This systemd target doesn't exist at all. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `8ba6101bbb`)	2021-08-18 15:29:03 -04:00
Guillaume Abrioux	d7311aeefc	containers: introduce target systemd unit This adds ceph-*.target systemd unit files support for containerized deployments. This also fixes a regression introduced by PR #6719 (rgw and nfs systemd units not getting purged) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1962748 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `09ef465f62`)	2021-08-18 13:42:50 -04:00
Guillaume Abrioux	6e9cf80747	adopt: import rgw ssl certificate into kv store Without this, when rgw is managed by cephadm, it fails to start because the ssl certificate isn't present in the kv store. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1987010 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1988404 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `930fc4c850`)	2021-08-05 14:47:47 -04:00
Dimitri Savineau	2377da8f9b	infra: use dedicated variables for balancer status The balancer status is registered during the cephadm-adopt, rolling_update and swith2container playbooks. But it is also used in the ceph-handler role which is included in those playbooks too. Even if the ceph-handler tasks are skipped for rolling_update and switch2container, the balancer_status variable is erased with the skip task result. play1: register: balancer_status play2: register: balancer_status <-- skipped play3: when: (balancer_status.stdout \| from_json)['active'] \| bool This leads to issue like: The conditional check '(balancer_status.stdout \| from_json)['active'] \| bool' failed. The error was: Unexpected templating type error occurred on ({% if (balancer_status.stdout \| from_json)['active'] \| bool %} True {% else %} False {% endif %}): expected string or buffer. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1982054 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `386661699b`)	2021-08-04 11:43:47 -04:00
Dimitri Savineau	31cc8bd2aa	osds: use osd pool ls instead of osd dump command The ceph osd pool ls detail command is a subset of the ceph osd dump command. $ ceph osd dump --format json\|wc -c 10117 $ ceph osd pool ls detail --format json\|wc -c 4740 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `06471a4b82`)	2021-08-03 13:57:14 -04:00
Benoît Knecht	c8348ab0d9	infrastructure-playbooks: Get Ceph info in check mode In the `set osd flags` block, run the Ceph commands that gather information from the cluster (and don't make any changes to it) even when running in check mode. This allows the tasks that depend on the variables set by those tasks to succeed in check mode. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `d7653dca95`)	2021-08-02 15:53:49 +02:00
Dimitri Savineau	e54c8e93ee	cephadm-adopt: set application on ganesha pool Set the nfs application to the ganesha pool. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1956840 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `aeb9f562e5`)	2021-07-21 16:22:25 +02:00
Dimitri Savineau	3ec8e90b34	cephadm-adopt: enable osd memory autotune for HCI This enables the osd_memory_target_autotune option on HCI environment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1973149 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `a305296384`)	2021-07-21 16:22:10 +02:00
Dimitri Savineau	cf734e19b7	common: fix py2 pool_list from_json when skipped When using python 2 and the task with a loop is skipped then it generates an error. Unexpected templating type error occurred on ({{ (pool_list.stdout \| from_json)['pools'] }}): expected string or buffer Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cf6e33346e`)	2021-07-21 14:00:30 +02:00
Guillaume Abrioux	3cc8c667d0	common: disable/enable pg_autoscaler The PG autoscaler can disrupt the PG checks so the idea here is to disable it and re-enable it back after the restart is done. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `13036115e2`)	2021-07-20 11:04:25 -04:00
Guillaume Abrioux	f80837c23e	cephadm_adopt: add any_errors_fatal on play Add any_errors_fatal: true in cephadm-adopt playbook. We should stop the playbook execution when a task throws an error. Otherwise it can lead to unexpected behavior. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1976179 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3b804a61dd`)	2021-07-03 11:58:57 +02:00
Guillaume Abrioux	0856d3e47f	cephadm-adopt/rgw: add host target in svc_id If multi-realms were deployed with several instances belonging to the same realm and zone using the same port on different nodes, the service id expected by cephadm will be the same and therefore only one service will be deployed. We need to create a service called `<node>.<realm>.<zone>.<port>` to be sure the service name will be unique and well deployed on the expected node in order to preserve backward compatibility with the rgws instances that were deployed with ceph-ansible. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967455 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `31311b03ed`)	2021-06-29 15:18:49 +02:00
Guillaume Abrioux	aa332ac64d	cephadm-adopt: support rgw multisite adoption We need to support rgw multisite deployments. This commit makes the adoption playbook support this kind of deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967455 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fc784fc44c`)	2021-06-24 09:48:27 +02:00
Guillaume Abrioux	17f9780274	cephadm-adopt: fix mgr placement hosts task When no `[mgrs]` group is defined in the inventory, mgr daemon are implicitly collocated with monitors. This task currently relies on the length of the mgr group in order to tell cephadm to deploy mgr daemons. If there's no `[mgrs]` group defined in the inventory, it will ask cephadm to deploy 0 mgr daemon which doesn't make sense and will throw an error. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1970313 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f9a73149a4`)	2021-06-14 13:55:45 +02:00
Guillaume Abrioux	747d259511	cephadm_adopt: fix ceph-crash migration ceph-ansible leaves a ceph-crash container in containerized deployment. It means we end up with 2 ceph-crash containers running after the migration playbook is complete. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1954614 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `22c18e82f0`)	2021-04-29 07:14:17 +02:00
Guillaume Abrioux	60c0fb8a7a	cephadm_adopt: fix rgw placement task Due to a recent breaking change in ceph, this command must be modified to add the <svc_id> parameter. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1f40c12502`)	2021-04-27 15:17:28 +02:00
Guillaume Abrioux	a1f445cc73	cephadm_adopt: create a 'nfs-ganesha' pool When migrating from a cluster with no MDS nodes deployed, `{{ cephfs_data_pool.name }}` doesn't exist so we need to create a pool for storing nfs export objects. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1950403 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `bb7d37fb6a`)	2021-04-27 15:17:28 +02:00
Guillaume Abrioux	6b87d8c95c	cephadm_adopt: support nfs-ganesha adoption This commit adds the nfs-ganesha adoption support in the `cephadm-adopt.yml` playbook. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1944504 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a9220654f5`)	2021-04-12 15:32:22 +02:00
Guillaume Abrioux	5aa9d0dfb4	cephadm_adopt: modify placement policy for rgw the adoption playbook should use `radosgw_num_instances` in order to determine how much rgw instance it should set recreate. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1943170 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1ffc4df6b6`)	2021-04-12 15:32:22 +02:00
Guillaume Abrioux	c2d40d4383	cephadm_adopt: fix a typo This play doesn't nothing else than stopping/removing rgw daemons. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ee44d86072`)	2021-04-12 15:32:22 +02:00
Alex Schultz	56aac327dd	Use ansible_facts It has come to our attention that using ansible_* vars that are populated with INJECT_FACTS_AS_VARS=True is not very performant. In order to be able to support setting that to off, we need to update the references to use ansible_facts[<thing>] instead of ansible_<thing>. Related: ansible#73654 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1935406 Signed-off-by: Alex Schultz <aschultz@redhat.com> (cherry picked from commit `a7f2fa73e6`)	2021-03-26 00:04:49 +01:00

1 2

83 Commits (f5020f61309d92be3527c78c110cbb0b1591d24e)