ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	f2eab356d6	ceph_volume: support overriding bind-mounts This makes it possible to call `podman run` with custom bind-mounts. cephadm-adopt.yml playbook needs it for a very specific use case: Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027411 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b02d71c307`) # Conflicts: # library/ceph_volume.py	2021-12-02 08:52:05 +01:00
Guillaume Abrioux	9423ec3eb6	adopt: fix ceph_origin and ceph_repository defaults This is overriding those variables because the precedence at the 'block var' level is greater than the group_vars/host_vars. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2026861 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e5ea2ece99`)	2021-11-30 10:57:34 +01:00
Guillaume Abrioux	53dc75d29c	validate: fix bug when using vault since a variable encrypted with vault is no longer a string but a encrypted object we can't use the filter \| length, we have to convert it to a string before. Fixes: #6991 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6ad7e52869`)	2021-11-29 13:42:24 +01:00
Guillaume Abrioux	efc93f5669	cephadm: support adding hosts with ipv6 The current implementation doesn't support adding hosts when using ipv6 addresses. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4f2c2af9b4`)	2021-11-08 10:36:27 +01:00
Guillaume Abrioux	d06c856fca	cephadm: use public_network when adding hosts When adding host, using ansible_facts['default_ipv4']['address'] might not be the desired network, we shouldn't enforce the subnet with the default route. Let's use the public_network instead. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2006415 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2f34531304`)	2021-11-08 10:36:27 +01:00
Guillaume Abrioux	5f7ad182f9	update: move a set_fact ceph-facts roles makes decisions based on the fact `rolling_update` so it must be called before we run this role. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e5edcc4214`)	2021-11-03 11:50:38 +01:00
Guillaume Abrioux	e63df909af	update: support --limit on monitor nodes Change needed in order to support --limit on mon nodes. Otherwise, a call to `hostvars[groups[mon_group_name][0]]['_current_monitor_address']` throws an error: ``` "The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute '_current_monitor_address'" ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304#c28 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `82eee4303b`)	2021-11-03 08:48:51 +01:00
Guillaume Abrioux	9526425111	rolling_update: modify default health_osd_check_* let's do more retries with a shorter delay. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `50a21d695e`)	2021-10-25 20:38:09 +02:00
Guillaume Abrioux	0d1c0c2813	rolling_update: fix pre and post osd upgrade play when using --limit osds, the play before and after osd upgrade are skipped because we use `hosts: "{{ mon_group_name \| default('mons') }}[0]"` using `hosts: "{{ osds_group_name \| default('osds') }}" with `delegate_to` to the first monitor addresses this issue. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fc9f87c45f`)	2021-10-25 20:15:17 +02:00
Guillaume Abrioux	120ed2b7f3	tests: add new scenario subset_update new scenario in order to test the subset upgrade approach using tags. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fb8a66149b`)	2021-10-25 20:15:17 +02:00
Guillaume Abrioux	1019c7bf25	update: support upgrading a subset of nodes It can be useful in a large cluster deployment to split the upgrade and only upgrade a group of nodes at a time. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e5cf9db2b0`)	2021-10-25 20:15:17 +02:00
Guillaume Abrioux	d73dde0fc7	adopt: fix rbd mirror adoption The rbd mirroring is broken because cephadm doesn't bindmount /etc/ceph anymore. It means the keyrings and ceph config file aren't available after the migration. The idea here is to remove the current rbd mirror peer and add it back to the mon config store so we aren't bound to the /etc/ceph directory. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967440 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9c794aa9bc`)	2021-10-25 20:14:24 +02:00
Per Abildgaard Toft	4271670a83	shrink-osd: fix regression because of a wrong regex `968891f449` introduced a regression. The regex is wrong because it doesn't allow to shrink osds with id greater than 9 Fixes: #6950 Signed-off-by: Per Abildgaard Toft <per@minfejl.dk> (cherry picked from commit `84118a3063`)	2021-10-21 12:38:45 +02:00
Guillaume Abrioux	c9582945fa	adopt: import rgw ssl certificate into kv store Without this, when rgw is managed by cephadm, it fails to start because the ssl certificate isn't present in the kv store. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1987010 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1988404 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `930fc4c850`) (cherry picked from commit `6e9cf80747`)	2021-10-18 18:38:47 +02:00
Dimitri Savineau	4ab40842df	library: make cephadm_adopt module idempotent Running the cephadm_adopt module on an already adopted daemon will fail because the cephadm adopt command isn't idempotent. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918424 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ff9d314305`)	2021-10-18 18:38:47 +02:00
Dimitri Savineau	864acaae10	cephadm-adopt: make the playbook idempotent If the cephadm-adopt.yml fails during the first execution and some daemons have already been adopted by cephadm then we can't rerun the playbook because the old container won't exist anymore. Error: no container with name or ID ceph-mon-xxx found: no such container If the daemons are adopted then the old systemd unit doesn't exist anymore so any call to that unit with systemd will fail. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918424 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `6886700a00`)	2021-10-18 18:38:47 +02:00
Seena Fallah	360cfb156d	cephadm: install cephadm from repository Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `5822936252`)	2021-10-18 18:38:47 +02:00
Seena Fallah	5e5f45d633	cephadm-adopt: configure repository for cephadm installation Configure repository for cephadm installation and use package install in both containerized and non containerized deployment Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `339212a7c6`)	2021-10-18 18:38:47 +02:00
Seena Fallah	075b1a94d5	ceph-validate: export validate repository vars as a task Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `4f6da9d92f`)	2021-10-18 18:38:47 +02:00
Seena Fallah	110b08c290	ceph-common: export repository configuration to a single task Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `e79bda9a05`)	2021-10-18 18:38:47 +02:00
Seena Fallah	057f8e4315	cephadm: set ssh configs at bootstrap step Add support ssh_user and ssh_config to cephadm bootstrap plugin Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `ae6be71b08`)	2021-10-15 15:13:18 +02:00
Guillaume Abrioux	21a4c16b06	shrink-osd: check osd id format This adds a check early in order to ensure the format of osd ids passed is correct. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2005734 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `968891f449`)	2021-10-15 14:35:34 +02:00
Guillaume Abrioux	5e40cb8957	tests: remove all references to ceph_stable_release this is legacy and not needed anymore. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f277a39dfe`)	2021-10-02 15:50:24 +02:00
Seena Fallah	59c7238741	ceph-defaults: set ceph_stable_release default to the stable branch release ceph_stable_release is a legacy from the time where a single branch of ceph-ansible supported more than one release of ceph Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `fb99626987`)	2021-10-02 15:50:24 +02:00
Francesco Pantano	2e93c80f73	Add ceph_nfs_adopt tag to the cephadm-adopt playbook There are existing OpenStack scenarios where nfs is still not managed by cephadm. For this reason sometimes is useful skip the nfs part of the adoption playbook and leave this daemon unmanaged. The purpose of this patch is providing a tag to enable the OpenStack operators to skip this playbook section. Closes: https://bugzilla.redhat.com/2009212 Signed-off-by: Francesco Pantano <fpantano@redhat.com> (cherry picked from commit `b7299f258b`)	2021-10-01 23:32:47 +02:00
Seena Fallah	d2da6f8974	cephadm: use cephadm_ssh_user for ssh user Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `0b78faa723`)	2021-10-01 23:32:16 +02:00
Guillaume Abrioux	16e41d3a81	tests: add osd node in collocation we update the pool size from 1 to 2 in idempotency test but only 1 node is available. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b6c470c7e2`)	2021-09-30 18:30:54 +02:00
Guillaume Abrioux	f6fc6dcf7e	tests: set rgw_instances in collect-logs.yml in order to gather rgw logs, we need rgw_instances to be set. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c2e46fe5a5`)	2021-09-30 17:53:01 +02:00
Guillaume Abrioux	77f8d7dfaa	tests: update collect-logs.yml playbook - change `ceph -s` output to json-pretty. - gather rgw logs - add `health detail` command Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b2ccc7234a`)	2021-09-30 17:48:35 +02:00
Guillaume Abrioux	4682334924	tests: move collect-logs.yml to ceph-ansible repo related ceph-build PR: ceph/ceph-build#1914 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `702564518b`)	2021-09-29 16:41:28 +02:00
Alex Lambert	de17b232e6	dashboard: allow disabling of unused features Unconfigured dashboard features can lead to empty tabs in the dashboard containing no meaningful content. Allow users to disable dashboard features they know will not be used. A list of features to be disabled allows the user to define a streamlined dashboard as standard across deployments. Defaults to disabling no features, ensuring that users are sure they do not need the dashboard feature before disabling it. Signed-off-by: Alex Lambert <lamberta@microsoft.com> (cherry picked from commit `a9680ab17f`)	2021-09-29 14:28:26 +02:00
Guillaume Abrioux	5904a25684	tests: fix container-cephadm job add missing variable `containerized_deployment` in group_vars Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `66f3eb377c`)	2021-09-29 09:58:14 +02:00
Guillaume Abrioux	da10c22500	cephadm-adopt: add no_log: true Let's add a `no_log: true` on the `cephadm registry-login` task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0a3b916ee7`)	2021-09-28 21:15:29 +02:00
Guillaume Abrioux	276b9fd49e	adopt: stop iscsi services in the first place If old containers are still running, it can make tcmu-runner process unable to open devices and there's nothing else to do than restarting the container. Also, as per discussion with iscsi experts, iscsi should be migrated before OSDs. (the client should be closed before the server) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000412 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d12efa1ab4`)	2021-09-28 18:47:02 +02:00
Seena Fallah	25e078f685	purge: add remove_docker tag This can help to skip docker removal tasks Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `ff39c8d70b`)	2021-09-14 20:49:55 +02:00
Seena Fallah	eef429a75b	cephadm-adopt: use cephadm_ssh_user for ssh user Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `67389d08d4`)	2021-09-14 20:49:33 +02:00
Daniel Pivonka	c8cadaa154	cephadm-adopt: set cephadm registry login info registry login info needs to be stored in cluster for cephadm and future hosts Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000103 Signed-off-by: Daniel Pivonka <dpivonka@redhat.com> (cherry picked from commit `1c50dc29cf`)	2021-09-13 16:18:53 +02:00
Seena Fallah	c8841cdf41	purge: add container_binary needed for zap osds `container_binary` isn't set anymore in the purge osd play because of a regression introduced by `60aa70a`. The CI didn't catch it because the play purging node-exporter sets this variable for all nodes before we run the purge osd play. This commit fixes this regression. Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `a51ce767ca`)	2021-09-09 14:40:42 +02:00
Dimitri Savineau	380d25a752	ceph-defaults: set quay.io as the default registry Because the ceph container images are now only pushed to the quay.io registry then this updates the default registry value. The docker.io registry can still be used but doesn't receive updated container images. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e7b43c1fc6`)	2021-09-09 13:43:02 +02:00
Dimitri Savineau	befe57d017	purge-dashboard: remove cid files This adds the service cid file cleanup as supported in the classic purge playbook since `b9dd253` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786691 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cddc23f511`)	2021-09-08 12:05:25 -04:00
Seena Fallah	688a673c48	ceph-container-engine: allow override container_package_name and container_service_name Only include specific variables when they are undefined Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `95bce32270`)	2021-09-08 15:35:19 +02:00
Dimitri Savineau	d054864366	tests/rgw: use json format output for user info If the radosgw user already exists then we need to have the output in json format because we are expecting to load the output with json.loads() Otherwise we have pytest failure like: ```console self = <json.decoder.JSONDecoder object at 0x7fa2f00a5fd0>, s = '', idx = 0 def raw_decode(self, s, idx=0): """Decode a JSON document from ``s`` (a ``str`` beginning with a JSON document) and return a 2-tuple of the Python representation and the index in ``s`` where the document ended. This can be used to decode a JSON document from a string that may have extraneous data at the end. """ try: obj, end = self.scan_once(s, idx) except StopIteration as err: > raise JSONDecodeError("Expecting value", s, err.value) from None E json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) ``` Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f2bd8ae70f`)	2021-08-27 14:40:32 -04:00
Dimitri Savineau	dc4b8445fa	tests/rgw: add timeout 5s to radosgw-admin command If the radosgw daemons aren't up and running correctly (like not registered in the servicemap or the OSD are down) then the radosgw-admin will hang forever. Jenkins will kill the jobs after 3h but we don't want to wait until this global timeout. Adding the timeout 5 command to the radosgw-admin commands (which is already present on other ceph calls) allows the job to fail earlier. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f01ae82eec`)	2021-08-27 14:40:32 -04:00
Dimitri Savineau	6baa6e6b84	container: explicitly pull monitoring images We don't pull the monitoring container images (alertmanager, prometheus, node-exporter and grafana) in a dedicated task like we're doing for the ceph container image. This means that the container image pull is done during the start of the systemd service. By doing this, pulling the image behind a proxy isn't working with podman. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1995574 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5bb7240f87`)	2021-08-23 16:08:16 -04:00
Guillaume Abrioux	6892e02a30	iscsi: don't set default value for trusted_ip_list It restricts access to the iSCSI API. It can be left empty if the API isn't going to be access from outside the gateway node Even though this seems to be a limited use case, it's better to leave it empty by default than having a meaningless default value. We could make this variable mandatory but that would be a breaking change. Let's just add a logic in the template in order to set this variable in the configuration file only if it was specified by users. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1994930 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `6802b8dddd`)	2021-08-19 12:06:50 -04:00
Guillaume Abrioux	afe442a18f	containers: introduce target systemd unit This adds ceph-*.target systemd unit files support for containerized deployments. This also fixes a regression introduced by PR #6719 (rgw and nfs systemd units not getting purged) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1962748 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `09ef465f62`)	2021-08-18 13:42:56 -04:00
Guillaume Abrioux	e7d9d0a7d4	roles: remove leftover from pr #4319 pr #4319 introduced some uesless `become: true` on systemd tasks. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1db8fa8989`)	2021-08-18 11:08:28 -04:00
Guillaume Abrioux	da54ea555e	Vagrantfile: fallback on 'varant_variables.yml.sample' When using a vagrant command from the root directory of the repo, it throws an error if no 'vagrant_variables.yml' file is present. ``` Message: Errno::ENOENT: No such file or directory @ rb_sysopen - /home/guits/workspaces/ceph-ansible/vagrant_variables.yml ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3d27f9e7dc`)	2021-08-18 11:08:06 -04:00
Guillaume Abrioux	492c2b5389	update: gather facts only one time this play doesn't need to gather facts from localhost Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c14e9114ba`)	2021-08-17 15:31:41 -04:00
Dimitri Savineau	a6b6706fdb	ceph-mon: do not log monitor keyring We don't want to display the keyring in the ansible log. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e44075abd6`)	2021-08-12 13:31:00 +02:00

1 2 3 4 5 ...

5702 Commits (f2eab356d6d9c0506eec6412200f7e7fa9d7886c) All Branches Search

5702 Commits (f2eab356d6d9c0506eec6412200f7e7fa9d7886c)

All Branches