ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Seena Fallah	59c7238741	ceph-defaults: set ceph_stable_release default to the stable branch release ceph_stable_release is a legacy from the time where a single branch of ceph-ansible supported more than one release of ceph Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `fb99626987`)	2021-10-02 15:50:24 +02:00
Francesco Pantano	2e93c80f73	Add ceph_nfs_adopt tag to the cephadm-adopt playbook There are existing OpenStack scenarios where nfs is still not managed by cephadm. For this reason sometimes is useful skip the nfs part of the adoption playbook and leave this daemon unmanaged. The purpose of this patch is providing a tag to enable the OpenStack operators to skip this playbook section. Closes: https://bugzilla.redhat.com/2009212 Signed-off-by: Francesco Pantano <fpantano@redhat.com> (cherry picked from commit `b7299f258b`)	2021-10-01 23:32:47 +02:00
Seena Fallah	d2da6f8974	cephadm: use cephadm_ssh_user for ssh user Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `0b78faa723`)	2021-10-01 23:32:16 +02:00
Guillaume Abrioux	16e41d3a81	tests: add osd node in collocation we update the pool size from 1 to 2 in idempotency test but only 1 node is available. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b6c470c7e2`)	2021-09-30 18:30:54 +02:00
Guillaume Abrioux	f6fc6dcf7e	tests: set rgw_instances in collect-logs.yml in order to gather rgw logs, we need rgw_instances to be set. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c2e46fe5a5`)	2021-09-30 17:53:01 +02:00
Guillaume Abrioux	77f8d7dfaa	tests: update collect-logs.yml playbook - change `ceph -s` output to json-pretty. - gather rgw logs - add `health detail` command Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b2ccc7234a`)	2021-09-30 17:48:35 +02:00
Guillaume Abrioux	4682334924	tests: move collect-logs.yml to ceph-ansible repo related ceph-build PR: ceph/ceph-build#1914 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `702564518b`)	2021-09-29 16:41:28 +02:00
Alex Lambert	de17b232e6	dashboard: allow disabling of unused features Unconfigured dashboard features can lead to empty tabs in the dashboard containing no meaningful content. Allow users to disable dashboard features they know will not be used. A list of features to be disabled allows the user to define a streamlined dashboard as standard across deployments. Defaults to disabling no features, ensuring that users are sure they do not need the dashboard feature before disabling it. Signed-off-by: Alex Lambert <lamberta@microsoft.com> (cherry picked from commit `a9680ab17f`)	2021-09-29 14:28:26 +02:00
Guillaume Abrioux	5904a25684	tests: fix container-cephadm job add missing variable `containerized_deployment` in group_vars Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `66f3eb377c`)	2021-09-29 09:58:14 +02:00
Guillaume Abrioux	da10c22500	cephadm-adopt: add no_log: true Let's add a `no_log: true` on the `cephadm registry-login` task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0a3b916ee7`)	2021-09-28 21:15:29 +02:00
Guillaume Abrioux	276b9fd49e	adopt: stop iscsi services in the first place If old containers are still running, it can make tcmu-runner process unable to open devices and there's nothing else to do than restarting the container. Also, as per discussion with iscsi experts, iscsi should be migrated before OSDs. (the client should be closed before the server) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000412 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d12efa1ab4`)	2021-09-28 18:47:02 +02:00
Seena Fallah	25e078f685	purge: add remove_docker tag This can help to skip docker removal tasks Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `ff39c8d70b`)	2021-09-14 20:49:55 +02:00
Seena Fallah	eef429a75b	cephadm-adopt: use cephadm_ssh_user for ssh user Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `67389d08d4`)	2021-09-14 20:49:33 +02:00
Daniel Pivonka	c8cadaa154	cephadm-adopt: set cephadm registry login info registry login info needs to be stored in cluster for cephadm and future hosts Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000103 Signed-off-by: Daniel Pivonka <dpivonka@redhat.com> (cherry picked from commit `1c50dc29cf`)	2021-09-13 16:18:53 +02:00
Seena Fallah	c8841cdf41	purge: add container_binary needed for zap osds `container_binary` isn't set anymore in the purge osd play because of a regression introduced by `60aa70a`. The CI didn't catch it because the play purging node-exporter sets this variable for all nodes before we run the purge osd play. This commit fixes this regression. Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `a51ce767ca`)	2021-09-09 14:40:42 +02:00
Dimitri Savineau	380d25a752	ceph-defaults: set quay.io as the default registry Because the ceph container images are now only pushed to the quay.io registry then this updates the default registry value. The docker.io registry can still be used but doesn't receive updated container images. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e7b43c1fc6`)	2021-09-09 13:43:02 +02:00
Dimitri Savineau	befe57d017	purge-dashboard: remove cid files This adds the service cid file cleanup as supported in the classic purge playbook since `b9dd253` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786691 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cddc23f511`)	2021-09-08 12:05:25 -04:00
Seena Fallah	688a673c48	ceph-container-engine: allow override container_package_name and container_service_name Only include specific variables when they are undefined Signed-off-by: Seena Fallah <seenafallah@gmail.com> (cherry picked from commit `95bce32270`)	2021-09-08 15:35:19 +02:00
Dimitri Savineau	d054864366	tests/rgw: use json format output for user info If the radosgw user already exists then we need to have the output in json format because we are expecting to load the output with json.loads() Otherwise we have pytest failure like: ```console self = <json.decoder.JSONDecoder object at 0x7fa2f00a5fd0>, s = '', idx = 0 def raw_decode(self, s, idx=0): """Decode a JSON document from ``s`` (a ``str`` beginning with a JSON document) and return a 2-tuple of the Python representation and the index in ``s`` where the document ended. This can be used to decode a JSON document from a string that may have extraneous data at the end. """ try: obj, end = self.scan_once(s, idx) except StopIteration as err: > raise JSONDecodeError("Expecting value", s, err.value) from None E json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) ``` Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f2bd8ae70f`)	2021-08-27 14:40:32 -04:00
Dimitri Savineau	dc4b8445fa	tests/rgw: add timeout 5s to radosgw-admin command If the radosgw daemons aren't up and running correctly (like not registered in the servicemap or the OSD are down) then the radosgw-admin will hang forever. Jenkins will kill the jobs after 3h but we don't want to wait until this global timeout. Adding the timeout 5 command to the radosgw-admin commands (which is already present on other ceph calls) allows the job to fail earlier. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f01ae82eec`)	2021-08-27 14:40:32 -04:00
Dimitri Savineau	6baa6e6b84	container: explicitly pull monitoring images We don't pull the monitoring container images (alertmanager, prometheus, node-exporter and grafana) in a dedicated task like we're doing for the ceph container image. This means that the container image pull is done during the start of the systemd service. By doing this, pulling the image behind a proxy isn't working with podman. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1995574 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5bb7240f87`)	2021-08-23 16:08:16 -04:00
Guillaume Abrioux	6892e02a30	iscsi: don't set default value for trusted_ip_list It restricts access to the iSCSI API. It can be left empty if the API isn't going to be access from outside the gateway node Even though this seems to be a limited use case, it's better to leave it empty by default than having a meaningless default value. We could make this variable mandatory but that would be a breaking change. Let's just add a logic in the template in order to set this variable in the configuration file only if it was specified by users. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1994930 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `6802b8dddd`)	2021-08-19 12:06:50 -04:00
Guillaume Abrioux	afe442a18f	containers: introduce target systemd unit This adds ceph-*.target systemd unit files support for containerized deployments. This also fixes a regression introduced by PR #6719 (rgw and nfs systemd units not getting purged) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1962748 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `09ef465f62`)	2021-08-18 13:42:56 -04:00
Guillaume Abrioux	e7d9d0a7d4	roles: remove leftover from pr #4319 pr #4319 introduced some uesless `become: true` on systemd tasks. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1db8fa8989`)	2021-08-18 11:08:28 -04:00
Guillaume Abrioux	da54ea555e	Vagrantfile: fallback on 'varant_variables.yml.sample' When using a vagrant command from the root directory of the repo, it throws an error if no 'vagrant_variables.yml' file is present. ``` Message: Errno::ENOENT: No such file or directory @ rb_sysopen - /home/guits/workspaces/ceph-ansible/vagrant_variables.yml ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3d27f9e7dc`)	2021-08-18 11:08:06 -04:00
Guillaume Abrioux	492c2b5389	update: gather facts only one time this play doesn't need to gather facts from localhost Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c14e9114ba`)	2021-08-17 15:31:41 -04:00
Dimitri Savineau	a6b6706fdb	ceph-mon: do not log monitor keyring We don't want to display the keyring in the ansible log. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e44075abd6`)	2021-08-12 13:31:00 +02:00
Guillaume Abrioux	5b30a72869	common: do not log keyring secret let's not display any keyring secret by default in ansible log. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1980744 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7511195738`)	2021-08-11 17:01:09 -04:00
Dimitri Savineau	fa8b58fb33	ceph-dashboard: fix TLS cert openssl generation With OpenSSL version prior 1.1.1 (like CentOS 7 with 1.0.2k), the -addext doesn't exist. As a solution, this uses the default openssl.cnf configuration file as a template and add the subjectAltName in the v3_ca section. This temp openssl configuration file is removed after the TLS certificate creation. This patch also move the run_once statement at the block level. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978869 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5e0ace7e54`)	2021-08-09 15:14:38 -04:00
Guillaume Abrioux	fa16f6d923	dashboard: subj_alt_names fact refactor the current way the variable is built results in: ``` 2021-08-03 04:18:23,020 - ceph.ceph - INFO - ok: [ceph-sangadi-4x-indpt6-node1-installer] => changed=false ansible_facts: subj_alt_names: \|- subjectAltName=ceph-sangadi-4x-indpt6-node1-installer/subjectAltName=10.0.210.223/subjectAltName=ceph-sangadi-4x-indpt6-node1-installersubjectAltName=ceph-sangadi-4x-indpt6-node2/subjectAltName=10.0.210.252/subjectAltName=ceph-sangadi-4x-indpt6-node2/ ``` which is incorrect. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978869 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6f1a0634f7`)	2021-08-09 15:14:38 -04:00
VasishtaShastry	3037d394ca	Fixes typo in rgw-add-users-buckets playbook Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com> (cherry picked from commit `478d9fdcb6`)	2021-08-09 14:31:48 -04:00
Teoman ONAY	47149a5483	podman pids.max default value is 2048, docker's one is 4096 which are sufficient for the default value (512) of rgw thread pool size. But if its value is increased near to the pids-limit value, it does not leave place for the other processes to spawn and run within the container and the container crashes. pids-limit set to unlimited regardless of the container engine. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1987041 Signed-off-by: Teoman ONAY <tonay@redhat.com> (cherry picked from commit `9b5d97adb9`)	2021-08-05 11:04:24 -04:00
Dimitri Savineau	bcf9a2c25e	infra: use dedicated variables for balancer status The balancer status is registered during the cephadm-adopt, rolling_update and swith2container playbooks. But it is also used in the ceph-handler role which is included in those playbooks too. Even if the ceph-handler tasks are skipped for rolling_update and switch2container, the balancer_status variable is erased with the skip task result. play1: register: balancer_status play2: register: balancer_status <-- skipped play3: when: (balancer_status.stdout \| from_json)['active'] \| bool This leads to issue like: The conditional check '(balancer_status.stdout \| from_json)['active'] \| bool' failed. The error was: Unexpected templating type error occurred on ({% if (balancer_status.stdout \| from_json)['active'] \| bool %} True {% else %} False {% endif %}): expected string or buffer. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1982054 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `386661699b`)	2021-08-04 11:43:53 -04:00
Dimitri Savineau	de05060049	library: exit on user creation failure When the ceph dashboard user creation fails then the issue is hidden as we don't check the return code and don't print the error message in the module output. This ends up with a failure on the ceph dashboard set roles command saying that the user doesn't exist. By failing on the user creation, we will have an explicit explaination of the issue (like weak password). Closes: #6197 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `17784624e0`)	2021-08-04 17:41:29 +02:00
Dimitri Savineau	561a7c02c0	osds: use osd pool ls instead of osd dump command The ceph osd pool ls detail command is a subset of the ceph osd dump command. $ ceph osd dump --format json\|wc -c 10117 $ ceph osd pool ls detail --format json\|wc -c 4740 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `06471a4b82`)	2021-08-03 13:57:20 -04:00
Dimitri Savineau	380e0bec83	rolling_update: get ceph version when mons exist `eec3878` introduced a regression for upgrade scenarios where there's no monitor nodes at all (like ganesha standalone, external clients, etc..) TASK [get the ceph release being deployed] ********************************** task path: infrastructure-playbooks/rolling_update.yml:121 Thursday 29 July 2021 15:55:29 +0000 (0:00:00.484) 0:00:15.802 ******* fatal: [client0]: FAILED! => msg: '''dict object'' has no attribute ''mons''' Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e87a47cf0c`)	2021-08-03 12:17:53 -04:00
Benoît Knecht	35ce2bb643	infrastructure-playbooks: Get Ceph info in check mode In the `set osd flags` block, run the Ceph commands that gather information from the cluster (and don't make any changes to it) even when running in check mode. This allows the tasks that depend on the variables set by those tasks to succeed in check mode. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `d7653dca95`)	2021-08-02 15:54:04 +02:00
Benoît Knecht	f9478472af	ceph-handler: Fix osd handler in check mode Run the Ceph commands that only gather information (without making any changes to the cluster) when running Ansible in check mode. This allows the tasks that depend on the variables set by those tasks to succeed in check mode. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `498acd7527`)	2021-08-02 15:54:04 +02:00
Dimitri Savineau	9ee44013c5	library: remove unused module import Move the import at the top of the file and remove unused module import. - E402 module level import not at top of file - F401 'xxxx' imported but unused This also removes the '# noqa E402' statement from the code. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2138a00a32`)	2021-08-02 15:51:39 +02:00
Wong Hoi Sing Edison	c475d84310	library: flake8 ceph-ansible modules This commit ensure all ceph-ansible modules pass flake8 properly. Signed-off-by: Wong Hoi Sing Edison <hswong3i@pantarei-design.com> (cherry picked from commit `beda1fe773`)	2021-08-02 15:51:39 +02:00
Dimitri Savineau	d7edc71fd5	ceph-defaults: update grafana dashboards source We currently download the grafana dashboars from the ceph@master branch for all ceph releases. We should use the right ceph branch according to the ceph release. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-07-27 11:44:50 -04:00
Dimitri Savineau	3e8d9b4a1f	ceph-defaults: add missing grafana dashboards The radosgw-sync-overview and rbd-details grafana dashboars were missing from the list. Closes: #6758 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f0ccf3ebf0`)	2021-07-27 10:53:47 -04:00
Guillaume Abrioux	b9cc91f622	update: check the ceph release Check early which Ceph release is going to be deployed and fail if it doesn't correspond to the ceph-ansible version being used. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978643 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `eec38784ec`)	2021-07-26 13:39:20 -04:00
Dimitri Savineau	f5ee8dfb26	alertmanager: allow disable dashboard tls verify When using self-signed/untrusted CA certificates, alertmanager displays an error in logs. With this commit this should make those messages disappear. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1936299 Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com> Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9f77b929d1`)	2021-07-25 22:02:16 -04:00
Guillaume Abrioux	f085f681f0	purge: support osd_auto_discovery This adds a task that zaps by osd id so we can support the scenario where osds were deployed with `osd_auto_discovery` is true. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876860 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4144074a50`)	2021-07-22 17:10:01 -04:00
Guillaume Abrioux	0ef447704f	purge: merge playbooks This refactor merges the two playbooks so we only have to maintain 1 playbook. (Symlink the old purge-container-cluster.yml playbook for backward compatibility). Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `17cd83bf3a`)	2021-07-22 17:10:01 -04:00
Guillaume Abrioux	b2b2871ccd	purge: drop variables from 'hosts' sections Those variables are useless given this is not possible to override them. Let's replace them with the hardcoded name instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6b50401d0c`)	2021-07-22 17:10:01 -04:00
Dimitri Savineau	88e07f0bbc	multisite: use node fqdn for endpoints when https When the rgw_multisite_proto variable is set to https then we shoudn't use the IP address in the zone endpoints list but the node FQDN to match the TLS certificate CN. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1965504 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ad05a08160`)	2021-07-22 22:48:03 +02:00
Dimitri Savineau	06158c2ac5	common: remove unnecessary run_once statements `1303611` introduced tasks for disabling the pg_autoscaler on pools and the balancer but thoses tasks are already executed on the first monitor node so we don't need to add the run_once statement. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `738fa9428a`)	2021-07-21 10:01:25 -04:00
Dimitri Savineau	f9d60644ad	common: fix py2 pool_list from_json when skipped When using python 2 and the task with a loop is skipped then it generates an error. Unexpected templating type error occurred on ({{ (pool_list.stdout \| from_json)['pools'] }}): expected string or buffer Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cf6e33346e`)	2021-07-21 08:57:53 -04:00

1 2 3 4 5 ...

5679 Commits (59c7238741f0afda5571c3a107ad1d570ae9d01a) All Branches Search

5679 Commits (59c7238741f0afda5571c3a107ad1d570ae9d01a)

All Branches