ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	4969ea7710	facts: always set ceph_run_cmd and ceph_admin_command always set these facts on monitor nodes whatever we run with `--limit`. Otherwise, playbook will fail when using `--limit` on nodes where these facts are used on a delegated task to monitor. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e5e81843e9`)	2020-06-03 13:22:45 -04:00
Guillaume Abrioux	86d5979269	osd: add a default value for 'default' in crush_rules Let's default to `False` for the `default` attribute in `crush_rules` variable. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1797774 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1b0b7af119`)	2020-06-03 13:20:40 -04:00
Dimitri Savineau	a97e24fee9	docker2podman: manage dashboard nodes The dashboard nodes (alertmanager, grafana, node-exporter, and prometheus) were not manage during the docker to podman migration. This adds the systemd container template of those services to a dedicated file (systemd.yml) in order to include it in the docker2podman playbook. This also adds the dashboard container images pull from docker to podman. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1829389 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `252e78b4e4`)	2020-06-03 13:20:24 -04:00
Dimitri Savineau	ffd28abb45	ceph-nfs: bind mount ganesha log directory The current ganesha log directory is only present in the container and not bind mount on the host. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `222fe4abd8`)	2020-06-03 13:19:47 -04:00
Benoît Knecht	8ae4bbd8ca	ceph-validate: Expand templates in rgw_create_pools Same fix as `ceph-rgw` for `rgw_create_pools` pool names that contain Jinja templates. See #5348 for details. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `444b46ea24`)	2020-06-03 13:18:43 -04:00
Benoît Knecht	e454b34b92	ceph-rgw: Make sure pool name templates are expanded It is common to set templated pool names in `rgw_create_pools`, e.g. ```yaml rgw_create_pools: "{{ rgw_zone }}.rgw.buckets.index": pg_num: 16 size: 3 type: replicated ``` This worked fine with Ansible 2.8, but broke in Ansible 2.9 due to a change in the way `with_dict` works [1]. This commit replaces the use of `with_dict` with ```yaml loop: "{{ rgw_create_pools \| dict2items }}" ``` which works as intended and expands the template in the pool name. [1]: https://docs.ansible.com/ansible/latest/porting_guides/porting_guide_2.9.html#loops Closes #5348 Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `d2b7670c7d`)	2020-06-03 13:18:43 -04:00
Dimitri Savineau	8c4190e243	ceph-facts: fix IPv6 _radosgw_address interface When using radosgw_interface and IPv6 setup then the _radosgw_address fact doesn't use square brackets compared to the radosgw_address and radosgw_address_block configuration. Closes: #5325 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ed4f23d530`)	2020-06-03 13:18:33 -04:00
fmount	7b5dba4488	Refresh ceph dashboard user role This change allows the operator to refresh the ceph dashboard admin role on multiple ceph-ansible executions. In the current state the role is set only when the user is created, and there's no way to change it if the user exists. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1826002 Signed-off-by: fmount <fpantano@redhat.com> (cherry picked from commit `5eb363e033`)	2020-06-03 13:18:18 -04:00
Guillaume Abrioux	c2335b597f	mds: don't enable application pool on cephfs pools this commit removes the task which enable application on cephfs pools. See: https://tracker.ceph.com/issues/43761 Fixes: #5278 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `86dc6f8206`)	2020-06-03 13:17:48 -04:00
Dimitri Savineau	487dcdc3f0	ceph-rgw: use match instead of equalto from jinja2 The '==' jinja2 operator (or 'equalto') has been introduced in jinja2 2.8. On EL7, jinja2 version is 2.7 so the operator isn't present creating templating error like: The error was: TemplateRuntimeError: no test named '==' Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1747206 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `34e6e8e06c`)	2020-05-06 15:45:44 -04:00
Dimitri Savineau	3e421ad6e6	ceph-nfs: fix internal ganesha deployment Since `ea2b654d9` we're not running the rados command from the monitor nodes but from the ganesha node. Unfortunately we don't have the required keyring on that node to run the rados command as we don't import the right keyring. This commit restores the workflow for internal ganesha deployment like before `ea2b654d9` but keeps the rados commands from the ganesha node for external deployment until we have a better design. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `8a890306ad`)	2020-05-06 13:30:21 -04:00
Dimitri Savineau	da04dfdddf	ceph-nfs: fix keyring copy for external ganesha Fix the condition on the keyring copy task that prevent the ganesha keyring to be created in the /var/lib/ceph directory. Also ensure that the directory exists first. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1831285 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `748ac4b928`)	2020-05-06 13:30:21 -04:00
Guillaume Abrioux	19d0db1c7b	nfs: fix 2 typo The condition is missing an index here which makes the playbook failing. Typical error: ``` The conditional check 'not item.get('skipped', False)' failed. The error was: error while evaluating conditional (not item.get('skipped', False)): 'list object' has no attribute 'get'", ``` Also, adds the missing '/keyring' on the `exec_cmd_nfs` fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1831342 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `cf460274c7`)	2020-05-06 13:30:21 -04:00
Dimitri Savineau	b77e2b64ce	ceph-dashboard: fix mgr dashboard IPv6 fact `15ed9ee` introduced a regression for the mgr dashboard daemon using IPv6 since the mgr dashboard configuration doesn't support brackets. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1827299 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f1728929cd`)	2020-04-23 16:23:30 -04:00
ianwatsonrh	ba4144544f	typo: updating type check on rc Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1826884 Signed-off-by: ianwatsonrh <ianwatson@redhat.com> (cherry picked from commit `ccf6a7f153`)	2020-04-23 17:04:17 +02:00
Dimitri Savineau	d117d08a6f	ceph-container-engine: add CentOS 8 support This adds CentOS 8 support for containerized deployment allowing podman installation as the default container engine for this distribution. Closes: #5130 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-04-23 13:26:29 +02:00
abaird-rh	6878aab0f9	Updated use of deprecated filter This was removed in Ansible 2.9. [DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of using `result\|version_compare` use `result is version_compare`. This feature will be removed in version 2.9. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. Rename 'version_compare' to the function 'version'. version_compose was renamed to version since ansible 2.5 Signed-off-by: abaird-rh <abaird@redhat.com> (cherry picked from commit `eb71244bfd`)	2020-04-20 13:37:42 -04:00
Guillaume Abrioux	0ace5f5f2c	mds: fix --limit run against mds nodes This commit fixes --limit runs against mds nodes. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `378405e328`)	2020-04-14 13:42:45 -04:00
Guillaume Abrioux	945266776b	nfs: create empty rados index object for nfs standalone This commit creates an empty rados index object even when deploying standalone nfs-ganesha. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822328 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ea2b654d95`)	2020-04-14 12:03:38 -04:00
Dimitri Savineau	1033ad191b	ceph-validate: update RHEL requirement for RHCS We were not testing the right ansible_distribution fact value for RHEL distribution. This commit also updates the minial RHEL version supported by RHCS. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5de74fe512`)	2020-04-14 11:27:01 -04:00
Guillaume Abrioux	1b79d73729	osd: fix monitor_name error when scaling out OSDs This commit fixes a bug when trying to scale out osd nodes with `crush_rule_config` is enabled. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822599 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4bcc52cb2a`)	2020-04-10 13:44:15 +02:00
Dimitri Savineau	6a2272b9c0	ceph-mgr: add saml python lib for dashboard SSO The dashboard SSO mgr module requires the saml python library to be installed. This is only a valid scenario for RHCS deployment because the saml python library isn't available in other classic repositories. This package is present in RHCS Tools repository so we also need to enable it on the mgr nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1820233 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `6617d90733`)	2020-04-06 11:00:01 -04:00
Guillaume Abrioux	7acd9686ab	osd: use default crush rule name when needed When `rule_name` isn't set in `crush_rules` the osd pool creation will fail. This commit adds a new fact `ceph_osd_pool_default_crush_rule_name` with the default crush rule name. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1817586 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1bb9860dfd`)	2020-03-31 19:42:40 -04:00
Guillaume Abrioux	03355aec8c	tests: add more coverage in external_clients scenario Run create_users_keys.yml in external_clients scenario Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8c1c34b201`)	2020-03-31 19:42:40 -04:00
Guillaume Abrioux	87e1f0cc6c	osd: support changing default rule even when osd_crush_location isn't defined Creating crush rules even with no crush hierarchy configuration is a valid scenario so we shouldn't be bound to the first task result (which configure crush hierarchy) to be able to add new crush rules. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1816989 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5b0476385c`)	2020-03-31 23:14:55 +02:00
Dimitri Savineau	1318575626	ceph-defaults: update container tag for nautilus The latest Ceph stable release is now Octopus so the "latest" container image tag is pointing to Octopus and not Nautilus anymore. This commit updates the ceph_docker_image_tag with "latest-nautilus". Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-30 09:55:05 +02:00
Dimitri Savineau	62042f370a	rhcs: drop debian support Support for debian with RHCS has been dropped starting RHCS 4 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `4ac99223b2`)	2020-03-27 10:22:43 -04:00
Dimitri Savineau	94490b6e16	ceph-defaults: update ganesha to 2.8 With Ceph Nautilus release we should use nfs-ganesha 2.8 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-26 13:32:29 -04:00
Guillaume Abrioux	2970e28d66	defaults: remove legacy comment This is no longer true, let's remove this comment given that this option is not ignored in containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e551b5ba1a`)	2020-03-26 12:08:23 -04:00
Dimitri Savineau	98f223c4d0	ceph-facts: fix rgw_instances_all fact The rgw_instances_all fact is supposed to be the list of all radosgw instances from all rgw nodes. But the fact is always using the local rgw_instances variable so this won't work on multiple nodes. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0487d21938`)	2020-03-25 08:41:28 +01:00
Guillaume Abrioux	d22641d37b	nfs: fix nfs with external ceph cluster support This commit refact and fix the nfs deployment with external ceph cluster support. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1814942 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `cc28d9ec26`)	2020-03-19 21:39:56 -04:00
Dimitri Savineau	55c222d088	dashboard: allow to set read-only admin user This commit allows one to set the role for the admin user as read-only. This can be controlled via the dashboard_admin_user_ro variable but the default value is false for backward compatibility. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1810176 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `fb69f6990c`)	2020-03-19 13:24:05 -04:00
Dimitri Savineau	bf9d628b65	ceph-defaults: update grafana container tag Since `8e8aa73` we're using grafana 5.4.3 in RHCS 4.1 via [1]. We should also update the grafana container tag from docker.io when using the community release. [1] registry.redhat.io/rhceph/rhceph-4-dashboard-rhel8:4 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b97a4d5201`)	2020-03-17 15:28:00 -04:00
petruha	f2a50c19dc	ceph-facts: Fix system_secret_key variable handling This commit fixes the system_secret_key variable not substitued by the right value and always using the 'system_secret_key' string instead. $ egrep 'system_(access\|secret)_key' group_vars/all.yml system_access_key: foofoofoofoofoofoofo system_secret_key: barbarbarbarbarbarbarbarbarbarbarbarbarb $ ansible-playbook -vv -i hosts site.yml.sample -e rgw_multisite=true (...) - hostname: storage0 endpoint: http://192.168.100.42:8080 instance_name: rgw0 radosgw_address: 192.168.50.3 radosgw_frontend_port: 8085 rgw_realm: canada rgw_zone: montreal rgw_zone_user: justin.trudeau rgw_zone_user_display_name: Justin Trudeau rgw_zonegroup: quebec system_access_key: foofoofoofoofoofoofo system_secret_key: system_secret_key Fixes https://github.com/ceph/ceph-ansible/issues/5150 Signed-off-by: petruha <5363545+p37ruh4@users.noreply.github.com> (cherry picked from commit `73b3fadb0e`)	2020-03-16 17:59:57 -04:00
Guillaume Abrioux	d4c0d3ee66	config: remove legacy option in ceph.conf.j2 This option has been deprecated (As of 0.51). By the way, ceph-ansible already sets the auth_{service,client,cluster}_required variables. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1623586 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `152c2caa9f`)	2020-03-16 12:04:48 -04:00
Dimitri Savineau	dc4993cc92	handler: add rgw multi-instances support This commit adds the rgw multi-instances support in ceph-handler (restart_rgw_daemons.sh.j2) Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `3626c688cf`)	2020-03-12 19:04:26 -04:00
Guillaume Abrioux	c26e80fdbf	rgw: add multi-instances support when deploying multisite This commit adds the multi-instances when deploying rgw multisite Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `60a2e28189`)	2020-03-12 19:04:26 -04:00
Dimitri Savineau	d248f6bf8d	ceph-infra: open radosgw ports for multi instances When using the radosgw multi instances configuration then the firewall rules aren't adapted to that setup. We only open the port according to the radosgw_frontend_port variable so only the first radosgw instance port will be opened in the firewall configuration. We should instead iterate over the rgw_instances list. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e8bf0a0cf2`)	2020-03-12 19:04:26 -04:00
Guillaume Abrioux	bf0a6835a2	rgw: fix a typo in create_realm_zonegroup_zone_lists This commit fixes a typo. `s/realms/secondary_realms` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b3bbd6bb77`)	2020-03-12 16:58:09 -04:00
Guillaume Abrioux	03416c0d4e	infra: add retries/until on firewalld start task This commit make that task retrying 5 times to start the service firewalld to avoid failure like following: ``` TASK [ceph-infra : start firewalld] ****************************************** task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-centos-container-purge/roles/ceph-infra/tasks/configure_firewall.yml:22 Monday 09 March 2020 08:58:48 +0000 (0:00:00.963) 0:02:16.457 ******** fatal: [osd4]: FAILED! => changed=false msg: \|- Unable to enable service firewalld: Created symlink from /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service to /usr/lib/systemd/system/firewalld.service. Created symlink from /etc/systemd/system/multi-user.target.wants/firewalld.service to /usr/lib/systemd/system/firewalld.service. Failed to execute operation: Connection reset by peer ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b3d943fe9f`)	2020-03-12 21:48:20 +01:00
Guillaume Abrioux	46a13664b2	rgw: add retry/until on pools tasks Sometimes, these task can timeout for some reason. Adding these retries can help to avoid unexcepted failures. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7a8a719e75`)	2020-03-06 16:10:03 +01:00
Guillaume Abrioux	9a574562e2	client: skip create_users_keys.yml when rolling_update There's no need to run this part of the role when upgrading clients node. Let's skip it when rolling_update.yml is being run. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `eac207091b`)	2020-03-06 16:10:03 +01:00
Guillaume Abrioux	1a978ae545	osd: do not change pool size on erasure pool This commit adds condition in order to not try to customize pools size when its type is erasure. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e17c79b871`)	2020-03-06 16:10:03 +01:00
Guillaume Abrioux	98783a17b3	osd: add pg autoscaler support This commit adds the pg autoscaler support. The structure for pool definition has now two additional attributes `pg_autoscale_mode` and `target_size_ratio`, eg: ``` test: name: "test" pg_num: "{{ osd_pool_default_pg_num }}" pgp_num: "{{ osd_pool_default_pg_num }}" rule_name: "replicated_rule" application: "rbd" type: 1 erasure_profile: "" expected_num_objects: "" size: "{{ osd_pool_default_size }}" min_size: "{{ osd_pool_default_min_size }}" pg_autoscale_mode: False target_size_ratio": 0.1 ``` when `pg_autoscale_mode` is `True` user has to set a decent value in `target_size_ratio`. Given that it's a new feature, it's still disabled by default. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1782253 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `47adc2bb08`)	2020-03-06 16:10:03 +01:00
Guillaume Abrioux	ae06d684b8	osd: refact osd pool creation Currently, the command executed is wrong, eg: ``` cmd: - podman - exec - ceph-mon-controller-0 - ceph - --cluster - ceph - osd - pool - create - volumes - '32' - '32' - replicated_rule - '1' delta: '0:00:01.625525' end: '2020-02-27 16:41:05.232705' item: ``` From documentation, the osd pool creation command is : ``` ceph osd pool create {pool-name} {pg-num} [{pgp-num}] [replicated] \ [crush-rule-name] [expected-num-objects] ceph osd pool create {pool-name} {pg-num} {pgp-num} erasure \ [erasure-code-profile] [crush-rule-name] [expected_num_objects] ``` it means we pass '1' (from item.type) as value for `expected_num_objects` by default which is very likely not what we want. Also, this commit modifies the default value when no `rule_name` is set to use the existing variable `osd_pool_default_crush_rule` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1808495 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `bf1f125d71`)	2020-03-06 16:10:03 +01:00
Ali Maredia	2c440d4427	rgw multisite: enable more than 1 realm per cluster Make it so that more than one realm, zonegroup, or zone can be created during a run of the rgw multisite ansible playbooks. The rgw hosts now need to be grouped into zones and realms in the inventory. .yml files need to be created in group_vars for the realms and zones. Sample yaml files are available. Also remove multsite destroy playbook and add --cluster before radosgw-admin commands remove manually added rgw_zone_endpoints var and have ceph-ansible automatically add the correct endpoints of all the rgws in a rgw_zone from the information provided in that rgws hostvars. Signed-off-by: Ali Maredia <amaredia@redhat.com> (cherry picked from commit `71f55bd54d`)	2020-03-04 14:39:23 -05:00
Dimitri Savineau	e037e99bd2	purge: stop rgw instances by iteration It looks like that the service module doesn't support wildcard anymore for stopping/disabling multiple services. fatal: [rgw0]: FAILED! => changed=false msg: 'This module does not currently support using glob patterns, found '''' in service name: ceph-radosgw@' ...ignoring Instead we should iterate over the rgw_instances list. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9d3b49293d`)	2020-03-03 10:31:48 +01:00
Dimitri Savineau	eb2fba79fc	ceph-infra: install firewalld python bindings When using the firewalld ansible module we need to be sure that the python bindings are installed. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `90b1fc8fe9`)	2020-03-03 10:31:48 +01:00
Dimitri Savineau	424a0ce4ab	ceph-infra: split firewalld tasks Since ansible 2.9 the firewalld task could not be used with service and source in the same time anymore. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `45fb9241c0`)	2020-03-03 10:31:48 +01:00
Dimitri Savineau	9d4f90c8b4	Add ansible 2.9 support This commit adds ansible 2.9 support in addition of 2.8. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `aefba82a2e`)	2020-03-03 10:31:48 +01:00
Dimitri Savineau	8cc2f8f21e	ceph-validate: start with ansible version test It doesn't make sense to start validating configuration if the ansible version isn't the good one. This commit moves the check_system task as the first task in the ceph-validate role. The ansible version test tasks are moved at the top of this file. Also moving the iscsi kernel tests from check_system to check_iscsi file. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1a77dd7e91`)	2020-03-03 10:31:48 +01:00
Guillaume Abrioux	5a51bd12dc	common: support OSDs with more than 2 digits When running environment with OSDs having ID with more than 2 digits, some tasks don't match the system units and therefore, playbook can fail. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1805643 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a084a2a347`)	2020-02-28 11:06:47 -05:00
Francesco Pantano	f5e2a69134	Configure ceph dashboard backend and dashboard_frontend_vip This change introduces a new set of tasks to configure the ceph dashboard backend and listen just on the mgr related subnet (and not on '*'). For the same reason the proper server address is added in both prometheus and alertmanger systemd units. This patch also adds the "dashboard_frontend_vip" parameter to make sure we're able to support the HA model when multiple grafana instances are deployed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792230 Signed-off-by: Francesco Pantano <fpantano@redhat.com> (cherry picked from commit `15ed9eebf1`)	2020-02-24 16:50:19 -05:00
Dimitri Savineau	ca2003fbcc	ceph-rgw: increase connection timeout to 10 5s as a connection timeout could be low in some setup. Let's increase it to 10s. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `44e750ee5d`)	2020-02-24 14:41:19 -05:00
Dimitri Savineau	3617543517	containers: add KillMode=none to systemd templates Because we are relying on docker\|podman for managing containers then we don't need systemd to manage the process (like kill). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5a03e0ee1c`)	2020-02-18 12:10:35 -05:00
Ali Maredia	7d2a217270	rgw: extend automatic rgw pool creation capability Add support for erasure code pools. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731148 Signed-off-by: Ali Maredia <amaredia@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1834c1e48d`)	2020-02-17 17:44:53 -05:00
Florian Faltermeier	17b405eb10	ceph-rgw-loadbalancer: Fix SSL newline issue The `ad7a5da` commit introduced a regression when using TLS on haproxy via the haproxy_frontend_ssl_certificate variable. This cause the "stats socket" and the "tune.ssl.default-dh-param" parameters to be on the same line resulting haproxy failing to start. [ALERT] 351/140240 (21388) : parsing [xxxxx] : 'stats socket' : unknown keyword 'tune.ssl.default-dh-param'. Registered [ALERT] 351/140240 (21388) : Fatal errors found in configuration. Fixes: #4869 Signed-off-by: Florian Faltermeier <florian.faltermeier@uibk.ac.at> (cherry picked from commit `9d081e2453`)	2020-02-17 11:36:09 -05:00
Dimitri Savineau	df43f32248	ceph-defaults: remove bootstrap_dirs_xxx vars Both bootstrap_dirs_owner and bootstrap_dirs_group variables aren't used anymore in the code. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c644ea9041`)	2020-02-17 11:15:33 -05:00
Dimitri Savineau	0deb5b0706	rgw: don't create user on secondary zones The rgw user creation for the Ceph dashboard integration shouldn't be created on secondary rgw zones. Closes: #4707 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1794351 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `16e12bf2bb`)	2020-02-17 16:43:56 +01:00
Dimitri Savineau	5f7778d59a	ceph-{mon,osd}: move default crush variables Since `ed36a11` we move the crush rules creation code from the ceph-mon to the ceph-osd role. To keep the backward compatibility we kept the possibility to set the crush variables on the mons side but we didn't move the default values. As a result, when using crush_rule_config set to true and wanted to use the default values for crush_rules then the crush rule ansible task creation will fail. "msg": "'ansible.vars.hostvars.HostVarsVars object' has no attribute 'crush_rules'" This patch move the default crush variables from ceph-mon to ceph-osd role but also use those default values when nothing is defined on the mons side. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1798864 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1fc6b33714`)	2020-02-17 10:18:56 -05:00
Dimitri Savineau	f78981bbf7	ceph-grafana: fix grafana_{crt,key} condition The grafana_{crt,key} aren't boolean variables but strings. The default value is an empty string so we should do the conditional on the string length instead of the bool filter Closes: #5053 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `15bd4cd189`)	2020-02-17 10:18:39 -05:00
Dimitri Savineau	a9a533e398	ceph-prometheus: add alertmanager HA config When using multiple alertmanager nodes (via the grafana-server group) then we need to specify the other peers in the configuration. https://prometheus.io/docs/alerting/alertmanager/#high-availability https://github.com/prometheus/alertmanager#high-availability Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792225 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b9d975385c`)	2020-02-17 16:18:20 +01:00
John Fulton	9c97179fc1	The _filtered_clients list should intersect with ansible_play_batch Client configuration with --limit fails without this patch because certain tasks are only done to the first host in the _filtered_clients list and it's likely that first host will not be included in what's sepcified with --limit. To fix this the _filtered_clients list should be built from all clients in the inventory that are also in the running play. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1798781 Signed-off-by: John Fulton <fulton@redhat.com> (cherry picked from commit `e4bf4857f5`)	2020-02-17 10:03:57 -05:00
Dimitri Savineau	b01b255414	ceph-nfs: add nfs-ganesha-rados-urls package Since nfs-ganesha 2.8.3 the rados-urls library has been move to a dedicated package. We don't have the same nfs-ganesha 2.8.x between the community and rhcs repositories. community: 2.8.1 rhcs: 2.8.3 As a workaround we will install that package only for rhcs setup. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0a3e85e8ca`)	2020-02-17 10:00:44 -05:00
Dimitri Savineau	6864d04fdf	ceph-nfs: fix ceph_nfs_ceph_user variable The ceph_nfs_ceph_user variable is a string for the ceph-nfs role but a list in ceph-client role. `6a6785b` introduced a confusion between both variable type in the ceph-nfs role for external ceph with ganesha. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1801319 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `10951eeea8`)	2020-02-17 15:27:30 +01:00
Dimitri Savineau	e4e1b386b0	dashboard: allow configuring multiple grafana host When using multiple grafana hosts then we push set the grafana and prometheus URL and push the dashboard layout to a single node. grafana_server_addrs is the list of all grafana nodes and used during the ceph-dashboard role (on mgr/mon nodes). grafana_server_addr is the current grafana node used during the ceph-grafana and ceph-prometheus role (on grafana-server nodes). We don't have the grafana_server_addr fact duplication code between external vs collocated nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1784011 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c6e96699f7`)	2020-02-12 19:56:31 -05:00
Guillaume Abrioux	1d2a395aaf	switch_to_containers: increase health check values This commit increases the default values for the following variable consumed in switch-from-non-containerized-to-containerized-ceph-daemons.yml playbook. This also moves these variables in `ceph-defaults` role so the user can set different values if needed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783223 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3700aa5385`)	2020-02-10 12:57:17 -05:00
Stanley Lam	0336a1476f	Add option for HAproxy to act a SSL frontend termination point for loadbalanced RGW instances. Signed-off-by: Stanley Lam <stanleylam_604@hotmail.com> (cherry picked from commit `ad7a5dad3f`)	2020-02-03 09:32:43 -05:00
Dimitri Savineau	0dbca448d1	ceph-handler: Use /proc/net/unix for rgw socket If for some reason, there's an old rgw socket file present in the /var/run/ceph/ directory then the test command could fail with test: xxxxxxxxx.asok: binary operator expected $ ls -hl /var/run/ceph/ total 0 srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94153614631472.asok srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94240997655088.asok We can check the radosgw socket in /proc/net/unix to avoid using wildcard in the socket name. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `60cbfdc2a6`)	2020-02-03 09:31:52 -05:00
Dimitri Savineau	460d3557d7	ceph-container-engine: lvm2 on OSD nodes only Since `de8f2a9` the lvm2 package installation has been moved from ceph-osd role to ceph-container-engine role. But the scope wasn't limited to the OSD nodes only. This commit fixes this behaviour. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `fa8aa8c864`)	2020-02-03 15:16:32 +01:00
Dimitri Savineau	80f1b0feb0	ceph-common: rhcs 4 repositories for rhel 7 RHCS 4 is available for both RHEL 7 and 8 so we should also enable the cdn repositories for that distribution. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1796853 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9b40a959b9`)	2020-02-03 15:15:35 +01:00
Mike Christie	76753e64f9	iscsi: Fix crashes during rolling update During a rolling update we will run the ceph iscsigw tasks that start the daemons then run the configure_iscsi.yml tasks which can create iscsi objects like targets, disks, clients, etc. The problem is that once the daemons are started they will accept confifguration requests, or may want to update the system themself. Those operations can then conflict with the configure_iscsi.yml tasks that setup objects and we can end up in crashes due to the kernel being in a unsupported state. This could also happen during creation, but is less likely due to no objects being setup yet, so there are no watchers or users accessing the gws yet. The fix in this patch works for both update and initial setup. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1795806 Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `77f3b5d51b`)	2020-02-03 15:15:15 +01:00
Guillaume Abrioux	1b33c5358e	config: fix external client scenario When no monitor group is present in the inventory, this task fails. This affects only non-containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e7bc079405`)	2020-01-31 13:37:10 +01:00
Dimitri Savineau	3daea719b6	ceph-defaults: remove rgw from ceph_conf_overrides The [rgw] section in the ceph.conf file or via the ceph_conf_overrides variable doesn't exist and has no effect. To apply overrides to all radosgw instances we should use either the [global] or [client] sections. Overrides per radosgw instance should still use the [client.rgw.{instance-name}] section. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1794552 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2f07b85131`)	2020-01-29 14:34:34 +01:00
Guillaume Abrioux	bc6777c6df	dashboard: add quotes when passing password to the CLI Otherwise, if the variables contains a '$' it will be interpreted as a BASH variable. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8c3759f8ce`)	2020-01-29 14:15:41 +01:00
Guillaume Abrioux	8a907cb1ca	validate: fail if dashboard\|grafana_admin_password aren't set This commit adds a task to make sure user set a custom password for `grafana_admin_password` and `dashboard_admin_password` variables. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1795509 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `99328545de`)	2020-01-29 14:15:41 +01:00
Dimitri Savineau	9da917501b	ceph-facts: fix _container_exec_cmd fact value When using different name between the inventory_hostname and the ansible_hostname then the _container_exec_cmd fact will get a wrong value based on the inventory_hostname instead of the ansible_hostname. This happens when the ceph cluster is already running (update/upgrade). Later the container exec commands will fail because the container name is wrong. We should always set the _container_exec_cmd based on the ansible_hostname fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1795792 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1fcafffdad`)	2020-01-29 11:48:44 +01:00
Guillaume Abrioux	0d2af6ebf3	fix calls to `container_exec_cmd` in ceph-osd role We must call `container_exec_cmd` from the right monitor node otherwise the value of the fact might mistmatch between the delegated node and the node being played. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1794900 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2f919f8971`)	2020-01-27 17:54:39 -05:00
Guillaume Abrioux	9fb69e13ed	handler: read container_exec_cmd value from first mon Given that we delegate to the first monitor, we must read the value of `container_exec_cmd` from this node. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792320 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `eb9112d8fb`)	2020-01-23 18:34:14 +01:00
Vytenis Sabaliauskas	4152a1a862	ceph-facts: Fix for 'running_mon is undefined' error, so that fact 'running_mon' is set once 'grep' successfully exits with 'rc == 0' Signed-off-by: Vytenis Sabaliauskas <vytenis.sabaliauskas@protonmail.com> (cherry picked from commit `ed1eaa1f38`)	2020-01-23 11:24:24 -05:00
Dimitri Savineau	6a51330892	ceph-osd: set container objectstore env variables Because we need to manage legacy ceph-disk based OSD with ceph-volume then we need a way to know the osd_objectstore in the container. This was done like this previously with ceph-disk so we should also do it with ceph-volume. Note that this won't have any impact for ceph-volume lvm based OSD. Rename docker_env_args fact to container_env_args and move the container condition on the include_tasks call. Remove OSD_DMCRYPT env variable from the ceph-osd template because it's now included in the container_env_args variable. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792122 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c9e1fe3d92`)	2020-01-20 15:36:11 -05:00
Benoît Knecht	ff2a2bb870	ceph-rgw: Fix customize pool size "when" condition In `3c31b19ab3`, I fixed the `customize pool size` task by replacing `item.size` with `item.value.size`. However, I missed the same issue in the `when` condition. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `3842aa1a30`)	2020-01-20 12:48:19 -05:00
Guillaume Abrioux	1462423059	handler: fix call to container_exec_cmd in handler_osds When unsetting the noup flag, we must call container_exec_cmd from the delegated node (first mon member) Also, adding a `run_once: true` because this task needs to be run only 1 time. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792320 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `22865cde9c`)	2020-01-20 12:45:51 -05:00
Dmitriy Rabotyagov	8d311a537d	Fix undefined running_mon Since commit [1] running_mon introduced, it can be not defined which results in fatal error [2]. This patch defines default value which was used before patch [1] Signed-off-by: Dmitriy Rabotyagov <drabotyagov@vexxhost.com> [1] `8dcbcecd71` [2] https://zuul.opendev.org/t/openstack/build/c82a73aeabd64fd583694ed04b947731/log/job-output.txt#14011 (cherry picked from commit `2478a7b948`)	2020-01-16 18:28:12 -05:00
Guillaume Abrioux	cae24dd85a	remove container_exec_cmd_mgr fact Iterating over all monitors in order to delegate a ` {{ container_binary }}` fails when collocating mgrs with mons, because ceph-facts reset `container_exec_cmd` to point to the first member of the monitor group. The idea is to force `container_exec_cmd` to be reset in ceph-mgr. This commit also removes the `container_exec_cmd_mgr` fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1791282 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8dcbcecd71`)	2020-01-15 21:10:54 +01:00
Dimitri Savineau	09a71e4a8c	ceph-iscsi: don't use bracket with trusted_ip_list The trusted_ip_list parameter for the rbd-target-api service doesn't support ipv6 address with bracket. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787531 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `bd87d69183`)	2020-01-14 12:48:04 -05:00
Dimitri Savineau	ff3a3ee5e9	container: move lvm2 package installation Before this patch, the lvm2 package installation was done during the ceph-osd role. However we were running ceph-volume command in the ceph-config role before ceph-osd. If lvm2 wasn't installed then the ceph-volume command fails: error checking path "/run/lock/lvm": stat /run/lock/lvm: no such file or directory This wasn't visible before because lvm2 was automatically installed as docker dependency but it's not the same for podman on CentOS 8. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `de8f2a9f83`)	2020-01-14 12:47:55 -05:00
Guillaume Abrioux	a81830ddc0	osd: use _devices fact in lvm batch scenario since `fd1718f379`, we must use `_devices` when deploying with lvm batch scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5558664f37`)	2020-01-14 15:33:15 +01:00
Guillaume Abrioux	ffdfa634ac	osd: do not run openstack_config during upgrade There is no need to run this part of the playbook when upgrading the cluter. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `af6875706a`)	2020-01-14 09:12:34 -05:00
Guillaume Abrioux	2d85fab02d	osd: support scaling up using --limit This commit lets add-osd.yml in place but mark the deprecation of the playbook. Scaling up OSDs is now possible using --limit Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3496a0efa2`)	2020-01-14 09:12:34 -05:00
Dimitri Savineau	dc797971ce	ceph-facts: move grafana fact to dedicated file We don't need to executed the grafana fact everytime but only during the dashboard deployment. Especially for ceph-grafana, ceph-prometheus and ceph-dashboard roles. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790303 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f940e695ab`)	2020-01-13 16:28:23 -05:00
Guillaume Abrioux	266c4c7763	facts: fix osp/ceph external use case `d6da508a9b` broke the osp/ceph external use case. We must skip these tasks when no monitor is present in the inventory. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790508 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2592a1e1e8`)	2020-01-13 21:07:01 +01:00
Guillaume Abrioux	532abbb9b2	defaults: change monitor\|radosgw_address default values To avoid confusion, let's change the default value from `0.0.0.0` to `x.x.x.x`. Users might think setting `0.0.0.0` will make the daemon binding on all interfaces. Fixes: #4827 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fc02fc98eb`)	2020-01-13 14:55:23 -05:00
Guillaume Abrioux	9ed540da7e	osd: ensure osd ids collected are well restarted This commit refact the condition in the loop of that task so all potential osd ids found are well started. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790212 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `58e6bfed2d`)	2020-01-13 14:31:03 -05:00
Dimitri Savineau	bc0f16f270	ceph-validate: add rbdmirror validation When ceph_rbd_mirror_configure is set to true we need to ensure that the required variables aren't empty. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1760553 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `4a065cebd7`)	2020-01-10 11:11:37 -05:00
Dimitri Savineau	f2e1941ef1	ceph-osd: wait for all osds once `cf8c6a3` moves the 'wait for all osds' task from openstack_config to the main tasks list. But the openstack_config code was executed only on the last OSD node. We don't need to do this check on all OSD node so we need to add set run_once to true on that task. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5bd1cf40eb`)	2020-01-10 11:07:25 -05:00
Dimitri Savineau	9fa5b296ca	ceph-osd: wait for all osd before crush rules When creating crush rules with device class parameter we need to be sure that all OSDs are up and running because the device class list is is populated with this information. This is now enable for all scenario not openstack_config only. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cf8c6a3849`)	2020-01-10 11:07:25 -05:00
Dimitri Savineau	bd016960cf	ceph-osd: add device class to crush rules This adds device class support to crush rules when using the class key in the rule dict via the create-replicated sub command. If the class key isn't specified then we use the create-simple sub command for backward compatibility. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ef2cb99f73`)	2020-01-10 11:07:25 -05:00
Dimitri Savineau	661b2c013a	move crush rule creation from mon to osd role If we want to create crush rules with the create-replicated sub command and device class then we need to have the OSD created before the crush rules otherwise the device classes won't exist. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ed36a11eab`)	2020-01-10 11:07:25 -05:00
Guillaume Abrioux	d6921f798d	config: exclude ceph-disk prepared osds in lvm batch report We must exclude the devices already used and prepared by ceph-disk when doing the lvm batch report. Otherwise it fails because ceph-volume complains about GPT header. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786682 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fd1718f379`)	2020-01-09 20:15:27 -05:00

1 2 3 4 5 ...

2614 Commits (fbc375387a3087e76a31488a6d753dd1d16c21bc)