ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	d117d08a6f	ceph-container-engine: add CentOS 8 support This adds CentOS 8 support for containerized deployment allowing podman installation as the default container engine for this distribution. Closes: #5130 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-04-23 13:26:29 +02:00
abaird-rh	6878aab0f9	Updated use of deprecated filter This was removed in Ansible 2.9. [DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of using `result\|version_compare` use `result is version_compare`. This feature will be removed in version 2.9. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. Rename 'version_compare' to the function 'version'. version_compose was renamed to version since ansible 2.5 Signed-off-by: abaird-rh <abaird@redhat.com> (cherry picked from commit `eb71244bfd`)	2020-04-20 13:37:42 -04:00
Guillaume Abrioux	0ace5f5f2c	mds: fix --limit run against mds nodes This commit fixes --limit runs against mds nodes. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `378405e328`)	2020-04-14 13:42:45 -04:00
Guillaume Abrioux	945266776b	nfs: create empty rados index object for nfs standalone This commit creates an empty rados index object even when deploying standalone nfs-ganesha. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822328 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ea2b654d95`)	2020-04-14 12:03:38 -04:00
Dimitri Savineau	1033ad191b	ceph-validate: update RHEL requirement for RHCS We were not testing the right ansible_distribution fact value for RHEL distribution. This commit also updates the minial RHEL version supported by RHCS. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5de74fe512`)	2020-04-14 11:27:01 -04:00
Guillaume Abrioux	1b79d73729	osd: fix monitor_name error when scaling out OSDs This commit fixes a bug when trying to scale out osd nodes with `crush_rule_config` is enabled. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822599 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4bcc52cb2a`)	2020-04-10 13:44:15 +02:00
Dimitri Savineau	6a2272b9c0	ceph-mgr: add saml python lib for dashboard SSO The dashboard SSO mgr module requires the saml python library to be installed. This is only a valid scenario for RHCS deployment because the saml python library isn't available in other classic repositories. This package is present in RHCS Tools repository so we also need to enable it on the mgr nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1820233 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `6617d90733`)	2020-04-06 11:00:01 -04:00
Guillaume Abrioux	7acd9686ab	osd: use default crush rule name when needed When `rule_name` isn't set in `crush_rules` the osd pool creation will fail. This commit adds a new fact `ceph_osd_pool_default_crush_rule_name` with the default crush rule name. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1817586 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1bb9860dfd`)	2020-03-31 19:42:40 -04:00
Guillaume Abrioux	03355aec8c	tests: add more coverage in external_clients scenario Run create_users_keys.yml in external_clients scenario Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8c1c34b201`)	2020-03-31 19:42:40 -04:00
Guillaume Abrioux	87e1f0cc6c	osd: support changing default rule even when osd_crush_location isn't defined Creating crush rules even with no crush hierarchy configuration is a valid scenario so we shouldn't be bound to the first task result (which configure crush hierarchy) to be able to add new crush rules. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1816989 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5b0476385c`)	2020-03-31 23:14:55 +02:00
Dimitri Savineau	1318575626	ceph-defaults: update container tag for nautilus The latest Ceph stable release is now Octopus so the "latest" container image tag is pointing to Octopus and not Nautilus anymore. This commit updates the ceph_docker_image_tag with "latest-nautilus". Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-30 09:55:05 +02:00
Dimitri Savineau	62042f370a	rhcs: drop debian support Support for debian with RHCS has been dropped starting RHCS 4 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `4ac99223b2`)	2020-03-27 10:22:43 -04:00
Dimitri Savineau	94490b6e16	ceph-defaults: update ganesha to 2.8 With Ceph Nautilus release we should use nfs-ganesha 2.8 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-26 13:32:29 -04:00
Guillaume Abrioux	2970e28d66	defaults: remove legacy comment This is no longer true, let's remove this comment given that this option is not ignored in containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e551b5ba1a`)	2020-03-26 12:08:23 -04:00
Dimitri Savineau	98f223c4d0	ceph-facts: fix rgw_instances_all fact The rgw_instances_all fact is supposed to be the list of all radosgw instances from all rgw nodes. But the fact is always using the local rgw_instances variable so this won't work on multiple nodes. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0487d21938`)	2020-03-25 08:41:28 +01:00
Guillaume Abrioux	d22641d37b	nfs: fix nfs with external ceph cluster support This commit refact and fix the nfs deployment with external ceph cluster support. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1814942 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `cc28d9ec26`)	2020-03-19 21:39:56 -04:00
Dimitri Savineau	55c222d088	dashboard: allow to set read-only admin user This commit allows one to set the role for the admin user as read-only. This can be controlled via the dashboard_admin_user_ro variable but the default value is false for backward compatibility. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1810176 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `fb69f6990c`)	2020-03-19 13:24:05 -04:00
Dimitri Savineau	bf9d628b65	ceph-defaults: update grafana container tag Since `8e8aa73` we're using grafana 5.4.3 in RHCS 4.1 via [1]. We should also update the grafana container tag from docker.io when using the community release. [1] registry.redhat.io/rhceph/rhceph-4-dashboard-rhel8:4 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b97a4d5201`)	2020-03-17 15:28:00 -04:00
petruha	f2a50c19dc	ceph-facts: Fix system_secret_key variable handling This commit fixes the system_secret_key variable not substitued by the right value and always using the 'system_secret_key' string instead. $ egrep 'system_(access\|secret)_key' group_vars/all.yml system_access_key: foofoofoofoofoofoofo system_secret_key: barbarbarbarbarbarbarbarbarbarbarbarbarb $ ansible-playbook -vv -i hosts site.yml.sample -e rgw_multisite=true (...) - hostname: storage0 endpoint: http://192.168.100.42:8080 instance_name: rgw0 radosgw_address: 192.168.50.3 radosgw_frontend_port: 8085 rgw_realm: canada rgw_zone: montreal rgw_zone_user: justin.trudeau rgw_zone_user_display_name: Justin Trudeau rgw_zonegroup: quebec system_access_key: foofoofoofoofoofoofo system_secret_key: system_secret_key Fixes https://github.com/ceph/ceph-ansible/issues/5150 Signed-off-by: petruha <5363545+p37ruh4@users.noreply.github.com> (cherry picked from commit `73b3fadb0e`)	2020-03-16 17:59:57 -04:00
Guillaume Abrioux	d4c0d3ee66	config: remove legacy option in ceph.conf.j2 This option has been deprecated (As of 0.51). By the way, ceph-ansible already sets the auth_{service,client,cluster}_required variables. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1623586 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `152c2caa9f`)	2020-03-16 12:04:48 -04:00
Dimitri Savineau	dc4993cc92	handler: add rgw multi-instances support This commit adds the rgw multi-instances support in ceph-handler (restart_rgw_daemons.sh.j2) Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `3626c688cf`)	2020-03-12 19:04:26 -04:00
Guillaume Abrioux	c26e80fdbf	rgw: add multi-instances support when deploying multisite This commit adds the multi-instances when deploying rgw multisite Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `60a2e28189`)	2020-03-12 19:04:26 -04:00
Dimitri Savineau	d248f6bf8d	ceph-infra: open radosgw ports for multi instances When using the radosgw multi instances configuration then the firewall rules aren't adapted to that setup. We only open the port according to the radosgw_frontend_port variable so only the first radosgw instance port will be opened in the firewall configuration. We should instead iterate over the rgw_instances list. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e8bf0a0cf2`)	2020-03-12 19:04:26 -04:00
Guillaume Abrioux	bf0a6835a2	rgw: fix a typo in create_realm_zonegroup_zone_lists This commit fixes a typo. `s/realms/secondary_realms` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b3bbd6bb77`)	2020-03-12 16:58:09 -04:00
Guillaume Abrioux	03416c0d4e	infra: add retries/until on firewalld start task This commit make that task retrying 5 times to start the service firewalld to avoid failure like following: ``` TASK [ceph-infra : start firewalld] ****************************************** task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-centos-container-purge/roles/ceph-infra/tasks/configure_firewall.yml:22 Monday 09 March 2020 08:58:48 +0000 (0:00:00.963) 0:02:16.457 ******** fatal: [osd4]: FAILED! => changed=false msg: \|- Unable to enable service firewalld: Created symlink from /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service to /usr/lib/systemd/system/firewalld.service. Created symlink from /etc/systemd/system/multi-user.target.wants/firewalld.service to /usr/lib/systemd/system/firewalld.service. Failed to execute operation: Connection reset by peer ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b3d943fe9f`)	2020-03-12 21:48:20 +01:00
Guillaume Abrioux	46a13664b2	rgw: add retry/until on pools tasks Sometimes, these task can timeout for some reason. Adding these retries can help to avoid unexcepted failures. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7a8a719e75`)	2020-03-06 16:10:03 +01:00
Guillaume Abrioux	9a574562e2	client: skip create_users_keys.yml when rolling_update There's no need to run this part of the role when upgrading clients node. Let's skip it when rolling_update.yml is being run. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `eac207091b`)	2020-03-06 16:10:03 +01:00
Guillaume Abrioux	1a978ae545	osd: do not change pool size on erasure pool This commit adds condition in order to not try to customize pools size when its type is erasure. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e17c79b871`)	2020-03-06 16:10:03 +01:00
Guillaume Abrioux	98783a17b3	osd: add pg autoscaler support This commit adds the pg autoscaler support. The structure for pool definition has now two additional attributes `pg_autoscale_mode` and `target_size_ratio`, eg: ``` test: name: "test" pg_num: "{{ osd_pool_default_pg_num }}" pgp_num: "{{ osd_pool_default_pg_num }}" rule_name: "replicated_rule" application: "rbd" type: 1 erasure_profile: "" expected_num_objects: "" size: "{{ osd_pool_default_size }}" min_size: "{{ osd_pool_default_min_size }}" pg_autoscale_mode: False target_size_ratio": 0.1 ``` when `pg_autoscale_mode` is `True` user has to set a decent value in `target_size_ratio`. Given that it's a new feature, it's still disabled by default. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1782253 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `47adc2bb08`)	2020-03-06 16:10:03 +01:00
Guillaume Abrioux	ae06d684b8	osd: refact osd pool creation Currently, the command executed is wrong, eg: ``` cmd: - podman - exec - ceph-mon-controller-0 - ceph - --cluster - ceph - osd - pool - create - volumes - '32' - '32' - replicated_rule - '1' delta: '0:00:01.625525' end: '2020-02-27 16:41:05.232705' item: ``` From documentation, the osd pool creation command is : ``` ceph osd pool create {pool-name} {pg-num} [{pgp-num}] [replicated] \ [crush-rule-name] [expected-num-objects] ceph osd pool create {pool-name} {pg-num} {pgp-num} erasure \ [erasure-code-profile] [crush-rule-name] [expected_num_objects] ``` it means we pass '1' (from item.type) as value for `expected_num_objects` by default which is very likely not what we want. Also, this commit modifies the default value when no `rule_name` is set to use the existing variable `osd_pool_default_crush_rule` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1808495 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `bf1f125d71`)	2020-03-06 16:10:03 +01:00
Ali Maredia	2c440d4427	rgw multisite: enable more than 1 realm per cluster Make it so that more than one realm, zonegroup, or zone can be created during a run of the rgw multisite ansible playbooks. The rgw hosts now need to be grouped into zones and realms in the inventory. .yml files need to be created in group_vars for the realms and zones. Sample yaml files are available. Also remove multsite destroy playbook and add --cluster before radosgw-admin commands remove manually added rgw_zone_endpoints var and have ceph-ansible automatically add the correct endpoints of all the rgws in a rgw_zone from the information provided in that rgws hostvars. Signed-off-by: Ali Maredia <amaredia@redhat.com> (cherry picked from commit `71f55bd54d`)	2020-03-04 14:39:23 -05:00
Dimitri Savineau	e037e99bd2	purge: stop rgw instances by iteration It looks like that the service module doesn't support wildcard anymore for stopping/disabling multiple services. fatal: [rgw0]: FAILED! => changed=false msg: 'This module does not currently support using glob patterns, found '''' in service name: ceph-radosgw@' ...ignoring Instead we should iterate over the rgw_instances list. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9d3b49293d`)	2020-03-03 10:31:48 +01:00
Dimitri Savineau	eb2fba79fc	ceph-infra: install firewalld python bindings When using the firewalld ansible module we need to be sure that the python bindings are installed. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `90b1fc8fe9`)	2020-03-03 10:31:48 +01:00
Dimitri Savineau	424a0ce4ab	ceph-infra: split firewalld tasks Since ansible 2.9 the firewalld task could not be used with service and source in the same time anymore. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `45fb9241c0`)	2020-03-03 10:31:48 +01:00
Dimitri Savineau	9d4f90c8b4	Add ansible 2.9 support This commit adds ansible 2.9 support in addition of 2.8. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `aefba82a2e`)	2020-03-03 10:31:48 +01:00
Dimitri Savineau	8cc2f8f21e	ceph-validate: start with ansible version test It doesn't make sense to start validating configuration if the ansible version isn't the good one. This commit moves the check_system task as the first task in the ceph-validate role. The ansible version test tasks are moved at the top of this file. Also moving the iscsi kernel tests from check_system to check_iscsi file. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1a77dd7e91`)	2020-03-03 10:31:48 +01:00
Guillaume Abrioux	5a51bd12dc	common: support OSDs with more than 2 digits When running environment with OSDs having ID with more than 2 digits, some tasks don't match the system units and therefore, playbook can fail. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1805643 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a084a2a347`)	2020-02-28 11:06:47 -05:00
Francesco Pantano	f5e2a69134	Configure ceph dashboard backend and dashboard_frontend_vip This change introduces a new set of tasks to configure the ceph dashboard backend and listen just on the mgr related subnet (and not on '*'). For the same reason the proper server address is added in both prometheus and alertmanger systemd units. This patch also adds the "dashboard_frontend_vip" parameter to make sure we're able to support the HA model when multiple grafana instances are deployed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792230 Signed-off-by: Francesco Pantano <fpantano@redhat.com> (cherry picked from commit `15ed9eebf1`)	2020-02-24 16:50:19 -05:00
Dimitri Savineau	ca2003fbcc	ceph-rgw: increase connection timeout to 10 5s as a connection timeout could be low in some setup. Let's increase it to 10s. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `44e750ee5d`)	2020-02-24 14:41:19 -05:00
Dimitri Savineau	3617543517	containers: add KillMode=none to systemd templates Because we are relying on docker\|podman for managing containers then we don't need systemd to manage the process (like kill). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5a03e0ee1c`)	2020-02-18 12:10:35 -05:00
Ali Maredia	7d2a217270	rgw: extend automatic rgw pool creation capability Add support for erasure code pools. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731148 Signed-off-by: Ali Maredia <amaredia@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1834c1e48d`)	2020-02-17 17:44:53 -05:00
Florian Faltermeier	17b405eb10	ceph-rgw-loadbalancer: Fix SSL newline issue The `ad7a5da` commit introduced a regression when using TLS on haproxy via the haproxy_frontend_ssl_certificate variable. This cause the "stats socket" and the "tune.ssl.default-dh-param" parameters to be on the same line resulting haproxy failing to start. [ALERT] 351/140240 (21388) : parsing [xxxxx] : 'stats socket' : unknown keyword 'tune.ssl.default-dh-param'. Registered [ALERT] 351/140240 (21388) : Fatal errors found in configuration. Fixes: #4869 Signed-off-by: Florian Faltermeier <florian.faltermeier@uibk.ac.at> (cherry picked from commit `9d081e2453`)	2020-02-17 11:36:09 -05:00
Dimitri Savineau	df43f32248	ceph-defaults: remove bootstrap_dirs_xxx vars Both bootstrap_dirs_owner and bootstrap_dirs_group variables aren't used anymore in the code. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c644ea9041`)	2020-02-17 11:15:33 -05:00
Dimitri Savineau	0deb5b0706	rgw: don't create user on secondary zones The rgw user creation for the Ceph dashboard integration shouldn't be created on secondary rgw zones. Closes: #4707 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1794351 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `16e12bf2bb`)	2020-02-17 16:43:56 +01:00
Dimitri Savineau	5f7778d59a	ceph-{mon,osd}: move default crush variables Since `ed36a11` we move the crush rules creation code from the ceph-mon to the ceph-osd role. To keep the backward compatibility we kept the possibility to set the crush variables on the mons side but we didn't move the default values. As a result, when using crush_rule_config set to true and wanted to use the default values for crush_rules then the crush rule ansible task creation will fail. "msg": "'ansible.vars.hostvars.HostVarsVars object' has no attribute 'crush_rules'" This patch move the default crush variables from ceph-mon to ceph-osd role but also use those default values when nothing is defined on the mons side. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1798864 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1fc6b33714`)	2020-02-17 10:18:56 -05:00
Dimitri Savineau	f78981bbf7	ceph-grafana: fix grafana_{crt,key} condition The grafana_{crt,key} aren't boolean variables but strings. The default value is an empty string so we should do the conditional on the string length instead of the bool filter Closes: #5053 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `15bd4cd189`)	2020-02-17 10:18:39 -05:00
Dimitri Savineau	a9a533e398	ceph-prometheus: add alertmanager HA config When using multiple alertmanager nodes (via the grafana-server group) then we need to specify the other peers in the configuration. https://prometheus.io/docs/alerting/alertmanager/#high-availability https://github.com/prometheus/alertmanager#high-availability Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792225 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b9d975385c`)	2020-02-17 16:18:20 +01:00
John Fulton	9c97179fc1	The _filtered_clients list should intersect with ansible_play_batch Client configuration with --limit fails without this patch because certain tasks are only done to the first host in the _filtered_clients list and it's likely that first host will not be included in what's sepcified with --limit. To fix this the _filtered_clients list should be built from all clients in the inventory that are also in the running play. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1798781 Signed-off-by: John Fulton <fulton@redhat.com> (cherry picked from commit `e4bf4857f5`)	2020-02-17 10:03:57 -05:00
Dimitri Savineau	b01b255414	ceph-nfs: add nfs-ganesha-rados-urls package Since nfs-ganesha 2.8.3 the rados-urls library has been move to a dedicated package. We don't have the same nfs-ganesha 2.8.x between the community and rhcs repositories. community: 2.8.1 rhcs: 2.8.3 As a workaround we will install that package only for rhcs setup. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0a3e85e8ca`)	2020-02-17 10:00:44 -05:00
Dimitri Savineau	6864d04fdf	ceph-nfs: fix ceph_nfs_ceph_user variable The ceph_nfs_ceph_user variable is a string for the ceph-nfs role but a list in ceph-client role. `6a6785b` introduced a confusion between both variable type in the ceph-nfs role for external ceph with ganesha. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1801319 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `10951eeea8`)	2020-02-17 15:27:30 +01:00
Dimitri Savineau	e4e1b386b0	dashboard: allow configuring multiple grafana host When using multiple grafana hosts then we push set the grafana and prometheus URL and push the dashboard layout to a single node. grafana_server_addrs is the list of all grafana nodes and used during the ceph-dashboard role (on mgr/mon nodes). grafana_server_addr is the current grafana node used during the ceph-grafana and ceph-prometheus role (on grafana-server nodes). We don't have the grafana_server_addr fact duplication code between external vs collocated nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1784011 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c6e96699f7`)	2020-02-12 19:56:31 -05:00
Guillaume Abrioux	1d2a395aaf	switch_to_containers: increase health check values This commit increases the default values for the following variable consumed in switch-from-non-containerized-to-containerized-ceph-daemons.yml playbook. This also moves these variables in `ceph-defaults` role so the user can set different values if needed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783223 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3700aa5385`)	2020-02-10 12:57:17 -05:00
Stanley Lam	0336a1476f	Add option for HAproxy to act a SSL frontend termination point for loadbalanced RGW instances. Signed-off-by: Stanley Lam <stanleylam_604@hotmail.com> (cherry picked from commit `ad7a5dad3f`)	2020-02-03 09:32:43 -05:00
Dimitri Savineau	0dbca448d1	ceph-handler: Use /proc/net/unix for rgw socket If for some reason, there's an old rgw socket file present in the /var/run/ceph/ directory then the test command could fail with test: xxxxxxxxx.asok: binary operator expected $ ls -hl /var/run/ceph/ total 0 srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94153614631472.asok srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94240997655088.asok We can check the radosgw socket in /proc/net/unix to avoid using wildcard in the socket name. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `60cbfdc2a6`)	2020-02-03 09:31:52 -05:00
Dimitri Savineau	460d3557d7	ceph-container-engine: lvm2 on OSD nodes only Since `de8f2a9` the lvm2 package installation has been moved from ceph-osd role to ceph-container-engine role. But the scope wasn't limited to the OSD nodes only. This commit fixes this behaviour. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `fa8aa8c864`)	2020-02-03 15:16:32 +01:00
Dimitri Savineau	80f1b0feb0	ceph-common: rhcs 4 repositories for rhel 7 RHCS 4 is available for both RHEL 7 and 8 so we should also enable the cdn repositories for that distribution. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1796853 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9b40a959b9`)	2020-02-03 15:15:35 +01:00
Mike Christie	76753e64f9	iscsi: Fix crashes during rolling update During a rolling update we will run the ceph iscsigw tasks that start the daemons then run the configure_iscsi.yml tasks which can create iscsi objects like targets, disks, clients, etc. The problem is that once the daemons are started they will accept confifguration requests, or may want to update the system themself. Those operations can then conflict with the configure_iscsi.yml tasks that setup objects and we can end up in crashes due to the kernel being in a unsupported state. This could also happen during creation, but is less likely due to no objects being setup yet, so there are no watchers or users accessing the gws yet. The fix in this patch works for both update and initial setup. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1795806 Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `77f3b5d51b`)	2020-02-03 15:15:15 +01:00
Guillaume Abrioux	1b33c5358e	config: fix external client scenario When no monitor group is present in the inventory, this task fails. This affects only non-containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e7bc079405`)	2020-01-31 13:37:10 +01:00
Dimitri Savineau	3daea719b6	ceph-defaults: remove rgw from ceph_conf_overrides The [rgw] section in the ceph.conf file or via the ceph_conf_overrides variable doesn't exist and has no effect. To apply overrides to all radosgw instances we should use either the [global] or [client] sections. Overrides per radosgw instance should still use the [client.rgw.{instance-name}] section. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1794552 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2f07b85131`)	2020-01-29 14:34:34 +01:00
Guillaume Abrioux	bc6777c6df	dashboard: add quotes when passing password to the CLI Otherwise, if the variables contains a '$' it will be interpreted as a BASH variable. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8c3759f8ce`)	2020-01-29 14:15:41 +01:00
Guillaume Abrioux	8a907cb1ca	validate: fail if dashboard\|grafana_admin_password aren't set This commit adds a task to make sure user set a custom password for `grafana_admin_password` and `dashboard_admin_password` variables. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1795509 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `99328545de`)	2020-01-29 14:15:41 +01:00
Dimitri Savineau	9da917501b	ceph-facts: fix _container_exec_cmd fact value When using different name between the inventory_hostname and the ansible_hostname then the _container_exec_cmd fact will get a wrong value based on the inventory_hostname instead of the ansible_hostname. This happens when the ceph cluster is already running (update/upgrade). Later the container exec commands will fail because the container name is wrong. We should always set the _container_exec_cmd based on the ansible_hostname fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1795792 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1fcafffdad`)	2020-01-29 11:48:44 +01:00
Guillaume Abrioux	0d2af6ebf3	fix calls to `container_exec_cmd` in ceph-osd role We must call `container_exec_cmd` from the right monitor node otherwise the value of the fact might mistmatch between the delegated node and the node being played. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1794900 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2f919f8971`)	2020-01-27 17:54:39 -05:00
Guillaume Abrioux	9fb69e13ed	handler: read container_exec_cmd value from first mon Given that we delegate to the first monitor, we must read the value of `container_exec_cmd` from this node. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792320 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `eb9112d8fb`)	2020-01-23 18:34:14 +01:00
Vytenis Sabaliauskas	4152a1a862	ceph-facts: Fix for 'running_mon is undefined' error, so that fact 'running_mon' is set once 'grep' successfully exits with 'rc == 0' Signed-off-by: Vytenis Sabaliauskas <vytenis.sabaliauskas@protonmail.com> (cherry picked from commit `ed1eaa1f38`)	2020-01-23 11:24:24 -05:00
Dimitri Savineau	6a51330892	ceph-osd: set container objectstore env variables Because we need to manage legacy ceph-disk based OSD with ceph-volume then we need a way to know the osd_objectstore in the container. This was done like this previously with ceph-disk so we should also do it with ceph-volume. Note that this won't have any impact for ceph-volume lvm based OSD. Rename docker_env_args fact to container_env_args and move the container condition on the include_tasks call. Remove OSD_DMCRYPT env variable from the ceph-osd template because it's now included in the container_env_args variable. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792122 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c9e1fe3d92`)	2020-01-20 15:36:11 -05:00
Benoît Knecht	ff2a2bb870	ceph-rgw: Fix customize pool size "when" condition In `3c31b19ab3`, I fixed the `customize pool size` task by replacing `item.size` with `item.value.size`. However, I missed the same issue in the `when` condition. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `3842aa1a30`)	2020-01-20 12:48:19 -05:00
Guillaume Abrioux	1462423059	handler: fix call to container_exec_cmd in handler_osds When unsetting the noup flag, we must call container_exec_cmd from the delegated node (first mon member) Also, adding a `run_once: true` because this task needs to be run only 1 time. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792320 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `22865cde9c`)	2020-01-20 12:45:51 -05:00
Dmitriy Rabotyagov	8d311a537d	Fix undefined running_mon Since commit [1] running_mon introduced, it can be not defined which results in fatal error [2]. This patch defines default value which was used before patch [1] Signed-off-by: Dmitriy Rabotyagov <drabotyagov@vexxhost.com> [1] `8dcbcecd71` [2] https://zuul.opendev.org/t/openstack/build/c82a73aeabd64fd583694ed04b947731/log/job-output.txt#14011 (cherry picked from commit `2478a7b948`)	2020-01-16 18:28:12 -05:00
Guillaume Abrioux	cae24dd85a	remove container_exec_cmd_mgr fact Iterating over all monitors in order to delegate a ` {{ container_binary }}` fails when collocating mgrs with mons, because ceph-facts reset `container_exec_cmd` to point to the first member of the monitor group. The idea is to force `container_exec_cmd` to be reset in ceph-mgr. This commit also removes the `container_exec_cmd_mgr` fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1791282 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8dcbcecd71`)	2020-01-15 21:10:54 +01:00
Dimitri Savineau	09a71e4a8c	ceph-iscsi: don't use bracket with trusted_ip_list The trusted_ip_list parameter for the rbd-target-api service doesn't support ipv6 address with bracket. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787531 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `bd87d69183`)	2020-01-14 12:48:04 -05:00
Dimitri Savineau	ff3a3ee5e9	container: move lvm2 package installation Before this patch, the lvm2 package installation was done during the ceph-osd role. However we were running ceph-volume command in the ceph-config role before ceph-osd. If lvm2 wasn't installed then the ceph-volume command fails: error checking path "/run/lock/lvm": stat /run/lock/lvm: no such file or directory This wasn't visible before because lvm2 was automatically installed as docker dependency but it's not the same for podman on CentOS 8. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `de8f2a9f83`)	2020-01-14 12:47:55 -05:00
Guillaume Abrioux	a81830ddc0	osd: use _devices fact in lvm batch scenario since `fd1718f379`, we must use `_devices` when deploying with lvm batch scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5558664f37`)	2020-01-14 15:33:15 +01:00
Guillaume Abrioux	ffdfa634ac	osd: do not run openstack_config during upgrade There is no need to run this part of the playbook when upgrading the cluter. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `af6875706a`)	2020-01-14 09:12:34 -05:00
Guillaume Abrioux	2d85fab02d	osd: support scaling up using --limit This commit lets add-osd.yml in place but mark the deprecation of the playbook. Scaling up OSDs is now possible using --limit Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3496a0efa2`)	2020-01-14 09:12:34 -05:00
Dimitri Savineau	dc797971ce	ceph-facts: move grafana fact to dedicated file We don't need to executed the grafana fact everytime but only during the dashboard deployment. Especially for ceph-grafana, ceph-prometheus and ceph-dashboard roles. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790303 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f940e695ab`)	2020-01-13 16:28:23 -05:00
Guillaume Abrioux	266c4c7763	facts: fix osp/ceph external use case `d6da508a9b` broke the osp/ceph external use case. We must skip these tasks when no monitor is present in the inventory. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790508 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2592a1e1e8`)	2020-01-13 21:07:01 +01:00
Guillaume Abrioux	532abbb9b2	defaults: change monitor\|radosgw_address default values To avoid confusion, let's change the default value from `0.0.0.0` to `x.x.x.x`. Users might think setting `0.0.0.0` will make the daemon binding on all interfaces. Fixes: #4827 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fc02fc98eb`)	2020-01-13 14:55:23 -05:00
Guillaume Abrioux	9ed540da7e	osd: ensure osd ids collected are well restarted This commit refact the condition in the loop of that task so all potential osd ids found are well started. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790212 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `58e6bfed2d`)	2020-01-13 14:31:03 -05:00
Dimitri Savineau	bc0f16f270	ceph-validate: add rbdmirror validation When ceph_rbd_mirror_configure is set to true we need to ensure that the required variables aren't empty. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1760553 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `4a065cebd7`)	2020-01-10 11:11:37 -05:00
Dimitri Savineau	f2e1941ef1	ceph-osd: wait for all osds once `cf8c6a3` moves the 'wait for all osds' task from openstack_config to the main tasks list. But the openstack_config code was executed only on the last OSD node. We don't need to do this check on all OSD node so we need to add set run_once to true on that task. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5bd1cf40eb`)	2020-01-10 11:07:25 -05:00
Dimitri Savineau	9fa5b296ca	ceph-osd: wait for all osd before crush rules When creating crush rules with device class parameter we need to be sure that all OSDs are up and running because the device class list is is populated with this information. This is now enable for all scenario not openstack_config only. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cf8c6a3849`)	2020-01-10 11:07:25 -05:00
Dimitri Savineau	bd016960cf	ceph-osd: add device class to crush rules This adds device class support to crush rules when using the class key in the rule dict via the create-replicated sub command. If the class key isn't specified then we use the create-simple sub command for backward compatibility. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ef2cb99f73`)	2020-01-10 11:07:25 -05:00
Dimitri Savineau	661b2c013a	move crush rule creation from mon to osd role If we want to create crush rules with the create-replicated sub command and device class then we need to have the OSD created before the crush rules otherwise the device classes won't exist. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ed36a11eab`)	2020-01-10 11:07:25 -05:00
Guillaume Abrioux	d6921f798d	config: exclude ceph-disk prepared osds in lvm batch report We must exclude the devices already used and prepared by ceph-disk when doing the lvm batch report. Otherwise it fails because ceph-volume complains about GPT header. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786682 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fd1718f379`)	2020-01-09 20:15:27 -05:00
Guillaume Abrioux	d6da508a9b	mon: support replacing a mon We must pick up a mon which actually exists in ceph-facts in order to detect if a cluster is running. Otherwise, it will state no cluster is already running which will end up deploying a new monitor isolated in a new quorum. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `86f3eeb717`)	2020-01-09 15:02:03 -05:00
Dimitri Savineau	a3c2259bde	ceph-iscsi: manage ipv6 in trusted_ip_list Only the ipv4 addresses from the nodes running the dashboard mgr module were added to the trusted_ip_list configuration file on the iscsigws nodes. This also add the iscsi gateways with ipv6 configuration to the ceph dashboard. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787531 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `70eba66182`)	2020-01-08 21:29:02 -05:00
Benoît Knecht	0cb1235962	ceph-rgw: Fix custom pool size setting RadosGW pools can be created by setting ```yaml rgw_create_pools: .rgw.root: pg_num: 512 size: 2 ``` for instance. However, doing so would create pools of size `osd_pool_default_size` regardless of the `size` value. This was due to the fact that the Ansible task used ``` {{ item.size \| default(osd_pool_default_size) }} ``` as the pool size value, but `item.size` is always undefined; the correct variable is `item.value.size`. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit `3c31b19ab3`)	2020-01-08 21:28:03 -05:00
Guillaume Abrioux	e001ded6f6	handler: fix bug `411bd07d54` introduced a bug in handlers using `handler__status` instead of `hostvars[item]['handler__status']` causes handlers to be triggered in anycase even though `handler_*_status` was set to `False` on a specific node. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `30200802d9`)	2020-01-08 19:46:11 -05:00
Dimitri Savineau	56f1b232a5	ceph-nfs: add ganesha_t type to selinux Since RHEL 8.1 we need to add the ganesha_t type to the permissive SELinux list. Otherwise the nfs-ganesha service won't start. This was done on RHEL 7 previously and part of the nfs-ganesha-selinux package on RHEL 8. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786110 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d758125290`)	2020-01-08 16:23:41 -05:00
Dimitri Savineau	701ade88c3	ceph-defaults: exclude rbd devices from discovery The RBD devices aren't excluded from the devices list in the LVM auto discovery scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783908 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `6f0556f015`)	2020-01-08 16:15:26 -05:00
Dimitri Savineau	af3c1b4c1a	ceph-infra: replace hardcoded grafana group name The grafana-server group name was hardcoded for the grafana/prometheus firewalld tasks condition. We should we the associated variable : grafana_server_group_name Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2c06678cde`)	2020-01-08 16:15:09 -05:00
Dimitri Savineau	27530b1d3f	ceph-infra: move dashboard into a dedicated file Instead of using multiple dashboard_enabled condition in the configure_firewall file we could just have the condition once and include the dedicated tasks list. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f4c261ef90`)	2020-01-08 16:15:09 -05:00
Dimitri Savineau	43ffcd7d28	ceph-infra: open dashboard port on monitor When there's no mgr group defined in the ansible inventory then the mgrs are deployed implicitly on the mons nodes. If the dashboard is enabled then we need to open the dashboard port on the node that is running the ceph mgr process (mgr or mon). The current code only allow to open that port on the mgr nodes when they are present explicitly in the inventory but not implicitly. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783520 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `4535985188`)	2020-01-08 16:15:09 -05:00
Guillaume Abrioux	3caba2c31c	dashboard: use fqdn in external url Force fqdn to be used in external url for prometheus and alertmanager. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1765485 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `498bc45859`)	2020-01-08 16:14:33 -05:00
Guillaume Abrioux	4cf5c08cd8	facts: use correct python interpreter that task is delegated on the first mon so we should always use the `discovered_interpreter_python` from that node. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5adb735c78`)	2020-01-08 11:18:45 -05:00
Guillaume Abrioux	ba4787817e	defaults: change default value for dashboard_admin_password A recent change in ceph/ceph prevent from having username in the password: `Error EINVAL: Password cannot contain username.` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0756fa467d`)	2019-12-11 08:48:34 -05:00
Guillaume Abrioux	00bdb60663	defaults: add a comment This commit isolates and adds an explicit comment about variables not intended to be modified by the user. Fixes: #4828 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a234338eff`)	2019-12-10 13:45:18 -05:00
Guillaume Abrioux	6295a33912	dashboard: run node_export as privileged container Typical error: ``` type=AVC msg=audit(1575367499.582:3210): avc: denied { search } for pid=26680 comm="node_exporter" name="1" dev="proc" ino=11528 scontext=system_u:system_r:container_t:s0:c100,c1014 tcontext=system_u:system_r:init_t:s0 tclass=dir permissive=0 ``` node_exporter needs to be run as privileged to avoid avc denied error since it gathers lot of information on the host. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762168 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d245eb7e7d`)	2019-12-09 17:27:51 +01:00
Dimitri Savineau	0340929ed3	ceph-defaults: exclude md devices from discovery The md devices (RAID software) aren't excluded from the devices list in the auto discovery scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1764601 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `014f51c2a4`)	2019-12-09 09:32:21 +01:00
Guillaume Abrioux	cfc10a8142	facts: avoid duplicated element in devices list When using `osd_auto_discovery`, `devices` is built multiple times due to multiple runs of `ceph-facts` role. It end up with duplicate instances of a same device in the list. Using `unique` filter when building the list fixes this issue. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `23b1f43897`)	2019-12-05 14:51:18 +01:00
Dimitri Savineau	f4e5f3ee9e	ceph-grafana: remove ipv6 brakets on wait_for The wait_for ansible module doesn't support the backets on IPv6 address so need to remove them. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769710 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `55adc10be3`)	2019-12-03 17:12:36 +01:00
Dimitri Savineau	a3bfed88c9	ceph-defaults: pin prometheus container tags In addition to the grafana container tag change, we need to do the same for the prometheus container stack based on the release present in the OSE 4.1 container image. $ docker run --rm openshift4/ose-prometheus-node-exporter:v4.1 --version node_exporter, version 0.17.0 build user: root@67fee13ed48f build date: 20191023-14:38:12 go version: go1.11.13 $ docker run --rm openshift4/ose-prometheus-alertmanager:4.1 --version alertmanager, version 0.16.2 build user: root@70b79a3f29b6 build date: 20191023-14:57:30 go version: go1.11.13 $ docker run --rm openshift4/ose-prometheus:4.1 --version prometheus, version 2.7.2 build user: root@12da054778a3 build date: 20191023-14:39:36 go version: go1.11.13 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `3e29b8d5ff`)	2019-12-03 16:19:16 +01:00
Guillaume Abrioux	6592caab08	facts: isolate container_binary facts in order to be able to call container_binary without having to run the whole ceph-facts role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fe5ffe589e`)	2019-12-03 09:57:11 -05:00
Guillaume Abrioux	1f30327688	purge: remove docker_* task All containers are removed when systemd stops them. There is no need to call this module in purge container playbook. This commit also removes all docker_image task and remove all container images in the final cleanup play. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1776736 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d23383a820`)	2019-12-03 09:57:11 -05:00
Guillaume Abrioux	ba2925df32	dashboard: use fqdn url for active alert When using the shortname, the URL for active alert launches with short hostname and fails to connect to the server. This commit changes the template in order to use the fqdn. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1765485 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a8d76d72d7`)	2019-12-03 09:44:17 -05:00
Guillaume Abrioux	e8ed36fdae	dashboard: only print dashboard url of the grafana-server node This commit makes the ceph-dashboard role only printing ceph-dashboard URL of the nodes present in grafana-server group Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762163 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `cc0c1ce301`)	2019-12-03 14:49:12 +01:00
Guillaume Abrioux	88d060f6e1	docker2podman: import ceph-handler role This is needed to avoid following error: ``` ERROR! The requested handler 'restart ceph mons' was not found in either the main handlers list nor in the listening handlers list ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a43a872105`)	2019-12-03 10:44:48 +01:00
VasishtaShastry	4edef59feb	Fixes failure of cephfs configuration using --limit Configuration of cephfs with an existing cluster using --limit used to fail at different tasks while running with site-docker.yml This commit addresses both of those tasks Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1773489 Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com> (cherry picked from commit `72c43cc5d9`)	2019-11-20 09:41:38 -05:00
VasishtaShastry	e54b6be74e	Evades validation of ceph_repository_type in containerized scenario This will prevent failure of site-docker.yml with configs in doc. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769760 Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com> Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9a1f1626c3`)	2019-11-18 16:41:10 +01:00
Dimitri Savineau	3ebd71c8c2	ceph-osd: fix fs.aio-max-nr sysctl condition [1] introduced a regression on the fs.aio-max-nr sysctl value condition. The enable key isn't a boolean but a string because the expression isn't evaluated. This string output "(osd_objectstore == 'bluestore')" is always true because item.enable condition only matches non empty string. So the sysctl value was applyied for both filestore and bluestore backend. [2] added the bool filter to the condition but the filter always returns false on string and the sysctl wasn't applyed at all. This commit fixes the enable key value by evaluating the value instead of using the string. [1] https://github.com/ceph/ceph-ansible/commit/08a2b58 [2] https://github.com/ceph/ceph-ansible/commit/ab54fe2 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ece46d33be`)	2019-11-07 20:31:02 +01:00
Dimitri Savineau	0dcaec64ec	ceph-defaults: pin grafana container tag to 5.2.4 The latest grafana container tag is using grafana 6.x release which could cause issue with the ceph dashboard integration. Considering that the grafana container in RHCS 3 is based on 5.x then we should use the same version. $ docker run --rm rhceph/rhceph-3-dashboard-rhel7:3 -v Version 5.2.4 (commit: unknown-dev) Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2037fb87b6`)	2019-10-31 19:10:04 -04:00
Dimitri Savineau	27eb40714c	ceph-osd: Remove ulimit nofile on container start Even if this improves ceph-disk/ceph-volume performances then it also impact the ceph-osd process. The ceph-osd process shouldn't use 1024:4096 value for the max open files. Removing the ulimit option from the container engine and doing this kind of change on the container side [1]. [1] https://github.com/ceph/ceph-container/pull/1497 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9a996aef7f`)	2019-10-31 14:42:30 -04:00
fmount	20b4234ddc	Set grafana-server user and password in ceph-dashboard role This change adds two tasks to set grafana-api user and password that are required to inject dashboard layouts to the external grafana instance. Without these two parameters the ceph-ansible playbook fails showing an authorization error (HTTPError: 401 Client Error: Unauthorized"). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1767365 Signed-off-by: fmount <fpantano@redhat.com> (cherry picked from commit `41b8c17356`)	2019-10-31 11:43:54 -04:00
Mihai Plasoianu	6015a6ca40	ceph-mon: use --admin-daemon to set default crush rule Signed-off-by: Mihai Plasoianu <m.plasoianu@vertical.de> (cherry picked from commit `d3f67d63ae`)	2019-10-29 22:26:53 -04:00
Dimitri Savineau	ffd05ca8df	defaults: add user/pass auth registry variables Add ceph_docker_registry_username and ceph_docker_registry_password variables in ceph-defaults role so they will be present in the group_vars samples but commented. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1763139 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b33c476f16`)	2019-10-24 16:24:54 -04:00
Dimitri Savineau	b3ee07b242	dashboard: add ceph iscsi management When deploying with ceph-iscsi nodes and dashboard enabled, we need to add the ceph iscsi gateway endpoints to the dashboard configuration and add the mgr ip address in the trusted list in the iscsi gateway configuration file. Closes: #4638 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1764173 https://docs.ceph.com/docs/master/mgr/dashboard/#enabling-iscsi-management Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d050391cbb`)	2019-10-23 09:47:46 +02:00
Dimitri Savineau	567e90cd2e	ceph-iscsi: add ceph-iscsi stable repositories This commit adds the support of the ceph-iscsi stable repository when use ceph_repository community instead of always using the devel repositories. We're still using the devel repositories for rtslib and tcmu-runner in both cases (dev and community). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f2cb937193`)	2019-10-23 09:47:46 +02:00
Dimitri Savineau	e00bc17bd9	Revert "iscsigw: install python-requests" We don't need this since [1]. Also this was only working for python2 and not supporting python3. [1] https://github.com/ceph/ceph-iscsi/commit/00f198a This reverts commit `167737dd3d`. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `fd8d47da98`)	2019-10-23 09:47:46 +02:00
Dimitri Savineau	4ff517e1ab	container/dashboard: run the registry auth task When deploying with packages then the ceph-container-common role isn't executed so the registry authentication task is ignored. Closes: #4636 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9ad000618f`)	2019-10-23 09:39:59 +02:00
Dimitri Savineau	c787cfcdff	travis: fail on ansible-lint errors If ansible-lint reports an error then it's skipped. We should fail in this case. This patch also fixes the pipefail lint in the rbd mirror role [306] Shells that use pipes should set the pipefail option Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `3969470fca`)	2019-10-21 15:55:54 -04:00
Dimitri Savineau	6d5125f2a4	lint: fix error [303,602,701,702] [303] mktemp used in place of tempfile module [602] Don't compare to empty string [701] No 'galaxy_info' found [702] Use 'galaxy_tags' rather than 'categories' This patch also changes the ansible log_path value via the ANSIBLE_LOG_PATH environment variable in the travis configuration to avoid warnings. [WARNING]: log file at /home/travis/ansible/ansible.log is not writeable and we cannot create it, aborting Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f7fd0b6d4f`)	2019-10-21 15:55:54 -04:00
Guillaume Abrioux	4bf8cbe0c8	validate: fix credentials validation This task is failing when `ceph_docker_registry_auth` is enabled and `ceph_docker_registry_username` is undefined with an ansible error instead of the expected message. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1763139 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `da4215e9c0`)	2019-10-21 15:55:35 -04:00
Guillaume Abrioux	541546a54a	common: do not override ceph_release when using custom repo Otherwise it fails like following: ``` TASK [ceph-mds : allow multimds] ************************************************************************************************************************************************ Monday 22 July 2019 16:37:38 +0800 (0:00:03.269) 0:13:25.651 ********* fatal: [rhel7u6clone1]: FAILED! => {"msg": "The conditional check 'ceph_release_num[ceph_release] == ceph_release_num.luminous' failed. The error was: error while evaluating conditional (ceph_release_num[ceph_release] == ceph_release_num.luminous): 'dict object' has no attribute u'dummy'\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-mds/tasks/create_mds_filesystems.yml': line 43, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: allow multimds\n ^ here\n"} ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4e9504c939`)	2019-10-17 20:10:47 -04:00
Guillaume Abrioux	18db9eb79e	nfs: remove unnecessary set_fact in main.yml this task is a leftover and no longer needed. It even causes bug when collocating nfs with mon. Closes: #4609 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b63bd13073`)	2019-10-16 14:01:46 -04:00
Mike Christie	7fbd76c93a	iscsi-gw: Fix rtslib installation When using python3 the name of the rtslib rpm is python3-rtslib. The packages that use rtslib already have code that detects the python version and distro deps, so drop it from the ceph iscsi gw task list and let the ceph-iscsi rpm dependency handle it. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1760930 Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `ba141298d7`)	2019-10-16 14:01:29 -04:00
Dimitri Savineau	35963194a7	rbd-mirror: fail if the peer is not added Due the 'failed_when: false' statement present in the peer task then the playbook continues to ran even if the peer task was failing (like incorrect remote peer format. "stderr": "rbd: invalid spec 'admin@cluster1'" This patch adds a task to list the peer present and add the peer only if it's not already added. With this we don't need the failed_when statement anymore. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1665877 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0b1e9c0737`)	2019-10-16 14:01:06 -04:00
Guillaume Abrioux	c962d87def	update: follow new recommandation to upgrade mds cluster Refact the mds cluster upgrade code in order to follow the documented recommandation. See: https://github.com/ceph/ceph/blob/master/doc/cephfs/upgrading.rst Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1569689 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `71cebf80a6`)	2019-10-16 12:59:08 -04:00
Dimitri Savineau	86b7137b27	ceph-iscsi: notify rbd target services When the iscsi gateway or the ceph configuration file change then we need to notify the rbd target api/gw services to be restarted. This patch also merges the rbd-target-api and rbd-target-gw handler into a single file and listen. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `bc701860d5`)	2019-10-16 11:34:15 -04:00
Guillaume Abrioux	50738ff5c0	mgr: do not copy all keyrings on all mgr There is no need to loop over all mgr nodes to set this fact, it's even breaking deployments because it tries to copy all mgr keyring on all mgr. Closes: #4602 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `cb80231725`)	2019-10-16 06:45:33 +02:00
Dimitri Savineau	3313bc5c1f	ceph-handler: group listen topics and condition We are using multiple listen topics with the handlers. That means that we are notifying 4 tasks for each handler. Instead we can group the listen on an include_tasks and based on the group condition. Before: NOTIFIED HANDLER ceph-handler : set _mon_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy mon restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph mon daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _mon_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _osd_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy osd restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph osds daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _osd_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _mds_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy mds restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph mds daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _mds_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _rgw_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy rgw restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph rgw daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _rgw_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _mgr_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy mgr restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph mgr daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _mgr_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _rbdmirror_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy rbd mirror restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph rbd mirror daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _rbdmirror_handler_called after restart for mon0 After: NOTIFIED HANDLER ceph-handler : mons handler for mon0 NOTIFIED HANDLER ceph-handler : osds handler for mon0 NOTIFIED HANDLER ceph-handler : mdss handler for mon0 NOTIFIED HANDLER ceph-handler : rgws handler for mon0 NOTIFIED HANDLER ceph-handler : mgrs handler for mon0 NOTIFIED HANDLER ceph-handler : rbdmirrors handler for mon0 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `fe9c5b8c68`)	2019-10-15 13:29:06 -04:00
Guillaume Abrioux	13f6a0a22a	handler: followup on #4519 This commit adds some missing `\| bool` filters. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ccc11cfc93`)	2019-10-15 13:29:06 -04:00
Guillaume Abrioux	fd10fbc047	handlers: refact osd handler This commit merges the two restart tasks into a single one, this way it's one task less to notify. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `411bd07d54`)	2019-10-15 13:29:06 -04:00
Dimitri Savineau	8117ed34d4	Remove validate action and notario dependency The current ceph-validate role is using both validate action and fail module tasks to validate the ceph configuration. The validate action is based on the notario python library. When one of the notario validation fails then a python stack trace is reported to the ansible task. This output isn't understandable by users. This patch removes the validate action and the notario depencendy. The validation is now done with only fail ansible module. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654790 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0f978d969b`)	2019-10-15 10:21:54 -04:00
Guillaume Abrioux	5568692340	mgr: improve mgr keyring creation Delegating on remote node isn't necessary here since we are already iterating over the right nodes. Closes: #4518 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `161170524d`)	2019-10-11 14:51:16 -04:00
Guillaume Abrioux	9c0547068e	validate: prevent from installing OSD on same disk as the OS This commit adds a validation task to prevent from installing an OSD on the same disk as the OS. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1623580 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `80e2d00b16`)	2019-10-11 09:44:10 -04:00
Guillaume Abrioux	98467ddf01	common: do not reset `container_exec_cmd` This commit removes some legacy tasks. These tasks aren't needed, they cause the playbook to fail when collocating daemons. Closes: #4553 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `273413186a`)	2019-10-10 15:56:01 -04:00
Dimitri Savineau	eb51cc1bb1	dashboard: update layouts before the restart If the mgr dashboard doesn't restart fast enough then the inject dashboard task will fail with a HTTP error 400. Error EINVAL: Traceback (most recent call last): File "/usr/share/ceph/mgr/mgr_module.py", line 914, in _handle_command return self.handle_command(inbuf, cmd) File "/usr/share/ceph/mgr/dashboard/module.py", line 450, in handle_command push_local_dashboards() File "/usr/share/ceph/mgr/dashboard/grafana.py", line 132, in push_local_dashboards retry() File "/usr/share/ceph/mgr/dashboard/grafana.py", line 89, in call result = self.func(self.args, *self.kwargs) File "/usr/share/ceph/mgr/dashboard/grafana.py", line 127, in push grafana.push_dashboard(body) File "/usr/share/ceph/mgr/dashboard/grafana.py", line 54, in push_dashboard response.raise_for_status() File "/usr/lib/python2.7/site-packages/requests/models.py", line 834, in raise_for_status raise HTTPError(http_error_msg, response=self) HTTPError: 400 Client Error: Bad Request Instead we can trigger this task before the module restart. Closes: #4565 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `3f6ff240b7`)	2019-10-09 07:24:56 +00:00
Guillaume Abrioux	1d4d49695e	nfs: stop nfs server service in all context This commit moves this task in order to stop the nfs server service regardless the deployment type desired (containerized or non containerized). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6c6a512a72`)	2019-10-07 18:17:49 +02:00
Guillaume Abrioux	9a62d006bd	nfs: stop nfs server service The syntax here wasn't working, this refact fixes this task. Also, removing the `ignore_errors: true` which was hidding the failure. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `47034effe0`)	2019-10-07 18:17:49 +02:00
Dimitri Savineau	d617626ef4	ceph-dashboard: remove rgw api host,port,scheme We don't need to have dedicated variables for the RGW integration into the Ceph Dashboard and need to be manually filled. Instead we can use the current values from the RGW nodes by using the IP and port from the first RGW instance of the first RGW node via the radosgw_address and radosgw_frontend_port variables. We don't need to specify all RGW nodes, this will be done automatically with one node. The RGW api scheme is using the radosgw_frontend_ssl_certificate variable to determine if the value is http or https. This variable is also reuse as a condition for the ssl verify task. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b9e93ad7a6`)	2019-10-07 10:25:29 -04:00
Guillaume Abrioux	b325cc386e	switch_to_containers: do not re-set `ceph_uid` This commit refacts the way we set `ceph_uid` fact in `ceph-facts` and removes all `set_fact` tasks for `ceph_uid` in switch-to-containers playbook to avoid duplicated code. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fa9b42e98e`)	2019-10-07 10:18:17 -04:00
Dimitri Savineau	a210efe361	ceph-dashboard: Improve https configuration This patch moves the https dashboard configuration into a dedicated block to avoid the multiple occurence of the dashboard_protocol condition. It also fixes the dashboard certificate and key variables handling in the condition introduced by `ab54fe2`. Those variables aren't boolean but strings so we can test them via the length filter. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `249764047b`)	2019-10-07 14:18:29 +02:00
Guillaume Abrioux	857c68087d	handler: followup on #4519 This commit adds some missing `\| bool` filters. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ccc11cfc93`)	2019-10-07 09:09:36 +02:00
Dimitri Savineau	5bbd825ab2	ceph-dashboard: add cluster parameter to ceph cmd The ceph dashboard tasks didn't use the cluster option if the cluster name isn't the default value. Closes: #4529 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `dd526cfe4e`)	2019-10-04 17:07:31 +00:00
Dimitri Savineau	8ec632c42c	ceph-handler: don't restart all OSDs with limit When using the ansible --limit option on one or few OSD nodes and if the handler is triggered then we will restart the OSD service on all OSDs nodes instead of the hosts limited by the limit value. Even if the play is limited by the --limit value we are using all OSD nodes from the OSD group. with_items: '{{ groups[osd_group_name] }}' Instead we should iterate only on the nodes present in both OSD group and limit list. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0346871fb5`)	2019-10-04 07:42:58 +02:00
Dimitri Savineau	70267cb30b	ceph-facts: fix _radosgw_address with block `e695efc` introduced a regression in the _radosgw_address fact when using the radosgw_address_block variable. There's no item there because we don't use the items lookup. This is only used for _monitor_address with monitor_address_block. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1758099 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `780cf36a59`)	2019-10-03 19:20:19 +00:00
Guillaume Abrioux	13ca0531d8	common: improve keyrings generation There is no need to get n * number of nodes the different keyrings. Adding a `run_once: true` here avoid running a ceph command too many times which could be impacting large cluster deployment. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9bad239d77`)	2019-10-02 14:34:27 +02:00
Dimitri Savineau	5b24c66ff7	ceph-facts: use --admin-daemon to get fsid During the rolling_update scenario, the fsid value is retrieve from the current ceph cluster configuration via the ceph daemon config command. This command tries first to resolve the admin socket path via the ceph-conf command. Unfortunately this command won't work if you have a duplicate key in the ceph configuration even if it only produces a warning. As a result the task will fail. Can't get admin socket path: unable to get conf option admin_socket for mon.xxx: warning: line 13: 'osd_memory_target' in section 'osd' redefined Instead of using ceph daemon we can use the --admin-daemon option because we already know what the socket admin path value based on the ceph cluster and mon hostname values. Closes: #4492 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ec3b687dc4`)	2019-10-02 14:01:32 +02:00
Guillaume Abrioux	c958bc1ddf	validate: fix gpt header check Check for gpt header when osd scenario is lvm or lvm batch. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `272d16e101`)	2019-10-01 13:02:45 -04:00
Guillaume Abrioux	b998fb339e	rbdmirror: rename a file rename this file to be more generic. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ed8616aa66`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	9a79ed1bf0	rgw: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/` directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e08194dd67`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	7f902994b3	rbdmirror: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/` directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c69816c6b7`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	d7a06c67db	iscsigw: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/ directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4636f3f7e2`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	df5337535d	container: isolate systemd tasks This commit isolates the systemd unit files generation for containers into separate yml files in order to be able importing each corresponding roles without playing all tasks. This is needed so we can run ceph-ansible to render systemd unit files so they call podman instead of docker. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `bd64167469`)	2019-10-01 18:50:51 +02:00
Dimitri Savineau	7bb835240e	ceph-facts: update external grafana fact filter `e695efc` hasn't been updated with the changes introduced in `9bb11c7` so the ips_in_ranges filter isn't used for an external grafana instance. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `20b1a464ec`)	2019-10-01 12:28:34 -04:00
Boris Ranto	af9f93f07f	ceph-defaults: Change the default prometheus port The old default prometheus port 9090 clashes with cockpit in rhel 8. The 9090 port is reserved for web service administration of machines. We should change the default to something that does not clash with other ports used in rhel 8, at least by default. The port 9092 seems like a good choice in my testing. Signed-off-by: Boris Ranto <branto@redhat.com> (cherry picked from commit `b96c6da832`)	2019-09-30 14:24:50 +02:00
Guillaume Abrioux	a3988887d2	Revert "ceph-common: install only necesarry ceph-* packages on debian" This reverts commit `58b27ef0b3`. This is breaking debian based OS deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e4444d29e0`)	2019-09-29 13:28:40 +00:00
Johannes Kastl	146f2e8de3	move python-xml to raw_install_python.yml The package python-xml is needed for ansible's zypper module to interact with the zypper package management tool. roles/ceph-defaults/defaults/main.yml: Remove python-xml from variable suse_package_dependencies to only install python-xml on SUSE/openSUSE if python is not found. raw_install_python.yml already contains all the logic needed to check if there is a valid python installation, so this is better suited there. openSUSE Leap 15.x / SLES 15.x do no longer have /usr/bin/python, only /usr/bin/python3, which already contains the xml module, so nothing needs to be installed in that case. Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `5cf22e9b31`)	2019-09-27 17:50:10 +02:00
Harald Jensås	5fea830414	Replace ipaddr() with ips_in_ranges() This change implements a filter_plugin that is used in the ceph-facts, ceph-validate roles and infrastucture-playbooks. The new filter plugin will return a list of all IP address that reside in any one of the given IP ranges. The new filter replaces the use of the ipaddr filter. ceph.conf already support a comma separated list of CIDRs for the public_network and cluster_network options. Changes: [1] and [2] introduced a regression in ceph-ansible where public_network can no longer be a comma separated list of cidrs. With this change a comma separated list of subnet CIDRs can also be used for monitor_address_block and radosgw_address_block. [1] commit: `d67230b2a2` [2] commit: `20e4852888` Related-To: https://bugs.launchpad.net/tripleo/+bug/1840030 Related-To: https://bugzilla.redhat.com/show_bug.cgi?id=1740283 Closes: #4333 Please backport to stable-4.0 Signed-off-by: Harald Jensås <hjensas@redhat.com> (cherry picked from commit `e695efcaf7`)	2019-09-27 17:49:46 +02:00
Dimitri Savineau	2d1372fe2a	ceph-nfs: Allow to configure SecType value Depending on the infrastruture (w/o kerberos auth) then the SecType value could be different. Currently this value is hardcoded in the NFS Ganesha template. Instead we can use a variable. The default value is still the same to avoid breaking the backward compatibility. Closes: #4459 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ca77d7bd31`)	2019-09-27 15:38:52 +02:00
Dimitri Savineau	21e1650db6	ceph-dashboard: Add prometheus api host The set-prometheus-api-host ceph dashboard subcommand was missing in ceph-dashboard role. Only grafana and alermanager were present. This commit also remove the trailing slash at the end of the host/url values. Closes: #4453 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `74ab59c4f3`)	2019-09-27 14:16:39 +02:00
Anthony Rusdi	3d2f9d2cde	ceph-common: install only necesarry ceph-* packages on debian Currently, ceph package only an meta-package that do not contain actual software, but simply depend on other packages. It's been few release since debian stretch (official), ubuntu bionic (official), ubuntu uca repository and upstream debian-jewel. As we only support nautilus and higher release for master branch, I propose to drop ceph package and use ceph-base instead for repository model other than rhcs so debian ceph install will be more minimalis. Signed-off-by: Anthony Rusdi <33247310+antrusd@users.noreply.github.com> (cherry picked from commit `58b27ef0b3`)	2019-09-27 14:16:20 +02:00
liuxu	1acd062f22	dashboard: add grafana dashboard support on Debian based OS download grafana dashboard files from github when running on Debian based OS Signed-off-by: liuxu <liuxu623@gmail.com> (cherry picked from commit `195f70897c`)	2019-09-27 09:12:39 +02:00
fmount	43830515af	Inject ceph grafana dashboard layouts This change just adds the task to inject from the ceph dashboard mgr module the required layouts to show all the cluster metrics on the grafana instance. Since we're now able to push grafana layouts through the ceph mgr module command, the dashboards configuration template is no longer needed on containerized environments. This commit also fixes the Vagrantfile IP static assigment in the grafana section because it generates an issue (it's the same of the mgr instance). Finally, considering some deployments that use an external grafana server instance, we reworked the 'grafana_server_addr' assignment to address these requirements. Signed-off-by: fmount <fpantano@redhat.com> (cherry picked from commit `9bb11c7b2a`)	2019-09-26 13:44:03 -04:00
Guillaume Abrioux	b16dfb1920	iscsigw: install python-requests Typical error at rbd-target-api startup: ``` Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: Traceback (most recent call last): Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: File "/usr/bin/rbd-target-api", line 39, in <module> Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: from gwcli.utils import (APIRequest, valid_gateway, valid_client, Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: File "/usr/lib/python2.7/site-packages/gwcli/utils.py", line 1, in <module> Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: import requests Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: ImportError: No module named requests ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `167737dd3d`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	b1e61be9c6	tests: set copy_admin_key at group_vars level setting it at extra vars level prevent from setting it per node. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5bb6a4da42`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	e1d06f498c	global: remove fetch_directory dependency This commit drops the fetch_directory dependency. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ab370b6ad8`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	69ec26e045	osd: add wal_devices option support to ceph_volume module This commit adds the `wal_devices` option support to the ceph_volume module. passing a devices list in `bluestore_wal_devices` will make ceph-volume creating 1 vg using these devices to create block.wal partitions. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `09e04a9197`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	a33791be25	osd: update doc text in defaults/main.yml This commit removes ceph-disk references. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `70f1b37097`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	d666e03b0c	osd: add block_db_devices option support to ceph_volume module This commit adds the `block_db_devices` option support to the ceph_volume module. passing a devices list in `dedicated_devices` will make ceph-volume creating 1 vg using these devices to create block.db partitions for data devices. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7b836eaa47`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	651cf13a74	validate: check ceph_docker_registry_* length This commit adds a condition to check whether these variables are empty. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2b97ac921b`)	2019-09-18 23:43:21 +02:00
Dimitri Savineau	9d3fbcf47e	container: Allow to use registry authentication The registry.redhat.io regsitry requires authentication so before pulling the RHCS 4 container images from the registry we need to do the login step. This is done via the new ceph_docker_registry_auth variable. The default value is false but true for RHCS setup. When set to true, you need to provide the username and password for the registry via the associated variables. This patch also updates the ceph_docker_registry value for RHCS setup. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1748911 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9f4a99fb24`)	2019-09-18 23:43:21 +02:00
Dimitri Savineau	b50fa23630	ceph-handler: Fix osd restart condition In containerized deployment, the restart OSD handler couldn't be triggered in most ansible execution. This is due to the usage of run_once + a condition on the inventory hostname and the last filter. The run_once is triggered first so ansible will pick a node in the osd group to execute the restart task. But if this node isn't the last one in the osd group then the task is ignored. There's more probability that the task will be ignored than executed. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5b1c15653f`)	2019-09-11 13:20:30 -04:00
Dimitri Savineau	8d26299116	rbd-mirror: Allow to copy the admin keyring The ceph-rbd-mirror role allows to copy the admin keyring via the copy_admin_key variable but there's actually no task in that role doing the job. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1f505628dd`)	2019-09-11 11:48:48 -04:00
Dimitri Savineau	142ac88961	rbd-mirror: Use the rbd mirror client keyring The admin keyring isn't present by default on the rbd mirror nodes so the rbd commands related to the mirroring confguration will fail. Instead we can use the rbd mirror client keyring. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `a3d36df025`)	2019-09-11 11:48:48 -04:00
Harald Jensås	e33e06d400	Support comma-delimited subnets in firewall ceph.conf supports a comma separated list of subnet CIDR's for the public_network and the cluster network. ceph-ansible should support setting up the firewall for this configuration. Closes: #4425 Related: #4333 https://docs.ceph.com/docs/nautilus/rados/configuration/network-config-ref/#network-config-settings Signed-off-by: Harald Jensås <hjensas@redhat.com> (cherry picked from commit `d94229204d`)	2019-09-10 09:34:48 -04:00
Giulio Fidente	cb66a62ae2	Look for additional names when checking ceph-nfs container status Ganesha cannot be operated active/active, in those deployments where it is managed by pacemaker the container name can be different than the default. This change uses "ceph_nfs_service_suffix" where previously missing to ensure tasks will work with customized names. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1750005 Signed-off-by: Giulio Fidente <gfidente@redhat.com> (cherry picked from commit `d2a2bd7c42`)	2019-09-09 16:48:50 -04:00
Dimitri Savineau	3fded4b8ec	rbd-mirror: configure pool and peer The rbd mirror configuration was only available for non containerized deployment and was also imcomplete. We now enable the mirroring on the pool and add the remote peer in both scenarios. The default mirroring mode is set to 'pool' but can be configured via the ceph_rbd_mirror_mode variable. This commit also fixes an issue on the rbd mirror command if the ceph cluster name isn't using the default value (ceph) due to a missing --cluster parameter to the command. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1665877 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `7e5e21741e`)	2019-09-09 16:05:56 +00:00
fmount	65a01036c2	Fix discovered_interpreter_python variable This change fixes the discovered_interpreter_python variable name that was "discovered_python_interpreter" and caused a failure in OSP deployments. Signed-off-by: fmount <fpantano@redhat.com> (cherry picked from commit `81eb091533`)	2019-09-04 14:16:57 -04:00
Johannes Kastl	781ab4ad62	openSUSE OBS repo using ceph_stable_release Instead of hardcoding `luminous`, use the `ceph_stable_release` variable to point to the correct repository. This is now uncommented in roles/ceph-defaults/defaults/main.yml to be available, as it is only used if ceph_repository is set to 'obs'. group_vars/*.sample files have been regenerated using the ./generate_group_vars_sample.sh script. Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `0cedc4d303`)	2019-08-30 09:04:24 -04:00
fmount	159db72269	Add http_addr option to grafana config We have no reason to make grafana container listen on *:<port>, so this change adds the http_addr option to the grafana config file and adds the related option on the wait_for tasks. Since grafana_server_addr should exists, we shouldn't rely on the _current_monitor_addr default on prometheus/grafana templates. This change also remove this default value that is not necessary anymore. Signed-off-by: fmount <fpantano@redhat.com> (cherry picked from commit `8a666bfd15`)	2019-08-30 09:04:16 -04:00
Dimitri Savineau	ab67c6bd76	lint: fix error [201,206] [201] Trailing whitespace [206] Variables should have spaces before and after: {{ var_name }} Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `42082c0a27`)	2019-08-30 09:04:00 -04:00
Johannes Kastl	64b11ab2b9	fix openSUSE OBS repo creation roles/ceph-common/tasks/installs/suse_obs_repository.yml: ansible's zypper_repository module does not know a parameter 'uri', this is called 'repo' instead Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `4711a7d626`)	2019-08-29 16:31:40 +00:00
Nick Erdmann	e8e1f310dd	ceph-infra: open ceph iscsi/prometheus port Signed-off-by: Nick Erdmann <n@nirf.de> (cherry picked from commit `7953ee1b81`)	2019-08-29 10:22:28 -04:00
Guillaume Abrioux	a3cbb59c05	lint: fix error [301], add `changed_when: false` when needed This commit fixes the error [301]: `[301] Commands should not change things if nothing needs doing` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `327d564106`)	2019-08-28 11:22:47 -04:00
Guillaume Abrioux	8f781198d6	lint: fix error [306], add pipefail on shell command using pipe This commit fixes the error [306]: `[306] Shells that use pipes should set the pipefail option` using `/bin/bash` as executable because Debian/Ubuntu systems use `dash` by default which doesn't have the `-o pipefail`. (See: https://github.com/ansible/ansible-lint/issues/497#issue-424623501) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `102edaeb61`)	2019-08-28 11:22:47 -04:00
Dimitri Savineau	364951ce2f	ceph-mon: Bind mount the ca-trust directory On containerized deployment, the mon container sometimes needs to access to the radosgw endpoint (via the radosgw-admin command). When using TLS on the radosgw with self-signed certificates then we need to access to the CA certification from the mon container. The CA certificate needs to be added on the host and then the directory will be bind mount on the container. Resolves: #4358 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2b0616ecca`)	2019-08-28 09:44:34 -04:00
Dimitri Savineau	1fbfa1ce1a	ceph-client: Use profile rbd in keyring caps Like the OpenStack keyrings, we can use the profile rbd for the clients keyring (both mon and osd). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `49aa05b96c`)	2019-08-28 09:42:03 -04:00
Dimitri Savineau	4df8de8f7b	Revert "osd: add 'osd blacklist' cap for osp keyrings" This reverts commit `2d955757ee`. The "osd blacklist" isn't an osd caps but should be used with mon caps. Also the correct caps for this is: 'allow command "osd blacklist"'. The current change is breaking the openstack and clients keyrings. By using the profile rbd (which is already used) we already rely on the ability to blacklist dead client. Resolves: #4385 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `717af83475`)	2019-08-28 09:42:03 -04:00
Johannes Kastl	3bfa1c50de	set discovered_python_interpreter if ansible_python_interpreter is defined If the user has set the `ansible_python_interpreter`, ansible will not try to discover python, so `discovered_python_interpreter` will not be set. Solution: Set `discovered_python_interpreter` to `ansible_python_interpreter` if `ansible_python_interpreter` is defined Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `bd507fa147`)	2019-08-27 21:06:43 +00:00
guihecheng	196e70a75a	rgw/multisite: assign 'rgw_zone' to the exact section in ceph.conf since the following commit: commit `1ac94c048f` rgw: add support for multiple rgw instances on a single host we have multi-instance rgw support on a single host and the config section name of the rgw changed from [client.rgw.$(hostname)] -> [client.rgw.$(hostname).rgwX] when X is the sequence number: 0,1,2,... So we should assign 'rgw_zone' item to the exact rgw instance config section in ceph.conf Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com> (cherry picked from commit `a0590cae9d`)	2019-08-23 15:56:15 +02:00
Artur Fijalkowski	27014df45e	global: make directories mode parameterizable This commit makes it possible to parametrize the ceph directories modes. So it changes hardocded mode for ceph related directories from 0755 to customizable with `ceph_directories_mode` variable. Closes: #2920 Signed-off-by: Artur Fijalkowski <artur.fijalkowski@ing.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `011270ca69`)	2019-08-23 11:39:23 +00:00
Dimitri Savineau	500c59c648	ceph-osd: Add ulimit nofile on container start On containerized deployment, the OSD entrypoint runs some ceph-volume commands (lvm/simple scan and/or activate) which perform badly without the ulimit option. This option was added for all previous ceph-volume commands but not on the ceph-osd container startup. Also updating hard limit value to 4096 to reflect default baremetal value. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9a4ac46d19`)	2019-08-22 22:50:17 +00:00
Kevin Coakley	c7950d5539	ceph-config: Set changed_when to false on fact gathering statements The "run 'ceph-volume lvm batch --report' to see how many osds are to be created" and "run 'ceph-volume lvm list' to see how many osds have already been created" statements only register the lvm_batch_report and lvm_list variables. Running those ceph-volume commands should never produce a change on the system. Adding changed_when: false prevents irrelevant change messages from Ansible. Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu> (cherry picked from commit `e11cbbbcb1`)	2019-08-22 20:36:39 +02:00
Johannes Kastl	3e17c458d0	facts: fix a typo This commit fixes a typo in roles/ceph-facts/tasks/facts.yml Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `e1b9312084`)	2019-08-22 18:11:18 +02:00
Johannes Kastl	82ede0afdb	ceph-nfs: fail on openSUSE Leap using distro packages roles/ceph-validate/tasks/check_nfs.yml: fail on openSUSE Leap using `ceph_origin = distro`, as the ganesha packages are not available from the distribution repositories Fixes: #4342 Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `11aa5dbb58`)	2019-08-21 15:40:22 +02:00
Guillaume Abrioux	fcf571430b	handler: do not validate the server certificate against the CA Otherwise rgw handler ends up with an error when using https. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9329bbb3af`)	2019-08-21 15:40:07 +02:00
Johannes Kastl	15646d1030	install ceph-mds packages on SUSE/openSUSE install packages on SUSE/openSUSE distributions, using the same logic as on RedHat-based distributions Fixes #4340 Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `c721cb99cb`)	2019-08-21 09:54:09 +00:00
Johannes Kastl	34783253a5	remove duplicate task installing suse dependencies roles/ceph-common/tasks/installs/install_on_suse.yml: remove the task that installs the dependencies, as this is done later in install_suse_packages.yml Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `504017d562`)	2019-08-20 14:36:15 +02:00

... 2 3 4 5 6 ...

2699 Commits (5fd299e358f0917c8b05b918272a576ff46b82dd)