ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	3f9081931f	rgw/rbdmirror: use service dump instead of ceph -s The ceph status command returns a lot of information stored in variables and/or facts which could consume resources for nothing. When checking the rgw/rbdmirror services status, we're only using the servicmap structure in the ceph status output. To optimize this, we could use the ceph service dump command which contains the same needed information. This command returns less information and is slightly faster than the ceph status command. $ ceph status -f json \| wc -c 2001 $ ceph service dump -f json \| wc -c 1105 $ time ceph status -f json > /dev/null real 0m0.557s user 0m0.516s sys 0m0.040s $ time ceph service dump -f json > /dev/null real 0m0.454s user 0m0.434s sys 0m0.020s Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-03 09:05:33 +01:00
Dimitri Savineau	88f91d8c12	monitor: use quorum_status instead of ceph status The ceph status command returns a lot of information stored in variables and/or facts which could consume resources for nothing. When checking the quorum status, we're only using the quorum_names structure in the ceph status output. To optimize this, we could use the ceph quorum_status command which contains the same needed information. This command returns less information. $ ceph status -f json \| wc -c 2001 $ ceph quorum_status -f json \| wc -c 957 $ time ceph status -f json > /dev/null real 0m0.577s user 0m0.538s sys 0m0.029s $ time ceph quorum_status -f json > /dev/null real 0m0.544s user 0m0.527s sys 0m0.016s Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-03 09:05:33 +01:00
wangxiaotong	b9cb0f12e9	osds: use ceph osd stat instead of ceph status Improve the checked way of the OSD created checking process. This replaces the ceph status command by the ceph osd stat command. The osdmap structure isn't needed anymore. $ ceph status -f json \| wc -c 2001 $ ceph osd stat -f json \| wc -c 132 $ time ceph status -f json > /dev/null real 0m0.563s user 0m0.526s sys 0m0.036s $ time ceph osd stat -f json > /dev/null real 0m0.457s user 0m0.411s sys 0m0.045s Signed-off-by: wangxiaotong <wangxiaotong@fiberhome.com>	2020-11-03 09:05:33 +01:00
Guillaume Abrioux	371d854a5c	common: follow up on #5948 In addition to `f7e2b2c608` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-11-02 20:16:36 -05:00
Benoît Knecht	0d76826bbb	ceph-mon: Don't set monitor directory mode recursively After rolling updates performed with `infrastructure-playbooks/rolling_updates.yml`, files located in `/var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }}` had mode 0755 (including the keyring), making them world-readable. This commit separates the task that configured permissions recursively on `/var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }}` into two separate tasks: 1. Set the ownership and mode of the directory itself; 2. Recursively set ownership in the directory, but don't modify the mode. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-11-02 17:36:37 +01:00
Dimitri Savineau	b02589ad50	keyring: use ceph_key module for get-or-create cmd Instead of using ceph auth get-or-create command via the ansible command module then we can use the ceph_key module. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-02 17:17:29 +01:00
Dimitri Savineau	59ecddcdd0	keyring: use ceph_key module for auth get command Instead of using ceph auth get command via the ansible command module then we can use the ceph_key module and the info state. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-02 17:17:29 +01:00
Gaudenz Steinlin	79ff79c422	openstack: use ceph_keyring_permissions by default Otherwise this task fails if no permission is set on the item. Previously the code omited the mode parameter if it was not set, but this was lost with commit `ab370b6ad8`. Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>	2020-11-02 15:53:58 +01:00
Dimitri Savineau	16cd183b9c	podman: force log driver to journald Since we've changed to podman configuration using the detach mode and systemd type to forking then the container logs aren't present in the journald anymore. The default conmon log driver is using k8s-file. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1890439 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-02 15:49:27 +01:00
Dimitri Savineau	cdb7b09cd7	ceph-handler: fix curl ipv6 command with rgw When using the curl command with ipv6 address and brackets then we need to use the -g option otherwise the command fails. $ curl http://[fdc2:328:750b:6983::6]:8080 curl: (3) [globbing] error: bad range specification after pos 9 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-02 15:45:51 +01:00
Guillaume Abrioux	a822f77300	iscsi: fix ownership on iscsi-gateway.cfg This file is currently deployed with '0644' ownership making this file readable by any user on the system. Since it contains sensitive information it should be readable by the owner only. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1890119 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-21 16:10:48 +02:00
Guillaume Abrioux	1cc9666c09	common: drop `fetch_directory` feature This commit drops the `fetch_directory` feature. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-21 13:22:16 +02:00
Guillaume Abrioux	900c0f4492	ceph-config: ceph.conf rendering refactor This commit cleans up the `main.yml` task file of `ceph-config`. It drops the local ceph.conf generation. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-21 13:22:16 +02:00
Guillaume Abrioux	a8bd947c7d	crash: refact caps definition there is no need to use `{{ }}` syntax here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-19 18:53:54 -04:00
Benoît Knecht	8b0023cb77	ceph-osd: Fix check mode for start osds tasks Correctly set `osd_ids_non_container.stdout_lines` to an empty list if it's undefined (i.e. in check mode). Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-10-19 20:22:08 +02:00
Benoît Knecht	8f436ab5d8	ceph-mon: Fix check mode for deploy monitor tasks Skip the `get initial keyring when it already exists` task when both commands whose `stdout` output it requires have been skipped (e.g. when running in check mode). Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-10-19 20:22:08 +02:00
Gaudenz Steinlin	68cc93fb18	ceph-crash: Only deploy key to targeted hosts The current task installs the ceph-crash key to "most" hosts via "delegate_to". This key is only used by the ceph-crash daemon and should just be installed on all hosts targeted by this role. There is no need for using a delegated task. Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>	2020-10-19 16:54:06 +02:00
Guillaume Abrioux	59d0f01992	ceph-osd: start osd after systemd overrides The service should be started after the ceph-osd systemd overrides has been added, otherwise, the latter isn't considered. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1860739 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-15 09:19:56 +02:00
Dimitri Savineau	9252b75173	container: remove container_binding_name variable The container_binding_name package was only mandatory when we were using the docker modules (docker_image and docker_container) but since we manage both docker and podman containers without using the dedicated module then we can remove it. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-14 10:32:31 +02:00
Dimitri Savineau	4eaa65c362	ceph-osd: don't start the OSD services twice Using the + operation on two lists doesn't filter out the duplicate keys. Currently each OSDs is started (via systemd) twice. Instead we could use the union filter. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-14 10:30:39 +02:00
Guillaume Abrioux	46d4d97da9	handler: refact check_socket_non_container the `stat --printf=%n` returns something like following: ``` ok: [osd0] => changed=false cmd: \|- stat --printf=%n /var/run/ceph/ceph-osd*.asok delta: '0:00:00.009388' end: '2020-10-06 06:18:28.109500' failed_when_result: false rc: 0 start: '2020-10-06 06:18:28.100112' stderr: '' stderr_lines: <omitted> stdout: /var/run/ceph/ceph-osd.2.asok/var/run/ceph/ceph-osd.5.asok stdout_lines: <omitted> ``` it makes the next task "check if the ceph osd socket is in-use" grep like this: ``` ok: [osd0] => changed=false cmd: - grep - -q - /var/run/ceph/ceph-osd.2.asok/var/run/ceph/ceph-osd.5.asok - /proc/net/unix ``` which will obviously fail because this path never exists. It makes the OSD handler broken. Let's use `find` module instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-08 17:37:50 -04:00
Benoît Knecht	54ba38e35e	Fix Ansible check mode for site.yml.sample playbook Make sure the `site.yml.sample` playbook can be run in check mode by skipping tasks that try to read the output of commands that have been skipped. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-10-07 00:29:44 +02:00
Dimitri Savineau	1281e8bcc8	library: add radosgw_zone module This adds radosgw_zone ansible module for replacing the command module usage with the radosgw-admin zone command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-06 10:07:58 +02:00
Dimitri Savineau	65dbe0782e	library: add radosgw_zonegroup module This adds radosgw_zonegroup ansible module for replacing the command module usage with the radosgw-admin zonegroup command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-06 10:07:58 +02:00
Dimitri Savineau	d171f4068d	library: add radosgw_realm module This adds radosgw_realm ansible module for replacing the command module usage with the radosgw-admin realm command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-06 10:07:58 +02:00
Dimitri Savineau	235c7e27cc	library: add radosgw_user module This adds radosgw_user ansible module for replacing the command module usage with the radosgw-admin user command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-06 10:07:58 +02:00
Dimitri Savineau	bd611a785b	library: add ceph_fs module This adds the ceph_fs ansible module for replacing the command module usage with the ceph fs command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-06 08:02:58 +02:00
Dimitri Savineau	c960362639	ceph_key: remove backward compatibility It's time to remove this backward compatibility. Users had enough time to convert their openstack_keys and key values. We now fail in ceph-validate if the caps key isn't set. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-06 07:59:38 +02:00
Guillaume Abrioux	a802fa2810	rgw: fix multi instances scaleout in baremetal When rgw and osd are collocated, the current workflow prevents from scaling out the radosgw_num_instances parameter when rerunning the playbook in baremetal deployments. When ceph-osd notifies handlers, it means rgw handlers are triggered too. The issue with this is that they are triggered before the role ceph-rgw is run. In the case a scaleout operation is expected on `radosgw_num_instances` it causes an issue because keyrings haven't been created yet so the new instances won't start. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1881313 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-06 07:38:44 +02:00
Guillaume Abrioux	ff95fa9c32	ceph-osd: refact `docker_exec_start_osd` This commit drops nested jinja construction in this set_fact task. It also rename it to `container_exec_start_osd` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-04 21:18:10 +02:00
Guillaume Abrioux	c101cb3931	defaults: change defaults value this commit changes defaults value in default pool definitions. there's no need to define `pg_num`, `pgp_num`, `size` and `min_size`, `ceph_pool` module will use the current default if needed. This also drops the 3 following `set_fact` in `ceph-facts`: - osd_pool_default_pg_num, - osd_pool_default_pgp_num, - osd_pool_default_size_num Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-02 07:42:40 +02:00
Guillaume Abrioux	29fc115f4a	ceph_pool: refact module remove complexity about current defaults in running cluster Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-02 07:42:40 +02:00
Seena Fallah	ff9f4d138f	ceph-facts: add get default crush rule from running monitor In case of deploying new monitor node to an existing cluster, osd_pool_default_crush_rule should be taken from running monitor because ceph-osd role won't be run and the new monitor will have different osd_pool_default_crush_role from other monitors. Signed-off-by: Seena Fallah <seenafallah@gmail.com>	2020-09-29 09:27:58 -04:00
Guillaume Abrioux	eefe11d90c	defaults: change default grafana-server name This change default value of grafana-server group name. Adding some tasks in ceph-defaults in order to keep backward compatibility. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-29 07:42:26 +02:00
Ali Maredia	902575369c	rgw multisite: check connection for realm endpoint This commit adds connection checks before realm pulls Curls are performed on the endpoint being pulled from the mons and the rgws Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731158 Signed-off-by: Ali Maredia <amaredia@redhat.com>	2020-09-29 07:37:21 +02:00
Dimitri Savineau	e11453c6f5	Remove unused centos docker tasks The `enable extras on centos` task just doesn't work when using the variable ceph_docker_enable_centos_extra_repo to true. fatal: [xxx]; FAILED! => {"changed": false, "msg": "Parameter 'baseurl', 'metalink' or 'mirrorlist' is required."} The CentOS extras repository is enabled by default so it's pretty safe to remove this task and the associated variable. This also removes the ceph_docker_on_openstack variable as it's a leftover and it is unused. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-29 07:35:10 +02:00
Dimitri Savineau	733596582d	ceph-handler: set handler on xxx_stat result In non containerized deployment we check if the service is running via the socket file presence. This is done via the xxx_socket_stat variable that check the file socket in the /var/run/ceph/ directory. In some scenarios, we could have the socket file still present in that directory but not used by any process. That's why we have the xxx_stat variable which clean those leftovers. The problem here is that we're set the variable for the handlers status (like handler_mon_status) based on xxx_socket_stat instead of xxx_stat. That means we will trigger the handlers if there's an old socket file present on the system without any process associated. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1866834 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-29 07:32:10 +02:00
Dimitri Savineau	501b8e0fd3	ceph-iscsi: create pool once from monitor `af9f6684` introduced a regression on the ceph iscsi pool creation because it was delegated to the first monitor node before that change. This patch restores the initial worflow. When the iscsi node doesn't have the admin keyring then the pool creation fails. This commit also ensures that the pool creation is only executed once when having multiple iscsi nodes. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-29 07:31:24 +02:00
Seena Fallah	69f7e35382	ceph-facts: check for mon socket in its own host delegate to its own host after checking mon socket to findout if mon socket is in-use or not. Signed-off-by: Seena Fallah <seenafallah@gmail.com>	2020-09-29 00:21:12 +02:00
Dimitri Savineau	50104650e7	add missing boolean filter Otherwise this will generate an ansible warning about the missing filter. [DEPRECATION WARNING]: evaluating xxx as a bare variable, this behaviour will go away and you might need to add \|bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This feature will be removed in version 2.12. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-28 20:45:01 +02:00
Guillaume Abrioux	bf7b044c9a	Revert "ceph-rgw: remove ceph_pool state and default value" This reverts commit `ba3512a8fc`.	2020-09-28 16:56:33 +02:00
Dimitri Savineau	1db4dc807c	ceph-mds: remove unused block condition Since `af9f6684` the cephfs pool(s) creation don't use the fs_pools_created variable anymore because the ceph_pool module is idempotent. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-28 10:22:35 +02:00
Tyler Bishop	ee4b8804ae	facts: support device aliases for (dedicated\|bluestore_wal)_devices Just likve `devices`, this commit adds the support for linux device aliases for `dedicated_devices` and `bluestore_wal_devices`. Signed-off-by: Tyler Bishop <tbishop@liquidweb.com>	2020-09-25 19:59:45 +02:00
Dimitri Savineau	ba3512a8fc	ceph-rgw: remove ceph_pool state and default value Since the state is now optional and default values are handled in the ceph_pool module itself. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-25 19:18:07 +02:00
Dimitri Savineau	4808523403	rolling_update: remove msgr2 migration In Pacific we're are sure that users already achieved the msgr2 because that was introduced in Nautilus. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-25 19:14:42 +02:00
Dimitri Savineau	62bd41f0d4	ceph-config: remove ceph_release from ceph.conf.j2 We don't use ceph_release variable in the ceph.conf jinja template. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-25 19:13:57 +02:00
Dmitriy Rabotyagov	297532ca41	Remove libjemalloc1 installation task libjemalloc1 package is not required neither for ganesha dependency nor for the package build process. So this task can be simply dropped. Signed-off-by: Dmitriy Rabotyagov <noonedeadpunk@ya.ru>	2020-09-24 13:56:16 +02:00
Dimitri Savineau	6dcfdf17d4	container: quote registry password When using a quote in the registry password then we have the following error: The error was: ValueError: No closing quotation To fix this we need to use the quote filter. Close: https://bugzilla.redhat.com/show_bug.cgi?id=1880252 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-18 11:14:00 -04:00
Guillaume Abrioux	ff19c1d851	facts: fix 'set_fact rgw_instances with rgw multisite' the current condition doesn't work, as soon as the first iteration is done the condition makes next iterations skip since `rgw_instances` got set with the first iteration. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1859872 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-18 10:14:34 -04:00
Dimitri Savineau	85643edfe3	ceph-infra: include iscsi nodes for logrotate The iscsi nodes aren't included in the logrotate condition. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-17 20:34:56 +02:00
Guillaume Abrioux	f576c02ff7	infra: support log rotation for tcmu-runner This commit adds the log rotation support for tcmu-runner. ceph-container related PR: ceph/ceph-container#1726 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1873915 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-16 20:23:22 -04:00
Dimitri Savineau	e54b924eaf	ceph-prometheus: update pool stat counter Since [1] The bytes_used pool counter in prometheus has been renamed to stored. Closes: #5781 [1] https://github.com/ceph/ceph/commit/71fe9149 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-16 09:50:42 -04:00
Dimitri Savineau	bda3581294	container: add optional http(s) proxy option When using a http(s) proxy with either docker or podman we can rely on the HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables. But with ansible, even if those variables are defined in a source file then they aren't loaded during the container pull/login tasks. This implements the http(s) proxy support with docker/podman. Both implementations are different: 1/ docker doesn't rely en the environment variables with the CLI. Thos are needed by the docker daemon via systemd. 2/ podman uses the environment variables so we need to add them to the login/pull tasks. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876692 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-16 06:52:26 +02:00
Dimitri Savineau	abb4023d76	ceph_key: set state as optional Most ansible module using a state parameter default to the present value (when available) instead of using it as a mandatory option. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-14 14:12:21 -04:00
Guillaume Abrioux	f0fc59258a	Revert "ceph_pool: use default size/min_size and rule_name" This reverts commit `142934057f`. This is already handled in the ceph_pool module itself Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-14 14:12:21 -04:00
Dimitri Savineau	2c4af70abd	dashboard: use run_once at block level Instead of using run_once: true on each tasks in a block section, we can use the run_once statement at the block level. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-14 13:47:36 +02:00
Dimitri Savineau	b105549ed8	node-exporter: exclude client nodes We don't need to install node-exporter on client node because there's no ceph services running on them. This also makes sure we use the group name variables in the prometheus service template instead of hardcoding the values. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-14 13:46:51 +02:00
Dimitri Savineau	3a05aeb6cb	ceph_pool: set state as optional Most ansible module using a state parameter default to the present value (when available) instead of using it as a mandatory option. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-11 10:26:15 +02:00
Dimitri Savineau	ee6f0547ba	library: add ceph_dashboard_user module This adds the ceph_dashboard_user ansible module for replacing the command module usage with the ceph dashboard ac-user-xxx command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-11 10:16:08 +02:00
Dimitri Savineau	142934057f	ceph_pool: use default size/min_size and rule_name Before [1] we were using default value for - size - min_size - rule_name when the key wasn't present in the pool dict. The commit [1] changed this by defaulting to omit. This patch restores the original workflow by using facts: - osd_pool_default_size - osd_pool_default_min_size - ceph_osd_pool_default_crush_rule_name [1] `af9f6684f2` Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-11 10:15:28 +02:00
Dimitri Savineau	f63022dfec	ceph-facts: only get fsid when monitor are present When running the rolling_update playbook with an inventory without monitor nodes defined (like external scenario) then we can't retrieve the cluster fsid from the running monitor. In this scenario we have to pass this information manually (group_vars or host_vars). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1877426 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-10 13:19:44 -04:00
Dimitri Savineau	8dacbce68f	ceph-rgw: use ceph_pool module Since [1] we can use the ceph_pool module instead of using the command module combined with ceph osd pool commands. [1] `bddcb439ce` Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-10 15:16:58 +02:00
Guillaume Abrioux	657e6c8c3b	tests: clean legacy clean some legacies since quay.ceph.io migration Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-09 14:42:41 +02:00
Niko Smeds	a951c1a3f0	Enable HAProxy backend checks for Ceph RGW Add the `check` option to server definitions to enable basic HAProxy health checks for Ceph RADOS gateway backends. Currently traffic will be forwarded to unhealthly `radosgw.service` servers. These changes resolve the issue. Signed-off-by: Niko Smeds nikosmeds@gmail.com	2020-08-27 10:57:46 -04:00
Guillaume Abrioux	54d3e9650f	dashboard: refact admin user creation task this commit splits this task in order to avoid using a `shell` module. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-21 09:22:11 +02:00
Guillaume Abrioux	f0fe193d8e	facts: refact and optimize memory consumption there's no need to run this task on all nodes. This uses too much memory for nothing. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1856981 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-20 11:16:26 -04:00
George Shuklin	73d4bb6bd6	Make 'disable ssl for dashboard task' idempotent. This should reduce number of 'changed' tasks during convergence test. Signed-off-by: George Shuklin <george.shuklin@gmail.com>	2020-08-20 16:48:32 +02:00
Rafał Wądołowski	55cd6e83e4	Comment out ceph_custom_key Since there is a check if ceph_custom_key is defined, there is no reason to define it by default. Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com>	2020-08-20 13:36:24 +02:00
Guillaume Abrioux	899d317196	iscsigw: add retry/until In order to avoid failures that could be fixed by simply retrying. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-20 13:25:05 +02:00
John Fulton	95dee6f1ca	Set default permission for prometheus config files Regardless of the outcome of Ansible 2.9.12 issue 71200 we can set a default permission for these files. Closes: https://github.com/ceph/ceph-ansible/issues/5677 Signed-off-by: John Fulton <fulton@redhat.com>	2020-08-18 15:49:31 -04:00
Guillaume Abrioux	8ed11ea3ee	infra: only install logrotate on right nodes For intsance, there is no need to install logrotate on clients nodes. This also ensure logrotate is installed only for containerized deployments since the packaging has an explicit dependency to logrotate Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-18 10:56:09 -04:00
Dimitri Savineau	cb8f0237e1	ceph-rgw: allow specifying crush rule on pool We already support specifiying a custom crush rule during pool creation in ceph-osd role but not in ceph-rgw role. This patch adds the missing code to implement this feature. Note this is only available for replicated pool not erasure. The rule must also exist prior the pool creation. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1855439 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-08-17 22:59:06 +02:00
Ali Maredia	5c1f4b1a1e	rgw: allow rgws to be concurrently with or without multisite Allows rgws in a ceph cluster to be run with multisite and without multisite at the same time. Signed-off-by: Ali Maredia <amaredia@redhat.com>	2020-08-17 11:11:11 +02:00
Guillaume Abrioux	e1cb385740	infra: add missing tag This commit adds the missing `with_pkg` tag on the logrotate installation task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-13 10:08:18 -04:00
Guillaume Abrioux	f1aa6cea21	infra: add log rotation support (containers) This commit adds the log rotation support via logrotate in containerized deployments. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1848388 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-11 15:03:20 +02:00
Guillaume Abrioux	448cc280b7	common: don't enable debug log on ceph-volume calls by default ceph-volume can generate large logs at some point. debug logs by definition should be enabled only when debugging. Let's make it customizable with a variable which is set to `False` by default. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-11 15:03:20 +02:00
raul	110eaf5f9f	rgw: support 1+ rgw instance in `radosgw_frontend_port` Change the radosgw_frontend_port to take in account more than 1 RGW instance, in it's original form `radosgw_frontend_port: radosgw_frontend_port \| int`, it configured the 8080 port to all instances, with the following modification `radosgw_frontend_port: radosgw_frontend_port \| int + item\|int` we increase in 1 the port count. Co-authored-by: Daniel Parkes <dparkes@redhat.com> Signed-off-by: raul <rmahique@redhat.com>	2020-08-11 14:05:43 +02:00
Guillaume Abrioux	dd4b5b0328	nfs: do not copy rgw keyring when `nfs_obj_gw` is true This keyring shouldn't be copied when `nfs_obj_gw` is `True` if the cluster doesn't contain a rgw node, which can be the case given we are using `nfs_obj_gw` instead of `nfs_file_gw` (cephfs vs. object), the deployment will fail trying to copy a key that doesn't exist. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-07 13:21:17 +02:00
Guillaume Abrioux	0a581a6e60	config: only add related rgw section there's no need to add each rgw section on all rgw nodes. With this commit, only related rgw section are rendered. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-03 14:47:27 +02:00
Dimitri Savineau	0d0f1e71df	dashboard: allow remote TLS cert/key copy When using TLS on the ceph dashboard or grafana services, we can provide the TLS certificate and key. Those files should be present on the ansible controller and they will be copyied to the right node(s). In some situation, the TLS certificate and key could be already present on the target node and not on the ansible controller. For this scenario, we just need to copy the files locally (on each remote host). This patch adds the dashboard_tls_external variable (with default to false) to allow users to achieve this scenario when configuring this variable to true. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1860815 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-08-03 13:39:47 +02:00
Dimitri Savineau	4e84b4beed	ceph-facts: remove mds_name fact The mds_name fact always gets the ansible_hostname value so we don't need to have a dedicated fact for this and use the ansible_hostname fact instead. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-23 17:02:43 +02:00
Dimitri Savineau	cbe79428e6	ceph-handler: remove iscsigws restart scripts The iscsigws restart scripts for tcmu-runner and rbd-target-{api,gw} services only call the systemctl restart command. We don't really need to copy a shell script to do it when we can use the ansible service module instead. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-23 17:02:12 +02:00
Dimitri Savineau	47b7c00287	podman: always remove container on start In case of failure, the systemd ExecStop isn't executed so the container isn't removed. After a reboot of a failed node, the container doesn't start because the old container is still present in created state. We should always try to remove the container in ExecStartPre for this situation. A normal reboot doesn't trigger this issue and this also doesn't affect nodes running containers via docker. This behaviour was introduced by `d43769d`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1858865 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-23 17:00:38 +02:00
Dimitri Savineau	18e3c7a0a2	ceph-handler: add missing condition on ceph-crash The ceph-crash tasks present in the ceph-handler role don't need to be executed on all nodes. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-21 23:26:11 +02:00
Guillaume Abrioux	39bb279a53	crash: rm container in ExecPreStart even with docker We should ensure the container is removed in `ExecPreStart` even when `{{ container_binary }}` is docker. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-21 23:23:18 +02:00
Guillaume Abrioux	9d2f2108e1	ceph-crash: introduce new role ceph-crash This commit introduces a new role `ceph-crash` in order to deploy everything needed for the ceph-crash daemon. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-21 20:22:12 +02:00
Guillaume Abrioux	d490968fc8	defaults: remove legacy These variables aren't consummed anywhere else than in ceph-nfs role so there is no need to have them in `ceph-defaults`'s defaults Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-21 09:39:15 +02:00
Guillaume Abrioux	f8a951f50c	facts: fix broken facts when using --limit This commit fixes these tasks when --limit is used. It makes sure the fact is set on right nodes even when the playbook is run with `--limit` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-20 10:56:10 -04:00
Dimitri Savineau	2b8ebf1457	ceph-dashboard: copy TLS cert/key on monitor The ceph-dashboard role is executed on the mgr nodes so the TLS cert/key files are copied to those nodes. But we are running importing the cert/key files into the ceph configuration on the monitor. Closes: #5557 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-20 16:16:35 +02:00
Guillaume Abrioux	86edae724f	rgw: set container memory limit to 4g This commit changes the container memory limit for rgw daemons. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707488 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-09 15:31:10 +02:00
Guillaume Abrioux	bcc673f66c	facts: refact `ceph_uid` fact There's no need to set this fact with a `set_fact` We can achieve this in `ceph-defaults` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-09 13:37:29 +02:00
Dimitri Savineau	1438ca0120	ceph-nfs: change ganesha devel source The download.nfs-ganesha.org source for nfs-ganesha on CentOS isn't available anymore. Let's switch back to shaman since we have builds available now. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-06 16:59:25 +02:00
Dimitri Savineau	93754bd70c	ceph-defaults: update nfs-ganesha to 3.3 nfs-ganesha 3.3 is the latest 3.x release available for octopus so we should update to this version. https://download.ceph.com/nfs-ganesha/rpm-V3.3-stable/octopus This will also match the version used in RHCS 5. Ceph container already uses that version too. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-03 06:36:49 +02:00
Dimitri Savineau	1361e84a4e	radosgw: remove INST_PORT environment variable This variable isn't consumed by the container so we can remove it. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-02 16:52:29 +02:00
Guillaume Abrioux	7dd68b9ac1	rgw: fix multi instances scaleout When rgw and osd are collocated, the current workflow prevents from scaling out the radosgw_num_instances parameter when rerunning the playbook. The environment file used in the rgw systemd template is rendered when executing the `ceph-rgw` role but during a new run of the playbook (in order to scale out rgw instances), handlers are triggered from `ceph-osd` role which is run before `ceph-rgw`, therefore it tries to start the new rgw daemon whereas its corresponding environment file hasn't been rendered yet and fails like following: ``` ceph-radosgw@rgw.ceph4osd3.rgw1.service failed to run 'start-pre' task: No such file or directory ``` This commit moves the tasks generating this file in `ceph-config` role so it is generated early. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1851906 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-02 10:39:50 -04:00
Dimitri Savineau	3592ba1d61	ceph-common: remove copr and sepia repositories All EL8 dependencies are now present on EPEL 8 so we don't need the additional repositories that were only a temporary solution. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-06-30 08:35:19 +02:00
George Shuklin	3e87f53875	Add container settings for Ubuntu 20 (the same as Ubuntu 18) Signed-off-by: George Shuklin <george.shuklin@gmail.com>	2020-06-29 12:18:58 -04:00
Dimitri Savineau	03cd75845f	dashboard: configure mgr backend before restart We need to set the mgr dashboard server ip address before restarting the dashboard module otherwise we can try to bind the dashboard module on an already used address. We already do this configuration for the dashboard port value and ssl setup so we should do the same for server address too. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1851455 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-06-29 14:59:01 +02:00
Jonathan Rosser	42884e8175	Ansible tests are not filters The use of "\| success" and "\| changed" are not valid syntax for modern ansible releases. Signed-off-by: Jonathan Rosser <jonathan.rosser@rd.bbc.co.uk>	2020-06-26 12:26:25 -04:00
Jonathan Rosser	92288c11c5	Install python routes package as a dependancy rather than directly This is now a dependancy of ceph-mgr so will be installed automatically and does not need a specific task. This change means that ceph-mgr installs correctly on Ubuntu Focal where the python3-routes package is necessary. Signed-off-by: Jonathan Rosser <jonathan.rosser@rd.bbc.co.uk>	2020-06-26 12:26:25 -04:00
Guillaume Abrioux	b7539eb275	dashboard: copy self-signed generated crt to mons This commit makes the playbook copying self-signed generated certificate to monitors. When mons and mgrs are deployed on dedicated nodes the playbook will fail when trying to import certificate and key files since they are generated on mgrs whereas we try to import them from a monitor. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846995 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-06-23 09:37:21 -04:00
Dimitri Savineau	d43769dc2a	podman: Add Type and PIDFile value to unit files This changes the way we are running the podman containers via systemd. They are now in dettached mode and Type/PIDFile set. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1834974 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-06-23 09:37:50 +02:00
Dimitri Savineau	bd22f1d1ec	docker: Add Requires on docker service When using docker container engine then the systemd unit scripts only use a dependency on the docker daemon via the After parameter. But if docker is restarted on a live system then the ceph systemd units should wait for the docker daemon to be fully restarted. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846830 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-06-22 23:08:50 +02:00
Dimitri Savineau	829990e60d	ceph-osd: remove ceph-osd-run.sh script Since we only have one scenario since nautilus then we can just move the container start command from ceph-osd-run.sh to the systemd unit service. As a result, the ceph-osd-run.sh.j2 template and the ceph_osd_docker_run_script_path variable are removed. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-06-18 17:51:13 +02:00
Dimitri Savineau	0f8a61a3ae	debian/uca: remove the handler notification The "update apt cache" in the ceph-handler role was never called and the handler trigger after adding the uca repository doesn't exist at all. Instead of using a handler for that we can just set the update_cache parameter to true like the other apt_repository tasks. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-06-17 10:14:03 +02:00
Guillaume Abrioux	b91d60d384	switch_to_containers: don't set noup flag We shouldn't set this flag when running switch_to_containers playbook. Otherwise the playbook fails waiting for pgs to be clean. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1843569 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-06-17 01:32:18 +02:00
Dimitri Savineau	cdb30bd125	container: inspect Id field instead of RepoDigests When a container image managed by podman isn't tag anymore then the RepoDigests field when inspecting the image doesn't return any value. This is different from docker workflow and it breaks the ceph-ansible container upgrade when collocated multiple services and using a non fix container tag (like latest or 4). $ podman images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/ceph/daemon latest 680c9c0d38c3 8 days ago 957 MB <none> <none> 011ee108bfc9 2 months ago 1.01 GB $ podman inspect 680c9c0d38c3 \| jq .[0].RepoDigests[0] "docker.io/ceph/daemon@sha256:20cf789235e23ddaf38e109b391d1496bb88011239d16862c4c106d0e05fea9e" $ podman inspect 011ee108bfc9 \| jq .[0].RepoDigests[0] null Because this field returns "null" then the ansible task trying to determine this value is failing ----------------------------- fatal: [foo]: FAILED! => msg: \|- The task includes an option with an undefined variable. The error was: None has no element 0 The error appears to be in 'roles/ceph-container-common/tasks/fetch_image.yml': line 137, column 3, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: set_fact ceph_osd_image_repodigest_before_pulling ^ here ----------------------------- We don't have this behaviour with docker. $ docker images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/ceph/daemon latest 680c9c0d38c3 8 days ago 928 MB docker.io/ceph/daemon <none> 011ee108bfc9 2 months ago 986 MB $ docker inspect 680c9c0d38c3 \| jq .[0].RepoDigests[0] "docker.io/ceph/daemon@sha256:45e6f28bb67c81b826acb64fad5c0da1cac3dffb41a88992fe4ca2be79575fa6" $ docker inspect 011ee108bfc9 \| jq .[0].RepoDigests[0] "docker.io/ceph/daemon@sha256:b393a73309d72e43ca7d65cd3519036007947671e373eb59aa75a46185c52231" Instead we should just get the Id field. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1844496 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-06-16 17:06:25 +02:00
Ali Maredia	0175c205fa	rgw multisite: add master zone endpoints to zonegroup We were only adding the endpoints to the master zone but not to the zonegroup. This patch fixes the issue. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1839228 Signed-off-by: Ali Maredia <amaredia@redhat.com>	2020-06-09 09:50:18 -04:00
Ansible Deployment User	3f906e0c26	rgwloadbalancer undefined index variable The vrrp_instances variable is using a loop with index but the index_var wasn't defined. As a result, the fact task was failing on this undefined index variable. The task includes an option with an undefined variable. The error was: 'index' is undefined Closes: #5395 Signed-off-by: Florian Faltermeier <florian.faltermeier@uibk.ac.at>	2020-05-26 10:03:25 -04:00
Dimitri Savineau	44e1ebaaff	ceph-nfs: add stable noarch repository When using the stable nfs ganesha repository, we need have both arch and noarch repositories enabled. Currently the noarch repository is missing which cause the non containerized deployment to fail. Closes: #5375 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-05-16 07:34:08 +02:00
Guillaume Abrioux	af9f6684f2	common: introduce ceph_pool module calls This commits calls the `ceph_pool` module for creating ceph pools everywhere it's needed in the playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-05-16 07:31:57 +02:00
Guillaume Abrioux	8c7a48832c	common: fix target_size_ratio task enablement The condition on this task is wrong, we have to check whether `target_size_ratio` is set in the pool definition instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-05-15 20:57:32 +02:00
Guillaume Abrioux	e5e81843e9	facts: always set ceph_run_cmd and ceph_admin_command always set these facts on monitor nodes whatever we run with `--limit`. Otherwise, playbook will fail when using `--limit` on nodes where these facts are used on a delegated task to monitor. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-05-15 10:53:15 +02:00
Dimitri Savineau	252e78b4e4	docker2podman: manage dashboard nodes The dashboard nodes (alertmanager, grafana, node-exporter, and prometheus) were not manage during the docker to podman migration. This adds the systemd container template of those services to a dedicated file (systemd.yml) in order to include it in the docker2podman playbook. This also adds the dashboard container images pull from docker to podman. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1829389 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-05-13 12:02:00 +02:00
Dimitri Savineau	b20519efd0	dashboard: allow disabling grafana api ssl verify When using an untrusted TLS certificate (like self-signed) on grafana then the grafana dashboards update subcommand will fail. One solution could be to trust the TLS certificate. The other one is to disable the TLS verification on the grafana API. Closes: #5324 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-05-13 11:56:57 +02:00
Dimitri Savineau	222fe4abd8	ceph-nfs: bind mount ganesha log directory The current ganesha log directory is only present in the container and not bind mount on the host. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-05-13 11:55:38 +02:00
Benoît Knecht	444b46ea24	ceph-validate: Expand templates in rgw_create_pools Same fix as `ceph-rgw` for `rgw_create_pools` pool names that contain Jinja templates. See #5348 for details. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-05-11 11:51:27 -04:00
Benoît Knecht	d2b7670c7d	ceph-rgw: Make sure pool name templates are expanded It is common to set templated pool names in `rgw_create_pools`, e.g. ```yaml rgw_create_pools: "{{ rgw_zone }}.rgw.buckets.index": pg_num: 16 size: 3 type: replicated ``` This worked fine with Ansible 2.8, but broke in Ansible 2.9 due to a change in the way `with_dict` works [1]. This commit replaces the use of `with_dict` with ```yaml loop: "{{ rgw_create_pools \| dict2items }}" ``` which works as intended and expands the template in the pool name. [1]: https://docs.ansible.com/ansible/latest/porting_guides/porting_guide_2.9.html#loops Closes #5348 Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-05-11 11:51:27 -04:00
Benoît Knecht	b7efca1785	ceph-validate: Fix "fail on unsupported CentOS release" The `dashboard_enabled` condition used a `true` filter (which doesn't exist) instead of the `bool` filter. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-05-08 10:21:11 -04:00
Dimitri Savineau	34e6e8e06c	ceph-rgw: use match instead of equalto from jinja2 The '==' jinja2 operator (or 'equalto') has been introduced in jinja2 2.8. On EL7, jinja2 version is 2.7 so the operator isn't present creating templating error like: The error was: TemplateRuntimeError: no test named '==' Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1747206 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-05-06 14:23:10 -04:00
Dimitri Savineau	8a890306ad	ceph-nfs: fix internal ganesha deployment Since `ea2b654d9` we're not running the rados command from the monitor nodes but from the ganesha node. Unfortunately we don't have the required keyring on that node to run the rados command as we don't import the right keyring. This commit restores the workflow for internal ganesha deployment like before `ea2b654d9` but keeps the rados commands from the ganesha node for external deployment until we have a better design. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-05-06 11:10:08 -04:00
Dimitri Savineau	748ac4b928	ceph-nfs: fix keyring copy for external ganesha Fix the condition on the keyring copy task that prevent the ganesha keyring to be created in the /var/lib/ceph directory. Also ensure that the directory exists first. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1831285 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-05-06 11:10:08 -04:00
Guillaume Abrioux	cf460274c7	nfs: fix 2 typo The condition is missing an index here which makes the playbook failing. Typical error: ``` The conditional check 'not item.get('skipped', False)' failed. The error was: error while evaluating conditional (not item.get('skipped', False)): 'list object' has no attribute 'get'", ``` Also, adds the missing '/keyring' on the `exec_cmd_nfs` fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1831342 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-05-06 11:10:08 -04:00
Dimitri Savineau	ed4f23d530	ceph-facts: fix IPv6 _radosgw_address interface When using radosgw_interface and IPv6 setup then the _radosgw_address fact doesn't use square brackets compared to the radosgw_address and radosgw_address_block configuration. Closes: #5325 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-04-28 14:35:16 -04:00
fmount	5eb363e033	Refresh ceph dashboard user role This change allows the operator to refresh the ceph dashboard admin role on multiple ceph-ansible executions. In the current state the role is set only when the user is created, and there's no way to change it if the user exists. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1826002 Signed-off-by: fmount <fpantano@redhat.com>	2020-04-23 16:28:49 -04:00
Dimitri Savineau	f1728929cd	ceph-dashboard: fix mgr dashboard IPv6 fact `15ed9ee` introduced a regression for the mgr dashboard daemon using IPv6 since the mgr dashboard configuration doesn't support brackets. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1827299 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-04-23 14:44:46 -04:00
Dimitri Savineau	2547ab601a	Readd CentOS 7 with conditions The CentOS 7 distribution could still be used be deploying ceph if - it's a containerized deployment - it's a non containerized deployment without the dashboard (due to missing python3 libraries). The ceph_stable_redhat_distro variable has been remove because we can rely on the ansible_distribution_major_version fact instead. The copr el8 repository configuration is only applied for CentOS 8. The ceph-mgr-dashboard package is only installed when the dashboard_enabled variable is set to true. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-04-23 13:31:11 +02:00
Guillaume Abrioux	86dc6f8206	mds: don't enable application pool on cephfs pools this commit removes the task which enable application on cephfs pools. See: https://tracker.ceph.com/issues/43761 Fixes: #5278 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-04-23 13:23:10 +02:00
ianwatsonrh	ccf6a7f153	typo: updating type check on rc Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1826884 Signed-off-by: ianwatsonrh <ianwatson@redhat.com>	2020-04-23 13:20:35 +02:00
abaird-rh	eb71244bfd	Updated use of deprecated filter This was removed in Ansible 2.9. [DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of using `result\|version_compare` use `result is version_compare`. This feature will be removed in version 2.9. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. Rename 'version_compare' to the function 'version'. version_compose was renamed to version since ansible 2.5 Signed-off-by: abaird-rh <abaird@redhat.com>	2020-04-20 15:29:29 +02:00
Guillaume Abrioux	378405e328	mds: fix --limit run against mds nodes This commit fixes --limit runs against mds nodes. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-04-14 10:42:43 -04:00
Guillaume Abrioux	ea2b654d95	nfs: create empty rados index object for nfs standalone This commit creates an empty rados index object even when deploying standalone nfs-ganesha. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822328 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-04-14 10:40:37 -04:00
Dimitri Savineau	5de74fe512	ceph-validate: update RHEL requirement for RHCS We were not testing the right ansible_distribution fact value for RHEL distribution. This commit also updates the minial RHEL version supported by RHCS. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-04-09 20:43:22 +02:00
Guillaume Abrioux	4bcc52cb2a	osd: fix monitor_name error when scaling out OSDs This commit fixes a bug when trying to scale out osd nodes with `crush_rule_config` is enabled. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822599 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-04-09 13:46:40 -04:00
Paulo Matias	38ce02c2ea	Allow user to specify grafana_server_fqdn This is needed to get a TLS certificate to validate correctly. If unspecified, auto-detected grafana_server_addr is used. Signed-off-by: Paulo Matias <matias@ufscar.br>	2020-04-07 20:51:23 +02:00
Paulo Matias	dac8e1d0a9	Prometheus APIs are only available through plain http Trying to access these APIs through TLS produces "Could not reach external API" errors in Ceph dashboard. Signed-off-by: Paulo Matias <matias@ufscar.br>	2020-04-07 20:51:23 +02:00
Matthew Vernon	7963a76c7a	Use a tempfile directory to store restart scripts Make a tempfile directory and copy the restart scripts there (and then execute them from there), rather than using insecure known filenames in /tmp/ This is a partial fix for ceph/ceph-ansible#2937 Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>	2020-04-06 22:55:51 +02:00
Dimitri Savineau	6617d90733	ceph-mgr: add saml python lib for dashboard SSO The dashboard SSO mgr module requires the saml python library to be installed. This is only a valid scenario for RHCS deployment because the saml python library isn't available in other classic repositories. This package is present in RHCS Tools repository so we also need to enable it on the mgr nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1820233 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-04-06 10:11:00 -04:00
Guillaume Abrioux	1bb9860dfd	osd: use default crush rule name when needed When `rule_name` isn't set in `crush_rules` the osd pool creation will fail. This commit adds a new fact `ceph_osd_pool_default_crush_rule_name` with the default crush rule name. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1817586 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-31 14:49:38 -04:00
Guillaume Abrioux	8c1c34b201	tests: add more coverage in external_clients scenario Run create_users_keys.yml in external_clients scenario Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-31 14:49:38 -04:00
Guillaume Abrioux	5b0476385c	osd: support changing default rule even when osd_crush_location isn't defined Creating crush rules even with no crush hierarchy configuration is a valid scenario so we shouldn't be bound to the first task result (which configure crush hierarchy) to be able to add new crush rules. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1816989 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-31 14:26:48 -04:00
Dimitri Savineau	64701437de	container: remove ulimit nofile parameter Since Ceph Octopus is python3 only we don't need to specify the max open files anymore with the container engine. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-30 09:54:23 +02:00
Dimitri Savineau	4ac99223b2	rhcs: drop debian support Support for debian with RHCS has been dropped starting RHCS 4 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-27 04:36:36 +01:00
Dimitri Savineau	90ad110861	rhcs: update release to 5 for octopus RHCS 5 will be based on Ceph Octopus release and only supported on RHEL 8. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-26 22:00:08 +01:00
Guillaume Abrioux	e551b5ba1a	defaults: remove legacy comment This is no longer true, let's remove this comment given that this option is not ignored in containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-26 09:19:14 -04:00
Guillaume Abrioux	b7ada14cf5	defaults: change nfs_ganesha_stable_branch In master, even though we are using dev repo, the value here should be closer from the last stable released. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-25 22:30:15 +01:00
Dimitri Savineau	706de944cf	ceph-defaults: update ceph_stable_redhat_distro Since octopus the ceph_stable_redhat_distro variable should be set to el8 instead of el7. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-25 21:00:24 +01:00
Dimitri Savineau	0487d21938	ceph-facts: fix rgw_instances_all fact The rgw_instances_all fact is supposed to be the list of all radosgw instances from all rgw nodes. But the fact is always using the local rgw_instances variable so this won't work on multiple nodes. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-25 08:02:13 +01:00
Guillaume Abrioux	83fdf24caf	doc/tests: bump to ansible 2.9 on master Add testing against ansible 2.9 on master branch. This commit also updates the documentation. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-25 08:01:27 +01:00
Guillaume Abrioux	1b0b7af119	osd: add a default value for 'default' in crush_rules Let's default to `False` for the `default` attribute in `crush_rules` variable. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1797774 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-24 08:41:43 -04:00
Dimitri Savineau	df8f853c85	Add pacific release Add the 16th ceph release: pacific. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-24 09:47:12 +01:00
Guillaume Abrioux	1a7f3caecb	facts: fix typo This commit fixes a typo in some task titles Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-23 14:03:52 -04:00
Guillaume Abrioux	cc28d9ec26	nfs: fix nfs with external ceph cluster support This commit refact and fix the nfs deployment with external ceph cluster support. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1814942 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-19 18:21:16 -04:00
Dimitri Savineau	fb69f6990c	dashboard: allow to set read-only admin user This commit allows one to set the role for the admin user as read-only. This can be controlled via the dashboard_admin_user_ro variable but the default value is false for backward compatibility. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1810176 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-19 15:34:41 +01:00
Dimitri Savineau	5051e67f8f	ceph-defaults: add registry name on dashboard vars We don't use the registry name when using the community dashboard container images (grafana, prometheus, alertmanager & node exporter). This commit adds the docker.io registry explicitly in the default dashboard container image name values. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-19 14:27:50 +01:00
Dimitri Savineau	b97a4d5201	ceph-defaults: update grafana container tag Since `8e8aa73` we're using grafana 5.4.3 in RHCS 4.1 via [1]. We should also update the grafana container tag from docker.io when using the community release. [1] registry.redhat.io/rhceph/rhceph-4-dashboard-rhel8:4 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-17 14:35:06 +01:00
petruha	73b3fadb0e	ceph-facts: Fix system_secret_key variable handling This commit fixes the system_secret_key variable not substitued by the right value and always using the 'system_secret_key' string instead. $ egrep 'system_(access\|secret)_key' group_vars/all.yml system_access_key: foofoofoofoofoofoofo system_secret_key: barbarbarbarbarbarbarbarbarbarbarbarbarb $ ansible-playbook -vv -i hosts site.yml.sample -e rgw_multisite=true (...) - hostname: storage0 endpoint: http://192.168.100.42:8080 instance_name: rgw0 radosgw_address: 192.168.50.3 radosgw_frontend_port: 8085 rgw_realm: canada rgw_zone: montreal rgw_zone_user: justin.trudeau rgw_zone_user_display_name: Justin Trudeau rgw_zonegroup: quebec system_access_key: foofoofoofoofoofoofo system_secret_key: system_secret_key Fixes https://github.com/ceph/ceph-ansible/issues/5150 Signed-off-by: petruha <5363545+p37ruh4@users.noreply.github.com>	2020-03-16 17:38:52 -04:00
Guillaume Abrioux	152c2caa9f	config: remove legacy option in ceph.conf.j2 This option has been deprecated (As of 0.51). By the way, ceph-ansible already sets the auth_{service,client,cluster}_required variables. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1623586 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-16 09:49:36 -04:00
Dimitri Savineau	3626c688cf	handler: add rgw multi-instances support This commit adds the rgw multi-instances support in ceph-handler (restart_rgw_daemons.sh.j2) Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-12 16:44:48 -04:00
Guillaume Abrioux	60a2e28189	rgw: add multi-instances support when deploying multisite This commit adds the multi-instances when deploying rgw multisite Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-12 16:44:48 -04:00
Dimitri Savineau	e8bf0a0cf2	ceph-infra: open radosgw ports for multi instances When using the radosgw multi instances configuration then the firewall rules aren't adapted to that setup. We only open the port according to the radosgw_frontend_port variable so only the first radosgw instance port will be opened in the firewall configuration. We should instead iterate over the rgw_instances list. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-12 16:44:48 -04:00
Dimitri Savineau	e62532de46	update osd pool set size command Since [1] we can't use osd pool without replicas (size: 1) by default. We now need to set the mon_allow_pool_size_one flag to true in the ceph configuration and add the --yes-i-really-mean-it flag to the osd pool set size cli. [1] https://github.com/ceph/ceph/commit/21508bd Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-11 11:25:42 +01:00
Guillaume Abrioux	b3bbd6bb77	rgw: fix a typo in create_realm_zonegroup_zone_lists This commit fixes a typo. `s/realms/secondary_realms` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-10 14:13:30 +01:00
Guillaume Abrioux	b3d943fe9f	infra: add retries/until on firewalld start task This commit make that task retrying 5 times to start the service firewalld to avoid failure like following: ``` TASK [ceph-infra : start firewalld] ****************************************** task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-centos-container-purge/roles/ceph-infra/tasks/configure_firewall.yml:22 Monday 09 March 2020 08:58:48 +0000 (0:00:00.963) 0:02:16.457 ******** fatal: [osd4]: FAILED! => changed=false msg: \|- Unable to enable service firewalld: Created symlink from /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service to /usr/lib/systemd/system/firewalld.service. Created symlink from /etc/systemd/system/multi-user.target.wants/firewalld.service to /usr/lib/systemd/system/firewalld.service. Failed to execute operation: Connection reset by peer ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-09 15:01:34 -04:00
Christian Berendt	608d7188a1	openstack_keys: use openstack_cinder_pool.name Instead of volumes as a static string the openstack_cinder_pool.name variable should be used as with the other keys. Signed-off-by: Christian Berendt <berendt@betacloud-solutions.de>	2020-03-09 08:17:22 -04:00
Guillaume Abrioux	7a8a719e75	rgw: add retry/until on pools tasks Sometimes, these task can timeout for some reason. Adding these retries can help to avoid unexcepted failures. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-06 08:55:13 -05:00
Guillaume Abrioux	eac207091b	client: skip create_users_keys.yml when rolling_update There's no need to run this part of the role when upgrading clients node. Let's skip it when rolling_update.yml is being run. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-04 13:06:32 -05:00
Ali Maredia	71f55bd54d	rgw multisite: enable more than 1 realm per cluster Make it so that more than one realm, zonegroup, or zone can be created during a run of the rgw multisite ansible playbooks. The rgw hosts now need to be grouped into zones and realms in the inventory. .yml files need to be created in group_vars for the realms and zones. Sample yaml files are available. Also remove multsite destroy playbook and add --cluster before radosgw-admin commands remove manually added rgw_zone_endpoints var and have ceph-ansible automatically add the correct endpoints of all the rgws in a rgw_zone from the information provided in that rgws hostvars. Signed-off-by: Ali Maredia <amaredia@redhat.com>	2020-03-04 12:58:13 -05:00
Guillaume Abrioux	e17c79b871	osd: do not change pool size on erasure pool This commit adds condition in order to not try to customize pools size when its type is erasure. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-04 09:29:01 -05:00
Guillaume Abrioux	47adc2bb08	osd: add pg autoscaler support This commit adds the pg autoscaler support. The structure for pool definition has now two additional attributes `pg_autoscale_mode` and `target_size_ratio`, eg: ``` test: name: "test" pg_num: "{{ osd_pool_default_pg_num }}" pgp_num: "{{ osd_pool_default_pg_num }}" rule_name: "replicated_rule" application: "rbd" type: 1 erasure_profile: "" expected_num_objects: "" size: "{{ osd_pool_default_size }}" min_size: "{{ osd_pool_default_min_size }}" pg_autoscale_mode: False target_size_ratio": 0.1 ``` when `pg_autoscale_mode` is `True` user has to set a decent value in `target_size_ratio`. Given that it's a new feature, it's still disabled by default. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1782253 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-04 09:29:01 -05:00
Guillaume Abrioux	bf1f125d71	osd: refact osd pool creation Currently, the command executed is wrong, eg: ``` cmd: - podman - exec - ceph-mon-controller-0 - ceph - --cluster - ceph - osd - pool - create - volumes - '32' - '32' - replicated_rule - '1' delta: '0:00:01.625525' end: '2020-02-27 16:41:05.232705' item: ``` From documentation, the osd pool creation command is : ``` ceph osd pool create {pool-name} {pg-num} [{pgp-num}] [replicated] \ [crush-rule-name] [expected-num-objects] ceph osd pool create {pool-name} {pg-num} {pgp-num} erasure \ [erasure-code-profile] [crush-rule-name] [expected_num_objects] ``` it means we pass '1' (from item.type) as value for `expected_num_objects` by default which is very likely not what we want. Also, this commit modifies the default value when no `rule_name` is set to use the existing variable `osd_pool_default_crush_rule` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1808495 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-04 09:29:01 -05:00
Dimitri Savineau	be8b315102	ceph-validate: add key format validation If the user provides manually the key value for a specific keyring then there's not valation on the content which could lead to unexpected failures in the ceph_key module. Closes: #5104 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-03 10:01:58 +01:00
Dimitri Savineau	9d3b49293d	purge: stop rgw instances by iteration It looks like that the service module doesn't support wildcard anymore for stopping/disabling multiple services. fatal: [rgw0]: FAILED! => changed=false msg: 'This module does not currently support using glob patterns, found '''' in service name: ceph-radosgw@' ...ignoring Instead we should iterate over the rgw_instances list. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-02 16:32:06 +01:00
Dimitri Savineau	90b1fc8fe9	ceph-infra: install firewalld python bindings When using the firewalld ansible module we need to be sure that the python bindings are installed. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-02 16:32:06 +01:00
Dimitri Savineau	45fb9241c0	ceph-infra: split firewalld tasks Since ansible 2.9 the firewalld task could not be used with service and source in the same time anymore. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-02 16:32:06 +01:00
Dimitri Savineau	aefba82a2e	Add ansible 2.9 support This commit adds ansible 2.9 support in addition of 2.8. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-02 16:32:06 +01:00
Guillaume Abrioux	0326d992c2	osd: add journal option in ceph_volume call (batch) This commit adds the journal option to the ceph_volume call when scenario is lvm batch Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-02-28 17:29:59 -05:00
Guillaume Abrioux	a084a2a347	common: support OSDs with more than 2 digits When running environment with OSDs having ID with more than 2 digits, some tasks don't match the system units and therefore, playbook can fail. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1805643 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-02-27 09:48:36 +01:00
Dimitri Savineau	44e750ee5d	ceph-rgw: increase connection timeout to 10 5s as a connection timeout could be low in some setup. Let's increase it to 10s. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-24 16:01:36 +01:00
Francesco Pantano	15ed9eebf1	Configure ceph dashboard backend and dashboard_frontend_vip This change introduces a new set of tasks to configure the ceph dashboard backend and listen just on the mgr related subnet (and not on '*'). For the same reason the proper server address is added in both prometheus and alertmanger systemd units. This patch also adds the "dashboard_frontend_vip" parameter to make sure we're able to support the HA model when multiple grafana instances are deployed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792230 Signed-off-by: Francesco Pantano <fpantano@redhat.com>	2020-02-19 17:52:53 -05:00
Dimitri Savineau	ac0f68ccf0	ceph-dashboard: update create/get rgw user tasks Since [1] if a rgw user already exists then the radosgw-admin user create command will return an error instead of modifying the current user. We were already doing separated tasks for create and get operation but only for multisite configuration but it's not enough. Instead we should do the get task first and depending on the result execute the create. This commit also adds missing run_once and delegate_to statement. [1] https://github.com/ceph/ceph/commit/269e9b9 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-18 10:22:21 +01:00
Sam Choraria	2a2656a985	ceph-rgw: allow SSL certificate content to supplied Allow SSL certificate & key contents to be written to the path specified by radosgw_frontend_ssl_certificate. This permits a certificate to be deployed & renewal of expired certificates through ceph-ansible. Signed-off-by: Sam Choraria <sam.choraria@bbc.co.uk>	2020-02-17 16:22:11 +01:00
Dimitri Savineau	c644ea9041	ceph-defaults: remove bootstrap_dirs_xxx vars Both bootstrap_dirs_owner and bootstrap_dirs_group variables aren't used anymore in the code. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 16:17:40 +01:00
Ali Maredia	1834c1e48d	rgw: extend automatic rgw pool creation capability Add support for erasure code pools. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731148 Signed-off-by: Ali Maredia <amaredia@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 16:07:43 +01:00
Florian Faltermeier	9d081e2453	ceph-rgw-loadbalancer: Fix SSL newline issue The `ad7a5da` commit introduced a regression when using TLS on haproxy via the haproxy_frontend_ssl_certificate variable. This cause the "stats socket" and the "tune.ssl.default-dh-param" parameters to be on the same line resulting haproxy failing to start. [ALERT] 351/140240 (21388) : parsing [xxxxx] : 'stats socket' : unknown keyword 'tune.ssl.default-dh-param'. Registered [ALERT] 351/140240 (21388) : Fatal errors found in configuration. Fixes: #4869 Signed-off-by: Florian Faltermeier <florian.faltermeier@uibk.ac.at>	2020-02-17 16:05:42 +01:00
Dimitri Savineau	16e12bf2bb	rgw: don't create user on secondary zones The rgw user creation for the Ceph dashboard integration shouldn't be created on secondary rgw zones. Closes: #4707 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1794351 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 15:08:11 +01:00
John Fulton	e4bf4857f5	The _filtered_clients list should intersect with ansible_play_batch Client configuration with --limit fails without this patch because certain tasks are only done to the first host in the _filtered_clients list and it's likely that first host will not be included in what's sepcified with --limit. To fix this the _filtered_clients list should be built from all clients in the inventory that are also in the running play. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1798781 Signed-off-by: John Fulton <fulton@redhat.com>	2020-02-17 11:29:18 +01:00
Dimitri Savineau	6dd9b25565	ceph-iscsi: don't use ceph_dev_xxx variables Using ceph_dev_branch and ceph_dev_sha1 for configuring ceph-iscsi repositories from shaman doesn't make sense because the ceph devel branches and sha1 aren't compatible with ceph-iscsi devel. Instead we could rely on the master branch and the latest sha1. Currently it's not possible to using a custom ceph branch/sha1 value with iscsi setup otherwise the repository setup will fail. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 10:56:52 +01:00
Dimitri Savineau	10951eeea8	ceph-nfs: fix ceph_nfs_ceph_user variable The ceph_nfs_ceph_user variable is a string for the ceph-nfs role but a list in ceph-client role. `6a6785b` introduced a confusion between both variable type in the ceph-nfs role for external ceph with ganesha. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1801319 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 10:56:05 +01:00
Dimitri Savineau	0a3e85e8ca	ceph-nfs: add nfs-ganesha-rados-urls package Since nfs-ganesha 2.8.3 the rados-urls library has been move to a dedicated package. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 10:52:30 +01:00
Dimitri Savineau	1fc6b33714	ceph-{mon,osd}: move default crush variables Since `ed36a11` we move the crush rules creation code from the ceph-mon to the ceph-osd role. To keep the backward compatibility we kept the possibility to set the crush variables on the mons side but we didn't move the default values. As a result, when using crush_rule_config set to true and wanted to use the default values for crush_rules then the crush rule ansible task creation will fail. "msg": "'ansible.vars.hostvars.HostVarsVars object' has no attribute 'crush_rules'" This patch move the default crush variables from ceph-mon to ceph-osd role but also use those default values when nothing is defined on the mons side. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1798864 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 10:50:53 +01:00
Dimitri Savineau	15bd4cd189	ceph-grafana: fix grafana_{crt,key} condition The grafana_{crt,key} aren't boolean variables but strings. The default value is an empty string so we should do the conditional on the string length instead of the bool filter Closes: #5053 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 10:49:08 +01:00
Dimitri Savineau	b9d975385c	ceph-prometheus: add alertmanager HA config When using multiple alertmanager nodes (via the grafana-server group) then we need to specify the other peers in the configuration. https://prometheus.io/docs/alerting/alertmanager/#high-availability https://github.com/prometheus/alertmanager#high-availability Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792225 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 10:46:21 +01:00
Dimitri Savineau	5a03e0ee1c	containers: add KillMode=none to systemd templates Because we are relying on docker\|podman for managing containers then we don't need systemd to manage the process (like kill). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-13 16:11:33 +01:00
Dimitri Savineau	c6e96699f7	dashboard: allow configuring multiple grafana host When using multiple grafana hosts then we push set the grafana and prometheus URL and push the dashboard layout to a single node. grafana_server_addrs is the list of all grafana nodes and used during the ceph-dashboard role (on mgr/mon nodes). grafana_server_addr is the current grafana node used during the ceph-grafana and ceph-prometheus role (on grafana-server nodes). We don't have the grafana_server_addr fact duplication code between external vs collocated nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1784011 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-10 11:18:45 -05:00
Guillaume Abrioux	3700aa5385	switch_to_containers: increase health check values This commit increases the default values for the following variable consumed in switch-from-non-containerized-to-containerized-ceph-daemons.yml playbook. This also moves these variables in `ceph-defaults` role so the user can set different values if needed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783223 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-02-07 14:59:14 -05:00
Dimitri Savineau	298ba0bf03	ceph-facts: set devices osd_auto_discovery on OSDs We only need to set the devices fact with osd_auto_discovery on OSD nodes. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-03 16:23:38 +01:00
Dimitri Savineau	ed461544a7	ceph-facts: remove is_podman fact This was used before the CentOS 8 requirement when using CentOS 7 atomic which has both docker and podman installed. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-03 10:11:03 -05:00
Mike Christie	77f3b5d51b	iscsi: Fix crashes during rolling update During a rolling update we will run the ceph iscsigw tasks that start the daemons then run the configure_iscsi.yml tasks which can create iscsi objects like targets, disks, clients, etc. The problem is that once the daemons are started they will accept confifguration requests, or may want to update the system themself. Those operations can then conflict with the configure_iscsi.yml tasks that setup objects and we can end up in crashes due to the kernel being in a unsupported state. This could also happen during creation, but is less likely due to no objects being setup yet, so there are no watchers or users accessing the gws yet. The fix in this patch works for both update and initial setup. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1795806 Signed-off-by: Mike Christie <mchristi@redhat.com>	2020-01-31 11:15:36 -05:00
Dimitri Savineau	9b40a959b9	ceph-common: rhcs 4 repositories for rhel 7 RHCS 4 is available for both RHEL 7 and 8 so we should also enable the cdn repositories for that distribution. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1796853 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-31 09:33:51 -05:00

... 2 3 4 5 6 ...

2877 Commits (f7b7ba30d9680af3058afbd81290243f81bd3998)