ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	9817d29543	ceph-nfs: allow overriding NFS_CORE_PARAM We already have config override variables for existing block (like ganesha_ceph_export_overrides, ganesha_log_overrides, etc...) or a global one (ganesha_conf_overrides) but redefining the NFS_CORE_PARAM block in that variable will erase all previous values (currently only Bind_Addr). ganesha_core_param_overrides: \| Enable_UDP = false; NFS_Port = 2050; Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1941775 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-07-19 18:22:14 +02:00
Guillaume Abrioux	72a0336c71	dashboard: remove "certificate is valid for" error When deploying dashboard with ssl certificates generated by ceph-ansible, we enforce the CN to 'ceph-dashboard' which can makes application such alertmanager complain like following: `err="Post https://mgr0:8443/api/prometheus_receiver: x509: certificate is valid for ceph-dashboard, not mgr0" context_err="context deadline exceeded"` The idea here is to add alternative names matching all mgr/mon instances in the certificate so this error won't appear in logs. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978869 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-07-07 09:38:34 -04:00
Guillaume Abrioux	f4f73b6197	dashboard: support dedicated network for the dashboard This introduces a new variable `dashboard_network` in order to support deploying the dashboard on a different subnet. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1927574 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-07-05 21:34:43 +02:00
Dimitri Savineau	1d56818658	prometheus: fix prometheus target url The prometheus service isn't binding on localhost. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1933560 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-07-02 17:20:02 +02:00
Dimitri Savineau	d704b05e52	ceph-facts: move device facts to its own file Instead of reusing the condition 'inventory_hostname in groups[osds]' on each device facts tasks then we can move all the tasks into a dedicated file and set the condition on the import_tasks statement. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-07-02 14:02:30 +02:00
Dimitri Savineau	55bca07cb6	ceph-validate: check logical volumes We currently don't check if the logical volume used in lvm_volumes list for either bluestore data/db/wal or filestore data/journal exist. We're only doing this on raw devices for batch scenario. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-07-02 14:02:30 +02:00
Dimitri Savineau	808e7106de	ceph-validate: check db/journal/wal devices too When using dedicated devices for db/journal/wal objecstore with ceph-volume lvm batch then we should also validate that those devices exist and don't use a gpt partition table in addition of the devices and lvm_volume.data variables. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-07-02 14:02:30 +02:00
Dimitri Savineau	7e50380f7f	ceph-validate: use root device from ansible_mounts Instead of using findmnt command to find the device associated to the root mount point then we can use the ansible_mounts fact. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-07-02 14:02:30 +02:00
Dimitri Savineau	0df99dda8d	ceph-validate: do not resolve devices This is already done in the ceph-facts role. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-07-02 14:02:30 +02:00
Dimitri Savineau	14d458b3b4	ceph-validate: check block presence first Instead of doing two parted calls we can check first if the device exist and then test the partition table. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-07-02 14:02:30 +02:00
Dimitri Savineau	ac0342b72e	ceph-validate: check devices from lvm_volumes `2888c08` introduced a regression as the check_devices tasks file was only included based on the devices variable. But that file also validate some devices from the lvm_volumes variable. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1906022 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-07-02 14:02:30 +02:00
Dimitri Savineau	9758e3c513	container: set tcmalloc value by default All ceph daemons need to have the TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES environment variable set to 128MB by default in container setup. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1970913 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-06-30 20:30:55 +02:00
Dimitri Savineau	a05730b38a	rhcs: remove ISO install method Starting RHCS 5, there's no ISO available anymore. This removes all ISO variables and the ceph_repository_type variable. Closes: #6626 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-06-30 18:03:03 +02:00
Boris Ranto	2491d4e004	dashboard: Add new prometheus alert It was requested for us to update our alerting definitions to include a slow OSD Ops health check. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1951664 Signed-off-by: Boris Ranto <branto@redhat.com>	2021-06-24 09:02:21 +02:00
Guillaume Abrioux	8279d14d32	multisite: fix bug during switch2containers When running the switch-to-containers playbook with multisite enabled, the fact "rgw_instances" is only set for the node being processed (serial: 1), the consequence of that is that the set_fact of 'rgw_instances_all' can't iterate over all rgw node in order to look up each 'rgw_instances_host'. Adding a condition checking whether hostvars[item]["rgw_instances_host"] is defined fixes this issue. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967926 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-06-17 01:49:29 +02:00
Guillaume Abrioux	8dbee99882	nfs: do no copy client.bootstrap-rgw when using mds There's no need to copy this keyring when using nfs with mds Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-06-16 06:32:43 +02:00
Guillaume Abrioux	38bfad46e8	container: conditionnally disable lvmetad Enabling lvmetad in containerized deployments on el7 based OS might cause issues. This commit make it possible to disable this service if needed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1955040 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-06-15 20:16:38 +02:00
Guillaume Abrioux	d58500ade0	ceph_key: handle error in a better way When calling the `ceph_key` module with `state: info`, if the ceph command called fails, the actual error is hidden by the module which makes it pretty difficult to troubleshoot. The current code always states that if rc is not equal to 0 the keyring doesn't exist. `state: info` should always return the actual rc, stdout and stderr. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1964889 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-06-14 23:46:20 +02:00
Guillaume Abrioux	f7166cccbf	rolling_update: fix mon+rgw/multisite collocation When monitors and rgw are collocated with multisite enabled, the rolling_update playbook fails because during the workflow, we run some radosgw-admin commands very early on the first mon even though this is the monitor being upgraded, it means the container doesn't exist since it was stopped. This block is relevant only for scaling out rgw daemons or initial deployment. In rolling_update workflow, it is not needed so let's skip it. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1970232 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-06-11 10:50:50 +02:00
Neelaksh Singh	d18a9860cd	Sensitive key data now hidden in output log Fixes: #6529 Signed-off-by: Neelaksh Singh <neelaksh48@gmail.com>	2021-06-08 20:46:37 +02:00
Guillaume Abrioux	4daed1f137	dashboard: set cookie_secure in grafana When using grafana behind https `cookie_secure` should be set to `true`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1966880 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-06-04 14:01:28 +02:00
Guillaume Abrioux	664dae0564	prometheus: enforce osd nodes in templates When osd nodes are collocated in the clients group (HCI context for instance), the current logic will exclude osd nodes since they are present in the client group. The best fix would be to exclude clients node only when they are not member of another group but for now, as a workaround, we can enforce the addition of osd nodes to fix this specific case. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1947695 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-05-25 16:53:49 +02:00
Guillaume Abrioux	e6d8b058ba	nfs: get org.ganesha.nfsd.conf from container Since we need to revert `33bfb10`, this is an alternative to initial approach. We can avoid maintaining this file since it is present in container image. The idea is to simply get it from the image container and write it to the host. Fixes: #6501 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-05-07 13:35:37 +02:00
Dimitri Savineau	a670982a38	ceph-rgw: fix pg_autoscale_mode for pool The pg_autoscale_mode for rgw pools introduced in `9f03a52` was wrong and was missing a `value` keyword because `rgw_create_pools` is a dict. Fixes: #6516 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-05-06 10:15:13 +02:00
Guillaume Abrioux	8f87754b76	ceph-nfs: fix dev repo task We need to filter with the OS architecture in order to fetch the right dev repository in shaman Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-04-29 19:44:17 +02:00
Seena Fallah	41295f0ef6	ceph-osd: allow to use ceph_tcmalloc_max_total_thread_cache for bluestore TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES is for both bluestore and filestore Signed-off-by: Seena Fallah <seenafallah@gmail.com>	2021-04-28 20:03:46 +02:00
Dimitri Savineau	4e6b2a54d2	ceph-defaults: update multisite readme reference The multisite README file has been merged into a single file. Closes: #6411 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-04-15 19:44:38 +02:00
Francesco Pantano	441651638d	Config the monitoring stack components api urls using a VIP When dashboard_frontend_vip is provided, all the services should be configured using the related VIP. A new VIP variable is added for both prometheus and alertmanager: we're already able to properly config the grafana vip using dashboard_frontend_vip variable. This change adds the same variable for both prometheus and alertmanager. Signed-off-by: Francesco Pantano <fpantano@redhat.com>	2021-04-15 14:25:53 +02:00
Guillaume Abrioux	839fac8f94	core: bump ansible version We should consider bumping ansible version for future releases, so let's start testing against ansible 2.10 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-04-15 13:49:24 +02:00
Benoît Knecht	c078513475	ceph-rgw-loadbalancer: Fix rgw_ports fact The `set_fact rgw_ports` task was failing due to a templating error, because `hostvars[item].rgw_instances` is a list, but it was treated as if it was a dictionary. Another issue was the fact that the `unique` filter only applied to the list being appended to `rgw_ports` instead of the entire list, which means it was possible to have duplicate items. Lastly, `rgw_ports` would have been a list of integers, but the `seport` module expects a list of strings. This commit fixes all of the issues above, allowing the `ceph-rgw-loadbalancer` role to work on systems with SELinux enabled. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2021-04-15 10:39:08 +02:00
Guillaume Abrioux	bab403b603	container/systemd: ensure /var/log/ceph exists This adds a `ExecStartPre=-/usr/bin/mkdir -p /var/log/ceph` in all systemd service templates for all ceph daemon. This is specific to RHCS after a Leapp upgrade is done. Indeed, the `/var/log/ceph` seems to be removed after the upgrade. In order to work around this issue let's ensure the directory is present before trying to start the containers with podman. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1949489 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-04-14 16:37:33 +02:00
Guillaume Abrioux	b1e7e1ad0f	rbdmirror: add retries/until when configuring mirroring `configure_mirroring.yml` is called right after the daemon is started. Sometimes, it can happen the first task in `configure_mirroring.yml` is run while the daemon isn't yet ready, adding a retries/until on that task should help to avoid causing the playbook to fail. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1944996 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-04-14 11:37:26 +02:00
Guillaume Abrioux	0772b3d28d	nfs: remove legacy task This fact is never used, let's remove the task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-04-12 14:43:19 +02:00
Guillaume Abrioux	d3d3d01528	nfs: rename two tasks set the name of those tasks accordingly with the fact name being set. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-04-12 14:43:19 +02:00
Guillaume Abrioux	70f19be367	docker2podman: skip some role imports from handler when running docker-to-podman playbook, there's no need to call `ceph-config` and `ceph-rgw` from the role `ceph-handler`. It can even have side effects when coming from a baremetal cluster that was previously migrated using the switch-to-containers playbook. Indeed it might complain about missing .target systemd unit since they are removed during that migration. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1944999 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-04-09 15:28:50 +02:00
Guillaume Abrioux	d0442d81b9	common: selinux tasks related refactor This moves some task from the `ceph-nfs` role in `ceph-common` since some of them are needed in `ceph-rgwloadbalancer` role. This avoids duplicated tasks. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-04-02 15:23:05 +02:00
Guillaume Abrioux	6bbb90198b	rgw-loadbalancers: add all rgw_ports to http_port_t type This adds all rgw ports to the http_port_t selinux type so it allows haproxy to connect to those ports in order to avoid AVC. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1923890 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-04-02 15:23:05 +02:00
kalebskeithley	9e7f22a071	rgw-loadbalancer: Update haproxy.cfg.j2 haproxy gets an AVC when configured to connect to port 8081 This commit adds a snippet regarding haproxy in a selinux environment Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1923890 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>	2021-04-02 15:23:05 +02:00
Dimitri Savineau	a0e1a450d3	container/registry: use password from stdin Pass the password variable via stdin for the registry login authentication. This allows to remove the no_log statement and see the task output without displaying the password value. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-04-01 21:07:37 +02:00
Guillaume Abrioux	2db2208e40	nfs: set idmap config for Ceph-NFS Currently NFS Ganesha (ceph-nfs) consumes /etc/idmapd.conf, which controls mapping of user/owner identities under NFSv4+. With containerized service deployment, this file is an immutable part of the container image and cannot be modified. Here we provide group variables, and a taskk and templates for the ceph-nfs role, to set the path of the idmap configuration file and to make the most common adjustment to the contents of that file -- namely to set the 'Domain'. We default the path to /etc/ganesha/idmap.conf so that we will not conflict with /etc/idmapd.conf on the controller nodes where ganesha runs. NFSv4 clients, as used for example by the Cinder NFS driver, consume /etc/idmapd.conf and may require different settings than what is wanted for NFS Ganesha. Additionally, because we already bind /etc/ganesha from the host into the ceph-nfs container, the file NFS Ganesha consumes will no longer be an immutable part of the container. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925646 Signed-off-by: Tom Barron tpb@dyncloud.net Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-31 21:52:07 +02:00
Guillaume Abrioux	b60c61ce45	dashboard: support prometheus storage.tsdb.retention.time parameter This commit adds the parameter `--storage.tsdb.retention.time` to the prometheus systemd unit template. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1928000 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-31 21:51:35 +02:00
Guillaume Abrioux	9f03a527ba	rgw: supports pg_autoscale_mode option for pool creation Support enabling/disabling the pg autoscaler for rgw pools. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-31 13:10:28 +02:00
Guillaume Abrioux	c5728bdc63	defaults: add a comment about `igw_network` This add a quick documentation in ceph-defaults about `igw_network` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-29 11:12:04 +02:00
Dimitri Savineau	2e1b6299b8	common,iscsi: don't use the shaman search endpoint In commits `39649f0` and `bf8cdad` we switch from using the shaman /repos endpoint to the /search endpoint for using the architecture filter. In fact that filter is also available with the /repos endpoint, which requires less ansible tasks. This also adds back a condition remove in `5801171` on the ceph-iscsi repository and that repository doesn't need to filter on the architecture because the ceph-iscsi project is noarch. Both ceph-iscsi and tcmu-runner shaman URLs were using the ceph_dev_branch and ceph_dev_sha1 variables which doesn't make sense. Those variables are only useful for the ceph core repository. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-03-27 01:55:04 +01:00
Guillaume Abrioux	c33de174f1	dashboard: support igw nodes with dedicated subnet This adds the possibility to deploy the dashboard with igw nodes using a dedicated subnet. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1926170 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-26 16:22:58 +01:00
VasishtaShastry	006998e804	Peer addition won't be skipped if remote is not in peer rbd-mirroring is not configured as adding peer is getting skipped. Peer addition should not get skipped if its not added already Closes - https://bugzilla.redhat.com/show_bug.cgi?id=1942444 Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>	2021-03-26 15:18:21 +01:00
Guillaume Abrioux	0163ecc924	convert some missed `ansible_`` calls to `ansible_facts['']` This converts some missed calls to `ansible_*` that were missed in initial PR #6312 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-25 15:19:13 +01:00
Alex Schultz	db031a4993	Disable facts by default in ansible.cfg As a continuation of `a7f2fa73e6`, this change switches fact injection to off by default in the provided ansible.cfg. Signed-off-by: Alex Schultz <aschultz@redhat.com>	2021-03-24 13:44:33 +01:00
Guillaume Abrioux	5801171b37	iscsi: fetch right repo from shaman due to recent changes in shaman, we must fetch the right repo by filtering on the desired architecture. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-23 19:58:27 +01:00
Brad Hubbard	bf8cdad937	Make sure the repo url contains the correct arch We can end up with an arm only repo unless we are specific about the architecture we require. Brings the deb code in line with the rpm equivalent. Signed-off-by: Brad Hubbard <bhubbard@redhat.com>	2021-03-22 09:39:48 +01:00
Guillaume Abrioux	ccd1cbb732	facts: fix nfs/external cluster scenario These tasks shouldn't be run when at least 1 monitor isn't present in the inventory. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1937997 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-17 16:05:48 +01:00
Guillaume Abrioux	b27398163a	validate: followup on `98e32b9` update the message accordingly to the check updated in commit `98e32b92f3` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-17 09:46:05 +01:00
Guillaume Abrioux	a112572734	clients: build filtered clients group early when the group `_filtered_clients` is built, the order can change from the original `clients` group which can cause issues since we run `ceph-container-engine` on the first client only. It means later in the playbook we can make call to the container CLI on a node where the container engine wasn't installed. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-16 19:38:04 +01:00
Guillaume Abrioux	98e32b92f3	validate: update `ceph_repository_community` check this updates the `ceph_repository_community` check in `ceph-validate` with the right ceph release expected. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-16 19:38:04 +01:00
Guillaume Abrioux	6c6939104d	nfs: bump nfs-ganesha version This commit updates the default version of nfs-ganesha to V3.5 which is the latest version available upstream. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-16 19:38:04 +01:00
Guillaume Abrioux	31a0f2653d	config: reset num_osds When collocating OSDs with other daemon, `num_osds` is incorrectly calculated because `ceph-config` is called multiple times. Indeed, the following code: ``` num_osds: "{{ lvm_list.stdout \| default('{}') \| from_json \| length \| int + num_osds \| default(0) \| int }}" ``` makes `num_osds` be incremented each time `ceph-config` is called. We have to reset it in order to get the correct number of expected OSDs. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-15 10:41:28 +01:00
Dimitri Savineau	5b86ac8801	library: add realm pull to radosgw_realm module This adds the realm pull operation to the current radosgw_realm module. The pull operation requires the url, access/secret key variables. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-03-12 18:21:37 +01:00
Benoît Knecht	2437f14581	ceph-mon: Fix check mode for deploy monitor tasks Skip the `get initial keyring when it already exists` task when both commands whose `stdout` output it requires have been skipped (e.g. when running in check mode). Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2021-03-12 18:19:46 +01:00
Matthew Vernon	88d119e95a	ceph-osd: add prepare_osd tag to lvm-batch scenario Sometimes it's useful to be able to skip the OSD creation step when running ceph-ansible (cf #1777). The lvm scenario has a prepare_osd tag on the relevant play. This commit adds the same tag to the lvm-batch scenario. Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>	2021-03-11 22:02:52 +01:00
Alex Schultz	a7f2fa73e6	Use ansible_facts It has come to our attention that using ansible_* vars that are populated with INJECT_FACTS_AS_VARS=True is not very performant. In order to be able to support setting that to off, we need to update the references to use ansible_facts[<thing>] instead of ansible_<thing>. Related: ansible#73654 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1935406 Signed-off-by: Alex Schultz <aschultz@redhat.com>	2021-03-08 20:54:02 +01:00
Matthew Vernon	847611048e	Fix typo and broken link for documenting RGW frontends http://docs.ceph.com/docs/nautilus/radosgw/frontends/ 404s so replace it with a working "latest" docs link, and correct the spelling of "additional" while I'm at it. Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>	2021-03-02 18:31:38 +01:00
Guillaume Abrioux	f143b1a647	dashboard: add missing parameter in `ceph_cmd` the `ceph_cmd` fact is missing the `--net=host` parameter. Some tasks consuming this fact can fail like following: ``` Error: error configuring network namespace for container b8ec913db1fb694ae683faf202680de7a59c714a004e533aba87e8503d29261f: Missing CNI default network ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1931365 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-03-02 07:34:25 +01:00
Dimitri Savineau	4047d02ee6	Add quincy release Add the 17th ceph release: quincy. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-12 10:02:08 +01:00
Dimitri Savineau	e4dd0067c6	ceph-common: enable rhcs tools repo for monitoring The monitoring node running grafana needs the rhcs tools repostory enabled in non containerized deployment to be able to install the ceph-grafana-dashboards rpm package. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1918650 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-10 08:17:11 +01:00
Guillaume Abrioux	931b87e830	rgw: fix a typo in multisite if `rgw_zonegroupmaster` is not defined at the rgw instance level in `rgw_instances` it will fallback to a wrong variable (`rgw_zonemaster`). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1925247 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-10 07:01:21 +01:00
Guillaume Abrioux	39649f0ce8	common: ensure shaman returns right repo Due to recent changes in shaman, there's a chance it returns the wrong repository from architecture point of view. We can query shaman and ask for the correct architecture to get around this. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-06 00:34:14 +01:00
Guillaume Abrioux	c1f627c465	validate: fix a typo fixes a typo Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-02-06 00:34:14 +01:00
Dimitri Savineau	b1f37c4b3d	ceph-defaults: use https for download.ceph.com There's no reason to still use http on download.ceph.com instead of https. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-03 07:14:37 +01:00
Dimitri Savineau	7208a39e57	ceph-facts: set rgw_instances_all fact once There's no need to set the rgw_instances_all fact for each node. We can rely on run_once for that one. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-02-01 13:49:12 +01:00
Dimitri Savineau	3749d297c7	ceph-mon: add ExecStartPre docker stop to systemd We already do that in the other systemd templates (mgr, mds, etc..) and would present to add workaround in other orchestration tool. This change is for containerized deployment only. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1882724 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-01-29 09:03:34 +01:00
Guillaume Abrioux	8617081664	rgw: avoid useless call to ceph-rgw since `ceph-rgw` may be called from `ceph-handler` in some contexts we should avoid rerunning it unnecessarily. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-01-28 14:37:14 -05:00
Guillaume Abrioux	71a5e666e3	rgw: multisite refact Add the possibility to deploy rgw multisite configuration with a mix of secondary and primary zones on a same rgw node. Before that, on a same node, all instances were either primary zones OR secondary. Now you can define a rgw instance like following: ``` rgw_instances: - instance_name: 'rgw0' rgw_zonemaster: false rgw_zonesecondary: true rgw_zonegroupmaster: false rgw_realm: 'france' rgw_zonegroup: 'zonegroup-france' rgw_zone: paris-00 radosgw_address: "{{ _radosgw_address }}" radosgw_frontend_port: 8080 rgw_zone_user: jacques.chirac rgw_zone_user_display_name: "Jacques Chirac" system_access_key: P9Eb6S8XNyo4dtZZUUMy system_secret_key: qqHCUtfdNnpHq3PZRHW5un9l0bEBM812Uhow0XfB endpoint: http://192.168.101.12:8080 ``` Basically it's now possible to define `rgw_zonemaster`, `rgw_zonesecondary` and `rgw_zonegroupmaster` at the intsance level instead of the whole node level. Also, this commit adds an option `deploy_secondary_zones` (default True) which can be set to `False` in order to explicitly ask the playbook to not deploy secondary zones in case where the corresponding endpoint are not deployed yet. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1915478 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-01-27 15:46:43 +01:00
Dimitri Savineau	bbcad9609c	grafana: update container tag to 6.7.4 This update the grafana container tag to 6.7.4. The RHCS version is now based on the RHCS 5 container image which is also based on 6.7.4. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-01-27 15:08:31 +01:00
Dimitri Savineau	7d56771975	ceph-defaults: change default ceph container tag The "latest" ceph container tag references the latest stable release (octopus at the moment). "latest" is an alias on "latest-octopus". On the devel branch we should use "latest-master" tag instead. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-01-22 21:12:34 +01:00
Guillaume Abrioux	4af0845702	mon: fix cephx disabled deployment Due to missing condition on `cephx` variable, cephx disabled deployments are broken. This commit fixes this. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1910151 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-01-18 11:30:02 -05:00
Guillaume Abrioux	ae196bf946	validate: check virtual_ips variable This commit checks the length of `virtual_ips` doesn't exceed the length of `groups[rgwloadbalancer_group_name]`. It also ensure this variable is defined when `groups[rgwloadbalancer_group_name]` contains at least one node. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-01-12 11:03:12 +01:00
Benoît Knecht	3116f46422	ceph-rgw-loadbalancer: Fix keepalived master selection While `2ca33641` fixed a bug in the way the `keepalived.conf.j2` template matched hostnames to set the VRRP `MASTER`/`BACKUP` states, it also introduced a regression in the case where `virtual_ips` is a list of more than one IP address. The previous behavior would result in each host in the `rgwloadbalancers` group to be `MASTER` for one of the `virtual_ips`, but the new behavior caused the first host to be `MASTER` for all the IP address in `virtual_ips`. This commit restores the original behavior. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2021-01-12 11:03:12 +01:00
Dimitri Savineau	3f64ced36b	ceph-osd: replace sysctl command task by slurp Instead of using the command module for retrieving a sysctl value then we can use the slurp module and read the value directly from /proc. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2021-01-11 13:24:23 +01:00
Guillaume Abrioux	ef975ef5ea	dashboard: configure passwords via stdin Due to recent changes in ceph, the few dashboard passwors must be passed via `-i` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-01-07 17:09:46 -05:00
Mike Currin	4cbc9a48c9	Path for ceph config missing in crash template The path where ceph.conf is located (/etc/ceph) missing in the Docker container bind mounts, this throws errors Signed-off-by: Mike Currin <currin@gmail.com>	2021-01-06 16:50:18 +01:00
Guillaume Abrioux	513c8cfe55	rgw: support switching from single-site to multisite When collocating rgw with either a mon, mgr or osd, switching from single site to a multisite rgw setup failed because of the handlers triggered between the ansible play of the collocated daemon and the play of the rgw. Since the multisite changes are not yet applied the handlers fail. The idea here is to ensure we run the multisite configuration from the ceph-handler role before the restart happens, this way it won't complain because of non existing multisite configuration. (Note: this is also valid when simply changing a multisite configuration) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1888630 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2021-01-06 09:58:45 -05:00
Fabien Brachere	4026ba9da1	library: add missing `target_size_ratio` parameter support in ceph_pool module When creating a new pool, target_size_ratio was ignored by ansible module ceph_pool.py. target_size_ratio is now used when pg_autoscale_mode is on. Tests added to library tests. This adds too the use in the role ceph-rgw. Signed-off-by: Fabien Brachere <fabien.brachere@celeste.fr>	2020-12-16 15:10:27 +01:00
Dimitri Savineau	827b23353f	ceph-config: fix ceph-volume lvm batch report Since the major ceph-volume lvm batch refactoring, the report value is different. Before the refact, the report was a dict with the OSDs list to be created under the "osds" key. After the refact, the report is a list of dict. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-12-15 21:19:04 +01:00
Karl-Heinz Preuß	6ce34ef59f	fix broken ceph-fetch-keys role set fetch_directory variable in default/main.yml instead of using the defaults jinja filter in tasks/main.yml. Fixes: #6072 Signed-off-by: Karl-Heinz Preuß <karl-heinz.preuss@cms.hu-berlin.de>	2020-12-14 17:36:17 +01:00
Seena Fallah	5e9444fa5c	ceph-osd: use global crush_device_class in lvm_volumes Use global crush_device_class variable if it's not set per OSD Signed-off-by: Seena Fallah <seenafallah@gmail.com>	2020-12-12 06:56:53 +01:00
Dimitri Savineau	aa6e1f20ea	Revert "config: Always use osd_memory_target if set" This reverts commit `4d1fdd2b05`. This breaks the backward compatibility with previous osd_memory_target calculation and we could have a value lower than the minimum value allowed (896M) which causes some ceph commands to fail (like ceph assimilate-conf). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-12-12 06:56:32 +01:00
Dimitri Savineau	5a41026347	monitoring: use config_template module for config The alertmanager, grafana and prometheus configuration file are generated with the template module which doesn't allow for using config overrides. Instead we could use the config_template plugin action and add a new variable for overrides (one for each component). With this patch, one should be able to add configuration to prometheus with the following: --- alertmanager_conf_overrides: global: smtp_smarthost: 'localhost:25' ... Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1902999 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-12-12 06:55:27 +01:00
Dimitri Savineau	d82249a8c0	ceph-rgw: add cluster parameter on ceph_ec_profile `81233dd` introduced a regression with the ceph_ec_profile module call in the ceph-rgw role due the missing cluster module parameter. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-12-12 06:54:46 +01:00
Dimitri Savineau	2aeab882f3	ceph-facts: fix grafana group conversion The conversion fact task was only executed when the grafana_server_group_name variable was explicitly set in the user configuration. If an user was using the default value then the conversion wasn't executed. This also adds back the default grafana_server_group_name value in case user was using the default value and to avoid undefined variable error. Instead of hardcoding the "monitoring" group name then we can reuse the monitoring_group_name variable. There's no need to override the monitoring_group_name variable, it's either using the default value or the one defined by the user. Finally removing the delegate_to statement on the add_host task since it's always executed on the ansible controller. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1903732 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-12-10 16:51:16 +01:00
Jukka Nousiainen	eb7473491b	ceph-mon: No become during gen mon initial keyring Since the backing generate_secret() just hands out urandom output, running as privileged doesn't seem to be required. It's not desireable to provide sudo in some Ansible runner environments. Signed-off-by: Jukka Nousiainen <jukka.nousiainen@csc.fi>	2020-12-03 10:04:21 +01:00
Guillaume Abrioux	86a8889ee3	common: do not use pipefail when not needed Let's discard the ansible lint error 306 and add a "# noqa 306" on tasks where we don't need `set -o pipefail` Fixes: #6090 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-12-01 15:07:09 -05:00
Dimitri Savineau	cf7345f143	consume ceph_volume module when possible We should always use the ceph_volume ansible module when possible. This patch replace the ceph-volume inventory and lvm {list,zap} commands called via the command/shell modules by the corresponding call with the ceph_volume module. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-12-01 17:54:10 +01:00
Dimitri Savineau	2e417ab901	library: add ceph_crush_rule module This adds ceph_crush_rule ansible module for replacing the command module usage with the ceph osd crush rule commands. This module can manage both erasure and replicated crush rules. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-12-01 17:52:41 +01:00
Guillaume Abrioux	5c4ae5356d	osd: add tag on 'wait for all osd to be up' task This allows skipping this task if really desired. Use it carefully. Use it at your own risk. Fixes: #6073 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-12-01 11:00:25 +01:00
Dimitri Savineau	1831b4955f	ceph-client: use group_by instead of add_host Instead of iterate over all client nodes with a loop sequentially, we can use the group_by ansible buildin. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-12-01 10:58:48 +01:00
Dimitri Savineau	5da593604a	library: add ceph_osd_flag module This adds ceph_osd_flag ansible module for replacing the command module usage with the ceph osd set/unset commands. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-12-01 10:29:11 +01:00
Guillaume Abrioux	d40dd764e0	iscsigw: remove `--cap-add=all` from `podman run` cmd As of podman `2.0.5`, `--cap-add` and `--privileged` are exclusive options. ``` Nov 30 13:56:30 magna089 podman[171677]: Error: invalid config provided: CapAdd and privileged are mutually exclusive options ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1902149 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-11-30 12:24:11 -05:00
Guillaume Abrioux	c68b124ba8	container: remove `--ignore` from `podman rm` command As of podman 2.0.5, `--ignore` param conflicts with `--storage`. ``` Nov 30 13:53:10 magna089 podman[164443]: Error: --storage conflicts with --volumes, --all, --latest, --ignore and --cidfile ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-11-30 12:24:11 -05:00
Dimitri Savineau	eaf0ebfc85	library: add ceph_mgr_module module This adds ceph_mgr_module ansible module for replacing the command module usage with the ceph mgr module enable/disable commands. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-30 16:52:02 +01:00
Dimitri Savineau	eb452d35bc	alertmanager/prometheus: fix owner/group Set the owner/group on alertmanager and prometheus directories and files to nobody and nogroup (uid and gid 65534) to avoid permission issues. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1901543 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-27 14:02:42 +01:00
Guillaume Abrioux	970c6a4ee6	mon: refact initial keyring generation adding monitor is no longer possible because we generate a new mon keyring each time the playbook is run. Fixes: #5864 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-11-25 09:34:44 +01:00
Guillaume Abrioux	5ff2ca270f	mon: replace `command` task by `copy` We can achieve this task using `copy` module. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-11-25 09:34:44 +01:00
Dimitri Savineau	40a87c4b92	ceph-iscsi: set the pool name in the config file When using a custom pool for iSCSI gateway then we need to set the pool name in the configuration otherwise the default rbd pool name will be used. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-24 20:41:54 +01:00
Guillaume Abrioux	81233dd963	rgw: call `ceph_ec_profile` when needed Let's replace `command` tasks with `ceph_ec_profile` calls Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-11-24 10:38:28 +01:00
Guillaume Abrioux	873fc8ec0f	osd: ensure /var/lib/ceph/osd/{cluster}-{id} is present This commit ensures that the `/var/lib/ceph/osd/{{ cluster }}-{{ osd_id }}` is present before starting OSDs. This is needed specificly when redeploying an OSD in case of OS upgrade failure. Since ceph data are still present on its devices then the node can be redeployed, however those directories aren't present since they are initially created by ceph-volume. We could recreate them manually but for better user experience we can ask ceph-ansible to recreate them. NOTE: this only works for OSDs that were deployed with ceph-volume. ceph-disk deployed OSDs would have to get those directories recreated manually. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1898486 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-11-19 09:20:28 +01:00
Dimitri Savineau	e150df789e	ceph-facts: fix read osd pool default crush fact We don't need to use run_once on that task when having running monitors otherwise the read task could be skip and the set task will fail. The conditional check 'crush_rule_variable.rc == 0' failed. The error was: error while evaluating conditional (crush_rule_variable.rc == 0): 'dict object' has no attribute 'rc' Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1898856 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-18 12:55:43 -05:00
Guillaume Abrioux	f5ba6d9b01	containers: modify bindmount option This commit changes the bind mount option for the mount point `/var/lib/ceph` in the systemd template for mon and mgr containers. This is needed in case of collocating mon/mgr with osds using dmcrypt scenario. Once mon/mgr got converted to containers, the dmcrypt layer sub mount is still seen in `/var/lib/ceph`. For some reason it makes the corresponding devices busy so any other container can't open/close it. As a result, it prevents osds from starting properly. Since it only happens on the nodes converted before the OSD play, the idea is to bind mount `/var/lib/ceph` on mon and mgr with the `rshared` option so once the sub mount is unmounted, it is propagated inside the container so it doesn't see that mount point. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1896392 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-11-17 09:19:23 -05:00
Guillaume Abrioux	5ba7824c55	container: force rm --storage on ExecStartPre This is a workaround to avoid error like following: ``` Error: error creating container storage: the container name "ceph-mgr-magna022" is already in use by "4a5f674e113f837a0cc561dea5d2cd55d16ca159a647b7794ab06c4c276ef701" ``` that doesn't seem to be 100% reproducible but it shows up after a reboot. The only workaround we came up with at the moment is to run `podman rm --storage <container>` before starting it. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1887716 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-11-16 10:38:40 -05:00
Benoît Knecht	c5f7343a2f	ceph-facts: Fix osd_pool_default_crush_rule fact The `osd_pool_default_crush_rule` is set based on `crush_rule_variable`, which is the output of a `grep` command. However, two consecutive tasks can set that variable, and if the second task is skipped, it still overwrites the `crush_rule_variable`, leading the `osd_pool_default_crush_rule` to be set to `ceph_osd_pool_default_crush_rule` instead of the output of the first task. This commit ensures that the fact is set right after the `crush_rule_variable` is assigned, before it can be overwritten. Closes #5912 Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-11-13 09:36:49 +01:00
Gaudenz Steinlin	4d1fdd2b05	config: Always use osd_memory_target if set The osd_memory_target variable was only used if it was higher than the calculated value based on the number of OSDs. This is changed to always use the value if it is set in the configuration. This allows this value to be intentionally set lower so that it does not have to be changed when more OSDs are added later. Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>	2020-11-13 09:13:58 +01:00
Guillaume Abrioux	5cadfea42e	dashboard: change dashboard_grafana_api_no_ssl_verify default value This sets the `dashboard_grafana_api_no_ssl_verify` default value according to the length of `dashboard_crt` and `dashboard_key`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-11-04 10:00:48 +01:00
Guillaume Abrioux	767d3c898e	dashboard: enable https by default see linked bz for details Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1889426 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-11-04 10:00:48 +01:00
Gaudenz Steinlin	15044da030	osd: Fix number of OSD calculation If some OSDs are to be created and others already exist the calculation only counted the to be created OSDs. This changes the calculation to take all OSDs into account. Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>	2020-11-03 14:33:35 +01:00
Dimitri Savineau	3f9081931f	rgw/rbdmirror: use service dump instead of ceph -s The ceph status command returns a lot of information stored in variables and/or facts which could consume resources for nothing. When checking the rgw/rbdmirror services status, we're only using the servicmap structure in the ceph status output. To optimize this, we could use the ceph service dump command which contains the same needed information. This command returns less information and is slightly faster than the ceph status command. $ ceph status -f json \| wc -c 2001 $ ceph service dump -f json \| wc -c 1105 $ time ceph status -f json > /dev/null real 0m0.557s user 0m0.516s sys 0m0.040s $ time ceph service dump -f json > /dev/null real 0m0.454s user 0m0.434s sys 0m0.020s Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-03 09:05:33 +01:00
Dimitri Savineau	88f91d8c12	monitor: use quorum_status instead of ceph status The ceph status command returns a lot of information stored in variables and/or facts which could consume resources for nothing. When checking the quorum status, we're only using the quorum_names structure in the ceph status output. To optimize this, we could use the ceph quorum_status command which contains the same needed information. This command returns less information. $ ceph status -f json \| wc -c 2001 $ ceph quorum_status -f json \| wc -c 957 $ time ceph status -f json > /dev/null real 0m0.577s user 0m0.538s sys 0m0.029s $ time ceph quorum_status -f json > /dev/null real 0m0.544s user 0m0.527s sys 0m0.016s Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-03 09:05:33 +01:00
wangxiaotong	b9cb0f12e9	osds: use ceph osd stat instead of ceph status Improve the checked way of the OSD created checking process. This replaces the ceph status command by the ceph osd stat command. The osdmap structure isn't needed anymore. $ ceph status -f json \| wc -c 2001 $ ceph osd stat -f json \| wc -c 132 $ time ceph status -f json > /dev/null real 0m0.563s user 0m0.526s sys 0m0.036s $ time ceph osd stat -f json > /dev/null real 0m0.457s user 0m0.411s sys 0m0.045s Signed-off-by: wangxiaotong <wangxiaotong@fiberhome.com>	2020-11-03 09:05:33 +01:00
Guillaume Abrioux	371d854a5c	common: follow up on #5948 In addition to `f7e2b2c608` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-11-02 20:16:36 -05:00
Benoît Knecht	0d76826bbb	ceph-mon: Don't set monitor directory mode recursively After rolling updates performed with `infrastructure-playbooks/rolling_updates.yml`, files located in `/var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }}` had mode 0755 (including the keyring), making them world-readable. This commit separates the task that configured permissions recursively on `/var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }}` into two separate tasks: 1. Set the ownership and mode of the directory itself; 2. Recursively set ownership in the directory, but don't modify the mode. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-11-02 17:36:37 +01:00
Dimitri Savineau	b02589ad50	keyring: use ceph_key module for get-or-create cmd Instead of using ceph auth get-or-create command via the ansible command module then we can use the ceph_key module. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-02 17:17:29 +01:00
Dimitri Savineau	59ecddcdd0	keyring: use ceph_key module for auth get command Instead of using ceph auth get command via the ansible command module then we can use the ceph_key module and the info state. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-02 17:17:29 +01:00
Gaudenz Steinlin	79ff79c422	openstack: use ceph_keyring_permissions by default Otherwise this task fails if no permission is set on the item. Previously the code omited the mode parameter if it was not set, but this was lost with commit `ab370b6ad8`. Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>	2020-11-02 15:53:58 +01:00
Dimitri Savineau	16cd183b9c	podman: force log driver to journald Since we've changed to podman configuration using the detach mode and systemd type to forking then the container logs aren't present in the journald anymore. The default conmon log driver is using k8s-file. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1890439 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-02 15:49:27 +01:00
Dimitri Savineau	cdb7b09cd7	ceph-handler: fix curl ipv6 command with rgw When using the curl command with ipv6 address and brackets then we need to use the -g option otherwise the command fails. $ curl http://[fdc2:328:750b:6983::6]:8080 curl: (3) [globbing] error: bad range specification after pos 9 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-02 15:45:51 +01:00
Guillaume Abrioux	a822f77300	iscsi: fix ownership on iscsi-gateway.cfg This file is currently deployed with '0644' ownership making this file readable by any user on the system. Since it contains sensitive information it should be readable by the owner only. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1890119 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-21 16:10:48 +02:00
Guillaume Abrioux	1cc9666c09	common: drop `fetch_directory` feature This commit drops the `fetch_directory` feature. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-21 13:22:16 +02:00
Guillaume Abrioux	900c0f4492	ceph-config: ceph.conf rendering refactor This commit cleans up the `main.yml` task file of `ceph-config`. It drops the local ceph.conf generation. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-21 13:22:16 +02:00
Guillaume Abrioux	a8bd947c7d	crash: refact caps definition there is no need to use `{{ }}` syntax here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-19 18:53:54 -04:00
Benoît Knecht	8b0023cb77	ceph-osd: Fix check mode for start osds tasks Correctly set `osd_ids_non_container.stdout_lines` to an empty list if it's undefined (i.e. in check mode). Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-10-19 20:22:08 +02:00
Benoît Knecht	8f436ab5d8	ceph-mon: Fix check mode for deploy monitor tasks Skip the `get initial keyring when it already exists` task when both commands whose `stdout` output it requires have been skipped (e.g. when running in check mode). Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-10-19 20:22:08 +02:00
Gaudenz Steinlin	68cc93fb18	ceph-crash: Only deploy key to targeted hosts The current task installs the ceph-crash key to "most" hosts via "delegate_to". This key is only used by the ceph-crash daemon and should just be installed on all hosts targeted by this role. There is no need for using a delegated task. Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>	2020-10-19 16:54:06 +02:00
Guillaume Abrioux	59d0f01992	ceph-osd: start osd after systemd overrides The service should be started after the ceph-osd systemd overrides has been added, otherwise, the latter isn't considered. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1860739 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-15 09:19:56 +02:00
Dimitri Savineau	9252b75173	container: remove container_binding_name variable The container_binding_name package was only mandatory when we were using the docker modules (docker_image and docker_container) but since we manage both docker and podman containers without using the dedicated module then we can remove it. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-14 10:32:31 +02:00
Dimitri Savineau	4eaa65c362	ceph-osd: don't start the OSD services twice Using the + operation on two lists doesn't filter out the duplicate keys. Currently each OSDs is started (via systemd) twice. Instead we could use the union filter. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-14 10:30:39 +02:00
Guillaume Abrioux	46d4d97da9	handler: refact check_socket_non_container the `stat --printf=%n` returns something like following: ``` ok: [osd0] => changed=false cmd: \|- stat --printf=%n /var/run/ceph/ceph-osd*.asok delta: '0:00:00.009388' end: '2020-10-06 06:18:28.109500' failed_when_result: false rc: 0 start: '2020-10-06 06:18:28.100112' stderr: '' stderr_lines: <omitted> stdout: /var/run/ceph/ceph-osd.2.asok/var/run/ceph/ceph-osd.5.asok stdout_lines: <omitted> ``` it makes the next task "check if the ceph osd socket is in-use" grep like this: ``` ok: [osd0] => changed=false cmd: - grep - -q - /var/run/ceph/ceph-osd.2.asok/var/run/ceph/ceph-osd.5.asok - /proc/net/unix ``` which will obviously fail because this path never exists. It makes the OSD handler broken. Let's use `find` module instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-08 17:37:50 -04:00
Benoît Knecht	54ba38e35e	Fix Ansible check mode for site.yml.sample playbook Make sure the `site.yml.sample` playbook can be run in check mode by skipping tasks that try to read the output of commands that have been skipped. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-10-07 00:29:44 +02:00
Dimitri Savineau	1281e8bcc8	library: add radosgw_zone module This adds radosgw_zone ansible module for replacing the command module usage with the radosgw-admin zone command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-06 10:07:58 +02:00
Dimitri Savineau	65dbe0782e	library: add radosgw_zonegroup module This adds radosgw_zonegroup ansible module for replacing the command module usage with the radosgw-admin zonegroup command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-06 10:07:58 +02:00
Dimitri Savineau	d171f4068d	library: add radosgw_realm module This adds radosgw_realm ansible module for replacing the command module usage with the radosgw-admin realm command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-06 10:07:58 +02:00
Dimitri Savineau	235c7e27cc	library: add radosgw_user module This adds radosgw_user ansible module for replacing the command module usage with the radosgw-admin user command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-06 10:07:58 +02:00
Dimitri Savineau	bd611a785b	library: add ceph_fs module This adds the ceph_fs ansible module for replacing the command module usage with the ceph fs command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-06 08:02:58 +02:00
Dimitri Savineau	c960362639	ceph_key: remove backward compatibility It's time to remove this backward compatibility. Users had enough time to convert their openstack_keys and key values. We now fail in ceph-validate if the caps key isn't set. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-06 07:59:38 +02:00
Guillaume Abrioux	a802fa2810	rgw: fix multi instances scaleout in baremetal When rgw and osd are collocated, the current workflow prevents from scaling out the radosgw_num_instances parameter when rerunning the playbook in baremetal deployments. When ceph-osd notifies handlers, it means rgw handlers are triggered too. The issue with this is that they are triggered before the role ceph-rgw is run. In the case a scaleout operation is expected on `radosgw_num_instances` it causes an issue because keyrings haven't been created yet so the new instances won't start. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1881313 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-06 07:38:44 +02:00
Guillaume Abrioux	ff95fa9c32	ceph-osd: refact `docker_exec_start_osd` This commit drops nested jinja construction in this set_fact task. It also rename it to `container_exec_start_osd` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-04 21:18:10 +02:00
Guillaume Abrioux	c101cb3931	defaults: change defaults value this commit changes defaults value in default pool definitions. there's no need to define `pg_num`, `pgp_num`, `size` and `min_size`, `ceph_pool` module will use the current default if needed. This also drops the 3 following `set_fact` in `ceph-facts`: - osd_pool_default_pg_num, - osd_pool_default_pgp_num, - osd_pool_default_size_num Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-02 07:42:40 +02:00
Guillaume Abrioux	29fc115f4a	ceph_pool: refact module remove complexity about current defaults in running cluster Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-02 07:42:40 +02:00
Seena Fallah	ff9f4d138f	ceph-facts: add get default crush rule from running monitor In case of deploying new monitor node to an existing cluster, osd_pool_default_crush_rule should be taken from running monitor because ceph-osd role won't be run and the new monitor will have different osd_pool_default_crush_role from other monitors. Signed-off-by: Seena Fallah <seenafallah@gmail.com>	2020-09-29 09:27:58 -04:00
Guillaume Abrioux	eefe11d90c	defaults: change default grafana-server name This change default value of grafana-server group name. Adding some tasks in ceph-defaults in order to keep backward compatibility. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-29 07:42:26 +02:00
Ali Maredia	902575369c	rgw multisite: check connection for realm endpoint This commit adds connection checks before realm pulls Curls are performed on the endpoint being pulled from the mons and the rgws Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731158 Signed-off-by: Ali Maredia <amaredia@redhat.com>	2020-09-29 07:37:21 +02:00
Dimitri Savineau	e11453c6f5	Remove unused centos docker tasks The `enable extras on centos` task just doesn't work when using the variable ceph_docker_enable_centos_extra_repo to true. fatal: [xxx]; FAILED! => {"changed": false, "msg": "Parameter 'baseurl', 'metalink' or 'mirrorlist' is required."} The CentOS extras repository is enabled by default so it's pretty safe to remove this task and the associated variable. This also removes the ceph_docker_on_openstack variable as it's a leftover and it is unused. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-29 07:35:10 +02:00
Dimitri Savineau	733596582d	ceph-handler: set handler on xxx_stat result In non containerized deployment we check if the service is running via the socket file presence. This is done via the xxx_socket_stat variable that check the file socket in the /var/run/ceph/ directory. In some scenarios, we could have the socket file still present in that directory but not used by any process. That's why we have the xxx_stat variable which clean those leftovers. The problem here is that we're set the variable for the handlers status (like handler_mon_status) based on xxx_socket_stat instead of xxx_stat. That means we will trigger the handlers if there's an old socket file present on the system without any process associated. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1866834 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-29 07:32:10 +02:00
Dimitri Savineau	501b8e0fd3	ceph-iscsi: create pool once from monitor `af9f6684` introduced a regression on the ceph iscsi pool creation because it was delegated to the first monitor node before that change. This patch restores the initial worflow. When the iscsi node doesn't have the admin keyring then the pool creation fails. This commit also ensures that the pool creation is only executed once when having multiple iscsi nodes. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-29 07:31:24 +02:00
Seena Fallah	69f7e35382	ceph-facts: check for mon socket in its own host delegate to its own host after checking mon socket to findout if mon socket is in-use or not. Signed-off-by: Seena Fallah <seenafallah@gmail.com>	2020-09-29 00:21:12 +02:00
Dimitri Savineau	50104650e7	add missing boolean filter Otherwise this will generate an ansible warning about the missing filter. [DEPRECATION WARNING]: evaluating xxx as a bare variable, this behaviour will go away and you might need to add \|bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This feature will be removed in version 2.12. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-28 20:45:01 +02:00
Guillaume Abrioux	bf7b044c9a	Revert "ceph-rgw: remove ceph_pool state and default value" This reverts commit `ba3512a8fc`.	2020-09-28 16:56:33 +02:00
Dimitri Savineau	1db4dc807c	ceph-mds: remove unused block condition Since `af9f6684` the cephfs pool(s) creation don't use the fs_pools_created variable anymore because the ceph_pool module is idempotent. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-28 10:22:35 +02:00
Tyler Bishop	ee4b8804ae	facts: support device aliases for (dedicated\|bluestore_wal)_devices Just likve `devices`, this commit adds the support for linux device aliases for `dedicated_devices` and `bluestore_wal_devices`. Signed-off-by: Tyler Bishop <tbishop@liquidweb.com>	2020-09-25 19:59:45 +02:00
Dimitri Savineau	ba3512a8fc	ceph-rgw: remove ceph_pool state and default value Since the state is now optional and default values are handled in the ceph_pool module itself. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-25 19:18:07 +02:00
Dimitri Savineau	4808523403	rolling_update: remove msgr2 migration In Pacific we're are sure that users already achieved the msgr2 because that was introduced in Nautilus. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-25 19:14:42 +02:00
Dimitri Savineau	62bd41f0d4	ceph-config: remove ceph_release from ceph.conf.j2 We don't use ceph_release variable in the ceph.conf jinja template. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-25 19:13:57 +02:00
Dmitriy Rabotyagov	297532ca41	Remove libjemalloc1 installation task libjemalloc1 package is not required neither for ganesha dependency nor for the package build process. So this task can be simply dropped. Signed-off-by: Dmitriy Rabotyagov <noonedeadpunk@ya.ru>	2020-09-24 13:56:16 +02:00
Dimitri Savineau	6dcfdf17d4	container: quote registry password When using a quote in the registry password then we have the following error: The error was: ValueError: No closing quotation To fix this we need to use the quote filter. Close: https://bugzilla.redhat.com/show_bug.cgi?id=1880252 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-18 11:14:00 -04:00
Guillaume Abrioux	ff19c1d851	facts: fix 'set_fact rgw_instances with rgw multisite' the current condition doesn't work, as soon as the first iteration is done the condition makes next iterations skip since `rgw_instances` got set with the first iteration. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1859872 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-18 10:14:34 -04:00
Dimitri Savineau	85643edfe3	ceph-infra: include iscsi nodes for logrotate The iscsi nodes aren't included in the logrotate condition. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-17 20:34:56 +02:00
Guillaume Abrioux	f576c02ff7	infra: support log rotation for tcmu-runner This commit adds the log rotation support for tcmu-runner. ceph-container related PR: ceph/ceph-container#1726 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1873915 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-16 20:23:22 -04:00
Dimitri Savineau	e54b924eaf	ceph-prometheus: update pool stat counter Since [1] The bytes_used pool counter in prometheus has been renamed to stored. Closes: #5781 [1] https://github.com/ceph/ceph/commit/71fe9149 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-16 09:50:42 -04:00
Dimitri Savineau	bda3581294	container: add optional http(s) proxy option When using a http(s) proxy with either docker or podman we can rely on the HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables. But with ansible, even if those variables are defined in a source file then they aren't loaded during the container pull/login tasks. This implements the http(s) proxy support with docker/podman. Both implementations are different: 1/ docker doesn't rely en the environment variables with the CLI. Thos are needed by the docker daemon via systemd. 2/ podman uses the environment variables so we need to add them to the login/pull tasks. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876692 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-16 06:52:26 +02:00
Dimitri Savineau	abb4023d76	ceph_key: set state as optional Most ansible module using a state parameter default to the present value (when available) instead of using it as a mandatory option. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-14 14:12:21 -04:00
Guillaume Abrioux	f0fc59258a	Revert "ceph_pool: use default size/min_size and rule_name" This reverts commit `142934057f`. This is already handled in the ceph_pool module itself Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-14 14:12:21 -04:00
Dimitri Savineau	2c4af70abd	dashboard: use run_once at block level Instead of using run_once: true on each tasks in a block section, we can use the run_once statement at the block level. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-14 13:47:36 +02:00
Dimitri Savineau	b105549ed8	node-exporter: exclude client nodes We don't need to install node-exporter on client node because there's no ceph services running on them. This also makes sure we use the group name variables in the prometheus service template instead of hardcoding the values. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-14 13:46:51 +02:00
Dimitri Savineau	3a05aeb6cb	ceph_pool: set state as optional Most ansible module using a state parameter default to the present value (when available) instead of using it as a mandatory option. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-11 10:26:15 +02:00
Dimitri Savineau	ee6f0547ba	library: add ceph_dashboard_user module This adds the ceph_dashboard_user ansible module for replacing the command module usage with the ceph dashboard ac-user-xxx command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-11 10:16:08 +02:00
Dimitri Savineau	142934057f	ceph_pool: use default size/min_size and rule_name Before [1] we were using default value for - size - min_size - rule_name when the key wasn't present in the pool dict. The commit [1] changed this by defaulting to omit. This patch restores the original workflow by using facts: - osd_pool_default_size - osd_pool_default_min_size - ceph_osd_pool_default_crush_rule_name [1] `af9f6684f2` Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-11 10:15:28 +02:00
Dimitri Savineau	f63022dfec	ceph-facts: only get fsid when monitor are present When running the rolling_update playbook with an inventory without monitor nodes defined (like external scenario) then we can't retrieve the cluster fsid from the running monitor. In this scenario we have to pass this information manually (group_vars or host_vars). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1877426 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-10 13:19:44 -04:00
Dimitri Savineau	8dacbce68f	ceph-rgw: use ceph_pool module Since [1] we can use the ceph_pool module instead of using the command module combined with ceph osd pool commands. [1] `bddcb439ce` Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-10 15:16:58 +02:00
Guillaume Abrioux	657e6c8c3b	tests: clean legacy clean some legacies since quay.ceph.io migration Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-09-09 14:42:41 +02:00
Niko Smeds	a951c1a3f0	Enable HAProxy backend checks for Ceph RGW Add the `check` option to server definitions to enable basic HAProxy health checks for Ceph RADOS gateway backends. Currently traffic will be forwarded to unhealthly `radosgw.service` servers. These changes resolve the issue. Signed-off-by: Niko Smeds nikosmeds@gmail.com	2020-08-27 10:57:46 -04:00
Guillaume Abrioux	54d3e9650f	dashboard: refact admin user creation task this commit splits this task in order to avoid using a `shell` module. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-21 09:22:11 +02:00
Guillaume Abrioux	f0fe193d8e	facts: refact and optimize memory consumption there's no need to run this task on all nodes. This uses too much memory for nothing. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1856981 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-20 11:16:26 -04:00
George Shuklin	73d4bb6bd6	Make 'disable ssl for dashboard task' idempotent. This should reduce number of 'changed' tasks during convergence test. Signed-off-by: George Shuklin <george.shuklin@gmail.com>	2020-08-20 16:48:32 +02:00
Rafał Wądołowski	55cd6e83e4	Comment out ceph_custom_key Since there is a check if ceph_custom_key is defined, there is no reason to define it by default. Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com>	2020-08-20 13:36:24 +02:00
Guillaume Abrioux	899d317196	iscsigw: add retry/until In order to avoid failures that could be fixed by simply retrying. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-20 13:25:05 +02:00
John Fulton	95dee6f1ca	Set default permission for prometheus config files Regardless of the outcome of Ansible 2.9.12 issue 71200 we can set a default permission for these files. Closes: https://github.com/ceph/ceph-ansible/issues/5677 Signed-off-by: John Fulton <fulton@redhat.com>	2020-08-18 15:49:31 -04:00
Guillaume Abrioux	8ed11ea3ee	infra: only install logrotate on right nodes For intsance, there is no need to install logrotate on clients nodes. This also ensure logrotate is installed only for containerized deployments since the packaging has an explicit dependency to logrotate Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-18 10:56:09 -04:00
Dimitri Savineau	cb8f0237e1	ceph-rgw: allow specifying crush rule on pool We already support specifiying a custom crush rule during pool creation in ceph-osd role but not in ceph-rgw role. This patch adds the missing code to implement this feature. Note this is only available for replicated pool not erasure. The rule must also exist prior the pool creation. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1855439 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-08-17 22:59:06 +02:00
Ali Maredia	5c1f4b1a1e	rgw: allow rgws to be concurrently with or without multisite Allows rgws in a ceph cluster to be run with multisite and without multisite at the same time. Signed-off-by: Ali Maredia <amaredia@redhat.com>	2020-08-17 11:11:11 +02:00
Guillaume Abrioux	e1cb385740	infra: add missing tag This commit adds the missing `with_pkg` tag on the logrotate installation task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-13 10:08:18 -04:00
Guillaume Abrioux	f1aa6cea21	infra: add log rotation support (containers) This commit adds the log rotation support via logrotate in containerized deployments. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1848388 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-11 15:03:20 +02:00
Guillaume Abrioux	448cc280b7	common: don't enable debug log on ceph-volume calls by default ceph-volume can generate large logs at some point. debug logs by definition should be enabled only when debugging. Let's make it customizable with a variable which is set to `False` by default. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-11 15:03:20 +02:00
raul	110eaf5f9f	rgw: support 1+ rgw instance in `radosgw_frontend_port` Change the radosgw_frontend_port to take in account more than 1 RGW instance, in it's original form `radosgw_frontend_port: radosgw_frontend_port \| int`, it configured the 8080 port to all instances, with the following modification `radosgw_frontend_port: radosgw_frontend_port \| int + item\|int` we increase in 1 the port count. Co-authored-by: Daniel Parkes <dparkes@redhat.com> Signed-off-by: raul <rmahique@redhat.com>	2020-08-11 14:05:43 +02:00
Guillaume Abrioux	dd4b5b0328	nfs: do not copy rgw keyring when `nfs_obj_gw` is true This keyring shouldn't be copied when `nfs_obj_gw` is `True` if the cluster doesn't contain a rgw node, which can be the case given we are using `nfs_obj_gw` instead of `nfs_file_gw` (cephfs vs. object), the deployment will fail trying to copy a key that doesn't exist. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-07 13:21:17 +02:00
Guillaume Abrioux	0a581a6e60	config: only add related rgw section there's no need to add each rgw section on all rgw nodes. With this commit, only related rgw section are rendered. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-03 14:47:27 +02:00
Dimitri Savineau	0d0f1e71df	dashboard: allow remote TLS cert/key copy When using TLS on the ceph dashboard or grafana services, we can provide the TLS certificate and key. Those files should be present on the ansible controller and they will be copyied to the right node(s). In some situation, the TLS certificate and key could be already present on the target node and not on the ansible controller. For this scenario, we just need to copy the files locally (on each remote host). This patch adds the dashboard_tls_external variable (with default to false) to allow users to achieve this scenario when configuring this variable to true. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1860815 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-08-03 13:39:47 +02:00
Dimitri Savineau	4e84b4beed	ceph-facts: remove mds_name fact The mds_name fact always gets the ansible_hostname value so we don't need to have a dedicated fact for this and use the ansible_hostname fact instead. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-23 17:02:43 +02:00
Dimitri Savineau	cbe79428e6	ceph-handler: remove iscsigws restart scripts The iscsigws restart scripts for tcmu-runner and rbd-target-{api,gw} services only call the systemctl restart command. We don't really need to copy a shell script to do it when we can use the ansible service module instead. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-23 17:02:12 +02:00
Dimitri Savineau	47b7c00287	podman: always remove container on start In case of failure, the systemd ExecStop isn't executed so the container isn't removed. After a reboot of a failed node, the container doesn't start because the old container is still present in created state. We should always try to remove the container in ExecStartPre for this situation. A normal reboot doesn't trigger this issue and this also doesn't affect nodes running containers via docker. This behaviour was introduced by `d43769d`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1858865 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-23 17:00:38 +02:00
Dimitri Savineau	18e3c7a0a2	ceph-handler: add missing condition on ceph-crash The ceph-crash tasks present in the ceph-handler role don't need to be executed on all nodes. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-21 23:26:11 +02:00
Guillaume Abrioux	39bb279a53	crash: rm container in ExecPreStart even with docker We should ensure the container is removed in `ExecPreStart` even when `{{ container_binary }}` is docker. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-21 23:23:18 +02:00
Guillaume Abrioux	9d2f2108e1	ceph-crash: introduce new role ceph-crash This commit introduces a new role `ceph-crash` in order to deploy everything needed for the ceph-crash daemon. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-21 20:22:12 +02:00
Guillaume Abrioux	d490968fc8	defaults: remove legacy These variables aren't consummed anywhere else than in ceph-nfs role so there is no need to have them in `ceph-defaults`'s defaults Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-21 09:39:15 +02:00

... 2 3 4 5 6 ...

2990 Commits (fb13ee35bf2cf096db3f66b516eaa7ffa0d24770)