ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	6fd6b31305	library: add ceph_dashboard_user module This adds the ceph_dashboard_user ansible module for replacing the command module usage with the ceph dashboard ac-user-xxx command. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ee6f0547ba`)	2020-09-11 09:08:56 -04:00
Guillaume Abrioux	828817489c	facts: refact and optimize memory consumption there's no need to run this task on all nodes. This uses too much memory for nothing. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1856981 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f0fe193d8e`)	2020-09-10 22:52:38 -04:00
Dimitri Savineau	593264e5f7	ceph-rgw: use ceph_pool module Since [1] we can use the ceph_pool module instead of using the command module combined with ceph osd pool commands. [1] `bddcb439ce` Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `8dacbce68f`)	2020-09-10 21:44:19 -04:00
Dimitri Savineau	0c0a930374	ceph-facts: only get fsid when monitor are present When running the rolling_update playbook with an inventory without monitor nodes defined (like external scenario) then we can't retrieve the cluster fsid from the running monitor. In this scenario we have to pass this information manually (group_vars or host_vars). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1877426 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f63022dfec`)	2020-09-10 20:57:16 +02:00
Niko Smeds	a41c572785	Enable HAProxy backend checks for Ceph RGW Add the `check` option to server definitions to enable basic HAProxy health checks for Ceph RADOS gateway backends. Currently traffic will be forwarded to unhealthly `radosgw.service` servers. These changes resolve the issue. Signed-off-by: Niko Smeds nikosmeds@gmail.com (cherry picked from commit `a951c1a3f0`)	2020-09-02 09:54:52 -04:00
Guillaume Abrioux	8fde5f7396	dashboard: refact admin user creation task this commit splits this task in order to avoid using a `shell` module. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `54d3e9650f`)	2020-08-21 13:55:54 +02:00
George Shuklin	c0d98878ff	Make 'disable ssl for dashboard task' idempotent. This should reduce number of 'changed' tasks during convergence test. Signed-off-by: George Shuklin <george.shuklin@gmail.com> (cherry picked from commit `73d4bb6bd6`)	2020-08-20 17:16:45 +02:00
Rafał Wądołowski	21a37e23b3	Comment out ceph_custom_key Since there is a check if ceph_custom_key is defined, there is no reason to define it by default. Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com> (cherry picked from commit `55cd6e83e4`)	2020-08-20 13:43:44 +02:00
John Fulton	489efd5689	Set default permission for prometheus config files Regardless of the outcome of Ansible 2.9.12 issue 71200 we can set a default permission for these files. Closes: https://github.com/ceph/ceph-ansible/issues/5677 Signed-off-by: John Fulton <fulton@redhat.com> (cherry picked from commit `95dee6f1ca`)	2020-08-18 18:04:17 -04:00
Guillaume Abrioux	3fad1677d6	infra: only install logrotate on right nodes For intsance, there is no need to install logrotate on clients nodes. This also ensure logrotate is installed only for containerized deployments since the packaging has an explicit dependency to logrotate Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8ed11ea3ee`)	2020-08-18 11:10:57 -04:00
Dimitri Savineau	e9c6028eb9	ceph-rgw: allow specifying crush rule on pool We already support specifiying a custom crush rule during pool creation in ceph-osd role but not in ceph-rgw role. This patch adds the missing code to implement this feature. Note this is only available for replicated pool not erasure. The rule must also exist prior the pool creation. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1855439 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cb8f0237e1`)	2020-08-17 23:00:13 +02:00
Ali Maredia	63d991dc3d	rgw: allow rgws to be concurrently with or without multisite Allows rgws in a ceph cluster to be run with multisite and without multisite at the same time. Signed-off-by: Ali Maredia <amaredia@redhat.com> (cherry picked from commit `5c1f4b1a1e`)	2020-08-17 13:56:45 +02:00
Guillaume Abrioux	2609da6ce7	infra: add missing tag This commit adds the missing `with_pkg` tag on the logrotate installation task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e1cb385740`)	2020-08-13 10:09:31 -04:00
Guillaume Abrioux	29d4c42f80	infra: add log rotation support (containers) This commit adds the log rotation support via logrotate in containerized deployments. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1848388 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f1aa6cea21`)	2020-08-12 22:57:10 +02:00
Guillaume Abrioux	8a7e4193db	common: don't enable debug log on ceph-volume calls by default ceph-volume can generate large logs at some point. debug logs by definition should be enabled only when debugging. Let's make it customizable with a variable which is set to `False` by default. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `448cc280b7`)	2020-08-12 22:57:10 +02:00
Guillaume Abrioux	223254e8bf	nfs: do not copy rgw keyring when `nfs_obj_gw` is true This keyring shouldn't be copied when `nfs_obj_gw` is `True` if the cluster doesn't contain a rgw node, which can be the case given we are using `nfs_obj_gw` instead of `nfs_file_gw` (cephfs vs. object), the deployment will fail trying to copy a key that doesn't exist. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `dd4b5b0328`)	2020-08-12 14:57:56 -04:00
raul	5fc3af5f4d	rgw: support 1+ rgw instance in `radosgw_frontend_port` Change the radosgw_frontend_port to take in account more than 1 RGW instance, in it's original form `radosgw_frontend_port: radosgw_frontend_port \| int`, it configured the 8080 port to all instances, with the following modification `radosgw_frontend_port: radosgw_frontend_port \| int + item\|int` we increase in 1 the port count. Co-authored-by: Daniel Parkes <dparkes@redhat.com> Signed-off-by: raul <rmahique@redhat.com> (cherry picked from commit `110eaf5f9f`)	2020-08-12 14:57:35 -04:00
Guillaume Abrioux	e0dc56b73c	config: only add related rgw section there's no need to add each rgw section on all rgw nodes. With this commit, only related rgw section are rendered. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0a581a6e60`)	2020-08-05 09:50:09 -04:00
Dimitri Savineau	f9a24e2541	dashboard: allow remote TLS cert/key copy When using TLS on the ceph dashboard or grafana services, we can provide the TLS certificate and key. Those files should be present on the ansible controller and they will be copyied to the right node(s). In some situation, the TLS certificate and key could be already present on the target node and not on the ansible controller. For this scenario, we just need to copy the files locally (on each remote host). This patch adds the dashboard_tls_external variable (with default to false) to allow users to achieve this scenario when configuring this variable to true. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1860815 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0d0f1e71df`)	2020-08-04 14:01:59 +02:00
Dimitri Savineau	56cf7168fa	ceph-handler: remove iscsigws restart scripts The iscsigws restart scripts for tcmu-runner and rbd-target-{api,gw} services only call the systemctl restart command. We don't really need to copy a shell script to do it when we can use the ansible service module instead. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cbe79428e6`)	2020-07-25 09:34:25 +02:00
Dimitri Savineau	2faed4c204	podman: always remove container on start In case of failure, the systemd ExecStop isn't executed so the container isn't removed. After a reboot of a failed node, the container doesn't start because the old container is still present in created state. We should always try to remove the container in ExecStartPre for this situation. A normal reboot doesn't trigger this issue and this also doesn't affect nodes running containers via docker. This behaviour was introduced by `d43769d`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1858865 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `47b7c00287`)	2020-07-24 12:47:01 -04:00
Dimitri Savineau	d5974086dd	ceph-facts: remove mds_name fact The mds_name fact always gets the ansible_hostname value so we don't need to have a dedicated fact for this and use the ansible_hostname fact instead. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `4e84b4beed`)	2020-07-24 10:50:44 -04:00
Dimitri Savineau	c694454f82	ceph-handler: add missing condition on ceph-crash The ceph-crash tasks present in the ceph-handler role don't need to be executed on all nodes. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `18e3c7a0a2`)	2020-07-22 18:47:01 -04:00
Guillaume Abrioux	c0b32e4a79	crash: rm container in ExecPreStart even with docker We should ensure the container is removed in `ExecPreStart` even when `{{ container_binary }}` is docker. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `39bb279a53`)	2020-07-22 18:47:01 -04:00
Guillaume Abrioux	e6059fdcd3	ceph-crash: introduce new role ceph-crash This commit introduces a new role `ceph-crash` in order to deploy everything needed for the ceph-crash daemon. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9d2f2108e1`)	2020-07-22 18:47:01 -04:00
Guillaume Abrioux	bd12158a1c	facts: fix broken facts when using --limit This commit fixes these tasks when --limit is used. It makes sure the fact is set on right nodes even when the playbook is run with `--limit` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f8a951f50c`)	2020-07-20 22:49:43 -04:00
Dimitri Savineau	b11eeed833	ceph-dashboard: copy TLS cert/key on monitor The ceph-dashboard role is executed on the mgr nodes so the TLS cert/key files are copied to those nodes. But we are running importing the cert/key files into the ceph configuration on the monitor. Closes: #5557 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2b8ebf1457`)	2020-07-20 22:49:20 -04:00
Guillaume Abrioux	c0b8edfea2	rgw: set container memory limit to 4g This commit changes the container memory limit for rgw daemons. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707488 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `86edae724f`)	2020-07-10 10:10:32 -04:00
Dimitri Savineau	16c5c93411	ceph-nfs: change ganesha devel source The download.nfs-ganesha.org source for nfs-ganesha on CentOS isn't available anymore. Let's switch back to shaman since we have builds available now. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1438ca0120`)	2020-07-06 11:01:13 -04:00
Dimitri Savineau	df3acbd974	ceph-defaults: update nfs-ganesha to 3.3 nfs-ganesha 3.3 is the latest 3.x release available for octopus so we should update to this version. https://download.ceph.com/nfs-ganesha/rpm-V3.3-stable/octopus This will also match the version used in RHCS 5. Ceph container already uses that version too. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `93754bd70c`)	2020-07-03 09:05:12 +02:00
Dimitri Savineau	596f3fa161	radosgw: remove INST_PORT environment variable This variable isn't consumed by the container so we can remove it. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1361e84a4e`)	2020-07-03 06:37:34 +02:00
Guillaume Abrioux	cdf61540d8	rgw: fix multi instances scaleout When rgw and osd are collocated, the current workflow prevents from scaling out the radosgw_num_instances parameter when rerunning the playbook. The environment file used in the rgw systemd template is rendered when executing the `ceph-rgw` role but during a new run of the playbook (in order to scale out rgw instances), handlers are triggered from `ceph-osd` role which is run before `ceph-rgw`, therefore it tries to start the new rgw daemon whereas its corresponding environment file hasn't been rendered yet and fails like following: ``` ceph-radosgw@rgw.ceph4osd3.rgw1.service failed to run 'start-pre' task: No such file or directory ``` This commit moves the tasks generating this file in `ceph-config` role so it is generated early. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1851906 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7dd68b9ac1`)	2020-07-03 06:37:34 +02:00
Dimitri Savineau	48fd6e6b16	ceph-common: remove copr and sepia repositories All EL8 dependencies are now present on EPEL 8 so we don't need the additional repositories that were only a temporary solution. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `3592ba1d61`)	2020-06-30 17:08:53 +02:00
Dimitri Savineau	e5eba9555b	dashboard: configure mgr backend before restart We need to set the mgr dashboard server ip address before restarting the dashboard module otherwise we can try to bind the dashboard module on an already used address. We already do this configuration for the dashboard port value and ssl setup so we should do the same for server address too. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1851455 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `03cd75845f`)	2020-06-29 12:34:26 -04:00
George Shuklin	cfc808804f	Add container settings for Ubuntu 20 (the same as Ubuntu 18) Signed-off-by: George Shuklin <george.shuklin@gmail.com> (cherry picked from commit `3e87f53875`)	2020-06-29 12:33:49 -04:00
Jonathan Rosser	28ee26ae58	Ansible tests are not filters The use of "\| success" and "\| changed" are not valid syntax for modern ansible releases. Signed-off-by: Jonathan Rosser <jonathan.rosser@rd.bbc.co.uk> (cherry picked from commit `42884e8175`)	2020-06-26 13:36:42 -04:00
Jonathan Rosser	77002c12c8	Install python routes package as a dependancy rather than directly This is now a dependancy of ceph-mgr so will be installed automatically and does not need a specific task. This change means that ceph-mgr installs correctly on Ubuntu Focal where the python3-routes package is necessary. Signed-off-by: Jonathan Rosser <jonathan.rosser@rd.bbc.co.uk> (cherry picked from commit `92288c11c5`)	2020-06-26 13:36:42 -04:00
Dimitri Savineau	7eddd89afa	podman: Add Type and PIDFile value to unit files This changes the way we are running the podman containers via systemd. They are now in dettached mode and Type/PIDFile set. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1834974 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d43769dc2a`)	2020-06-23 17:35:24 +02:00
Dimitri Savineau	51cfb89501	ceph-osd: remove ceph-osd-run.sh script Since we only have one scenario since nautilus then we can just move the container start command from ceph-osd-run.sh to the systemd unit service. As a result, the ceph-osd-run.sh.j2 template and the ceph_osd_docker_run_script_path variable are removed. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `829990e60d`)	2020-06-23 17:35:24 +02:00
Guillaume Abrioux	e1c8a0daf6	dashboard: copy self-signed generated crt to mons This commit makes the playbook copying self-signed generated certificate to monitors. When mons and mgrs are deployed on dedicated nodes the playbook will fail when trying to import certificate and key files since they are generated on mgrs whereas we try to import them from a monitor. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846995 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b7539eb275`)	2020-06-23 15:43:26 +02:00
Dimitri Savineau	5428a41fcf	docker: Add Requires on docker service When using docker container engine then the systemd unit scripts only use a dependency on the docker daemon via the After parameter. But if docker is restarted on a live system then the ceph systemd units should wait for the docker daemon to be fully restarted. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846830 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `bd22f1d1ec`)	2020-06-22 17:30:28 -04:00
Guillaume Abrioux	4fe8e12484	switch_to_containers: don't set noup flag We shouldn't set this flag when running switch_to_containers playbook. Otherwise the playbook fails waiting for pgs to be clean. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1843569 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b91d60d384`)	2020-06-17 09:24:02 -04:00
Dimitri Savineau	c6e60db2fb	container: inspect Id field instead of RepoDigests When a container image managed by podman isn't tag anymore then the RepoDigests field when inspecting the image doesn't return any value. This is different from docker workflow and it breaks the ceph-ansible container upgrade when collocated multiple services and using a non fix container tag (like latest or 4). $ podman images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/ceph/daemon latest 680c9c0d38c3 8 days ago 957 MB <none> <none> 011ee108bfc9 2 months ago 1.01 GB $ podman inspect 680c9c0d38c3 \| jq .[0].RepoDigests[0] "docker.io/ceph/daemon@sha256:20cf789235e23ddaf38e109b391d1496bb88011239d16862c4c106d0e05fea9e" $ podman inspect 011ee108bfc9 \| jq .[0].RepoDigests[0] null Because this field returns "null" then the ansible task trying to determine this value is failing ----------------------------- fatal: [foo]: FAILED! => msg: \|- The task includes an option with an undefined variable. The error was: None has no element 0 The error appears to be in 'roles/ceph-container-common/tasks/fetch_image.yml': line 137, column 3, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: set_fact ceph_osd_image_repodigest_before_pulling ^ here ----------------------------- We don't have this behaviour with docker. $ docker images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/ceph/daemon latest 680c9c0d38c3 8 days ago 928 MB docker.io/ceph/daemon <none> 011ee108bfc9 2 months ago 986 MB $ docker inspect 680c9c0d38c3 \| jq .[0].RepoDigests[0] "docker.io/ceph/daemon@sha256:45e6f28bb67c81b826acb64fad5c0da1cac3dffb41a88992fe4ca2be79575fa6" $ docker inspect 011ee108bfc9 \| jq .[0].RepoDigests[0] "docker.io/ceph/daemon@sha256:b393a73309d72e43ca7d65cd3519036007947671e373eb59aa75a46185c52231" Instead we should just get the Id field. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1844496 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `cdb30bd125`)	2020-06-16 13:12:26 -04:00
Ali Maredia	5b76ba12f7	rgw multisite: add master zone endpoints to zonegroup We were only adding the endpoints to the master zone but not to the zonegroup. This patch fixes the issue. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1839228 Signed-off-by: Ali Maredia <amaredia@redhat.com> (cherry picked from commit `0175c205fa`)	2020-06-09 12:29:56 -04:00
Ansible Deployment User	85df54a698	rgwloadbalancer undefined index variable The vrrp_instances variable is using a loop with index but the index_var wasn't defined. As a result, the fact task was failing on this undefined index variable. The task includes an option with an undefined variable. The error was: 'index' is undefined Closes: #5395 Signed-off-by: Florian Faltermeier <florian.faltermeier@uibk.ac.at> (cherry picked from commit `3f906e0c26`)	2020-05-26 12:09:41 -04:00
Guillaume Abrioux	4453028862	common: introduce ceph_pool module calls This commits calls the `ceph_pool` module for creating ceph pools everywhere it's needed in the playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `af9f6684f2`)	2020-05-22 17:05:22 +02:00
Dimitri Savineau	3247b1eea9	ceph-nfs: add stable noarch repository When using the stable nfs ganesha repository, we need have both arch and noarch repositories enabled. Currently the noarch repository is missing which cause the non containerized deployment to fail. Closes: #5375 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `44e1ebaaff`)	2020-05-19 11:18:45 -04:00
Guillaume Abrioux	521c356f33	common: fix target_size_ratio task enablement The condition on this task is wrong, we have to check whether `target_size_ratio` is set in the pool definition instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8c7a48832c`)	2020-05-19 15:15:03 +02:00
Guillaume Abrioux	ec21d57d23	facts: always set ceph_run_cmd and ceph_admin_command always set these facts on monitor nodes whatever we run with `--limit`. Otherwise, playbook will fail when using `--limit` on nodes where these facts are used on a delegated task to monitor. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e5e81843e9`)	2020-05-15 09:56:10 -04:00
Dimitri Savineau	02e5167f2a	ceph-nfs: bind mount ganesha log directory The current ganesha log directory is only present in the container and not bind mount on the host. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `222fe4abd8`)	2020-05-13 16:41:49 -04:00

1 2 3 4 5 ...

2664 Commits (35b488c18993f9c460c3503da3cec94baff4ee50)