ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	3ebd71c8c2	ceph-osd: fix fs.aio-max-nr sysctl condition [1] introduced a regression on the fs.aio-max-nr sysctl value condition. The enable key isn't a boolean but a string because the expression isn't evaluated. This string output "(osd_objectstore == 'bluestore')" is always true because item.enable condition only matches non empty string. So the sysctl value was applyied for both filestore and bluestore backend. [2] added the bool filter to the condition but the filter always returns false on string and the sysctl wasn't applyed at all. This commit fixes the enable key value by evaluating the value instead of using the string. [1] https://github.com/ceph/ceph-ansible/commit/08a2b58 [2] https://github.com/ceph/ceph-ansible/commit/ab54fe2 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ece46d33be`)	2019-11-07 20:31:02 +01:00
Dimitri Savineau	0dcaec64ec	ceph-defaults: pin grafana container tag to 5.2.4 The latest grafana container tag is using grafana 6.x release which could cause issue with the ceph dashboard integration. Considering that the grafana container in RHCS 3 is based on 5.x then we should use the same version. $ docker run --rm rhceph/rhceph-3-dashboard-rhel7:3 -v Version 5.2.4 (commit: unknown-dev) Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2037fb87b6`)	2019-10-31 19:10:04 -04:00
Dimitri Savineau	27eb40714c	ceph-osd: Remove ulimit nofile on container start Even if this improves ceph-disk/ceph-volume performances then it also impact the ceph-osd process. The ceph-osd process shouldn't use 1024:4096 value for the max open files. Removing the ulimit option from the container engine and doing this kind of change on the container side [1]. [1] https://github.com/ceph/ceph-container/pull/1497 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9a996aef7f`)	2019-10-31 14:42:30 -04:00
fmount	20b4234ddc	Set grafana-server user and password in ceph-dashboard role This change adds two tasks to set grafana-api user and password that are required to inject dashboard layouts to the external grafana instance. Without these two parameters the ceph-ansible playbook fails showing an authorization error (HTTPError: 401 Client Error: Unauthorized"). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1767365 Signed-off-by: fmount <fpantano@redhat.com> (cherry picked from commit `41b8c17356`)	2019-10-31 11:43:54 -04:00
Mihai Plasoianu	6015a6ca40	ceph-mon: use --admin-daemon to set default crush rule Signed-off-by: Mihai Plasoianu <m.plasoianu@vertical.de> (cherry picked from commit `d3f67d63ae`)	2019-10-29 22:26:53 -04:00
Dimitri Savineau	ffd05ca8df	defaults: add user/pass auth registry variables Add ceph_docker_registry_username and ceph_docker_registry_password variables in ceph-defaults role so they will be present in the group_vars samples but commented. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1763139 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b33c476f16`)	2019-10-24 16:24:54 -04:00
Dimitri Savineau	b3ee07b242	dashboard: add ceph iscsi management When deploying with ceph-iscsi nodes and dashboard enabled, we need to add the ceph iscsi gateway endpoints to the dashboard configuration and add the mgr ip address in the trusted list in the iscsi gateway configuration file. Closes: #4638 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1764173 https://docs.ceph.com/docs/master/mgr/dashboard/#enabling-iscsi-management Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d050391cbb`)	2019-10-23 09:47:46 +02:00
Dimitri Savineau	567e90cd2e	ceph-iscsi: add ceph-iscsi stable repositories This commit adds the support of the ceph-iscsi stable repository when use ceph_repository community instead of always using the devel repositories. We're still using the devel repositories for rtslib and tcmu-runner in both cases (dev and community). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f2cb937193`)	2019-10-23 09:47:46 +02:00
Dimitri Savineau	e00bc17bd9	Revert "iscsigw: install python-requests" We don't need this since [1]. Also this was only working for python2 and not supporting python3. [1] https://github.com/ceph/ceph-iscsi/commit/00f198a This reverts commit `167737dd3d`. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `fd8d47da98`)	2019-10-23 09:47:46 +02:00
Dimitri Savineau	4ff517e1ab	container/dashboard: run the registry auth task When deploying with packages then the ceph-container-common role isn't executed so the registry authentication task is ignored. Closes: #4636 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9ad000618f`)	2019-10-23 09:39:59 +02:00
Dimitri Savineau	c787cfcdff	travis: fail on ansible-lint errors If ansible-lint reports an error then it's skipped. We should fail in this case. This patch also fixes the pipefail lint in the rbd mirror role [306] Shells that use pipes should set the pipefail option Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `3969470fca`)	2019-10-21 15:55:54 -04:00
Dimitri Savineau	6d5125f2a4	lint: fix error [303,602,701,702] [303] mktemp used in place of tempfile module [602] Don't compare to empty string [701] No 'galaxy_info' found [702] Use 'galaxy_tags' rather than 'categories' This patch also changes the ansible log_path value via the ANSIBLE_LOG_PATH environment variable in the travis configuration to avoid warnings. [WARNING]: log file at /home/travis/ansible/ansible.log is not writeable and we cannot create it, aborting Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f7fd0b6d4f`)	2019-10-21 15:55:54 -04:00
Guillaume Abrioux	4bf8cbe0c8	validate: fix credentials validation This task is failing when `ceph_docker_registry_auth` is enabled and `ceph_docker_registry_username` is undefined with an ansible error instead of the expected message. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1763139 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `da4215e9c0`)	2019-10-21 15:55:35 -04:00
Guillaume Abrioux	541546a54a	common: do not override ceph_release when using custom repo Otherwise it fails like following: ``` TASK [ceph-mds : allow multimds] ************************************************************************************************************************************************ Monday 22 July 2019 16:37:38 +0800 (0:00:03.269) 0:13:25.651 ********* fatal: [rhel7u6clone1]: FAILED! => {"msg": "The conditional check 'ceph_release_num[ceph_release] == ceph_release_num.luminous' failed. The error was: error while evaluating conditional (ceph_release_num[ceph_release] == ceph_release_num.luminous): 'dict object' has no attribute u'dummy'\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-mds/tasks/create_mds_filesystems.yml': line 43, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: allow multimds\n ^ here\n"} ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4e9504c939`)	2019-10-17 20:10:47 -04:00
Guillaume Abrioux	18db9eb79e	nfs: remove unnecessary set_fact in main.yml this task is a leftover and no longer needed. It even causes bug when collocating nfs with mon. Closes: #4609 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b63bd13073`)	2019-10-16 14:01:46 -04:00
Mike Christie	7fbd76c93a	iscsi-gw: Fix rtslib installation When using python3 the name of the rtslib rpm is python3-rtslib. The packages that use rtslib already have code that detects the python version and distro deps, so drop it from the ceph iscsi gw task list and let the ceph-iscsi rpm dependency handle it. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1760930 Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `ba141298d7`)	2019-10-16 14:01:29 -04:00
Dimitri Savineau	35963194a7	rbd-mirror: fail if the peer is not added Due the 'failed_when: false' statement present in the peer task then the playbook continues to ran even if the peer task was failing (like incorrect remote peer format. "stderr": "rbd: invalid spec 'admin@cluster1'" This patch adds a task to list the peer present and add the peer only if it's not already added. With this we don't need the failed_when statement anymore. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1665877 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0b1e9c0737`)	2019-10-16 14:01:06 -04:00
Guillaume Abrioux	c962d87def	update: follow new recommandation to upgrade mds cluster Refact the mds cluster upgrade code in order to follow the documented recommandation. See: https://github.com/ceph/ceph/blob/master/doc/cephfs/upgrading.rst Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1569689 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `71cebf80a6`)	2019-10-16 12:59:08 -04:00
Dimitri Savineau	86b7137b27	ceph-iscsi: notify rbd target services When the iscsi gateway or the ceph configuration file change then we need to notify the rbd target api/gw services to be restarted. This patch also merges the rbd-target-api and rbd-target-gw handler into a single file and listen. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `bc701860d5`)	2019-10-16 11:34:15 -04:00
Guillaume Abrioux	50738ff5c0	mgr: do not copy all keyrings on all mgr There is no need to loop over all mgr nodes to set this fact, it's even breaking deployments because it tries to copy all mgr keyring on all mgr. Closes: #4602 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `cb80231725`)	2019-10-16 06:45:33 +02:00
Dimitri Savineau	3313bc5c1f	ceph-handler: group listen topics and condition We are using multiple listen topics with the handlers. That means that we are notifying 4 tasks for each handler. Instead we can group the listen on an include_tasks and based on the group condition. Before: NOTIFIED HANDLER ceph-handler : set _mon_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy mon restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph mon daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _mon_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _osd_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy osd restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph osds daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _osd_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _mds_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy mds restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph mds daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _mds_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _rgw_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy rgw restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph rgw daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _rgw_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _mgr_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy mgr restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph mgr daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _mgr_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _rbdmirror_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy rbd mirror restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph rbd mirror daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _rbdmirror_handler_called after restart for mon0 After: NOTIFIED HANDLER ceph-handler : mons handler for mon0 NOTIFIED HANDLER ceph-handler : osds handler for mon0 NOTIFIED HANDLER ceph-handler : mdss handler for mon0 NOTIFIED HANDLER ceph-handler : rgws handler for mon0 NOTIFIED HANDLER ceph-handler : mgrs handler for mon0 NOTIFIED HANDLER ceph-handler : rbdmirrors handler for mon0 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `fe9c5b8c68`)	2019-10-15 13:29:06 -04:00
Guillaume Abrioux	13f6a0a22a	handler: followup on #4519 This commit adds some missing `\| bool` filters. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ccc11cfc93`)	2019-10-15 13:29:06 -04:00
Guillaume Abrioux	fd10fbc047	handlers: refact osd handler This commit merges the two restart tasks into a single one, this way it's one task less to notify. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `411bd07d54`)	2019-10-15 13:29:06 -04:00
Dimitri Savineau	8117ed34d4	Remove validate action and notario dependency The current ceph-validate role is using both validate action and fail module tasks to validate the ceph configuration. The validate action is based on the notario python library. When one of the notario validation fails then a python stack trace is reported to the ansible task. This output isn't understandable by users. This patch removes the validate action and the notario depencendy. The validation is now done with only fail ansible module. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654790 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0f978d969b`)	2019-10-15 10:21:54 -04:00
Guillaume Abrioux	5568692340	mgr: improve mgr keyring creation Delegating on remote node isn't necessary here since we are already iterating over the right nodes. Closes: #4518 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `161170524d`)	2019-10-11 14:51:16 -04:00
Guillaume Abrioux	9c0547068e	validate: prevent from installing OSD on same disk as the OS This commit adds a validation task to prevent from installing an OSD on the same disk as the OS. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1623580 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `80e2d00b16`)	2019-10-11 09:44:10 -04:00
Guillaume Abrioux	98467ddf01	common: do not reset `container_exec_cmd` This commit removes some legacy tasks. These tasks aren't needed, they cause the playbook to fail when collocating daemons. Closes: #4553 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `273413186a`)	2019-10-10 15:56:01 -04:00
Dimitri Savineau	eb51cc1bb1	dashboard: update layouts before the restart If the mgr dashboard doesn't restart fast enough then the inject dashboard task will fail with a HTTP error 400. Error EINVAL: Traceback (most recent call last): File "/usr/share/ceph/mgr/mgr_module.py", line 914, in _handle_command return self.handle_command(inbuf, cmd) File "/usr/share/ceph/mgr/dashboard/module.py", line 450, in handle_command push_local_dashboards() File "/usr/share/ceph/mgr/dashboard/grafana.py", line 132, in push_local_dashboards retry() File "/usr/share/ceph/mgr/dashboard/grafana.py", line 89, in call result = self.func(self.args, *self.kwargs) File "/usr/share/ceph/mgr/dashboard/grafana.py", line 127, in push grafana.push_dashboard(body) File "/usr/share/ceph/mgr/dashboard/grafana.py", line 54, in push_dashboard response.raise_for_status() File "/usr/lib/python2.7/site-packages/requests/models.py", line 834, in raise_for_status raise HTTPError(http_error_msg, response=self) HTTPError: 400 Client Error: Bad Request Instead we can trigger this task before the module restart. Closes: #4565 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `3f6ff240b7`)	2019-10-09 07:24:56 +00:00
Guillaume Abrioux	1d4d49695e	nfs: stop nfs server service in all context This commit moves this task in order to stop the nfs server service regardless the deployment type desired (containerized or non containerized). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6c6a512a72`)	2019-10-07 18:17:49 +02:00
Guillaume Abrioux	9a62d006bd	nfs: stop nfs server service The syntax here wasn't working, this refact fixes this task. Also, removing the `ignore_errors: true` which was hidding the failure. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `47034effe0`)	2019-10-07 18:17:49 +02:00
Dimitri Savineau	d617626ef4	ceph-dashboard: remove rgw api host,port,scheme We don't need to have dedicated variables for the RGW integration into the Ceph Dashboard and need to be manually filled. Instead we can use the current values from the RGW nodes by using the IP and port from the first RGW instance of the first RGW node via the radosgw_address and radosgw_frontend_port variables. We don't need to specify all RGW nodes, this will be done automatically with one node. The RGW api scheme is using the radosgw_frontend_ssl_certificate variable to determine if the value is http or https. This variable is also reuse as a condition for the ssl verify task. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b9e93ad7a6`)	2019-10-07 10:25:29 -04:00
Guillaume Abrioux	b325cc386e	switch_to_containers: do not re-set `ceph_uid` This commit refacts the way we set `ceph_uid` fact in `ceph-facts` and removes all `set_fact` tasks for `ceph_uid` in switch-to-containers playbook to avoid duplicated code. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fa9b42e98e`)	2019-10-07 10:18:17 -04:00
Dimitri Savineau	a210efe361	ceph-dashboard: Improve https configuration This patch moves the https dashboard configuration into a dedicated block to avoid the multiple occurence of the dashboard_protocol condition. It also fixes the dashboard certificate and key variables handling in the condition introduced by `ab54fe2`. Those variables aren't boolean but strings so we can test them via the length filter. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `249764047b`)	2019-10-07 14:18:29 +02:00
Guillaume Abrioux	857c68087d	handler: followup on #4519 This commit adds some missing `\| bool` filters. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ccc11cfc93`)	2019-10-07 09:09:36 +02:00
Dimitri Savineau	5bbd825ab2	ceph-dashboard: add cluster parameter to ceph cmd The ceph dashboard tasks didn't use the cluster option if the cluster name isn't the default value. Closes: #4529 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `dd526cfe4e`)	2019-10-04 17:07:31 +00:00
Dimitri Savineau	8ec632c42c	ceph-handler: don't restart all OSDs with limit When using the ansible --limit option on one or few OSD nodes and if the handler is triggered then we will restart the OSD service on all OSDs nodes instead of the hosts limited by the limit value. Even if the play is limited by the --limit value we are using all OSD nodes from the OSD group. with_items: '{{ groups[osd_group_name] }}' Instead we should iterate only on the nodes present in both OSD group and limit list. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0346871fb5`)	2019-10-04 07:42:58 +02:00
Dimitri Savineau	70267cb30b	ceph-facts: fix _radosgw_address with block `e695efc` introduced a regression in the _radosgw_address fact when using the radosgw_address_block variable. There's no item there because we don't use the items lookup. This is only used for _monitor_address with monitor_address_block. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1758099 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `780cf36a59`)	2019-10-03 19:20:19 +00:00
Guillaume Abrioux	13ca0531d8	common: improve keyrings generation There is no need to get n * number of nodes the different keyrings. Adding a `run_once: true` here avoid running a ceph command too many times which could be impacting large cluster deployment. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9bad239d77`)	2019-10-02 14:34:27 +02:00
Dimitri Savineau	5b24c66ff7	ceph-facts: use --admin-daemon to get fsid During the rolling_update scenario, the fsid value is retrieve from the current ceph cluster configuration via the ceph daemon config command. This command tries first to resolve the admin socket path via the ceph-conf command. Unfortunately this command won't work if you have a duplicate key in the ceph configuration even if it only produces a warning. As a result the task will fail. Can't get admin socket path: unable to get conf option admin_socket for mon.xxx: warning: line 13: 'osd_memory_target' in section 'osd' redefined Instead of using ceph daemon we can use the --admin-daemon option because we already know what the socket admin path value based on the ceph cluster and mon hostname values. Closes: #4492 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ec3b687dc4`)	2019-10-02 14:01:32 +02:00
Guillaume Abrioux	c958bc1ddf	validate: fix gpt header check Check for gpt header when osd scenario is lvm or lvm batch. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `272d16e101`)	2019-10-01 13:02:45 -04:00
Guillaume Abrioux	b998fb339e	rbdmirror: rename a file rename this file to be more generic. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ed8616aa66`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	9a79ed1bf0	rgw: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/` directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e08194dd67`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	7f902994b3	rbdmirror: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/` directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c69816c6b7`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	d7a06c67db	iscsigw: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/ directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4636f3f7e2`)	2019-10-01 18:50:51 +02:00
Guillaume Abrioux	df5337535d	container: isolate systemd tasks This commit isolates the systemd unit files generation for containers into separate yml files in order to be able importing each corresponding roles without playing all tasks. This is needed so we can run ceph-ansible to render systemd unit files so they call podman instead of docker. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `bd64167469`)	2019-10-01 18:50:51 +02:00
Dimitri Savineau	7bb835240e	ceph-facts: update external grafana fact filter `e695efc` hasn't been updated with the changes introduced in `9bb11c7` so the ips_in_ranges filter isn't used for an external grafana instance. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `20b1a464ec`)	2019-10-01 12:28:34 -04:00
Boris Ranto	af9f93f07f	ceph-defaults: Change the default prometheus port The old default prometheus port 9090 clashes with cockpit in rhel 8. The 9090 port is reserved for web service administration of machines. We should change the default to something that does not clash with other ports used in rhel 8, at least by default. The port 9092 seems like a good choice in my testing. Signed-off-by: Boris Ranto <branto@redhat.com> (cherry picked from commit `b96c6da832`)	2019-09-30 14:24:50 +02:00
Guillaume Abrioux	a3988887d2	Revert "ceph-common: install only necesarry ceph-* packages on debian" This reverts commit `58b27ef0b3`. This is breaking debian based OS deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e4444d29e0`)	2019-09-29 13:28:40 +00:00
Johannes Kastl	146f2e8de3	move python-xml to raw_install_python.yml The package python-xml is needed for ansible's zypper module to interact with the zypper package management tool. roles/ceph-defaults/defaults/main.yml: Remove python-xml from variable suse_package_dependencies to only install python-xml on SUSE/openSUSE if python is not found. raw_install_python.yml already contains all the logic needed to check if there is a valid python installation, so this is better suited there. openSUSE Leap 15.x / SLES 15.x do no longer have /usr/bin/python, only /usr/bin/python3, which already contains the xml module, so nothing needs to be installed in that case. Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `5cf22e9b31`)	2019-09-27 17:50:10 +02:00
Harald Jensås	5fea830414	Replace ipaddr() with ips_in_ranges() This change implements a filter_plugin that is used in the ceph-facts, ceph-validate roles and infrastucture-playbooks. The new filter plugin will return a list of all IP address that reside in any one of the given IP ranges. The new filter replaces the use of the ipaddr filter. ceph.conf already support a comma separated list of CIDRs for the public_network and cluster_network options. Changes: [1] and [2] introduced a regression in ceph-ansible where public_network can no longer be a comma separated list of cidrs. With this change a comma separated list of subnet CIDRs can also be used for monitor_address_block and radosgw_address_block. [1] commit: `d67230b2a2` [2] commit: `20e4852888` Related-To: https://bugs.launchpad.net/tripleo/+bug/1840030 Related-To: https://bugzilla.redhat.com/show_bug.cgi?id=1740283 Closes: #4333 Please backport to stable-4.0 Signed-off-by: Harald Jensås <hjensas@redhat.com> (cherry picked from commit `e695efcaf7`)	2019-09-27 17:49:46 +02:00
Dimitri Savineau	2d1372fe2a	ceph-nfs: Allow to configure SecType value Depending on the infrastruture (w/o kerberos auth) then the SecType value could be different. Currently this value is hardcoded in the NFS Ganesha template. Instead we can use a variable. The default value is still the same to avoid breaking the backward compatibility. Closes: #4459 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ca77d7bd31`)	2019-09-27 15:38:52 +02:00
Dimitri Savineau	21e1650db6	ceph-dashboard: Add prometheus api host The set-prometheus-api-host ceph dashboard subcommand was missing in ceph-dashboard role. Only grafana and alermanager were present. This commit also remove the trailing slash at the end of the host/url values. Closes: #4453 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `74ab59c4f3`)	2019-09-27 14:16:39 +02:00
Anthony Rusdi	3d2f9d2cde	ceph-common: install only necesarry ceph-* packages on debian Currently, ceph package only an meta-package that do not contain actual software, but simply depend on other packages. It's been few release since debian stretch (official), ubuntu bionic (official), ubuntu uca repository and upstream debian-jewel. As we only support nautilus and higher release for master branch, I propose to drop ceph package and use ceph-base instead for repository model other than rhcs so debian ceph install will be more minimalis. Signed-off-by: Anthony Rusdi <33247310+antrusd@users.noreply.github.com> (cherry picked from commit `58b27ef0b3`)	2019-09-27 14:16:20 +02:00
liuxu	1acd062f22	dashboard: add grafana dashboard support on Debian based OS download grafana dashboard files from github when running on Debian based OS Signed-off-by: liuxu <liuxu623@gmail.com> (cherry picked from commit `195f70897c`)	2019-09-27 09:12:39 +02:00
fmount	43830515af	Inject ceph grafana dashboard layouts This change just adds the task to inject from the ceph dashboard mgr module the required layouts to show all the cluster metrics on the grafana instance. Since we're now able to push grafana layouts through the ceph mgr module command, the dashboards configuration template is no longer needed on containerized environments. This commit also fixes the Vagrantfile IP static assigment in the grafana section because it generates an issue (it's the same of the mgr instance). Finally, considering some deployments that use an external grafana server instance, we reworked the 'grafana_server_addr' assignment to address these requirements. Signed-off-by: fmount <fpantano@redhat.com> (cherry picked from commit `9bb11c7b2a`)	2019-09-26 13:44:03 -04:00
Guillaume Abrioux	b16dfb1920	iscsigw: install python-requests Typical error at rbd-target-api startup: ``` Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: Traceback (most recent call last): Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: File "/usr/bin/rbd-target-api", line 39, in <module> Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: from gwcli.utils import (APIRequest, valid_gateway, valid_client, Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: File "/usr/lib/python2.7/site-packages/gwcli/utils.py", line 1, in <module> Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: import requests Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: ImportError: No module named requests ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `167737dd3d`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	b1e61be9c6	tests: set copy_admin_key at group_vars level setting it at extra vars level prevent from setting it per node. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5bb6a4da42`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	e1d06f498c	global: remove fetch_directory dependency This commit drops the fetch_directory dependency. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ab370b6ad8`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	69ec26e045	osd: add wal_devices option support to ceph_volume module This commit adds the `wal_devices` option support to the ceph_volume module. passing a devices list in `bluestore_wal_devices` will make ceph-volume creating 1 vg using these devices to create block.wal partitions. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `09e04a9197`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	a33791be25	osd: update doc text in defaults/main.yml This commit removes ceph-disk references. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `70f1b37097`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	d666e03b0c	osd: add block_db_devices option support to ceph_volume module This commit adds the `block_db_devices` option support to the ceph_volume module. passing a devices list in `dedicated_devices` will make ceph-volume creating 1 vg using these devices to create block.db partitions for data devices. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7b836eaa47`)	2019-09-26 16:21:54 +02:00
Guillaume Abrioux	651cf13a74	validate: check ceph_docker_registry_* length This commit adds a condition to check whether these variables are empty. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2b97ac921b`)	2019-09-18 23:43:21 +02:00
Dimitri Savineau	9d3fbcf47e	container: Allow to use registry authentication The registry.redhat.io regsitry requires authentication so before pulling the RHCS 4 container images from the registry we need to do the login step. This is done via the new ceph_docker_registry_auth variable. The default value is false but true for RHCS setup. When set to true, you need to provide the username and password for the registry via the associated variables. This patch also updates the ceph_docker_registry value for RHCS setup. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1748911 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9f4a99fb24`)	2019-09-18 23:43:21 +02:00
Dimitri Savineau	b50fa23630	ceph-handler: Fix osd restart condition In containerized deployment, the restart OSD handler couldn't be triggered in most ansible execution. This is due to the usage of run_once + a condition on the inventory hostname and the last filter. The run_once is triggered first so ansible will pick a node in the osd group to execute the restart task. But if this node isn't the last one in the osd group then the task is ignored. There's more probability that the task will be ignored than executed. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5b1c15653f`)	2019-09-11 13:20:30 -04:00
Dimitri Savineau	8d26299116	rbd-mirror: Allow to copy the admin keyring The ceph-rbd-mirror role allows to copy the admin keyring via the copy_admin_key variable but there's actually no task in that role doing the job. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1f505628dd`)	2019-09-11 11:48:48 -04:00
Dimitri Savineau	142ac88961	rbd-mirror: Use the rbd mirror client keyring The admin keyring isn't present by default on the rbd mirror nodes so the rbd commands related to the mirroring confguration will fail. Instead we can use the rbd mirror client keyring. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `a3d36df025`)	2019-09-11 11:48:48 -04:00
Harald Jensås	e33e06d400	Support comma-delimited subnets in firewall ceph.conf supports a comma separated list of subnet CIDR's for the public_network and the cluster network. ceph-ansible should support setting up the firewall for this configuration. Closes: #4425 Related: #4333 https://docs.ceph.com/docs/nautilus/rados/configuration/network-config-ref/#network-config-settings Signed-off-by: Harald Jensås <hjensas@redhat.com> (cherry picked from commit `d94229204d`)	2019-09-10 09:34:48 -04:00
Giulio Fidente	cb66a62ae2	Look for additional names when checking ceph-nfs container status Ganesha cannot be operated active/active, in those deployments where it is managed by pacemaker the container name can be different than the default. This change uses "ceph_nfs_service_suffix" where previously missing to ensure tasks will work with customized names. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1750005 Signed-off-by: Giulio Fidente <gfidente@redhat.com> (cherry picked from commit `d2a2bd7c42`)	2019-09-09 16:48:50 -04:00
Dimitri Savineau	3fded4b8ec	rbd-mirror: configure pool and peer The rbd mirror configuration was only available for non containerized deployment and was also imcomplete. We now enable the mirroring on the pool and add the remote peer in both scenarios. The default mirroring mode is set to 'pool' but can be configured via the ceph_rbd_mirror_mode variable. This commit also fixes an issue on the rbd mirror command if the ceph cluster name isn't using the default value (ceph) due to a missing --cluster parameter to the command. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1665877 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `7e5e21741e`)	2019-09-09 16:05:56 +00:00
fmount	65a01036c2	Fix discovered_interpreter_python variable This change fixes the discovered_interpreter_python variable name that was "discovered_python_interpreter" and caused a failure in OSP deployments. Signed-off-by: fmount <fpantano@redhat.com> (cherry picked from commit `81eb091533`)	2019-09-04 14:16:57 -04:00
Johannes Kastl	781ab4ad62	openSUSE OBS repo using ceph_stable_release Instead of hardcoding `luminous`, use the `ceph_stable_release` variable to point to the correct repository. This is now uncommented in roles/ceph-defaults/defaults/main.yml to be available, as it is only used if ceph_repository is set to 'obs'. group_vars/*.sample files have been regenerated using the ./generate_group_vars_sample.sh script. Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `0cedc4d303`)	2019-08-30 09:04:24 -04:00
fmount	159db72269	Add http_addr option to grafana config We have no reason to make grafana container listen on *:<port>, so this change adds the http_addr option to the grafana config file and adds the related option on the wait_for tasks. Since grafana_server_addr should exists, we shouldn't rely on the _current_monitor_addr default on prometheus/grafana templates. This change also remove this default value that is not necessary anymore. Signed-off-by: fmount <fpantano@redhat.com> (cherry picked from commit `8a666bfd15`)	2019-08-30 09:04:16 -04:00
Dimitri Savineau	ab67c6bd76	lint: fix error [201,206] [201] Trailing whitespace [206] Variables should have spaces before and after: {{ var_name }} Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `42082c0a27`)	2019-08-30 09:04:00 -04:00
Johannes Kastl	64b11ab2b9	fix openSUSE OBS repo creation roles/ceph-common/tasks/installs/suse_obs_repository.yml: ansible's zypper_repository module does not know a parameter 'uri', this is called 'repo' instead Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `4711a7d626`)	2019-08-29 16:31:40 +00:00
Nick Erdmann	e8e1f310dd	ceph-infra: open ceph iscsi/prometheus port Signed-off-by: Nick Erdmann <n@nirf.de> (cherry picked from commit `7953ee1b81`)	2019-08-29 10:22:28 -04:00
Guillaume Abrioux	a3cbb59c05	lint: fix error [301], add `changed_when: false` when needed This commit fixes the error [301]: `[301] Commands should not change things if nothing needs doing` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `327d564106`)	2019-08-28 11:22:47 -04:00
Guillaume Abrioux	8f781198d6	lint: fix error [306], add pipefail on shell command using pipe This commit fixes the error [306]: `[306] Shells that use pipes should set the pipefail option` using `/bin/bash` as executable because Debian/Ubuntu systems use `dash` by default which doesn't have the `-o pipefail`. (See: https://github.com/ansible/ansible-lint/issues/497#issue-424623501) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `102edaeb61`)	2019-08-28 11:22:47 -04:00
Dimitri Savineau	364951ce2f	ceph-mon: Bind mount the ca-trust directory On containerized deployment, the mon container sometimes needs to access to the radosgw endpoint (via the radosgw-admin command). When using TLS on the radosgw with self-signed certificates then we need to access to the CA certification from the mon container. The CA certificate needs to be added on the host and then the directory will be bind mount on the container. Resolves: #4358 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `2b0616ecca`)	2019-08-28 09:44:34 -04:00
Dimitri Savineau	1fbfa1ce1a	ceph-client: Use profile rbd in keyring caps Like the OpenStack keyrings, we can use the profile rbd for the clients keyring (both mon and osd). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `49aa05b96c`)	2019-08-28 09:42:03 -04:00
Dimitri Savineau	4df8de8f7b	Revert "osd: add 'osd blacklist' cap for osp keyrings" This reverts commit `2d955757ee`. The "osd blacklist" isn't an osd caps but should be used with mon caps. Also the correct caps for this is: 'allow command "osd blacklist"'. The current change is breaking the openstack and clients keyrings. By using the profile rbd (which is already used) we already rely on the ability to blacklist dead client. Resolves: #4385 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `717af83475`)	2019-08-28 09:42:03 -04:00
Johannes Kastl	3bfa1c50de	set discovered_python_interpreter if ansible_python_interpreter is defined If the user has set the `ansible_python_interpreter`, ansible will not try to discover python, so `discovered_python_interpreter` will not be set. Solution: Set `discovered_python_interpreter` to `ansible_python_interpreter` if `ansible_python_interpreter` is defined Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `bd507fa147`)	2019-08-27 21:06:43 +00:00
guihecheng	196e70a75a	rgw/multisite: assign 'rgw_zone' to the exact section in ceph.conf since the following commit: commit `1ac94c048f` rgw: add support for multiple rgw instances on a single host we have multi-instance rgw support on a single host and the config section name of the rgw changed from [client.rgw.$(hostname)] -> [client.rgw.$(hostname).rgwX] when X is the sequence number: 0,1,2,... So we should assign 'rgw_zone' item to the exact rgw instance config section in ceph.conf Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com> (cherry picked from commit `a0590cae9d`)	2019-08-23 15:56:15 +02:00
Artur Fijalkowski	27014df45e	global: make directories mode parameterizable This commit makes it possible to parametrize the ceph directories modes. So it changes hardocded mode for ceph related directories from 0755 to customizable with `ceph_directories_mode` variable. Closes: #2920 Signed-off-by: Artur Fijalkowski <artur.fijalkowski@ing.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `011270ca69`)	2019-08-23 11:39:23 +00:00
Dimitri Savineau	500c59c648	ceph-osd: Add ulimit nofile on container start On containerized deployment, the OSD entrypoint runs some ceph-volume commands (lvm/simple scan and/or activate) which perform badly without the ulimit option. This option was added for all previous ceph-volume commands but not on the ceph-osd container startup. Also updating hard limit value to 4096 to reflect default baremetal value. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9a4ac46d19`)	2019-08-22 22:50:17 +00:00
Kevin Coakley	c7950d5539	ceph-config: Set changed_when to false on fact gathering statements The "run 'ceph-volume lvm batch --report' to see how many osds are to be created" and "run 'ceph-volume lvm list' to see how many osds have already been created" statements only register the lvm_batch_report and lvm_list variables. Running those ceph-volume commands should never produce a change on the system. Adding changed_when: false prevents irrelevant change messages from Ansible. Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu> (cherry picked from commit `e11cbbbcb1`)	2019-08-22 20:36:39 +02:00
Johannes Kastl	3e17c458d0	facts: fix a typo This commit fixes a typo in roles/ceph-facts/tasks/facts.yml Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `e1b9312084`)	2019-08-22 18:11:18 +02:00
Johannes Kastl	82ede0afdb	ceph-nfs: fail on openSUSE Leap using distro packages roles/ceph-validate/tasks/check_nfs.yml: fail on openSUSE Leap using `ceph_origin = distro`, as the ganesha packages are not available from the distribution repositories Fixes: #4342 Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `11aa5dbb58`)	2019-08-21 15:40:22 +02:00
Guillaume Abrioux	fcf571430b	handler: do not validate the server certificate against the CA Otherwise rgw handler ends up with an error when using https. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9329bbb3af`)	2019-08-21 15:40:07 +02:00
Johannes Kastl	15646d1030	install ceph-mds packages on SUSE/openSUSE install packages on SUSE/openSUSE distributions, using the same logic as on RedHat-based distributions Fixes #4340 Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `c721cb99cb`)	2019-08-21 09:54:09 +00:00
Johannes Kastl	34783253a5	remove duplicate task installing suse dependencies roles/ceph-common/tasks/installs/install_on_suse.yml: remove the task that installs the dependencies, as this is done later in install_suse_packages.yml Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `504017d562`)	2019-08-20 14:36:15 +02:00
Guillaume Abrioux	642851fa5d	osd: add 'osd blacklist' cap for osp keyrings This commits adds the `osd blacklist` cap on all OSP clients keyrings. Fixes: #2296 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2d955757ee`)	2019-08-20 13:09:05 +02:00
Johannes Kastl	6fa0eb90a2	only support openSUSE Leap 15.x, fail on 42.x openSUSE switched from 'openSUSE 13.x' to 'openSUSE Leap 42.x' and then to 'openSUSE Leap 15.x' to align with SLES15 development. The previous logic did not correctly allow the current release, as 15.x matched the 'less than 42.3' condition. For now only support openSUSE Leap 15.x, and extend support once 16.x is released (or whatever the exact version will be) Signed-off-by: Johannes Kastl <kastl@b1-systems.de> (cherry picked from commit `5ee3d96fb4`)	2019-08-20 09:37:29 +02:00
Guillaume Abrioux	19c7b650db	osd: remove useless condition just like `ceph_osd_pool_default_size`, a pool size might change after an initial deployment. Having this condition prevents from customizing the pool in that case. This is not needed so let's remove it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `70cf2a5846`)	2019-08-20 09:13:15 +02:00
Guillaume Abrioux	6d90dbc3c0	common: replace shell module there is no need to use `shell` in these tasks. Let's use `command`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4df92152c0`)	2019-08-19 18:47:14 +00:00
Guillaume Abrioux	f08408bf5c	osd: refact 'wait for all osd to be up' task let's use `until` instead of doing test in bash using python oneliner also, use `command` instead of `shell`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `687087fd43`)	2019-08-19 18:47:14 +00:00
Guillaume Abrioux	2f77704591	common: use discovered_interpreter_python fact in order to use the right binary name when using python cli in command or shell module. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `13815ad3ca`)	2019-08-19 18:47:14 +00:00
Guillaume Abrioux	0f90ffe9df	mgr: refact 'wait for all mgr to be up' task There's no need to use `shell` module here. Instead of using `\| python -c`, let's use `from_json` filter. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5b9b841108`)	2019-08-08 15:57:54 +02:00
Dimitri Savineau	d4348da7a1	mgr/dashboard: Fix grafana/prometheus url config When configuring grafana/prometheus embed in the mgr/dashboard, we need to use the address of the grafana-server node and not the current hostname because mgr/dashboard and grafana/prometheus could be present on different hosts. We should instead rely on the grafana_server_addr variable and remove the dashboard_url. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `4c6ec1dccb`)	2019-08-08 13:47:09 +02:00
Dimitri Savineau	cf82ac5590	ceph-dashboard: Add run_once on delegate tasks Because we need to execute commands from a monitor node (the first one in the mons list) we are using delegate_to option. If there's multiple nodes running the ceph-dashboard role then the delegated task will be executed multiple times. Also remove a mgr config-key option not present for nautilus+ releases. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f545b5be0d`)	2019-08-08 13:47:09 +02:00
Dimitri Savineau	8bb1be30fa	ceph-infra: Apply firewall rules with container We don't have a reason to not apply firewall rules on the host when using a containerized deployment. The TripleO environments already manage the ceph firewall rules outside ceph-ansible and set the configure_firewall variable to false. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1733251 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `771f25b1f8`)	2019-08-07 10:41:47 +02:00
Dimitri Savineau	308e5fe9f4	ceph-grafana: Set grafana uid/gid on files We don't need to create a grafana system user (in fact we even don't set the righ uid to this user) because we're using a container setup. Instead we just need to be sure to set the owner/group to 472 (grafana user/group from the container) like we do for ceph/167. We don't need to set the user/group recursively on /etc/grafana directory in a dedicated task. Also on Ubuntu system, the ceph-grafana-dashboards isn't present so on non containerized deployment we won't have the /etc/grafana/dashboards/ceph-dashboard directory present (coming with the package) so we need to be sure it exists. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `34036c667c`)	2019-08-07 10:41:03 +02:00
Dimitri Savineau	36e18e20d1	ceph-osd: check container engine rc for pools When creating OpenStack pools, we only check if the return code from the pool list command isn't 0 (ie: if it doesn't exist). In that case, the return code will be 2. That's why the next condition is rc != 0 for the pool creation. But in containerized deployment, the return code could be different if there's a failure on the container engine command (like container not running). In that case, the return code could but either 1 (docker) or 125 (podman) so we should fail at this point and not in the next tasks. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1732157 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d549fffdd2`)	2019-07-31 14:07:41 -04:00
Guillaume Abrioux	51af74face	dashboard: fix timeout usage on rgw user creation command For some reason, this is making the playbook failing like following: ``` TASK [ceph-dashboard : create radosgw system user] ********************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************** task path: /home/guits/ceph-ansible/roles/ceph-dashboard/tasks/configure_dashboard.yml:106 Tuesday 30 July 2019 10:04:54 +0200 (0:00:01.910) 0:11:22.319 ******** FAILED - RETRYING: create radosgw system user (3 retries left). FAILED - RETRYING: create radosgw system user (2 retries left). FAILED - RETRYING: create radosgw system user (1 retries left). fatal: [mgr0 -> mon0]: FAILED! => changed=true attempts: 3 cmd: timeout 20 podman exec ceph-mon-mon0 radosgw-admin user create --uid=ceph-dashboard --display-name='Ceph dashboard' --system delta: '0:00:20.021973' end: '2019-07-30 08:06:32.656066' msg: non-zero return code rc: 124 start: '2019-07-30 08:06:12.634093' stderr: 'exec failed: container_linux.go:336: starting container process caused "process_linux.go:82: copying bootstrap data to pipe caused \"write init-p: broken pipe\""' stderr_lines: <omitted> stdout: '' stdout_lines: <omitted> ``` using `timeout -f -s KILL` fixes this issue. Also, there is no need to use `shell` module here, let's switch to `command`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c9d80af4e0`)	2019-07-30 15:08:46 +02:00
Guillaume Abrioux	ea44783f3d	validate: add checks for grafana-server group definition this commit adds two checks: - check that the `[grafana-server]` group is defined - check that the `[grafana-server]` contains at least one node. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `02beb00916`)	2019-07-29 15:46:58 +02:00
Guillaume Abrioux	e2b41a17c0	mgr: fix a typo this tasks isn't using the right container_exec_cmd, that's delegating to the wrong node. Let's use the right fact to fix this command. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ec33ee7574`)	2019-07-29 15:46:58 +02:00
Guillaume Abrioux	1a9043128c	dashboard: remove cfg80211 module installation According to this comment [1], this seems to be needed to detect wifi devices. In node exporter we can see this: ``` --collector.wifi Enable the wifi collector (default: disabled). ``` since it's enabled by default and we don't even change this in our systemd templates for node-exporter, we can easily assume in the end it's not needed. Therefore, let's remove this. [1] `dbf81b6b5b (diff-961545214e21efed3b84a9e178927a08L21-L23)` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b9cdf341be`)	2019-07-29 15:46:58 +02:00
Guillaume Abrioux	d0ad1cf0f1	dashboard: use dedicated group only There's no need to add complexity and trying to fallback on other group. Let's deploy dashboard on all nodes present in grafana-server group. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d67230b2a2`)	2019-07-29 15:46:58 +02:00
Guillaume Abrioux	93826e061d	dashboard: enable dashboard by default This commit enables dashboard deployment by default. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1726739 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fb1b5b3251`) # Conflicts: # tox-dashboard.ini	2019-07-29 15:46:58 +02:00
Dimitri Savineau	43d625b59a	Remove NBSP characters Some NBSP are still present in the yaml files. Adding a test in travis CI. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `07c6695d16`)	2019-07-26 16:23:41 -04:00
Guillaume Abrioux	6ef73b59d2	container: rename docker directories Those 2 directories should be renamed to be more generic (docker vs. podman). Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `19950b5170`)	2019-07-25 13:40:40 +02:00
fmount	15c745d998	Avoid to setup provisioners in a fully containerized environment This commit adds a when clause to avoid the setup of grafana provisioners in a fully containerized scenario. This is needed when the ceph-grafana-dashboards package is not installed and this task could result in a wrong grafana configuration that let the container crash. Signed-off-by: fmount <fpantano@redhat.com> (cherry picked from commit `fac1b030cb`)	2019-07-24 14:16:55 +02:00
Dimitri Savineau	367dce2894	ceph-dashboard: enable rgw options conditionally The dashboard rgw frontend options only need to be applied when there's some nodes present in the rgw ansible group. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5383c2f7f3`)	2019-07-19 20:33:42 +00:00
Dimitri Savineau	87db5aa55c	dashboard: use variables for port value The current port value for alertmanager, grafana, node-exporter and prometheus is hardcoded in the roles so it's not possible to change the port binding of those services. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `8ab9b719fa`)	2019-07-19 20:33:42 +00:00
Giulio Fidente	985165dbf7	Fix backward compat with old cephfs_pools format Previously cephfs_pools items used to have a pgs: key but not pgp_num: nor pg_num: Signed-off-by: Giulio Fidente <gfidente@redhat.com> (cherry picked from commit `edd1420217`)	2019-07-19 17:50:57 +00:00
Guillaume Abrioux	bbfd6965e0	handler: fix bug in osd handlers `fbf4ed42ae` introduced a bug when container binary is podman. podman doesn't support ps -f using regular expression, the container id is never set in the restart script causing the handler to fail. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1721536 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `618dbf271d`)	2019-07-18 16:49:14 +00:00
Guillaume Abrioux	4aa4496fc1	validate: fail if gpt header found on unprepared devices ceph-volume will complain if gpt headers are found on devices. This commit checks whether a gpt header is present on devices passed in `devices` variable and fail early. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1730541 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `487d701685`)	2019-07-18 10:32:53 +02:00
Dimitri Savineau	2d8ed4cc52	ceph-infra: update handler with daemon variable Both ntp and chrony daemon use variable for the service name because it could be different depending on the GNU/Linux distribution. This has been update in `9d88d3199` for chrony but only for the start part not for the handler. The commit fixes this for both ntp and chrony. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0ae0193144`)	2019-07-15 16:36:49 +00:00
Dimitri Savineau	b87c189299	ceph-infra: Open prometheus port The Prometheus porrt 9090 isn't open in the firewall configuration. Also the dashboard task on the grafana node was not required because it's already present on the mgr node. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `41b44dde85`)	2019-07-11 13:41:58 +00:00
Guillaume Abrioux	2742063aee	handler: remove legacy condition since everything is already in a block with the same condition, it's not needed to leave all of them on these tasks. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ee29f7370a`)	2019-07-10 15:53:26 +00:00
Guillaume Abrioux	bca8ac39c2	validate: improve message printed in check_devices.yml The message prints the whole content of the registered variable in the playbook, this is not needed and makes the message pretty unclear and unreadable. ``` "msg": "{'_ansible_parsed': True, 'changed': False, '_ansible_no_log': False, u'err': u'Error: Could not stat device /dev/sdf - No such file or directory.\\n', 'item': u'/dev/sdf', '_ansible_item_result': True, u'failed': False, '_ansible_item_label': u'/dev/sdf', u'msg': u\"Error while getting device information with parted script: '/sbin/parted -s -m /dev/sdf -- unit 'MiB' print'\", u'rc': 1, u'invocation': {u'module_args': {u'part_start': u'0%', u'part_end': u'100%', u'name': None, u'align': u'optimal', u'number': None, u'label': u'msdos', u'state': u'info', u'part_type': u'primary', u'flags': None, u'device': u'/dev/sdf', u'unit': u'MiB'}}, 'failed_when_result': False, '_ansible_ignore_errors': None, u'out': u''} is not a block special file!" ``` Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1719023 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e6dc3ebd8c`)	2019-07-10 09:37:01 -04:00
Boris Ranto	5d5e7d59fd	dashboard: Use upstream default port We are currently using incorrect dashboard default port. The upstream uses 8443 instead of 8234 by default. This should get us closer to the upstream project. Signed-off-by: Boris Ranto <branto@redhat.com> (cherry picked from commit `21758fcee8`)	2019-07-10 11:49:35 +02:00
Dimitri Savineau	3bdcbb005f	ceph-dashboard: remove bool filter for rgw vars Some dashboard_rgw_api_* variables are using the bool filter but those variables are strings with an empty string as default value. So we should test the variable against an empty string instead of a bool. dashboard_rgw_api_host: '' dashboard_rgw_api_port: '' dashboard_rgw_api_scheme: '' dashboard_rgw_api_admin_resource: '' Resolves: #4179 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5413274412`)	2019-07-10 11:48:58 +02:00
Dimitri Savineau	c040c34d97	ceph-iscsi: Update gateway config/template - Remove gateway_keyring from the configuration file because it's not used in ceph-iscsi 3.x release. - Use config_template instead of template module for iscsi-gateway configuration file. Because the file is an ini file and we might want to override more parameters than those present in ceph-ansible. - Because we can now set the pool name in the configuration, we should use a variable for that. This is refact with the iscsi_pool_* variables also used to configure the pool size. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1f2a4f1910`)	2019-07-10 09:35:21 +00:00
Dimitri Savineau	f13e6642a4	ceph-handler: fix cluster name in socket path `c90f605b5` introduces the default ceph cluster name value in the rgw socket path for the rgw restart script. But this should use the `cluster` variable instead. This commit also fixes this in the osd restart script. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `de7f948b75`)	2019-07-08 19:57:08 +00:00
ilyashestopalov	5c6a9e1a96	ceph-mon: Fix cluster name parameter The ability to add nodes with the monitor role to an existing cluster whose name differs from the default name is fixed. Signed-off-by: ilyashestopalov <usr.tester@yandex.ru> (cherry picked from commit `904532c5e2`)	2019-07-08 09:12:37 -04:00
fmount	ca378f1da0	Add package-install tag on ceph-grafana-dashboard pkg install. According to the OSP pattern, we need the package-install tag to control what is installed on the host. This commit just add the missing tag to meet the TripleO requirements. See: /issues/4197 for details Fixes: #4197 Signed-off-by: fmount <fpantano@redhat.com> (cherry picked from commit `95bd002b35`)	2019-07-08 10:42:41 +00:00
Dimitri Savineau	cd7156efee	ceph-iscsi-gw: Update log directories bind mount On containerized deployment we need to bind mount the ceph-iscsi directory to avoid writing the logs in the container. The /var/log/ceph directory isn't use by rbd-targe-api/gw services because they have their own log directories. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `91bef94b6c`)	2019-07-07 07:09:42 +00:00
Guillaume Abrioux	689605b084	iscsi: refact deprecated variables This commit moves some old variables into ceph-defaults so we can move the `use_new_ceph_iscsi` fact in ceph-facts role in order. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a781ce881c`)	2019-07-04 00:04:04 +00:00
Mike Christie	ce62ac7beb	igw: Add check for missing iqn If the user is still using the older packages and does not setup the target iqn you will just get a vague error message later on. This adds a check during the validate task, so it is clear to the user. Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `08a6d10c32`)	2019-07-04 00:04:04 +00:00
Mike Christie	cb8bab06d8	igw: Update iscsigws.yml.sample for ceph-iscsi support Update iscsigws.yml.sample to document that we cannot use ansible to setup iSCSI objects and use the new ceph-iscsi package. Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `75fee55d19`)	2019-07-04 00:04:04 +00:00
Mike Christie	6872f7ee95	igw: Support ceph-iscsi package for install This adds support for the ceph-iscsi package during install. ceph-iscsi does not support setting up targets/gws, luns and clients with the current library/igw_* code. Going forward those tasks should be done with gwcli or dashboard. ceph-iscsi will only be used if the user has no iscsi objects setup so we do not break existing setups. The next patch will update the iscsigws.yml.sample to document that users must not setup any iscsi object if they want to use the new package and tools. Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `cbe66cec52`)	2019-07-04 00:04:04 +00:00
Mike Christie	f180eccb84	igw: drop gateway_ip_list for container setups The gateway_ip_list is not used in container setups, so drop it for that case. Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `b7b2213be1`)	2019-07-04 00:04:04 +00:00
Mike Christie	f984db5544	igw: move gateway_ip_list check to validate role Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `d89d3e7cd6`)	2019-07-04 00:04:04 +00:00
Dimitri Savineau	d4a3e26534	ceph-handler: Fix rgw socket in restart script Since Mimic the radosgw socket has two extra fields in the socket name (before the .asok suffix): <pid>.<ctid> Before: /var/run/ceph/ceph-client.rgw.cephaio-1.asok After: /var/run/ceph/ceph-client.rgw.cephaio-1.16913.23928832.asok The radosgw restart script doesn't handle this and could fail during an upgrade. If the SOCKETS variable isn't defined in the script then the test command won't fail because the return code is 0 $ test -S $ echo $? 0 There multiple issues in that script: - The default SOCKETS value isn't defined due to a typo SOCKET vs SOCKETS. - Because the socket name uses the pid then we need to check the socket name after the service restart. - After restarting the radosgw service we need to wait few seconds otherwise the socket won't be created. - Update the wget parameters because the command is doing a loop. We now use the same option than curl. - The check_rest function doesn't test the radosgw at all due to a wrong test command (test against a string) and always returns 0. This needs to use the DOCKER_EXECS variable in order to execute the command. $ test 'wget http://192.168.100.11:8080' $ echo $? 0 Also remove the test based on the ansible_fqdn because we only use the ansible_hostname + rgw instance name. Finally group all for loop into a single one. Resolves: #3926 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c90f605b51`)	2019-07-03 15:08:35 +00:00
Giulio Fidente	72e0ac1f44	Add radosgw_frontend_ssl_certificate parameter This is necessary when configuring RGW with SSL because in addition to passing specific frontend options, civetweb appends the 's' character to the binding port and beast uses ssl_endpoint instead of endpoint. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1722071 Signed-off-by: Giulio Fidente <gfidente@redhat.com> (cherry picked from commit `d526803c6c`)	2019-07-02 20:13:09 +00:00
Guillaume Abrioux	2295a4cf0a	containers: improve logging bindmount /var/log/ceph on all containers so it's possible to retrieve logs from the host. related ceph-container PR: ceph/ceph-container#1408 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1710548 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `33eed78d17`)	2019-07-02 11:27:34 -04:00
Guillaume Abrioux	381358f439	nfs: clean template remove legacy options ``` ganesha.nfsd-115[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:13): Unknown parameter (Dir_Max) ganesha.nfsd-115[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:14): Unknown parameter (Cache_FDs) ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b725b3077e`)	2019-07-02 11:01:07 +02:00
Dimitri Savineau	109883e7a5	ceph-osd: Add CONTAINER_IMAGE env variable This environment variable was added in `cb381b4` but was removed in `4d35e9e`. This commit reintroduces the change. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `02fbe76e62`)	2019-06-27 17:34:24 -04:00
Guillaume Abrioux	bcfed47009	dashboard: move ceph-grafana-dashboards package installation This commit moves the package installation into ceph-dashboard role. This is needed to install ceph dasboard json file in `/etc/grafana/dashboards/ceph-dashboard/`. Closes: #4026 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6e2e30db54`)	2019-06-26 12:03:21 -04:00
Guillaume Abrioux	df0d146166	infra: refact dashboard firewall rules - There is no need to open ports 3000, 8234, 9283 on all nodes. - Add missing rule for alertmanager (port 9093) Closes: #4023 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `14f5fc3c86`)	2019-06-26 12:03:21 -04:00
Guillaume Abrioux	28e1ce0d8c	dashboard: append mgr modules to ceph_mgr_modules when `dashboard_enabled` is `True`, let's append `dashboard` and `prometheus` modules to `ceph_mgr_modules` so they are automatically loaded. Closes: #4026 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a2b6f44665`)	2019-06-26 12:03:21 -04:00
fmount	5c009d01b6	Set grafana_server_addr fact for ipv6 scenarios. As the bz1721914 describes, the grafana_server_addr fact is not defined if ip_version used is ipv6. This commit adds the ip_version condition to set correctly this fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1721914 Signed-off-by: fmount <fpantano@redhat.com> (cherry picked from commit `e655038743`)	2019-06-26 12:02:29 -04:00
Guillaume Abrioux	b9c49227bb	facts: fix bug in grafana_server_addr fact setting If no grafana-server group is defined while an mgr group is, that task will fail because `hostvars[groups[grafana_server_group_name][0]` can't return anything since `groups['grafana-server']` will be a non existing key. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `366b309c12`)	2019-06-26 15:08:44 +02:00
Guillaume Abrioux	115b457731	nfs: add missing \| bool filters To address this warning: ``` [DEPRECATION WARNING]: evaluating nfs_ganesha_dev as a bare variable, this behaviour will go away and you might need to add \|bool to the expression in the future ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2b9fb377a8`)	2019-06-26 13:13:11 +02:00
Guillaume Abrioux	bf61b5e823	nfs: remove duplicate task This task is already present in pre_requisite_non_container.yml Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `edb8d42596`)	2019-06-26 13:13:11 +02:00
Dimitri Savineau	fbf4ed42ae	ceph-handler: Fix OSD restart script There's two big issues with the current OSD restart script. 1/ We try to test if the ceph osd daemon socket exists but we use a wildcard for the socket name : /var/run/ceph/*.asok. This fails because we usually have multiple ceph osd sockets (or other ceph daemon collocated) present in /var/run/ceph directory. Currently the test fails with: bash: line xxx: [: too many arguments But it doesn't stop the script execution. Instead we can specify the full ceph osd socket name because we already know the OSD id. 2/ The container filter pattern is wrong and could matches multiple containers resulting the script to fail. We use the filter with two different patterns. One is with the device name (sda, sdb, ..) and the other one is with the OSD id (ceph-osd-0, ceph-osd-15, ..). In both case we could match more than needed. $ docker container ls CONTAINER ID IMAGE NAMES 958121a7cc7d ceph-daemon:latest ceph-osd-strg0-sda 589a982d43b5 ceph-daemon:latest ceph-osd-strg0-sdb 46c7240d71f3 ceph-daemon:latest ceph-osd-strg0-sdaa 877985ec3aca ceph-daemon:latest ceph-osd-strg0-sdab $ docker container ls -q -f "name=sda" 958121a7cc7d 46c7240d71f3 877985ec3aca $ docker container ls CONTAINER ID IMAGE NAMES 2db399b3ee85 ceph-daemon:latest ceph-osd-5 099dc13f08f1 ceph-daemon:latest ceph-osd-13 5d0c2fe8f121 ceph-daemon:latest ceph-osd-17 d6c7b89db1d1 ceph-daemon:latest ceph-osd-1 $ docker container ls -q -f "name=ceph-osd-1" 099dc13f08f1 5d0c2fe8f121 d6c7b89db1d1 Adding an extra '$' character at the end of the pattern solves the problem. Finally removing the get_container_osd_id function because it's not used in the script at all. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `45d46541cb`)	2019-06-21 14:51:29 -04:00
Dimitri Savineau	6fd4902b55	Change ansible_lsb by ansible_distribution_release The ansible_lsb fact is based on the lsb package (lsb-base, lsb-release or redhat-lsb-core). If the package isn't installed on the remote host then the fact isn't populated. -------- "ansible_lsb": {}, -------- Switching to the ansible_distribution_release fact instead. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `dc187ea6fa`)	2019-06-21 13:36:15 -04:00
fpantano	c03a1e49dd	Add higher retry/delay defaults to check the quorum status. As per bz1718981, this commit adds higher values to check the quorum status. This is helpful for several OSP deployments that fail during the scale up. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1718981 Signed-off-by: fpantano <fpantano@redhat.com> (cherry picked from commit `ba73dc7b21`)	2019-06-20 20:03:19 -04:00
Dimitri Savineau	62d98971f2	ceph-volume: Set max open files limit on container The ceph-volume lvm list command takes ages to complete when having a lot of LV devices on containerized deployment. For instance, with 25 OSDs on a node it takes 3 mins 44s to list the OSD. Adding the max open files limit to the container engine cli when executing the ceph-volume command seems to improve a lot thee execution time ~30s. This was impacting the OSDs creation with ceph-volume (both filestore and bluestore) when using multiple LV devices. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b987534881`)	2019-06-20 20:00:53 -04:00
Dimitri Savineau	590f6026bb	roles: Remove useless become (true) flag We already set the become flag to true at a play level in the site* playbooks so we don't need to set it at a task level. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `7c3640177b`)	2019-06-20 22:00:27 +00:00
Guillaume Abrioux	52ff9ce5d1	facts: add a retry on get current fsid task sometimes it can happen the following task fails: ``` TASK [ceph-facts : get current fsid] ***************************************** task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-centos-container-update/roles/ceph-facts/tasks/facts.yml:78 Wednesday 19 June 2019 18:12:49 +0000 (0:00:00.203) 0:02:39.995 **** fatal: [mon2 -> mon1]: FAILED! => changed=true cmd: - timeout - --foreground - -s - KILL - 600s - docker - exec - ceph-mon-mon1 - ceph - --cluster - ceph - daemon - mon.mon1 - config - get - fsid delta: '0:00:00.239339' end: '2019-06-19 18:12:49.812099' msg: non-zero return code rc: 22 start: '2019-06-19 18:12:49.572760' stderr: 'admin_socket: exception getting command descriptions: [Errno 2] No such file or directory' stderr_lines: <omitted> stdout: '' stdout_lines: <omitted> ``` not sure exactly why since just before this task, mon1 seems to be well UP otherwise it wouldn't have passed the task `waiting for the containerized monitor to join the quorum`. As a quick fix/workaround, let's add a retry which allows us to get around this situation: ``` TASK [ceph-facts : get current fsid] *************************************** task path: /home/jenkins-build/build/workspace/ceph-ansible-scenario/roles/ceph-facts/tasks/facts.yml:78 Thursday 20 June 2019 15:35:07 +0000 (0:00:00.201) 0:03:47.288 ******* FAILED - RETRYING: get current fsid (3 retries left). changed: [mon2 -> mon1] => changed=true attempts: 2 cmd: - timeout - --foreground - -s - KILL - 600s - docker - exec - ceph-mon-mon1 - ceph - --cluster - ceph - daemon - mon.mon1 - config - get - fsid delta: '0:00:00.290252' end: '2019-06-20 15:35:13.960188' rc: 0 start: '2019-06-20 15:35:13.669936' stderr: '' stderr_lines: <omitted> stdout: \|- { "fsid": "153e159d-7ade-42a7-842c-4d04348b901e" } stdout_lines: <omitted> ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `46a2683944`)	2019-06-20 14:01:33 -04:00
Guillaume Abrioux	c245c4e8eb	osd: remove legacy task `parted_results` isn't used anymore in the playbook. By the way, `parted` seems to cause issue because it changes the ownership on devices: ``` root@osd0 ~]# ls -l /dev/sdc* brw-rw----. 1 root disk 8, 32 Jun 11 08:53 /dev/sdc brw-rw----. 1 ceph ceph 8, 33 Jun 11 08:53 /dev/sdc1 brw-rw----. 1 ceph ceph 8, 34 Jun 11 08:53 /dev/sdc2 [root@osd0 ~]# parted -s /dev/sdc print Model: ATA QEMU HARDDISK (scsi) Disk /dev/sdc: 53.7GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 1049kB 1075MB 1074MB ceph block.db 2 1075MB 2149MB 1074MB ceph block.db [root@osd0 ~]# #We can see ownerships have changed from ceph:ceph to root:disk: [root@osd0 ~]# ls -l /dev/sdc* brw-rw----. 1 root disk 8, 32 Jun 11 08:57 /dev/sdc brw-rw----. 1 root disk 8, 33 Jun 11 08:57 /dev/sdc1 brw-rw----. 1 root disk 8, 34 Jun 11 08:57 /dev/sdc2 [root@osd0 ~]# ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `eece362b38`)	2019-06-19 08:41:25 +00:00
Rishabh Dave	c51e0b51d2	align cephfs pool creation The definitions of cephfs pools should match openstack pools. Signed-off-by: Rishabh Dave <ridave@redhat.com> Co-Authored-by: Simone Caronni <simone.caronni@teralytics.net> (cherry picked from commit `67071c3169`)	2019-06-18 09:17:13 +02:00
Dimitri Savineau	6e565b251d	remove ceph-agent role and references The ceph-agent role was used only for RHCS 2 (jewel) so it's not usefull anymore. The current code will fail on CentOS distribution because the rhscon package is only avaible on Red Hat with the RHCS 2 repository and this ceph release is supported on stable-3.0 branch. Resolves: #4020 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `7503098ca0`)	2019-06-17 15:56:00 -04:00
Dimitri Savineau	b1f8518ef9	tests: Update ansible ssh_args variable Because we're using vagrant, a ssh config file will be created for each nodes with options like user, host, port, identity, etc... But via tox we're override ANSIBLE_SSH_ARGS to use this file. This remove the default value set in ansible.cfg. Also adding PreferredAuthentications=publickey because CentOS/RHEL servers are configured with GSSAPIAuthenticationis enabled for ssh server forcing the client to make a PTR DNS query. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `34f9d51178`)	2019-06-17 16:45:38 +02:00
Rishabh Dave	dc66a5e65a	ceph-infra: make chronyd default NTP daemon Since timesyncd is not available on RHEL-based OSs, change the default to chronyd for RHEL-based OSs. Also, chronyd is chrony on Ubuntu, so set the Ansible fact accordingly. Fixes: https://github.com/ceph/ceph-ansible/issues/3628 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `9d88d3199f`)	2019-06-14 12:21:02 +00:00
Guillaume Abrioux	6805eb3184	iscsi: assign application (rbd) to pool 'rbd' if we don't assign the rbd application tag on this pool, the cluster will get `HEALTH_WARN` state like following: ``` HEALTH_WARN application not enabled on 1 pool(s) POOL_APP_NOT_ENABLED application not enabled on 1 pool(s) application not enabled on pool 'rbd' ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4cf17a6fdd`)	2019-06-13 14:51:19 -04:00
Rishabh Dave	34e3b3f0e4	ceph-infra: update cache for Ubuntu Ubuntu-based CI jobs often fail with error code 404 while installing NTP daemons. Updating cache beforehand should fix the issue. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `d1c266e6c7`)	2019-06-13 14:50:19 -04:00
Guillaume Abrioux	b1a3b6e2f1	mon: enforce mon0 delegation for initial_mon_key register since this task is designed to be always run on the first monitor, let's enforce the container name accordingly otherwise it could fail like following: ``` fatal: [mon1 -> mon0]: FAILED! => changed=true cmd: - docker - exec - ceph-mon-mon1 - ceph - --cluster - ceph - --name - mon. - -k - /var/lib/ceph/mon/ceph-mon0/keyring - auth - get-key - mon. delta: '0:00:00.085025' end: '2019-06-12 06:12:27.677936' msg: non-zero return code rc: 1 start: '2019-06-12 06:12:27.592911' stderr: 'Error response from daemon: No such container: ceph-mon-mon1' stderr_lines: <omitted> stdout: '' stdout_lines: <omitted> ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `905c2256bd`)	2019-06-13 07:39:07 +02:00
Dimitri Savineau	f71e8f249f	ceph-node-exporter: Fix systemd template `069076b` introduced a bug in the systemd unit script template. This commit fixes the options used by the node-exporter container. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d0840217f3`)	2019-06-13 07:37:26 +02:00
Guillaume Abrioux	5e392d1a60	dashboard: add allow_embedding support Add a variable to support the allow_embedding support. See ceph/ceph-ansible/issues/4084 for details. Fixes: #4084 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `27856cc499`)	2019-06-12 17:05:26 -04:00
Guillaume Abrioux	dfdaef4158	dashboard: fix dashboard_url setting This setting must be set to something resolvable. See: ceph/ceph-ansible/issues/4085 for details Fixes: #4085 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2c9cd9d9e7`)	2019-06-12 17:04:57 -04:00
Dimitri Savineau	3815add534	ceph-handler: replace fuser by /proc/net/unix We're using fuser command to see if a process is using a ceph unix socket file. But the fuser command runs through every PID present in /proc/<PID> to see if one of them is using the file. On a system running thousands processes, the fuser command can take a long time to finish. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1717011 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `da9891da1e`)	2019-06-12 23:00:36 +02:00
Dimitri Savineau	7c6a09152d	ceph-node-exporter: use modprobe ansible module Instead of using the modprobe command from the path in the systemd unit script, we can use the modprobe ansible module. That way we don't have to manage the binary path based on the linux distribution. Resolves: #4072 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `dbf81b6b5b`)	2019-06-12 10:02:54 -04:00
fmount	138fa19ccf	Fix units and add ability to have a dedicated instance Few fixes on systemd unit templates for node_exporter and alertmanager container parameters. Added the ability to use a dedicated instance to deploy the dashboard components (prometheus and grafana). This commit also introduces the grafana_group_name variable to refer grafana group and keep consistency with the other groups. During the integration with TripleO some grafana/prometheus template variables resulted undefined. This commit adds the ability to check if the group exist and create, accordingly, different job groups in prometheus template. Signed-off-by: fmount <fpantano@redhat.com> (cherry picked from commit `069076bbfd`)	2019-06-12 11:48:12 +02:00
Guillaume Abrioux	d36bab5557	validate: fail in check_devices at the right task see https://bugzilla.redhat.com/show_bug.cgi?id=1648168#c17 for details. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1648168#c17 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `771648304d`)	2019-06-10 08:11:39 +02:00
Dimitri Savineau	376cb86db2	container-common: support podman on Ubuntu Currently we're only able to use podman on ubuntu if podman's installation is done manually before the ceph-ansible execution because the deb package is present in an external repository. We already manage the docker-ce installation via an external repository so we should be able to allow the podman installation with the same mechanism too. https://github.com/containers/libpod/blob/master/install.md#ubuntu Resolves: #3947 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `518ab794fb`)	2019-06-07 10:12:36 -04:00
Dimitri Savineau	e9edb5a92a	podman: Add systemd dependency on network.target When using podman, the systemd unit scripts don't have a dependency on the network. So we're not sure that the network is up and running when the containers are starting. With docker this behaviour is already handled because the systemd unit scripts depend on docker service which is started after the network. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f49090df7e`)	2019-06-07 16:06:26 +02:00
L3D	1daca1ba83	ansible: use 'bool' filter on boolean conditionals By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these: ``` [DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable, this behaviour will go away and you might need to add \|bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This feature will be removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. ``` Now appended ``\| bool`` on a lot of the affected variables. Sometimes the coding style from ``variable\|bool`` changed to ``variable \| bool`` (with spaces at the pipe). Closes: #4022 Signed-off-by: L3D <l3d@c3woc.de> (cherry picked from commit `ab54fe20ec`)	2019-06-07 16:05:51 +02:00
guihecheng	c52020a4db	Add role definitions of ceph-rgw-loadbalancer This add support for rgw loadbalancer based on HAProxy and Keepalived. We define a single role ceph-rgw-loadbalancer and include HAProxy and Keepalived configurations all in this. A single haproxy backend is used to balance all RGW instances and a single frontend is exported via a single port, default 80. Keepalived is used to maintain the high availability of all haproxy instances. You are free to use any number of VIPs. A single VIP is shared across all keepalived instances and there will be one master for one VIP, selected sequentially, and others serve as backups. This assumes that each keepalived instance is on the same node as one haproxy instance and we use a simple check script to detect the state of each haproxy instance and trigger the VIP failover upon its failure. Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com> (cherry picked from commit `35d40c65f8`)	2019-06-06 19:44:30 +00:00
Guillaume Abrioux	6449d8fd56	validate: add a check for nfs standalone if `nfs_obj_gw` is True when deploying an internal ganesha with an external ceph cluster, `ceph_nfs_rgw_access_key` and `ceph_nfs_rgw_secret_key` must be provided so the ganesha configuration file can be generated. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `003aeea45a`)	2019-06-06 12:44:37 +00:00
Guillaume Abrioux	cb125fa4c8	nfs: support internal Ganesha with external ceph cluster This commits allows to deploy an internal ganesha with an external ceph cluster. This requires to define `external_cluster_mon_ips` with a comma separated list of external monitors. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1710358 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6a6785b719`)	2019-06-06 12:44:37 +00:00
Guillaume Abrioux	61a52a97e3	ceph-osd: do not relabel /run/udev in containerized context Otherwise content in /run/udev is mislabeled and prevent some services like NetworkManager from starting. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `80875adba7`)	2019-06-04 22:09:27 +00:00
Guillaume Abrioux	3b40380870	tests: test podman against atomic os instead rhel8 the rhel8 image used is an outdated beta version, it is not worth it to maintain this image upstream, since it's possible to test podman with a newer version of centos/atomic-host image. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a78fb209b1`)	2019-06-04 22:09:27 +00:00
Dimitri Savineau	b8bcbacdbb	ceph-nfs: use template module for configuration `789cef7` introduces a regression in the ganesha configuration file generation. The new config_template module version broke it. But the ganesha.conf file isn't an ini file and doesn't really need to use the config_template module. Instead we can use the classic template module. Resolves: #4045 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `616c484698`)	2019-06-04 14:18:51 +02:00
Dimitri Savineau	acef6665ca	ceph-facts: generate fsid on mon node The fsid generation is done via a python command. When the ansible controller node only have python3 available (like RHEL 8) then the python command isn't necessarily present causing the fsid generation to fail. We already do some resource creation (like ceph keyring secret) with the python command too but from the mon node so we should do the same for fsid. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1714631 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `daf92a9e1f`)	2019-06-03 09:01:33 -04:00
Guillaume Abrioux	16c6d530c6	roles: introduce `ceph-container-engine` role This commit splits the current `ceph-container-common` role. This introduces a new role `ceph-container-engine` which handles the tasks specific to the installation of containers tools (docker/podman). This is needed for the ceph-dashboard implementation for 2 main reasons: 1/ Since the ceph-dashboard stack is only containerized, we must install everything needed to run containers even in non containerized deployments. Splitting this role allows us to not have to call the full `ceph-container-common` role which would run a bunch of unneeded tasks that would have been skipped anyway. 2/ The current implementation would have required to run `ceph-container-common` on all ceph-clients nodes which would have been conflicting with `9d3517c670` (we don't want to run ceph-container-common on all client nodes, see mentioned commit for more details) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `55420d6253`)	2019-05-22 15:24:11 -04:00
Dimitri Savineau	27bd7df5cf	ceph-mgr: install python-routes for dashboard The ceph mgr dashboard requires routes python library to be installed on the system. Resolves: #3995 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f37edfa113`)	2019-05-22 13:07:17 +02:00
Dimitri Savineau	6d521f1516	ceph-prometheus: fix error in templates - remove trailing double quotes in jinja templates - add jinja filename without .j2 suffix Resolves: #4011 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `29b0d47c8c`)	2019-05-22 08:45:31 +02:00
Dimitri Savineau	1fd81e8d42	common: use gnupg instead of gpg gpg package isn't available for all Debian/Ubuntu distribution but gnupg is. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `622d9feae9`)	2019-05-21 16:28:51 -04:00
Guillaume Abrioux	5982e17315	config: fix ipv6 As of nautilus, if you set `ms bind ipv6 = True` you must explicitly set `ms bind ipv4 = False` too, otherwise OSDs will still try to pick up an IPv4 address. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1710319 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6ca7372a2d`)	2019-05-21 16:26:54 -04:00
Dimitri Savineau	6e917da52a	ceph-nfs: apply selinux fix anyway Because ansible_distribution_version doesn't return minor version on CentOS with ansible 2.8 we can apply the selinux anyway but only for CentOS/RHEL 7. Starting RHEL 8, there's a dedicated package for selinux called nfs-ganesha-selinux [1]. Also replace the command module + semanage by the selinux_permissive module. [1] https://github.com/nfs-ganesha/nfs-ganesha/commit/a7911f Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0ee833432e`)	2019-05-21 09:17:46 +02:00
Dimitri Savineau	78ce0aa0b5	ceph-validate: use kernel validation for iscsi Ceph iSCSI gateway requires Red Hat Enterprise Linux or CentOS 7.5 or later. Because we can not check the ansible_distribution_version fact for CentOS with ansible 2.8 (returns only the major version) we can fallback by checking the kernel option. - CONFIG_TARGET_CORE=m - CONFIG_TCM_USER2=m - CONFIG_ISCSI_TARGET=m http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/ Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0c7fd79865`)	2019-05-21 09:17:46 +02:00
Guillaume Abrioux	d83db2c8ed	switch to ansible 2.8 - remove private attribute with import_role. - update documentation. - update rpm spec requirement. - fix MagicMock python import in unit tests. Closes: #3765 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `72d8315299`)	2019-05-21 09:17:46 +02:00
Dimitri Savineau	bcafb182c4	common: install dependencies for apt modules When using a minimal Debian/Ubuntu distribution there's no ca-certificates and gpg packages installed so the apt modules will fail: Failed to find required executable gpg in paths: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin apt.cache.FetchFailedException: W:https://download.ceph.com/debian-luminous/dists/bionic/InRelease: No system certificates available. Try installing ca-certificates. Resolves: #3994 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `494746b7a6`)	2019-05-20 10:45:46 +02:00
Guillaume Abrioux	1e2f8cd909	dashboard: move defaults variables to ceph-defaults There is no need to have default values for these variables in each roles since there is no corresponding host groups Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9f0d4d6847`)	2019-05-17 16:05:58 +02:00
Guillaume Abrioux	e29fd842a6	rename docker_exec_cmd variable This commit renames the `docker_exec_cmd` variable to `container_exec_cmd` so it's more generic. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e74d80e72f`)	2019-05-17 16:05:58 +02:00
Guillaume Abrioux	aa80895d19	dashboard: align the way containers are managed This commit aligns the way the different containers are managed with how it's currently done with the other ceph daemon. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `cc285c417a`)	2019-05-17 16:05:58 +02:00
Guillaume Abrioux	567c6ceb43	dashboard: convert dashboard_rgw_api_no_ssl_verify to a bool make `dashboard_rgw_api_no_ssl_verify` a bool variable since it seems to be used as it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `cd5f3fca64`)	2019-05-17 16:05:58 +02:00
Guillaume Abrioux	c38c72d914	dashboard: remove legacy file this file seems to be no longer used, let's remove it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8bbcc46ae4`)	2019-05-17 16:05:58 +02:00
Guillaume Abrioux	79ad697af7	dashboard: set less permissive permissions on dashboard certificate/key use `0440` instead of `0644` is enough Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `14f381200d`)	2019-05-17 16:05:58 +02:00
Guillaume Abrioux	c45906e0ac	dashboard: simplify config-key command since stable-4.0 isn't to deploy ceph releases prior to nautilus, there's no need to add this complexity here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4405f50c85`)	2019-05-17 16:05:58 +02:00
Guillaume Abrioux	fe5bcc2f9f	dashboard: do not call ceph-container-common from other role use site.yml to deploy ceph-container-common in order to install docker even in non-containerized deployments since there's no RPM available to deploy the differents applications needed for ceph-dashboard. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `cdff0da7d4`)	2019-05-17 16:05:58 +02:00
Guillaume Abrioux	c48c3776be	dashboard: use existing variable to detect containerized deployment there is no need to add more complexity for this, let's use `containerized_deployment` in order to detect if we are running a containerized deployment. The idea is to use `container_exec_cmd` the same way we do in the rest of the playbook to run the different ceph commands needed to deploy the ceph-dashboard role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `742bb6214c`)	2019-05-17 16:05:58 +02:00
Guillaume Abrioux	4702194d6e	facts: set container_binary fact in non-containerized deployment This is needed for the ceph-dashboard implementation since it requires to run containerized application which aren't packaged as RPMs. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6d9dbb1d39`)	2019-05-17 16:05:58 +02:00
Guillaume Abrioux	997d179b7c	dashboard: rename template files add .j2 to all templates file related to dashboard roles. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3578d576a4`)	2019-05-17 16:05:58 +02:00
Boris Ranto	db3f0088fc	dashboard: Support podman This adds support for podman in dashboard-related roles. It also drops the creation of custom network for the dashboard-related roles as this functionality works in a different way with podman. Signed-off-by: Boris Ranto <branto@redhat.com> (cherry picked from commit `b4d1c3693b`)	2019-05-17 16:05:58 +02:00
Boris Ranto	5a85be9502	dashboard: Set ssl_server_port if it is supported We cannot use the old fashioned config-key way, here. It was not supported when the option was introduced (post 14.2.0). Since the option is not always supported we can simply ignore the potential failure on ceph clusters that do not support it. Signed-off-by: Boris Ranto <branto@redhat.com> (cherry picked from commit `e737a1f83e`)	2019-05-17 16:05:58 +02:00
Boris Ranto	fda901fff9	dashboard: Add and copy alerting rules This commit adds a list of alerting rules for ceph-dashboard from the old cephmetrics project. It also installs the configuration file so that the rules get recognized by the prometheus server. Signed-off-by: Boris Ranto <branto@redhat.com> (cherry picked from commit `8f77caa932`)	2019-05-17 16:05:58 +02:00
Boris Ranto	5ac7559736	Merge cephmetrics/dashboard-ansible repo This commit will merge dashboard-ansible installation scripts with ceph-ansible. This includes several new roles to setup ceph-dashboard and the underlying technologies like prometheus and grafana server. Signed-off-by: Boris Ranto & Zack Cerza <team-gmeno@redhat.com> Co-authored-by: Zack Cerza <zcerza@redhat.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2f141a6e80`)	2019-05-17 16:05:58 +02:00
Dimitri Savineau	bd33bcef2b	container-common: allow podman for other distros Currently podman installation is very tied to RHEL 8 even if we're able to install it on Debian/Ubuntu distribution. This patch changes the way we are starting or not the (fat) container daemon. Before the condition was based on the distribution release and now on the container_service_name variable. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d2ad191eca`)	2019-05-13 10:36:22 -04:00
Bruceforce	f34c1dcd9d	ceph-nfs: fixed with_items If we do this in one line we get the error described in #3968 fixes #3968 Signed-off-by: Bruceforce <markus.greis@gmx.de> (cherry picked from commit `c3b0ee30a1`)	2019-05-13 10:36:12 -04:00
Dimitri Savineau	6a48ff8a37	Update RHCS version with Nautilus RHCS 4 will be based on Nautilus and only usable on RHEL 8. Updated the default ceph_rhcs_version to 4 and update the rhcs repositories to rhcs 4 with RHEL 8. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ba49225eab`)	2019-05-13 16:23:24 +02:00
Bruceforce	a007be17b7	ceph-nfs: fixed condition for "stable repos specific tasks" The old condition would resolve to "when": "nfs_ganesha_stable - ceph_repository == 'community'" now it is "when": [ "nfs_ganesha_stable", "ceph_repository == 'community'" ] Please backport to stable-4.0 Signed-off-by: Bruceforce <markus.greis@gmx.de> (cherry picked from commit `29f2c953b4`)	2019-05-13 11:05:40 +02:00
Kevin Coakley	e1b5b20111	Set the rgw_create_pools pools application to rgw Set the application to rgw for pools created from rgw_create_pools. On Ceph Nautilus the heath is set to HEALTH_WARN with the message "application not enabled on X pool(s)" if an application isn't specified for a pool. Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu> (cherry picked from commit `381c58ca3e`)	2019-05-13 11:05:14 +02:00
Rishabh Dave	8959ed50a5	ceph-mds: group similar tasks in create_mds_filesystem.yml Group similar tasks together using block keyword. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `1a4dccdbb9`)	2019-05-10 15:54:40 +02:00
Rishabh Dave	238a2696a6	ceph-rbd-mirror: refactor tasks/main.yml Use blocks for similar tasks in main.yml. And move when keywords before block keywords. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `121b5e4184`)	2019-05-10 15:54:16 +02:00
Guillaume Abrioux	cc6127d669	facts: fix external cluster bug running an external ceph cluster deployment with (obviously) no monitors defined in inventory breaks with an undefined error because `_monitor_addresses` never get defined. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707460 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `936c6fca78`)	2019-05-09 08:30:33 +02:00
Rishabh Dave	9e6b2e3bc5	don't access other node's docker_exec_cmd variable Except for some corner case, it's not correct to access some other node's copy of variable docker_exec_cmd. Therefore replace "hostvars[groups[mon_group_name][0]]['docker_exec_cmd']" by "docker_exec_cmd". Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `89748d579a`)	2019-05-07 17:56:30 +02:00
Rishabh Dave	df95900913	ceph-mgr: create keys for MGRs Add code in ceph-mgr for creating a keyring for manager in so that managers can be deployed on a separate node too. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `56bfec7c58`)	2019-05-07 15:12:29 +02:00
Gaudenz Steinlin	29650e71d8	Fix check mode support Adds "check_mode: no" to commands which register cluster state in a variable and don't modify anything. These commands have to run in order to support running the playbook in check mode. Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch> (cherry picked from commit `3c8987c7a5`)	2019-05-07 13:07:45 +02:00
Rishabh Dave	06b3ab2a6b	improve coding style Keywords requiring only one item shouldn't express it by creating a list with single item. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `739a662c80`) Conflicts: roles/ceph-mon/tasks/ceph_keys.yml roles/ceph-validate/tasks/check_devices.yml	2019-05-06 15:09:06 +00:00
Dimitri Savineau	4752327340	ansible: remove private and static attribute This will be removed in ansible 2.8 and breaks the playbook execution with this release. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `ae266c6f2b`)	2019-05-02 20:21:26 -04:00
Dimitri Savineau	2eb7642ad3	ceph-mds: Increase cpu limit to 4 In containerized deployment the default mds cpu quota is too low for production environment. This is causing performance degradation compared to bare-metal. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695850 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1999cf3d19`)	2019-04-30 12:12:01 -04:00
Dimitri Savineau	d8688e0eb9	ceph-osd: Increase cpu limit to 4 In containerized deployment the default osd cpu quota is too low for production environment using NVMe devices. This is causing performance degradation compared to bare-metal. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695880 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c17106874c`)	2019-04-30 12:11:42 -04:00
Dimitri Savineau	e29a8a1f31	ceph-iscsi: start tcmu-runner for non-container Only rbd-target-api and rbd-target-gw were started/enabled for non containerized deployment. The issue doesn't happen with containerized setup. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `4ae5ce399b`)	2019-04-29 23:03:59 +00:00
Rishabh Dave	ebd2ae520d	ceph-config: remove redundant condition on a block Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-04-25 13:51:58 +02:00
Rishabh Dave	cad35d5c52	"when" keyword should precede "block" keyword Otherwise the reader is forced to search for "when" when blocks are too long. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `e0beaf123a`) Conflicts: roles/ceph-config/tasks/main.yml roles/ceph-container-common/tasks/pre_requisites/prerequisites.yml roles/ceph-validate/tasks/check_devices.yml	2019-04-24 16:25:43 +02:00
Kyle Bader	cd0eddc460	rgw: add cpuset support 1/ The OSD already supports cpuset to be used for containerized deployments through the use of the ceph_osd_docker_cpuset_cpus variable. This adds similar support to the RGW service for containerized deployments by setting a new variable named ceph_rgw_docker_cpuset_cpus. Like the OSD, there are times where using distinct cores has advantages over using the CFS in kernel scheduler. ceph_rgw_docker_cpuset_cpus accepts a comma delimited set of CPU ids 2/ Add support for specifying --cpuset-mem variable to restrict the cgroup's memory allocations to a particular numa node, which should typically correspond with the cpu ids of that numa node that were provided with --cpuset-cpus. To ensure the correct cpu ids are used one can run `numactl --hardware` to list the nodes and which cpu ids correspond to each. Signed-off-by: Kyle Bader <kbader@redhat.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0bee90b201`)	2019-04-23 09:09:32 +02:00
Radu Toader	6e02e5faae	Allow CephFS pool to be created with specific rule_name, erasure_profile just like rbd pools Signed-off-by: Radu Toader <radu.m.toader@gmail.com> (cherry picked from commit `b2f242660e`)	2019-04-20 06:40:08 +00:00
Dimitri Savineau	f770917517	ceph-container-common: modify requirement flow Until now it was not possible to install a specific container package because it was somehow hardcoded. This patch allows to override the container package name (docker.io vs docker-ce) and refacts the package installation. This could be achieve via the container_package_name variable. Instead of using one task per distribution we can set the package and service name in vars. This allows to have a unified package task. Also refactorize the debian_prerequisites tasks because the content was outdated. https://docs.docker.com/install/linux/docker-ce/debian/ https://docs.docker.com/install/linux/docker-ce/ubuntu/ Resolves: #3609 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `8105a1cefb`)	2019-04-19 04:07:22 +00:00
Andrew Schoen	545d93aae8	rolling_update: set num_osds to the number of running osds We do this so that the ceph-config role can most accurately report the number of osds for the generation of the ceph.conf file. We don't want to use ceph-volume to determine the number of osds because in an upgrade to nautilus ceph-volume won't be able to accurately count osds created by ceph-disk. Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `67453853ff`)	2019-04-18 19:12:13 +02:00
Andrew Schoen	1e0e50fc90	ceph-osd: do not run lvm batch tasks during update When performing a rolling update do not try to create any new osds with `ceph-volume lvm batch`. This is troublesome because when upgrading to nautilus the devices list might contain devices that are currently being used by ceph-disk and have GPT headers on them, which will cause ceph-volume to fail when trying to use such a device. Any devices originally created by ceph-disk will need to be removed from the devices list before any new osds can be created. Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `5e3dfe5021`)	2019-04-18 19:12:13 +02:00
Dimitri Savineau	2d3c636fa8	ceph-mgr: Add extra module packages Since Nautilus there's mgr extra modules not present in ceph-mgr package but in dedicated packages. Resolves: #3860 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `86315272c7`)	2019-04-18 19:10:31 +02:00
Guillaume Abrioux	b4377f6163	update: refact msgr2 migration this commit refact the msgr2 protocol introduction. If it's a fresh install, let's go with v2 only. If we upgrade to nautilus, we should go with v2+v1 syntax to ensure nothing breaks. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a4bc7bda51`)	2019-04-18 19:10:10 +02:00
Dimitri Savineau	84d6bb226b	ceph-iscsi-gw: Remove library directory The library directory that contain the custom ceph modules in present in the ceph-ansible root directory. All igw_* mocules are already present there so we don't need the one present in roles/ceph-iscsi-gw/library. Also remove the associated spec file. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c8814d1331`)	2019-04-18 16:32:58 +02:00
Guillaume Abrioux	6b5487d1e5	mds: remove legacy task this task has nothing to do in stable-4.0 and after. Let's remove it since stable-4.0 and after aren't intended to deploy luminous. Closes: #3873 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `58f3851573`)	2019-04-18 10:15:43 -04:00
Dimitri Savineau	8edb064606	allow using ansible 2.8 Currently we only support ansible 2.7 We plan to use 2.8 when it will be release so we have to support both 2.7 and 2.8. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1700548 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e471bce76b`)	2019-04-17 18:14:58 +02:00
Guillaume Abrioux	3787c9b7ad	defaults: refact package dependencies installation. Because `5c98e361df` could be seen as a non backward compatible change this commit reverts it and bring back package dependencies installation support. Let's just modify the default value instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `edfa4310d3`)	2019-04-16 12:06:25 -04:00
Guillaume Abrioux	5aca0996ed	defaults: remove some package dependencies These packages aren't needed anymore. They were needed for ceph-init-detect buti as of ceph-init-detect doesn't exist anymore. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683885 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5c98e361df`)	2019-04-16 12:06:25 -04:00
Rishabh Dave	a3e4bf3796	check if mon daemon is installed before restarting it Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `96c180cc0e`)	2019-04-16 11:14:21 +02:00
Guillaume Abrioux	f8b69694cc	mon: check if an initial monitor keyring already exists When adding a new monitor, we must reuse the existing initial monitor keyring. Otherwise, the new monitor will issue its 'mkfs' with a new monitor keyring and it will result with a mismatch between them. The new monitor will be unable to join the quorum in the end. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `edf1ee2073`)	2019-04-16 11:14:21 +02:00
Guillaume Abrioux	22d39591a4	osd: remove legacy file this file is not used anymore, let's remove it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f899da3172`)	2019-04-12 00:45:21 +00:00
Guillaume Abrioux	692b1a8b9f	osd: remove ceph-disk scenarios files these files aren't needed anymore since we only use lvm scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4f68462009`)	2019-04-12 00:45:21 +00:00
Guillaume Abrioux	41e55a840f	osd: remove dedicated_devices variable This variable was related to ceph-disk scenarios. Since we are entirely dropping ceph-disk support as of stable-4.0, let's remove this variable. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f0416c8892`)	2019-04-12 00:45:21 +00:00
Guillaume Abrioux	4a663e1fc0	osd: remove variable osd_scenario As of stable-4.0, the only valid scenario is `lvm`. Thus, this makes this variable useless. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4d35e9eeed`)	2019-04-12 00:45:21 +00:00
Guillaume Abrioux	948a5e802e	osd: remove legacy file ceph_disk_cli_options_facts.yml is not used anymore, let's remove it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4d5637fd8a`)	2019-04-12 00:45:21 +00:00
Sébastien Han	89463939f2	validate: only check device when they are devices We only validate the devices that are passed if there is a list of devices to validate. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `2888c0825f`)	2019-04-12 00:45:21 +00:00
Sébastien Han	343a99c8b7	osd: default osd_scenario to lvm osd_scenario has become obsolete and defaults to lvm. With lvm there is no such things has collocated and non-collocated. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `52df15895b`)	2019-04-12 00:45:21 +00:00
Sébastien Han	279044155f	validate: print a message for old scenarios ceph-disk is not supported anymore, so all the newly created OSDs will be configured using ceph-volume. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `9ea1e49407`)	2019-04-12 00:45:21 +00:00
Sébastien Han	11c6655f57	osd: remove ceph-disk support We don't support the preparation of OSD with ceph-disk. ceph-volume is only supported. However, the start operation of OSD is still supported. So let's say you change a config option, the handlers will be able to restart all the OSDs via their respective systemd unit files. Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e2a5aa062e`)	2019-04-12 00:45:21 +00:00
Dimitri Savineau	c9a3def3a6	ceph-mds: Set application pool to cephfs We don't need to use the cephfs variable for the application pool name because it's always cephfs. If the cephfs variable is set to something else than the default value it will break the appplication pool task. Resolves: #3790 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d2efb7f02b`)	2019-04-11 17:47:21 +02:00
Guillaume Abrioux	f5f8d264e2	osds: allow passing devices by path ceph-volume didn't work when the devices where passed by path. Since it now support it, let's allow this feature in ceph-ansible Closes: #3812 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7e0adca7a4`)	2019-04-11 02:25:15 +00:00
Dimitri Savineau	1e944b6022	rgw: change default frontend on nautilus As discussed in ceph/ceph#26599, beast is now the default frontend for rados gateway with nautilus release. Add rgw_thread_pool_size variable with 512 as default value and keep backward compatibility with num_threads option when using civetweb. Update radosgw_civetweb_num_threads to reflect rgw_thread_pool_size change. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d17b1b48b6`)	2019-04-10 14:42:33 -04:00
Guillaume Abrioux	a718ddec50	mon: remove useless delegate_to Let's use a condition to run this task only on the first mon. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `631e5d3144`)	2019-04-10 09:52:29 +00:00
Matthew Vernon	a4d75c6ea6	UCA: Uncomment UCA variables in defaults, fix consequent breakage The Ubuntu Cloud Archive-related (UCA) defaults in roles/ceph-defaults/defaults/main.yml were commented out, which means if you set `ceph_repository` to "uca", you get undefined variable errors, e.g. ``` The task includes an option with an undefined variable. The error was: 'ceph_stable_repo_uca' is undefined The error appears to have been in '/nfs/users/nfs_m/mv3/software/ceph-ansible/roles/ceph-common/tasks/installs/debian_uca_repository.yml': line 6, column 3, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: add ubuntu cloud archive repository ^ here ``` Unfortunately, uncommenting these results in some other breakage, because further roles were written that use the fact of `ceph_stable_release_uca` being defined as a proxy for "we're using UCA", so try and install packages from the bionic-updates/queens release, for example, which doesn't work. So there are a few `apt` tasks that need modifying to not use `ceph_stable_release_uca` unless `ceph_origin` is `repository` and `ceph_repository` is `uca`. Closes: #3475 Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk> (cherry picked from commit `9dd913cf8a`)	2019-04-10 03:50:27 +00:00
Dimitri Savineau	4cc318d13c	container-common: Enable docker on boot for ubuntu docker daemon is automatically started during package installation but the service isn't enabled on boot. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `37816570c6`)	2019-04-10 00:02:35 +00:00
Rishabh Dave	c60915733a	allow adding a MDS to already deployed cluster Add a tox scenario that adds an new MDS node as a part of already deployed Ceph cluster and deploys MDS there. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `c0dfa9b61a`)	2019-04-09 16:48:59 +02:00
Dimitri Savineau	8715490223	ceph-facts: use last ipv6 address for mon/rgw When using monitor_address_block or radosgw_address_block variables to configure the mon/rgw address we're getting the first ip address from the ansible facts present in that cidr. When there's VIP on that network the first filter could return the wrong value. This seems to affect only IPv6 setup because the VIP addresses are added to the ansible facts at the beginning of the list. This is the opposite (at the end) when using IPv4. This causes the mon/rgw processes to bind on the VIP address. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1680155 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `fd4b0ec7eb`)	2019-04-09 10:48:14 -04:00
François Lafont	af78673328	ceph-rgw: Fix bad paths which depend on the clustername The path of the RGW environment file (in the /var/lib/ceph/radosgw/ directory) depends on the Ceph clustername. It was not taken into account in the Ansible role `ceph-rgw`. Signed-off-by: flaf <francois.lafont.1978@gmail.com> (cherry picked from commit `4c3e77d869`)	2019-04-09 10:44:45 -04:00

... 3 4 5 6 7 ...

2639 Commits (f344fe6f92a8f2b1efb699638bbc534c44375afb)