ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Johannes Kastl	c721cb99cb	install ceph-mds packages on SUSE/openSUSE install packages on SUSE/openSUSE distributions, using the same logic as on RedHat-based distributions Fixes #4340 Signed-off-by: Johannes Kastl <kastl@b1-systems.de>	2019-08-21 09:57:56 +02:00
Guillaume Abrioux	9329bbb3af	handler: do not validate the server certificate against the CA Otherwise rgw handler ends up with an error when using https. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-20 13:52:15 +02:00
Johannes Kastl	504017d562	remove duplicate task installing suse dependencies roles/ceph-common/tasks/installs/install_on_suse.yml: remove the task that installs the dependencies, as this is done later in install_suse_packages.yml Signed-off-by: Johannes Kastl <kastl@b1-systems.de>	2019-08-20 12:59:25 +02:00
Guillaume Abrioux	70cf2a5846	osd: remove useless condition just like `ceph_osd_pool_default_size`, a pool size might change after an initial deployment. Having this condition prevents from customizing the pool in that case. This is not needed so let's remove it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-19 16:17:22 +02:00
Guillaume Abrioux	4df92152c0	common: replace shell module there is no need to use `shell` in these tasks. Let's use `command`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-14 16:42:02 +02:00
Guillaume Abrioux	687087fd43	osd: refact 'wait for all osd to be up' task let's use `until` instead of doing test in bash using python oneliner also, use `command` instead of `shell`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-14 16:42:02 +02:00
Guillaume Abrioux	13815ad3ca	common: use discovered_interpreter_python fact in order to use the right binary name when using python cli in command or shell module. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-14 16:42:02 +02:00
Guillaume Abrioux	a5e359ee80	osd: update the check for 'all osd to be up' the data structure has changed in octopus. eg: the path to `num_osds` is now `["osdmap"]["num_osds"]`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-14 16:42:02 +02:00
Guillaume Abrioux	5b9b841108	mgr: refact 'wait for all mgr to be up' task There's no need to use `shell` module here. Instead of using `\| python -c`, let's use `from_json` filter. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-07 10:33:54 +02:00
Dimitri Savineau	4c6ec1dccb	mgr/dashboard: Fix grafana/prometheus url config When configuring grafana/prometheus embed in the mgr/dashboard, we need to use the address of the grafana-server node and not the current hostname because mgr/dashboard and grafana/prometheus could be present on different hosts. We should instead rely on the grafana_server_addr variable and remove the dashboard_url. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-08-06 09:34:20 +02:00
Dimitri Savineau	f545b5be0d	ceph-dashboard: Add run_once on delegate tasks Because we need to execute commands from a monitor node (the first one in the mons list) we are using delegate_to option. If there's multiple nodes running the ceph-dashboard role then the delegated task will be executed multiple times. Also remove a mgr config-key option not present for nautilus+ releases. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-08-06 09:34:20 +02:00
Johannes Kastl	5ee3d96fb4	only support openSUSE Leap 15.x, fail on 42.x openSUSE switched from 'openSUSE 13.x' to 'openSUSE Leap 42.x' and then to 'openSUSE Leap 15.x' to align with SLES15 development. The previous logic did not correctly allow the current release, as 15.x matched the 'less than 42.3' condition. For now only support openSUSE Leap 15.x, and extend support once 16.x is released (or whatever the exact version will be) Signed-off-by: Johannes Kastl <kastl@b1-systems.de>	2019-08-05 09:46:31 -04:00
Dimitri Savineau	771f25b1f8	ceph-infra: Apply firewall rules with container We don't have a reason to not apply firewall rules on the host when using a containerized deployment. The TripleO environments already manage the ceph firewall rules outside ceph-ansible and set the configure_firewall variable to false. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1733251 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-08-01 15:16:49 +02:00
Dimitri Savineau	34036c667c	ceph-grafana: Set grafana uid/gid on files We don't need to create a grafana system user (in fact we even don't set the righ uid to this user) because we're using a container setup. Instead we just need to be sure to set the owner/group to 472 (grafana user/group from the container) like we do for ceph/167. We don't need to set the user/group recursively on /etc/grafana directory in a dedicated task. Also on Ubuntu system, the ceph-grafana-dashboards isn't present so on non containerized deployment we won't have the /etc/grafana/dashboards/ceph-dashboard directory present (coming with the package) so we need to be sure it exists. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-08-01 10:10:56 +02:00
Guillaume Abrioux	c9d80af4e0	dashboard: fix timeout usage on rgw user creation command For some reason, this is making the playbook failing like following: ``` TASK [ceph-dashboard : create radosgw system user] ********************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************** task path: /home/guits/ceph-ansible/roles/ceph-dashboard/tasks/configure_dashboard.yml:106 Tuesday 30 July 2019 10:04:54 +0200 (0:00:01.910) 0:11:22.319 ******** FAILED - RETRYING: create radosgw system user (3 retries left). FAILED - RETRYING: create radosgw system user (2 retries left). FAILED - RETRYING: create radosgw system user (1 retries left). fatal: [mgr0 -> mon0]: FAILED! => changed=true attempts: 3 cmd: timeout 20 podman exec ceph-mon-mon0 radosgw-admin user create --uid=ceph-dashboard --display-name='Ceph dashboard' --system delta: '0:00:20.021973' end: '2019-07-30 08:06:32.656066' msg: non-zero return code rc: 124 start: '2019-07-30 08:06:12.634093' stderr: 'exec failed: container_linux.go:336: starting container process caused "process_linux.go:82: copying bootstrap data to pipe caused \"write init-p: broken pipe\""' stderr_lines: <omitted> stdout: '' stdout_lines: <omitted> ``` using `timeout -f -s KILL` fixes this issue. Also, there is no need to use `shell` module here, let's switch to `command`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-30 13:52:44 +02:00
Guillaume Abrioux	2d955757ee	osd: add 'osd blacklist' cap for osp keyrings This commits adds the `osd blacklist` cap on all OSP clients keyrings. Fixes: #2296 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-29 09:57:25 -04:00
Dimitri Savineau	d549fffdd2	ceph-osd: check container engine rc for pools When creating OpenStack pools, we only check if the return code from the pool list command isn't 0 (ie: if it doesn't exist). In that case, the return code will be 2. That's why the next condition is rc != 0 for the pool creation. But in containerized deployment, the return code could be different if there's a failure on the container engine command (like container not running). In that case, the return code could but either 1 (docker) or 125 (podman) so we should fail at this point and not in the next tasks. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1732157 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-29 15:55:04 +02:00
Guillaume Abrioux	02beb00916	validate: add checks for grafana-server group definition this commit adds two checks: - check that the `[grafana-server]` group is defined - check that the `[grafana-server]` contains at least one node. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-29 14:42:45 +02:00
Guillaume Abrioux	ec33ee7574	mgr: fix a typo this tasks isn't using the right container_exec_cmd, that's delegating to the wrong node. Let's use the right fact to fix this command. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-29 14:42:45 +02:00
Guillaume Abrioux	b9cdf341be	dashboard: remove cfg80211 module installation According to this comment [1], this seems to be needed to detect wifi devices. In node exporter we can see this: ``` --collector.wifi Enable the wifi collector (default: disabled). ``` since it's enabled by default and we don't even change this in our systemd templates for node-exporter, we can easily assume in the end it's not needed. Therefore, let's remove this. [1] `dbf81b6b5b (diff-961545214e21efed3b84a9e178927a08L21-L23)` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-29 14:42:45 +02:00
Guillaume Abrioux	d67230b2a2	dashboard: use dedicated group only There's no need to add complexity and trying to fallback on other group. Let's deploy dashboard on all nodes present in grafana-server group. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-29 14:42:45 +02:00
Guillaume Abrioux	fb1b5b3251	dashboard: enable dashboard by default This commit enables dashboard deployment by default. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1726739 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-29 14:42:45 +02:00
Dimitri Savineau	07c6695d16	Remove NBSP characters Some NBSP are still present in the yaml files. Adding a test in travis CI. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-26 16:09:23 -04:00
Guillaume Abrioux	19950b5170	container: rename docker directories Those 2 directories should be renamed to be more generic (docker vs. podman). Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-24 16:31:46 +02:00
fmount	fac1b030cb	Avoid to setup provisioners in a fully containerized environment This commit adds a when clause to avoid the setup of grafana provisioners in a fully containerized scenario. This is needed when the ceph-grafana-dashboards package is not installed and this task could result in a wrong grafana configuration that let the container crash. Signed-off-by: fmount <fpantano@redhat.com>	2019-07-23 09:06:50 +02:00
Giulio Fidente	edd1420217	Fix backward compat with old cephfs_pools format Previously cephfs_pools items used to have a pgs: key but not pgp_num: nor pg_num: Signed-off-by: Giulio Fidente <gfidente@redhat.com>	2019-07-19 11:56:58 -04:00
Guillaume Abrioux	618dbf271d	handler: fix bug in osd handlers `fbf4ed42ae` introduced a bug when container binary is podman. podman doesn't support ps -f using regular expression, the container id is never set in the restart script causing the handler to fail. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1721536 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-18 16:22:51 +02:00
Guillaume Abrioux	487d701685	validate: fail if gpt header found on unprepared devices ceph-volume will complain if gpt headers are found on devices. This commit checks whether a gpt header is present on devices passed in `devices` variable and fail early. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1730541 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-18 07:43:55 +02:00
Dimitri Savineau	5383c2f7f3	ceph-dashboard: enable rgw options conditionally The dashboard rgw frontend options only need to be applied when there's some nodes present in the rgw ansible group. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-18 07:22:13 +02:00
Dimitri Savineau	8ab9b719fa	dashboard: use variables for port value The current port value for alertmanager, grafana, node-exporter and prometheus is hardcoded in the roles so it's not possible to change the port binding of those services. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-18 07:22:13 +02:00
Dimitri Savineau	0ae0193144	ceph-infra: update handler with daemon variable Both ntp and chrony daemon use variable for the service name because it could be different depending on the GNU/Linux distribution. This has been update in `9d88d3199` for chrony but only for the start part not for the handler. The commit fixes this for both ntp and chrony. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-12 09:14:33 -04:00
Dimitri Savineau	41b44dde85	ceph-infra: Open prometheus port The Prometheus porrt 9090 isn't open in the firewall configuration. Also the dashboard task on the grafana node was not required because it's already present on the mgr node. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-11 13:40:22 +02:00
Guillaume Abrioux	ee29f7370a	handler: remove legacy condition since everything is already in a block with the same condition, it's not needed to leave all of them on these tasks. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-10 09:42:00 -04:00
Guillaume Abrioux	e6dc3ebd8c	validate: improve message printed in check_devices.yml The message prints the whole content of the registered variable in the playbook, this is not needed and makes the message pretty unclear and unreadable. ``` "msg": "{'_ansible_parsed': True, 'changed': False, '_ansible_no_log': False, u'err': u'Error: Could not stat device /dev/sdf - No such file or directory.\\n', 'item': u'/dev/sdf', '_ansible_item_result': True, u'failed': False, '_ansible_item_label': u'/dev/sdf', u'msg': u\"Error while getting device information with parted script: '/sbin/parted -s -m /dev/sdf -- unit 'MiB' print'\", u'rc': 1, u'invocation': {u'module_args': {u'part_start': u'0%', u'part_end': u'100%', u'name': None, u'align': u'optimal', u'number': None, u'label': u'msdos', u'state': u'info', u'part_type': u'primary', u'flags': None, u'device': u'/dev/sdf', u'unit': u'MiB'}}, 'failed_when_result': False, '_ansible_ignore_errors': None, u'out': u''} is not a block special file!" ``` Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1719023 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-10 09:32:11 -04:00
Dimitri Savineau	1f2a4f1910	ceph-iscsi: Update gateway config/template - Remove gateway_keyring from the configuration file because it's not used in ceph-iscsi 3.x release. - Use config_template instead of template module for iscsi-gateway configuration file. Because the file is an ini file and we might want to override more parameters than those present in ceph-ansible. - Because we can now set the pool name in the configuration, we should use a variable for that. This is refact with the iscsi_pool_* variables also used to configure the pool size. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-10 09:44:40 +02:00
Dimitri Savineau	5413274412	ceph-dashboard: remove bool filter for rgw vars Some dashboard_rgw_api_* variables are using the bool filter but those variables are strings with an empty string as default value. So we should test the variable against an empty string instead of a bool. dashboard_rgw_api_host: '' dashboard_rgw_api_port: '' dashboard_rgw_api_scheme: '' dashboard_rgw_api_admin_resource: '' Resolves: #4179 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-10 09:42:37 +02:00
Boris Ranto	21758fcee8	dashboard: Use upstream default port We are currently using incorrect dashboard default port. The upstream uses 8443 instead of 8234 by default. This should get us closer to the upstream project. Signed-off-by: Boris Ranto <branto@redhat.com>	2019-07-10 09:17:36 +02:00
Dimitri Savineau	de7f948b75	ceph-handler: fix cluster name in socket path `c90f605b5` introduces the default ceph cluster name value in the rgw socket path for the rgw restart script. But this should use the `cluster` variable instead. This commit also fixes this in the osd restart script. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-08 13:55:35 -04:00
fmount	95bd002b35	Add package-install tag on ceph-grafana-dashboard pkg install. According to the OSP pattern, we need the package-install tag to control what is installed on the host. This commit just add the missing tag to meet the TripleO requirements. See: /issues/4197 for details Fixes: #4197 Signed-off-by: fmount <fpantano@redhat.com>	2019-07-08 10:54:30 +02:00
Dimitri Savineau	91bef94b6c	ceph-iscsi-gw: Update log directories bind mount On containerized deployment we need to bind mount the ceph-iscsi directory to avoid writing the logs in the container. The /var/log/ceph directory isn't use by rbd-targe-api/gw services because they have their own log directories. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-07 07:25:33 +02:00
ilyashestopalov	904532c5e2	ceph-mon: Fix cluster name parameter The ability to add nodes with the monitor role to an existing cluster whose name differs from the default name is fixed. Signed-off-by: ilyashestopalov <usr.tester@yandex.ru>	2019-07-07 07:21:29 +02:00
Guillaume Abrioux	a781ce881c	iscsi: refact deprecated variables This commit moves some old variables into ceph-defaults so we can move the `use_new_ceph_iscsi` fact in ceph-facts role in order. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-03 22:13:19 +02:00
Mike Christie	08a6d10c32	igw: Add check for missing iqn If the user is still using the older packages and does not setup the target iqn you will just get a vague error message later on. This adds a check during the validate task, so it is clear to the user. Signed-off-by: Mike Christie <mchristi@redhat.com>	2019-07-03 22:13:19 +02:00
Mike Christie	75fee55d19	igw: Update iscsigws.yml.sample for ceph-iscsi support Update iscsigws.yml.sample to document that we cannot use ansible to setup iSCSI objects and use the new ceph-iscsi package. Signed-off-by: Mike Christie <mchristi@redhat.com>	2019-07-03 22:13:19 +02:00
Mike Christie	cbe66cec52	igw: Support ceph-iscsi package for install This adds support for the ceph-iscsi package during install. ceph-iscsi does not support setting up targets/gws, luns and clients with the current library/igw_* code. Going forward those tasks should be done with gwcli or dashboard. ceph-iscsi will only be used if the user has no iscsi objects setup so we do not break existing setups. The next patch will update the iscsigws.yml.sample to document that users must not setup any iscsi object if they want to use the new package and tools. Signed-off-by: Mike Christie <mchristi@redhat.com>	2019-07-03 22:13:19 +02:00
Mike Christie	b7b2213be1	igw: drop gateway_ip_list for container setups The gateway_ip_list is not used in container setups, so drop it for that case. Signed-off-by: Mike Christie <mchristi@redhat.com>	2019-07-03 22:13:19 +02:00
Mike Christie	d89d3e7cd6	igw: move gateway_ip_list check to validate role Signed-off-by: Mike Christie <mchristi@redhat.com>	2019-07-03 22:13:19 +02:00
Dimitri Savineau	c90f605b51	ceph-handler: Fix rgw socket in restart script Since Mimic the radosgw socket has two extra fields in the socket name (before the .asok suffix): <pid>.<ctid> Before: /var/run/ceph/ceph-client.rgw.cephaio-1.asok After: /var/run/ceph/ceph-client.rgw.cephaio-1.16913.23928832.asok The radosgw restart script doesn't handle this and could fail during an upgrade. If the SOCKETS variable isn't defined in the script then the test command won't fail because the return code is 0 $ test -S $ echo $? 0 There multiple issues in that script: - The default SOCKETS value isn't defined due to a typo SOCKET vs SOCKETS. - Because the socket name uses the pid then we need to check the socket name after the service restart. - After restarting the radosgw service we need to wait few seconds otherwise the socket won't be created. - Update the wget parameters because the command is doing a loop. We now use the same option than curl. - The check_rest function doesn't test the radosgw at all due to a wrong test command (test against a string) and always returns 0. This needs to use the DOCKER_EXECS variable in order to execute the command. $ test 'wget http://192.168.100.11:8080' $ echo $? 0 Also remove the test based on the ansible_fqdn because we only use the ansible_hostname + rgw instance name. Finally group all for loop into a single one. Resolves: #3926 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-03 09:30:33 +02:00
Giulio Fidente	d526803c6c	Add radosgw_frontend_ssl_certificate parameter This is necessary when configuring RGW with SSL because in addition to passing specific frontend options, civetweb appends the 's' character to the binding port and beast uses ssl_endpoint instead of endpoint. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1722071 Signed-off-by: Giulio Fidente <gfidente@redhat.com>	2019-07-02 14:14:37 -04:00
Guillaume Abrioux	b725b3077e	nfs: clean template remove legacy options ``` ganesha.nfsd-115[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:13): Unknown parameter (Dir_Max) ganesha.nfsd-115[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:14): Unknown parameter (Cache_FDs) ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-28 15:09:19 -04:00

1 2 3 4 5 ...

2353 Commits (c721cb99cbcecdc7e234568096e49874358eff49)