ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Harald Jensås	e8ed6655f3	Support comma-delimited subnets in firewall ceph.conf supports a comma separated list of subnet CIDR's for the public_network and the cluster network. ceph-ansible should support setting up the firewall for this configuration. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1767392 Closes: #4425 Related: #4333 https://docs.ceph.com/docs/nautilus/rados/configuration/network-config-ref/#network-config-settings Signed-off-by: Harald Jensås <hjensas@redhat.com> (cherry picked from commit `d94229204d`)	2019-11-01 11:00:18 -04:00
Dimitri Savineau	dd4a4cbb66	ceph-infra: Remove restart firewalld handler There's no need to restart firewalld service when a new rule is added due to the usage of the immediate flag. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b7338d438a`)	2019-11-01 11:00:18 -04:00
Dimitri Savineau	4cd53bfbe5	ceph-osd: Remove ulimit nofile on container start Even if this improves ceph-disk/ceph-volume performances then it also impact the ceph-osd process. The ceph-osd process shouldn't use 1024:4096 value for the max open files. Removing the ulimit option from the container engine and doing this kind of change on the container side [1]. [1] https://github.com/ceph/ceph-container/pull/1497 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9a996aef7f`)	2019-10-31 14:42:41 -04:00
Dimitri Savineau	f3fc97caa0	openstack_config: fix docker exec command container_exec_cmd should be replace by docker_exec_cmd. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1765110 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-24 14:13:52 -04:00
Guillaume Abrioux	1884506189	update: follow new recommandation to upgrade mds cluster Refact the mds cluster upgrade code in order to follow the documented recommandation. See: https://github.com/ceph/ceph/blob/luminous/doc/cephfs/upgrading.rst Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1569689 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `71cebf80a6`)	2019-10-21 15:44:38 -04:00
Guillaume Abrioux	8dc40711bb	common: do not override ceph_release when using custom repo Otherwise it fails like following: ``` TASK [ceph-mds : allow multimds] ************************************************************************************************************************************************ Monday 22 July 2019 16:37:38 +0800 (0:00:03.269) 0:13:25.651 ********* fatal: [rhel7u6clone1]: FAILED! => {"msg": "The conditional check 'ceph_release_num[ceph_release] == ceph_release_num.luminous' failed. The error was: error while evaluating conditional (ceph_release_num[ceph_release] == ceph_release_num.luminous): 'dict object' has no attribute u'dummy'\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-mds/tasks/create_mds_filesystems.yml': line 43, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: allow multimds\n ^ here\n"} ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4e9504c939`)	2019-10-17 20:10:57 -04:00
Dimitri Savineau	c8d0c4722c	rbd-mirror: fail if the peer is not added Due the 'failed_when: false' statement present in the peer task then the playbook continues to ran even if the peer task was failing (like incorrect remote peer format. "stderr": "rbd: invalid spec 'admin@cluster1'" This patch adds a task to list the peer present and add the peer only if it's not already added. With this we don't need the failed_when statement anymore. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1665877 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0b1e9c0737`)	2019-10-16 14:01:18 -04:00
Dimitri Savineau	1eea339f87	Remove validate action and notario dependency The current ceph-validate role is using both validate action and fail module tasks to validate the ceph configuration. The validate action is based on the notario python library. When one of the notario validation fails then a python stack trace is reported to the ansible task. This output isn't understandable by users. This patch removes the validate action and the notario depencendy. The validation is now done with only fail ansible module. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654790 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-15 18:05:16 +02:00
Guillaume Abrioux	70ac841153	validate: prevent from installing OSD on same disk as the OS This commit adds a validation task to prevent from installing an OSD on the same disk as the OS. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1623580 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `80e2d00b16`)	2019-10-11 09:44:20 -04:00
Dimitri Savineau	2e44b6af74	ceph-config: remove container_binary variable `9e7972a` introduced a regression via the container_binary variable which is undefined. The CEPH_CONTAINER_BINARY environment variable isn't used at all. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-08 00:44:13 +02:00
Dimitri Savineau	077b61a008	ceph-mgr: fix ceph_key module with container `556052b` changed the way the mgr keyring are created but the ceph_key module need the containerized parameter when the deployment is using containers. This module doesn't support CEPH_CONTAINER_[BINARY\|IMAGE] environment variables. Closes: #4547 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-07 16:05:43 -04:00
Guillaume Abrioux	b1fa3c881c	nfs: stop nfs server service in all context This commit moves this task in order to stop the nfs server service regardless the deployment type desired (containerized or non containerized). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6c6a512a72`)	2019-10-07 18:18:21 +02:00
Guillaume Abrioux	003017d568	nfs: stop nfs server service The syntax here wasn't working, this refact fixes this task. Also, removing the `ignore_errors: true` which was hidding the failure. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `47034effe0`)	2019-10-07 18:18:21 +02:00
Rishabh Dave	556052b235	ceph-mgr: create keys for MGRs Add code in ceph-mgr for creating a keyring for manager in so that managers can be deployed on a separate node too. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1552210 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `56bfec7c58`)	2019-10-04 13:15:26 +02:00
Dimitri Savineau	070db68ffd	ceph-handler: don't restart all OSDs with limit When using the ansible --limit option on one or few OSD nodes and if the handler is triggered then we will restart the OSD service on all OSDs nodes instead of the hosts limited by the limit value. Even if the play is limited by the --limit value we are using all OSD nodes from the OSD group. with_items: '{{ groups[osd_group_name] }}' Instead we should iterate only on the nodes present in both OSD group and limit list. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0346871fb5`)	2019-10-04 07:43:17 +02:00
Guillaume Abrioux	8a1bda6d91	osd: refact 'wait for all osd to be up' task let's use `until` instead of doing test in bash using python oneliner also, use `command` instead of `shell`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c76cd5ad84`)	2019-10-04 04:25:20 +02:00
Guillaume Abrioux	86c224e71d	validate: fix gpt header check Check for gpt header when osd scenario is lvm or lvm batch. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731310 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-10-01 09:59:31 -04:00
Andrew Schoen	1821efb3a2	ceph-config: do not always assume containers when calculating num_osds CEPH_CONTAINER_IMAGE should be None if containerized_deployment is False. Resolves: #4498 Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `70a4368bc5`)	2019-09-30 13:38:51 -04:00
Guillaume Abrioux	749d404e87	mon: use ceph_key module for containerized mgr keyring creation This commit replaces a `command` task with `ceph_key` in order to create mgr keyrings. This allows us to use `mode` parameter to set the right mode on generated keys. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1734513 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-09-25 11:30:41 -04:00
Dimitri Savineau	211dd2fcf6	ceph-osd: handle loop devices with containers Since we change the way to run the OSD containers with the ID instead of the device name, we lost the ability to use loop devices. Loop devices are like nvme or cciss devices because the partitions are referenced with an extra 'p' before the partition number. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1749097 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-09-25 16:11:29 +02:00
Guillaume Abrioux	9e7972a116	config: support num_osds fact setting in containerized deployment This part of the code must be supported in containerized deployment Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1664112 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fe1528adb4`)	2019-09-25 13:37:57 +02:00
Dimitri Savineau	28009496f6	ceph-handler: Fix osd restart condition In containerized deployment, the restart OSD handler couldn't be triggered in most ansible execution. This is due to the usage of run_once + a condition on the inventory hostname and the last filter. The run_once is triggered first so ansible will pick a node in the osd group to execute the restart task. But if this node isn't the last one in the osd group then the task is ignored. There's more probability that the task will be ignored than executed. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5b1c15653f`)	2019-09-10 16:53:38 -04:00
Dimitri Savineau	7347f32231	rbd-mirror: Allow to copy the admin keyring The ceph-rbd-mirror role allows to copy the admin keyring via the copy_admin_key variable but there's actually no task in that role doing the job. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1f505628dd`)	2019-09-10 16:38:48 -04:00
Dimitri Savineau	54926a825e	rbd-mirror: Use the rbd mirror client keyring The admin keyring isn't present by default on the rbd mirror nodes so the rbd commands related to the mirroring confguration will fail. Instead we can use the rbd mirror client keyring. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `a3d36df025`)	2019-09-10 16:38:48 -04:00
Giulio Fidente	e0e9fa47df	Look for additional names when checking ceph-nfs container status Ganesha cannot be operated active/active, in those deployments where it is managed by pacemaker the container name can be different than the default. This change uses "ceph_nfs_service_suffix" where previously missing to ensure tasks will work with customized names. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1750005 Signed-off-by: Giulio Fidente <gfidente@redhat.com> (cherry picked from commit `d2a2bd7c42`)	2019-09-09 16:48:59 -04:00
Dimitri Savineau	27217af369	rbd-mirror: configure pool and peer The rbd mirror configuration was only available for non containerized deployment and was also imcomplete. We now enable the mirroring on the pool and add the remote peer in both scenarios. The default mirroring mode is set to 'pool' but can be configured via the ceph_rbd_mirror_mode variable. This commit also fixes an issue on the rbd mirror command if the ceph cluster name isn't using the default value (ceph) due to a missing --cluster parameter to the command. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1665877 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `7e5e21741e`)	2019-09-09 12:13:24 -04:00
Dimitri Savineau	1f06875531	ceph-infra: Apply firewall rules with container We don't have a reason to not apply firewall rules on the host when using a containerized deployment. The TripleO environments already manage the ceph firewall rules outside ceph-ansible and set the configure_firewall variable to false. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1733251 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `771f25b1f8`)	2019-08-30 09:01:16 -04:00
Dimitri Savineau	1084d1c1b5	ceph-client: Use profile rbd in keyring caps Like the OpenStack keyrings, we can use the profile rbd for the clients keyring (both mon and osd). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `49aa05b96c`)	2019-08-28 09:42:33 -04:00
Dimitri Savineau	0be4c5116d	Revert "osd: add 'osd blacklist' cap for osp keyrings" This reverts commit `2d955757ee`. The "osd blacklist" isn't an osd caps but should be used with mon caps. Also the correct caps for this is: 'allow command "osd blacklist"'. The current change is breaking the openstack and clients keyrings. By using the profile rbd (which is already used) we already rely on the ability to blacklist dead client. Resolves: #4385 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `717af83475`)	2019-08-28 09:42:33 -04:00
Dimitri Savineau	7d2b29d0eb	ceph-osd: Add ulimit nofile on container start On containerized deployment, the OSD entrypoint runs some ceph-volume commands (lvm/simple scan and/or activate) which perform badly without the ulimit option. This option was added for all previous ceph-volume commands but not on the ceph-osd container startup. Also updating hard limit value to 4096 to reflect default baremetal value. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1744390 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `9a4ac46d19`)	2019-08-27 20:52:58 +02:00
Guillaume Abrioux	c32d690a4c	mgr: add a check task for all mgr to be up This can't be backported from master since there was too many modifications meantime. When mgr aren't all ready, sometimes the following error can show up: ``` stderr: 'Error ENOENT: all mgr daemons do not support module ''status'', pass --force to force enablement' ``` This commit adds a check so all mgr are available when we try to enable modules. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-22 17:11:19 +02:00
Guillaume Abrioux	12e61d190e	validate: fail if gpt header found on unprepared devices ceph-volume will complain if gpt headers are found on devices. This commit checks whether a gpt header is present on devices passed in `devices` variable and fail early. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1730541 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `487d701685`)	2019-08-22 16:59:34 +02:00
Guillaume Abrioux	81906344ee	osd: copy systemd-device-to-id.sh on all osd nodes before running it Otherwise it will fail when running rolling_update.yml playbook because of `serial: 1` usage. The task which copies the script is run against the current node being played only whereas the task which runs the script is run against all nodes in a loop, it ends up with the typical error: ``` 2019-08-08 17:47:05,115 p=14905 u=ubuntu \| failed: [magna023 -> magna030] (item=magna030) => { "changed": true, "cmd": [ "/usr/bin/env", "bash", "/tmp/systemd-device-to-id.sh" ], "delta": "0:00:00.004339", "end": "2019-08-08 17:46:59.059670", "invocation": { "module_args": { "_raw_params": "/usr/bin/env bash /tmp/systemd-device-to-id.sh", "_uses_shell": false, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true } }, "item": "magna030", "msg": "non-zero return code", "rc": 127, "start": "2019-08-08 17:46:59.055331", "stderr": "bash: /tmp/systemd-device-to-id.sh: No such file or directory", "stderr_lines": [ "bash: /tmp/systemd-device-to-id.sh: No such file or directory" ], "stdout": "", "stdout_lines": [] } ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1739209 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-12 21:57:29 +02:00
Guillaume Abrioux	a4f4dd7535	osd: add 'osd blacklist' cap for osp keyrings This commits adds the `osd blacklist` cap on all OSP clients keyrings. Fixes: #2296 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2d955757ee`)	2019-08-07 10:43:04 +02:00
Dimitri Savineau	d12e6e626d	rgw: add beast frontend Allow to configure the rgw beast frontend in addition to civetweb (default value). Add rgw_thread_pool_size variable with 512 as default value and keep backward compatibility with num_threads option when using civetweb. Update radosgw_civetweb_num_threads to reflect rgw_thread_pool_size change. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1733406 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d17b1b48b6`)	2019-08-01 10:10:09 +02:00
Dimitri Savineau	4dffcfb429	ceph-osd: check container engine rc for pools When creating OpenStack pools, we only check if the return code from the pool list command isn't 0 (ie: if it doesn't exist). In that case, the return code will be 2. That's why the next condition is rc != 0 for the pool creation. But in containerized deployment, the return code could be different if there's a failure on the container engine command (like container not running). In that case, the return code could but either 1 (docker) or 125 (podman) so we should fail at this point and not in the next tasks. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1732157 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d549fffdd2`)	2019-07-31 14:08:22 -04:00
Dimitri Savineau	5463d730ee	Remove NBSP characters Some NBSP are still present in the yaml files. Adding a test in travis CI. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `07c6695d16`)	2019-07-26 16:23:38 -04:00
Dimitri Savineau	bedc0ab69d	ceph-osd: use OSD id with systemd ceph-disk When using containerized deployment we have to create the systemd service unit based on a template. The current implementation with ceph-disk is using the device name as paramater to the systemd service and for the container name too. $ systemctl start ceph-osd@sdb $ docker ps --filter 'name=ceph-osd-' CONTAINER ID IMAGE NAMES 065530d0a27f ceph/daemon:latest-luminous ceph-osd-strg0-sdb This is the only scenario (compared to non containerized or ceph-volume based deployment) that isn't using the OSD id. $ systemctl start ceph-osd@0 $ docker ps --filter 'name=ceph-osd-' CONTAINER ID IMAGE NAMES d34552ec157e ceph/daemon:latest-luminous ceph-osd-0 Also if the device mapping doesn't persist to system reboot (ie sdb might be remapped to sde) then the OSD service won't come back after the reboot. This patch allows to use the OSD id with the ceph-osd systemd service but requires to activate the OSD manually with ceph-disk first in order to affect the ID to that OSD. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1670734 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-26 16:07:22 -04:00
Dimitri Savineau	df46d10c27	ceph-infra: update handler with daemon variable Both ntp and chrony daemon use variable for the service name because it could be different depending on the GNU/Linux distribution. This has been update in `9d88d3199` for chrony but only for the start part not for the handler. The commit fixes this for both ntp and chrony. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0ae0193144`)	2019-07-12 10:50:04 -04:00
Ramana Raja	9097f9847c	Install nfs-ganesha stable v2.7 nfs-ganesha v2.5 and 2.6 have hit EOL. Install nfs-ganesha v2.7 stable that is currently being maintained. Signed-off-by: Ramana Raja <rraja@redhat.com> (cherry picked from commit `dfff89ce67`)	2019-07-10 22:09:14 +02:00
Guillaume Abrioux	1716eea5e3	validate: improve message printed in check_devices.yml The message prints the whole content of the registered variable in the playbook, this is not needed and makes the message pretty unclear and unreadable. ``` "msg": "{'_ansible_parsed': True, 'changed': False, '_ansible_no_log': False, u'err': u'Error: Could not stat device /dev/sdf - No such file or directory.\\n', 'item': u'/dev/sdf', '_ansible_item_result': True, u'failed': False, '_ansible_item_label': u'/dev/sdf', u'msg': u\"Error while getting device information with parted script: '/sbin/parted -s -m /dev/sdf -- unit 'MiB' print'\", u'rc': 1, u'invocation': {u'module_args': {u'part_start': u'0%', u'part_end': u'100%', u'name': None, u'align': u'optimal', u'number': None, u'label': u'msdos', u'state': u'info', u'part_type': u'primary', u'flags': None, u'device': u'/dev/sdf', u'unit': u'MiB'}}, 'failed_when_result': False, '_ansible_ignore_errors': None, u'out': u''} is not a block special file!" ``` Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1719023 (cherry picked from commit `e6dc3ebd8c`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-10 09:37:12 -04:00
Dimitri Savineau	94cdef2757	ceph-handler: Fix rgw socket in restart script If the SOCKET variable isn't defined in the script then the test command won't fail because the return code is 0 $ test -S $ echo $? 0 There multiple issues in that script: - The default SOCKET value isn't defined. - Update the wget parameters because the command is doing a loop. We now use the same option than curl. - The check_rest function doesn't test the radosgw at all due to a wrong test command (test against a string) and always returns 0. This needs to use the DOCKER_EXEC variable in order to execute the command. $ test 'wget http://192.168.100.11:8080' $ echo $? 0 Resolves: #3926 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c90f605b51`)	2019-07-08 10:38:35 -04:00
Dimitri Savineau	9cc5d1e903	ceph-handler: Fix radosgw_address default value The rgw restart script set the RGW_IP variable depending on ansible variables: - radosgw_address - radosgw_address_block - radosgw_interface Those variables have default values defined in ceph-defaults role: radosgw_interface: interface radosgw_address: 0.0.0.0 radosgw_address_block: subnet But in the rgw restart script we always use the radosgw_address value instead of the radosgw_interface when defined because we aren't testing the right default value. As a consequence, the RGW_IP variable will be set to 0.0.0.0 even if the ip address associated to the radosgw_interface variable is set correctly. This causes the check_rest function to fail. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-07 07:24:38 +02:00
Dimitri Savineau	14f2d616ee	ceph-nfs: use template module for configuration `789cef7` introduces a regression in the ganesha configuration file generation. The new config_template module version broke it. But the ganesha.conf file isn't an ini file and doesn't really need to use the config_template module. Instead we can use the classic template module. Resolves: #4045 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `616c484698`)	2019-06-24 20:47:25 +02:00
Dimitri Savineau	d08af0a654	ceph-disk: Set max open files limit on container Same behaviour than ceph-volume (`b987534`). The ceph-disk command runs faster when using ulimit nofile with container cli. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-06-24 10:06:11 +02:00
Dimitri Savineau	2b492e3de1	ceph-handler: Fix OSD restart script There's two big issues with the current OSD restart script. 1/ We try to test if the ceph osd daemon socket exists but we use a wildcard for the socket name : /var/run/ceph/*.asok. This fails because we usually have multiple ceph osd sockets (or other ceph daemon collocated) present in /var/run/ceph directory. Currently the test fails with: bash: line xxx: [: too many arguments But it doesn't stop the script execution. Instead we can specify the full ceph osd socket name because we already know the OSD id. 2/ The container filter pattern is wrong and could matches multiple containers resulting the script to fail. We use the filter with two different patterns. One is with the device name (sda, sdb, ..) and the other one is with the OSD id (ceph-osd-0, ceph-osd-15, ..). In both case we could match more than needed. $ docker container ls CONTAINER ID IMAGE NAMES 958121a7cc7d ceph-daemon:latest ceph-osd-strg0-sda 589a982d43b5 ceph-daemon:latest ceph-osd-strg0-sdb 46c7240d71f3 ceph-daemon:latest ceph-osd-strg0-sdaa 877985ec3aca ceph-daemon:latest ceph-osd-strg0-sdab $ docker container ls -q -f "name=sda" 958121a7cc7d 46c7240d71f3 877985ec3aca $ docker container ls CONTAINER ID IMAGE NAMES 2db399b3ee85 ceph-daemon:latest ceph-osd-5 099dc13f08f1 ceph-daemon:latest ceph-osd-13 5d0c2fe8f121 ceph-daemon:latest ceph-osd-17 d6c7b89db1d1 ceph-daemon:latest ceph-osd-1 $ docker container ls -q -f "name=ceph-osd-1" 099dc13f08f1 5d0c2fe8f121 d6c7b89db1d1 Adding an extra '$' character at the end of the pattern solves the problem. Finally removing the get_container_osd_id function because it's not used in the script at all. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `45d46541cb`)	2019-06-21 14:49:55 -04:00
Dimitri Savineau	f4212b20e5	ceph-volume: Set max open files limit on container The ceph-volume lvm list command takes ages to complete when having a lot of LV devices on containerized deployment. For instance, with 25 OSDs on a node it takes 3 mins 44s to list the OSD. Adding the max open files limit to the container engine cli when executing the ceph-volume command seems to improve a lot thee execution time ~30s. This was impacting the OSDs creation with ceph-volume (both filestore and bluestore) when using multiple LV devices. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b987534881`)	2019-06-20 20:01:13 -04:00
Guillaume Abrioux	f29366b848	ceph-osd: do not relabel /run/udev in containerized context Otherwise content in /run/udev is mislabeled and prevent some services like NetworkManager from starting. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `80875adba7`)	2019-06-19 23:46:46 +02:00
Rishabh Dave	114078bfa1	ceph-infra: make chronyd default NTP daemon Since timesyncd is not available on RHEL-based OSs, change the default to chronyd for RHEL-based OSs. Also, chronyd is chrony on Ubuntu, so set the Ansible fact accordingly. Fixes: https://github.com/ceph/ceph-ansible/issues/3628 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `9d88d3199f`)	2019-06-18 10:46:34 +02:00
Rishabh Dave	93c7d8d79d	don't install NTPd on Atomic Since Atomic doesn't allow any installations and NTPd is not present on Atomic image we are using, abort when ntp_daemon_type is set to ntpd. https://github.com/ceph/ceph-ansible/issues/3572 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `bdff3e48fd`)	2019-06-18 10:46:34 +02:00
Dimitri Savineau	81de8a8106	remove ceph-agent role and references The ceph-agent role was used only for RHCS 2 (jewel) so it's not usefull anymore. The current code will fail on CentOS distribution because the rhscon package is only avaible on Red Hat with the RHCS 2 repository and this ceph release is supported on stable-3.0 branch. Resolves: #4020 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `7503098ca0`)	2019-06-17 14:42:08 -04:00
Dimitri Savineau	ed9b594b80	tests: Update ansible ssh_args variable Because we're using vagrant, a ssh config file will be created for each nodes with options like user, host, port, identity, etc... But via tox we're override ANSIBLE_SSH_ARGS to use this file. This remove the default value set in ansible.cfg. Also adding PreferredAuthentications=publickey because CentOS/RHEL servers are configured with GSSAPIAuthenticationis enabled for ssh server forcing the client to make a PTR DNS query. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `34f9d51178`)	2019-06-17 12:02:36 -04:00
Guillaume Abrioux	64659d2c82	iscsi: assign application (rbd) to pool 'rbd' if we don't assign the rbd application tag on this pool, the cluster will get `HEALTH_WARN` state like following: ``` HEALTH_WARN application not enabled on 1 pool(s) POOL_APP_NOT_ENABLED application not enabled on 1 pool(s) application not enabled on pool 'rbd' ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4cf17a6fdd`)	2019-06-13 14:43:25 +02:00
Dimitri Savineau	95f3908e44	ceph-handler: replace fuser by /proc/net/unix We're using fuser command to see if a process is using a ceph unix socket file. But the fuser command runs through every PID present in /proc/<PID> to see if one of them is using the file. On a system running thousands processes, the fuser command can take a long time to finish. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1717011 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `da9891da1e`)	2019-06-12 23:00:21 +02:00
Guillaume Abrioux	db90debcc7	validate: fail in check_devices at the right task see https://bugzilla.redhat.com/show_bug.cgi?id=1648168#c17 for details. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1648168#c17 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `771648304d`)	2019-06-10 08:09:58 +02:00
Dimitri Savineau	0b653ee5b4	update default rhcs values and docs The RHCS documentation mentionned in the default values and group_vars directory are referring to RHCS 2.x while it should be 3.x. Revolves: https://bugzilla.redhat.com/show_bug.cgi?id=1702732 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-06-04 14:18:23 +02:00
Guillaume Abrioux	5053f32c15	osds: allow passing devices by path ceph-volume didn't work when the devices where passed by path. Since it now support it, let's allow this feature in ceph-ansible Closes: #3812 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8f2c45dfd3`)	2019-05-09 14:21:43 +02:00
Dimitri Savineau	2fa8099fa7	osd: set default bluestore_wal_devices empty We only need to set the wal dedicated device when there's three tiers of storage used. Currently the block.wal partition will also be created on the same device than block.db. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1685253 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-25 07:13:38 +00:00
Dimitri Savineau	7418999638	ceph-mds: Increase cpu limit to 4 In containerized deployment the default mds cpu quota is too low for production environment. This is causing performance degradation compared to bare-metal. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695850 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1999cf3d19`)	2019-04-24 21:44:23 +00:00
Dimitri Savineau	54128db5cd	ceph-osd: Fix merge conflict from mergify The PR #3916 was merged automatically by mergify even if there was a confict in the ceph-osd-run.sh.j2 template. This commit resolves the conflict. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-24 12:41:23 -04:00
Dimitri Savineau	3ae2a687ed	ceph-osd: Increase cpu limit to 4 In containerized deployment the default osd cpu quota is too low for production environment using NVMe devices. This is causing performance degradation compared to bare-metal. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695880 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c17106874c`) # Conflicts: # roles/ceph-osd/templates/ceph-osd-run.sh.j2	2019-04-24 16:02:28 +00:00
Matthew Vernon	1556d802ff	ceph-mon: increase timeout waiting for admin and bootstrap keys With a large and/or busy cluster, it can take significantly more than 30s for a restarted monitor to get to the point where `ceph-create-keys` returns successfully. A recent upgrade of our production cluster failed here because it took a couple of minutes for the newly-upgraded `mon` to be ready. So increase the timeout significantly. This patch is applied to stable-3.2, because the affected code is refactored in stable-4.0 and ceph-create-keys is no longer called. Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>	2019-04-12 17:03:39 +00:00
Dimitri Savineau	56215d7688	ceph-mds: Set application pool to cephfs We don't need to use the cephfs variable for the application pool name because it's always cephfs. If the cephfs variable is set to something else than the default value it will break the appplication pool task. Resolves: #3790 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d2efb7f02b`)	2019-04-11 15:38:14 +00:00
Guillaume Abrioux	c5c354a61a	remove all NBSPs char in stable-3.2 branch this can cause issues, let's replace all of these chars with real spaces. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-10 13:27:48 +02:00
Matthew Vernon	a8c9b65d13	UCA: Uncomment UCA variables in defaults, fix consequent breakage The Ubuntu Cloud Archive-related (UCA) defaults in roles/ceph-defaults/defaults/main.yml were commented out, which means if you set `ceph_repository` to "uca", you get undefined variable errors, e.g. ``` The task includes an option with an undefined variable. The error was: 'ceph_stable_repo_uca' is undefined The error appears to have been in '/nfs/users/nfs_m/mv3/software/ceph-ansible/roles/ceph-common/tasks/installs/debian_uca_repository.yml': line 6, column 3, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: add ubuntu cloud archive repository ^ here ``` Unfortunately, uncommenting these results in some other breakage, because further roles were written that use the fact of `ceph_stable_release_uca` being defined as a proxy for "we're using UCA", so try and install packages from the bionic-updates/queens release, for example, which doesn't work. So there are a few `apt` tasks that need modifying to not use `ceph_stable_release_uca` unless `ceph_origin` is `repository` and `ceph_repository` is `uca`. Closes: #3475 Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk> (cherry picked from commit `9dd913cf8a`)	2019-04-09 16:54:37 +00:00
Dimitri Savineau	efa0083f3c	ceph-osd: Drop memory flag with bluestore Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `dc1c0dcee2`)	2019-04-09 13:26:20 +00:00
Dimitri Savineau	bbb8ca6643	mon/rgw: use last ipv6 address When using monitor_address_block or radosgw_address_block variables to configure the mon/rgw address we're getting the first ip address from the ansible facts present in that cidr. When there's VIP on that network the first filter could return the wrong value. This seems to affect only IPv6 setup because the VIP addresses are added to the ansible facts at the beginning of the list. This is the opposite (at the end) when using IPv4. This causes the mon/rgw processes to bind on the VIP address. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1680155 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-09 06:17:27 +02:00
Ali Maredia	e943288cae	rgw multisite: add more than 1 rgw to the master or secondary zone Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1664869 Signed-off-by: Ali Maredia <amaredia@redhat.com> (cherry picked from commit `37f46a8c5d`)	2019-04-06 08:50:30 +00:00
Dimitri Savineau	d1b3d18af1	radosgw: Raise cpu limit to 8 In containerized deployment the default radosgw quota is too low for production environment. This is causing performance degradation compared to bare-metal. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1680171 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d3ae9fd05f`)	2019-04-04 19:14:28 +02:00
Guillaume Abrioux	b92c826661	defaults: change default value for ceph_docker_image_tag Since nautilus has been released, it's now the latest stable release, it means the tag `latest` now refers to nautilus. `stable-3.2` isn't intended to deploy nautilus, therefore, we should change the default value for this variable to the latest release stable-3.2 is able to deploy (mimic). Closes: #3734 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-21 18:37:21 +00:00
Dimitri Savineau	e4a71eabd9	ceph-osd: Ensure lvm2 is installed When using osd_scenario lvm, we never check if the lvm2 package is present on the host. When using containerized deployment and docker on CentOS/RedHat this package will be automatically installed as a dependency but not for Ubuntu distribution. OSD deployed via ceph-volume require the lvmetad.socket to be active and running. Resolves: #3728 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `179fdfbc19`)	2019-03-20 22:59:28 +00:00
Guillaume Abrioux	d3f6556041	osd: backward compatibility with old disk_list.sh location Since all files in container image have moved to `/opt/ceph-container` this check must look for new AND the old path so it's backward compatible. Otherwise it could end up by templating an inconsistent `ceph-osd-run.sh`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `987bdac963`)	2019-03-18 21:56:53 +00:00
Dimitri Savineau	46e8898093	ceph-validate: fail if there's no ipaddr available in monitor_address_block subnet When using monitor_address_block to determine the ip address of the monitor node, we need an ip address available in that cidr to be present in the ansible facts (ansible_all_ipv[46]_addresses). Currently we don't check if there's an ip address available during the ceph-validate role. As a result, the ceph-config role fails due to an empty list during ceph.conf template creation but the error isn't explicit. TASK [ceph-config : generate ceph.conf configuration file] ***** fatal: [0]: FAILED! => {"msg": "No first item, sequence was empty."} With this patch we will fail before the ceph deployment with an explicit failure message. Resolves: rhbz#1673687 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5c39735be5`)	2019-03-18 18:31:18 +00:00
Gregory Orange	86e39a29c8	Change docker_container parameter network to network_mode Addressing "populate kv_store with custom ceph.conf": Unsupported parameters for (docker_container) module. Looking at https://docs.ansible.com/ansible/latest/modules/docker_container_module.html shows that the correct parameter is network_mode, not network. Signed-off-by: Gregory Orange <gregoryo2014@users.noreply.github.com>	2019-03-18 13:23:10 +00:00
Dimitri Savineau	bfa99cdd53	Set the default crush rule in ceph.conf Currently the default crush rule value is added to the ceph config on the mon nodes as an extra configuration applied after the template generation via the ansible ini module. This implies two behaviors: 1/ On each ceph-ansible run, the ceph.conf will be regenerated via ceph-config+template and then ceph-mon+ini_file. This leads to a non necessary daemons restart. 2/ When other ceph daemons are collocated on the monitor nodes (like mgr or rgw), the default crush rule value will be erased by the ceph.conf template (mon -> mgr -> rgw). This patch adds the osd_pool_default_crush_rule config to the ceph template and only for the monitor nodes (like crush_rules.yml). The default crush rule id is read (if exist) from the current ceph configuration. The default configuration is -1 (ceph default). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1638092 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d8538ad4e1`)	2019-03-14 14:48:03 +00:00
Dimitri Savineau	2f3206abeb	ceph-osd: Install numactl package when needed With `3e32dce` we can run OSD containers with numactl support. When using numactl command in a containerized deployment we need to be sure that the corresponding package is installed on the host. The package installation is only executed when the ceph_osd_numactl_opts variable isn't empty. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b7f4e3e7c7`)	2019-03-12 08:14:47 +00:00
Guillaume Abrioux	34086ec233	osd: support numactl options on OSD activate This commit adds OSD containers activate with numactl support. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1684146 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b3eb9206fa`)	2019-03-11 09:50:29 +00:00
VasishtaShastry	2393d82306	Extends check_devices tasks to non-collocated an lvm-batch scenarios Tuned name of a task and error message to make it more user understandable Fixes BZ 1648168 - ceph-validate : devices are not validated in non-collocated and lvm_batch scenario Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1648168 Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com> (cherry picked from commit `34c25ef49b`)	2019-03-01 04:06:57 +00:00
ToprHarley	d1051c8e55	Convert interface names to underscores Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1540881 Signed-off-by: Tomas Petr <tpetr@redhat.com> (cherry picked from commit `573adce7dd`)	2019-02-28 19:02:32 +00:00
Guillaume Abrioux	de3465b6a3	osd: add ipc=host in systemd template for containers in addition to `15812970f0` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d5be83e504`)	2019-02-28 13:48:39 +00:00
fpantano	1033411512	Removed not needed mountpoint and removed ubuntu section Referring to BZ#1683290, as dsavineau suggests, being this bug tripleO specific, removed the ubuntu section and removed useless mountpoints. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683290 Signed-off-by: fpantano <fpantano@redhat.com> (cherry picked from commit `21fad7ced3`)	2019-02-28 12:31:23 +00:00
fpantano	9b843c24f9	Added to the ceph-radosgw service template the ca-trust volume avoiding to expose useless information. This bug is referred to the following bugzilla: Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683290 Signed-off-by: fpantano <fpantano@redhat.com> (cherry picked from commit `0c1944236b`)	2019-02-28 12:31:23 +00:00
Kevin Coakley	2005d857df	Set permissions on monitor directory to u=rwX,g=rX,o=rX recursive Set directories to 755 and files to 644 to /var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} recursively instead of setting files and directories to 755 recursively. The ceph mon process writes files to this path with permissions 644. This update stops ansible from updating the permissions in /var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} every time ceph mon writes a file and increases idempotency. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683997 Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu> (cherry picked from commit `d327681b99`)	2019-02-28 10:52:04 +00:00
Dimitri Savineau	77596c791d	mon: Move client admin variable to defaults There's no need to set the client_admin_ceph_authtool_cap variable via a set_fact task. Instead we can set this in the role defaults. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `58a9d310d5`)	2019-02-27 20:03:13 +00:00
Dimitri Savineau	05c6ac4d78	mon: Add mds permissions to client.admin The administrator keyring needs full capabilities on mds like mon, osd and mgr. Whithout this, the client.admin key won't be able to run commands against mds (like ceph tell mds.0 session ls) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1672878 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `dd7b7604de`)	2019-02-27 20:03:13 +00:00
Guillaume Abrioux	8cc75e516c	common: do not override ceph_release when ceph_repository is 'rhcs' We shouldn't reset `ceph_release` with `ceph_stable_release` when `ceph_repository` is `rhcs` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2b60a35634`)	2019-02-21 13:03:16 +00:00
Guillaume Abrioux	d15b055854	osd: make the 'wait for all osd to be up' task configurable introduce two new variables to make the check that 'wait for all osd to be up' configurable. It's possible that for some deployments, OSDs can take longer to be seen as UP and IN. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1676763 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `21e5db8982`)	2019-02-20 16:53:06 +00:00
David Waiting	eba80adb1a	ensure at least one osd is up The existing task checks that the number of OSDs is equal to the number of up OSDs before continuing. The problem is that if none of the OSDs have been discovered yet, the task will exit immediately and subsequent pool creation will fail (num_osds = 0, num_up_osds = 0). This is related to Bugzilla 1578086. In this change, we also check that at least one OSD is present. In our testing, this results in the task correctly waiting for all OSDs to come up before continuing. Signed-off-by: David Waiting <david_waiting@comcast.com> (cherry picked from commit `3930791cb7`)	2019-02-19 19:02:16 +00:00
Patrick C. F. Ernzer	a43c68df7d	setup_ntp: call handler to disable ntpd if chronyd used The task setup chronyd called the handler disable chronyd, which of course defeats the purpose. Changing the task to disable ntpd instead fixes the issue of chronyd being disabled after it got enabled. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1673664 Fixes: #3582 Signed-off-by: Patrick C. F. Ernzer pcfe@redhat.com (cherry picked from commit `c605ff6a68`)	2019-02-15 09:09:36 +00:00
Guillaume Abrioux	6200f90ab2	iscsi: fix permission denied error Typical error: ``` fatal: [iscsi-gw0]: FAILED! => msg: 'an error occurred while trying to read the file ''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key'': [Errno 13] Permission denied: b''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key''' ``` `become: True` is not needed on the following task: `copy crt file(s) to gateway nodes`. Since it's already set in the main playbook (site.yml/site-container.yml) The thing is that the files get generated in the 'fetch_directory' with root user because there is a 'delegate_to' + we run the playbook with `become: True` (from main playbook). The idea here is to create files under ansible user so we can open them later to copy them on the remote machine. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9d590f4339`)	2019-02-11 16:17:44 +00:00
Leah Neukirchen	d855cb2595	Fix uses of default(omit) with string concatenation When {{omit}} is concatenated with another string, it expands to something like __omit_place_holder__63eea0d96dd6ed867b95405e11d87dddf61f448d. However, in these use-cases we need an empty string. Regression introduced in `d53f55e807`. Signed-off-by: Leah Neukirchen <leah.neukirchen@mayflower.de>	2019-02-08 11:01:11 +00:00
Sébastien Han	7db797d8df	osd: expose udev into the container In order to be able to retrieve udev information, we must expose its socket. As per, https://github.com/ceph/ceph/pull/25201 ceph-volume will start consuming udev output. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `997667a873`)	2019-02-06 00:37:11 +00:00
Guillaume Abrioux	303cc85754	osd: bind mount /var/run/udev/ without this, the command `ceph-volume lvm list --format json` hangs and takes a very long time to complete. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7ade032807`)	2019-02-06 00:37:11 +00:00
Guillaume Abrioux	af17e0dfbb	override ceph_release with ceph_stable_release when `ceph_origin` is set to `'repository'` and `ceph_repository` to `'community'` we need to ensure `ceph_release` reflect `ceph_stable_release`. `4a3f180f9d` simply removed the override while it should just have to be run only when the condition mentioned above is satisfied. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0bfefdd5bc`)	2019-01-24 14:18:34 +00:00
Guillaume Abrioux	e29cdd0a61	config: remove code related to ceph release prior to luminous This part of the code is not needed since ceph-ansible@master is intended to deploy ceph@master only. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1bbdde272f`)	2019-01-24 14:18:34 +00:00
Guillaume Abrioux	eaa92f7e55	ceph-default: rm useless condition This condition is useless and it's also creating issues we don't see in our CI. ceph_release is set by either ceph-common or ceph-docker-common so let's keep it this way. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379 (cherry picked from commit `e9188cd202`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-24 14:18:34 +00:00
Noah Watkins	e57e2d98a1	start_osds: use list instead of keys (re-introduce) the python3 fix merged by: https://github.com/ceph/ceph-ansible/pull/3346 was reintroduced a few days later by: `82a6b5adec` and this patch fixes it again :) Signed-off-by: Noah Watkins <nwatkins@redhat.com> (cherry picked from commit `3cf5fd2c3e`)	2019-01-16 15:48:35 +00:00
Sébastien Han	04d8002614	switch: do not fail on missing key Some people use the switch playbook to perform upgrade so they end up in the same situation than https://bugzilla.redhat.com/show_bug.cgi?id=1650572 This is applying the same fix as `729744c6a8`. We don't want to fail on key that are not present since they will get created after the mons are updated. They will be created by the task "create potentially missing keys (rbd and rbd-mirror)". Signed-off-by: Sébastien Han <seb@redhat.com>	2019-01-14 18:54:46 +00:00
Rishabh Dave	4e94d11aa7	ceph-infra: remove ntp_rmp.yml and ntp_debian.yml This commit fixes the merge conflict that occurred during the auto-backport and auto-merge of the commit `488281187e`. Also please note that the commit `488281187e` was merged (on PR 3477) "as it is" (despite of merge conflicts) which was not supposed to be the case ideally. This had a side-effect that the feature of supporting multiple NTP daemons (new ones are namely chronyd and timesyncd) was also backported which is itself against the convention. For consistency's sake the feature was backported to stable-3.1 as well. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-01-09 22:15:18 +01:00
Guillaume Abrioux	416b503476	introduce new role ceph-facts sometimes we play the whole role `ceph-defaults` just to access the default value of some variables. It means we play the `facts.yml` part in this role while it's not desired. Splitting this role will speedup the playbook. Closes: #3282 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0eb56e36f8`)	2019-01-07 09:14:10 +01:00

1 2 3 4 5 ...

2149 Commits (guits-3.2.53-cephdisk-osds)