ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	787a6e879e	update: use ids to restart osds instead of device name we must use the ids instead of device names in the tasks executed in `post_tasks` for the osd rolling update otherwise it ends up with old systemd units enabled. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1739209 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-13 13:42:58 +02:00
Guillaume Abrioux	81906344ee	osd: copy systemd-device-to-id.sh on all osd nodes before running it Otherwise it will fail when running rolling_update.yml playbook because of `serial: 1` usage. The task which copies the script is run against the current node being played only whereas the task which runs the script is run against all nodes in a loop, it ends up with the typical error: ``` 2019-08-08 17:47:05,115 p=14905 u=ubuntu \| failed: [magna023 -> magna030] (item=magna030) => { "changed": true, "cmd": [ "/usr/bin/env", "bash", "/tmp/systemd-device-to-id.sh" ], "delta": "0:00:00.004339", "end": "2019-08-08 17:46:59.059670", "invocation": { "module_args": { "_raw_params": "/usr/bin/env bash /tmp/systemd-device-to-id.sh", "_uses_shell": false, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true } }, "item": "magna030", "msg": "non-zero return code", "rc": 127, "start": "2019-08-08 17:46:59.055331", "stderr": "bash: /tmp/systemd-device-to-id.sh: No such file or directory", "stderr_lines": [ "bash: /tmp/systemd-device-to-id.sh: No such file or directory" ], "stdout": "", "stdout_lines": [] } ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1739209 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-12 21:57:29 +02:00
Guillaume Abrioux	5b29144bbd	tests: deploy mgr on a dedicated node (all_daemons scenario) let's deploy mgr on a dedicated node. This makes update job failing on stable-4.0 branch since there's a mismatch between the two inventories. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-08 13:43:29 +02:00
Guillaume Abrioux	a4f4dd7535	osd: add 'osd blacklist' cap for osp keyrings This commits adds the `osd blacklist` cap on all OSP clients keyrings. Fixes: #2296 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2d955757ee`)	2019-08-07 10:43:04 +02:00
Dimitri Savineau	343eec7a53	shrink-osd: Stop ceph-disk container based on ID Since `bedc0ab` we now manage ceph-osd systemd unit scripts based on ID instead of device name but it was not present in the shrink-osd playbook (ceph-disk version). To keep backward compatibility on deployment that didn't do yet the transition on OSD id then we should stop unit scripts for both device and ID. This commit adds the ulimit nofile container option to get better performance on ceph-disk commands. It also fixes an issue when the OSD id matches multiple OSD ids with the same first digit. $ ceph-disk list \| grep osd.1 /dev/sdb1 ceph data, prepared, cluster ceph, osd.1, block /dev/sdb2 /dev/sdg1 ceph data, prepared, cluster ceph, osd.12, block /dev/sdg2 Finally removing the shrinked OSD directory. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-08-06 09:38:52 +02:00
Dimitri Savineau	d12e6e626d	rgw: add beast frontend Allow to configure the rgw beast frontend in addition to civetweb (default value). Add rgw_thread_pool_size variable with 512 as default value and keep backward compatibility with num_threads option when using civetweb. Update radosgw_civetweb_num_threads to reflect rgw_thread_pool_size change. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1733406 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d17b1b48b6`)	2019-08-01 10:10:09 +02:00
Dimitri Savineau	4dffcfb429	ceph-osd: check container engine rc for pools When creating OpenStack pools, we only check if the return code from the pool list command isn't 0 (ie: if it doesn't exist). In that case, the return code will be 2. That's why the next condition is rc != 0 for the pool creation. But in containerized deployment, the return code could be different if there's a failure on the container engine command (like container not running). In that case, the return code could but either 1 (docker) or 125 (podman) so we should fail at this point and not in the next tasks. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1732157 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d549fffdd2`)	2019-07-31 14:08:22 -04:00
Dimitri Savineau	bf8bd4c0f1	tests: Update ooo-collocation scenario The ooo-collocation scenario was still using an old container image and doesn't match the requirement on latest stable-3.2 code. We need to use at least the container image v3.2.5. Also updating the OSD tests to reflect the changes introduced by the commit `bedc0ab` because we don't have the OSD systemd unit script using device name anymore. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-30 08:27:13 +02:00
Dimitri Savineau	5463d730ee	Remove NBSP characters Some NBSP are still present in the yaml files. Adding a test in travis CI. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `07c6695d16`)	2019-07-26 16:23:38 -04:00
Dimitri Savineau	bedc0ab69d	ceph-osd: use OSD id with systemd ceph-disk When using containerized deployment we have to create the systemd service unit based on a template. The current implementation with ceph-disk is using the device name as paramater to the systemd service and for the container name too. $ systemctl start ceph-osd@sdb $ docker ps --filter 'name=ceph-osd-' CONTAINER ID IMAGE NAMES 065530d0a27f ceph/daemon:latest-luminous ceph-osd-strg0-sdb This is the only scenario (compared to non containerized or ceph-volume based deployment) that isn't using the OSD id. $ systemctl start ceph-osd@0 $ docker ps --filter 'name=ceph-osd-' CONTAINER ID IMAGE NAMES d34552ec157e ceph/daemon:latest-luminous ceph-osd-0 Also if the device mapping doesn't persist to system reboot (ie sdb might be remapped to sde) then the OSD service won't come back after the reboot. This patch allows to use the OSD id with the ceph-osd systemd service but requires to activate the OSD manually with ceph-disk first in order to affect the ID to that OSD. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1670734 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-26 16:07:22 -04:00
Dimitri Savineau	df46d10c27	ceph-infra: update handler with daemon variable Both ntp and chrony daemon use variable for the service name because it could be different depending on the GNU/Linux distribution. This has been update in `9d88d3199` for chrony but only for the start part not for the handler. The commit fixes this for both ntp and chrony. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `0ae0193144`)	2019-07-12 10:50:04 -04:00
Ramana Raja	9097f9847c	Install nfs-ganesha stable v2.7 nfs-ganesha v2.5 and 2.6 have hit EOL. Install nfs-ganesha v2.7 stable that is currently being maintained. Signed-off-by: Ramana Raja <rraja@redhat.com> (cherry picked from commit `dfff89ce67`)	2019-07-10 22:09:14 +02:00
Guillaume Abrioux	1716eea5e3	validate: improve message printed in check_devices.yml The message prints the whole content of the registered variable in the playbook, this is not needed and makes the message pretty unclear and unreadable. ``` "msg": "{'_ansible_parsed': True, 'changed': False, '_ansible_no_log': False, u'err': u'Error: Could not stat device /dev/sdf - No such file or directory.\\n', 'item': u'/dev/sdf', '_ansible_item_result': True, u'failed': False, '_ansible_item_label': u'/dev/sdf', u'msg': u\"Error while getting device information with parted script: '/sbin/parted -s -m /dev/sdf -- unit 'MiB' print'\", u'rc': 1, u'invocation': {u'module_args': {u'part_start': u'0%', u'part_end': u'100%', u'name': None, u'align': u'optimal', u'number': None, u'label': u'msdos', u'state': u'info', u'part_type': u'primary', u'flags': None, u'device': u'/dev/sdf', u'unit': u'MiB'}}, 'failed_when_result': False, '_ansible_ignore_errors': None, u'out': u''} is not a block special file!" ``` Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1719023 (cherry picked from commit `e6dc3ebd8c`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-10 09:37:12 -04:00
Guillaume Abrioux	d739f41549	shrink-osd: (ceph-disk only) remove prepare container When shrinking an OSD, its corresponding 'prepare container' should be removed otherwise it prevent from redeploying a new osd because of this leftover. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-09 09:04:19 -04:00
Guillaume Abrioux	4b49013369	shrink-osd: (ceph-disk only) remove gpt header Removing the gpt header on devices will ease ceph-disk to ceph-volume migration when using shrink-osd + add-osd playbooks. ceph-disk requires GPT header where ceph-volume will complain if GPT header is present. That won't break ceph-disk (re)deployment since we check and add the GPT header if needed when deploying ceph-disk ODs. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1613735 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-09 09:04:19 -04:00
Dimitri Savineau	94cdef2757	ceph-handler: Fix rgw socket in restart script If the SOCKET variable isn't defined in the script then the test command won't fail because the return code is 0 $ test -S $ echo $? 0 There multiple issues in that script: - The default SOCKET value isn't defined. - Update the wget parameters because the command is doing a loop. We now use the same option than curl. - The check_rest function doesn't test the radosgw at all due to a wrong test command (test against a string) and always returns 0. This needs to use the DOCKER_EXEC variable in order to execute the command. $ test 'wget http://192.168.100.11:8080' $ echo $? 0 Resolves: #3926 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c90f605b51`)	2019-07-08 10:38:35 -04:00
Dimitri Savineau	9cc5d1e903	ceph-handler: Fix radosgw_address default value The rgw restart script set the RGW_IP variable depending on ansible variables: - radosgw_address - radosgw_address_block - radosgw_interface Those variables have default values defined in ceph-defaults role: radosgw_interface: interface radosgw_address: 0.0.0.0 radosgw_address_block: subnet But in the rgw restart script we always use the radosgw_address value instead of the radosgw_interface when defined because we aren't testing the right default value. As a consequence, the RGW_IP variable will be set to 0.0.0.0 even if the ip address associated to the radosgw_interface variable is set correctly. This causes the check_rest function to fail. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-07-07 07:24:38 +02:00
Gabriel Ramirez	9c31811c33	validate.py: Fix alphabetical order on uca Alphabetized ceph_repository_uca keys due to errors validating when using UCA/queens repository on Ubuntu 16.04 An exception occurred during task execution. To see the full traceback, use -vvv. The error was: SchemaError: -> ceph_stable_repo_uca schema item is not alphabetically ordered Closes: #4154 Signed-off-by: Gabriel Ramirez <gabrielramirez1109@gmail.com> (cherry picked from commit `82262c6e8c`)	2019-06-25 11:36:04 -04:00
Dimitri Savineau	14f2d616ee	ceph-nfs: use template module for configuration `789cef7` introduces a regression in the ganesha configuration file generation. The new config_template module version broke it. But the ganesha.conf file isn't an ini file and doesn't really need to use the config_template module. Instead we can use the classic template module. Resolves: #4045 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `616c484698`)	2019-06-24 20:47:25 +02:00
Guillaume Abrioux	8b91905dff	purge: ensure no ceph kernel thread is present This tries to first unmount any cephfs/nfs-ganesha mount point on client nodes, then unmap any mapped rbd devices and finally it tries to remove ceph kernel modules. If it fails it means some resources are still busy and should be cleaned manually before continuing to purge the cluster. This is done early in the playbook so the cluster stays untouched until everything is ready for that operation, otherwise if you try to redeploy a cluster it could end up by getting confused by leftover from previous deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1337915 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `20e4852888`)	2019-06-24 15:36:21 +02:00
Guillaume Abrioux	520f4e9914	add-osd: fix error in validate execution role ceph-facts should be run before we play ceph-validate since it has reference to facts that are set in ceph-facts role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-24 14:36:18 +02:00
Guillaume Abrioux	27aad73471	tests: add nfs-ganesha testing This was removed because of broken repositories which made the CI failing. That doesn't make sense anymore so adding back it Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-24 10:35:42 +02:00
Dimitri Savineau	d08af0a654	ceph-disk: Set max open files limit on container Same behaviour than ceph-volume (`b987534`). The ceph-disk command runs faster when using ulimit nofile with container cli. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-06-24 10:06:11 +02:00
Dimitri Savineau	2b492e3de1	ceph-handler: Fix OSD restart script There's two big issues with the current OSD restart script. 1/ We try to test if the ceph osd daemon socket exists but we use a wildcard for the socket name : /var/run/ceph/*.asok. This fails because we usually have multiple ceph osd sockets (or other ceph daemon collocated) present in /var/run/ceph directory. Currently the test fails with: bash: line xxx: [: too many arguments But it doesn't stop the script execution. Instead we can specify the full ceph osd socket name because we already know the OSD id. 2/ The container filter pattern is wrong and could matches multiple containers resulting the script to fail. We use the filter with two different patterns. One is with the device name (sda, sdb, ..) and the other one is with the OSD id (ceph-osd-0, ceph-osd-15, ..). In both case we could match more than needed. $ docker container ls CONTAINER ID IMAGE NAMES 958121a7cc7d ceph-daemon:latest ceph-osd-strg0-sda 589a982d43b5 ceph-daemon:latest ceph-osd-strg0-sdb 46c7240d71f3 ceph-daemon:latest ceph-osd-strg0-sdaa 877985ec3aca ceph-daemon:latest ceph-osd-strg0-sdab $ docker container ls -q -f "name=sda" 958121a7cc7d 46c7240d71f3 877985ec3aca $ docker container ls CONTAINER ID IMAGE NAMES 2db399b3ee85 ceph-daemon:latest ceph-osd-5 099dc13f08f1 ceph-daemon:latest ceph-osd-13 5d0c2fe8f121 ceph-daemon:latest ceph-osd-17 d6c7b89db1d1 ceph-daemon:latest ceph-osd-1 $ docker container ls -q -f "name=ceph-osd-1" 099dc13f08f1 5d0c2fe8f121 d6c7b89db1d1 Adding an extra '$' character at the end of the pattern solves the problem. Finally removing the get_container_osd_id function because it's not used in the script at all. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `45d46541cb`)	2019-06-21 14:49:55 -04:00
Dimitri Savineau	f4212b20e5	ceph-volume: Set max open files limit on container The ceph-volume lvm list command takes ages to complete when having a lot of LV devices on containerized deployment. For instance, with 25 OSDs on a node it takes 3 mins 44s to list the OSD. Adding the max open files limit to the container engine cli when executing the ceph-volume command seems to improve a lot thee execution time ~30s. This was impacting the OSDs creation with ceph-volume (both filestore and bluestore) when using multiple LV devices. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b987534881`)	2019-06-20 20:01:13 -04:00
Guillaume Abrioux	f29366b848	ceph-osd: do not relabel /run/udev in containerized context Otherwise content in /run/udev is mislabeled and prevent some services like NetworkManager from starting. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `80875adba7`)	2019-06-19 23:46:46 +02:00
Rishabh Dave	114078bfa1	ceph-infra: make chronyd default NTP daemon Since timesyncd is not available on RHEL-based OSs, change the default to chronyd for RHEL-based OSs. Also, chronyd is chrony on Ubuntu, so set the Ansible fact accordingly. Fixes: https://github.com/ceph/ceph-ansible/issues/3628 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `9d88d3199f`)	2019-06-18 10:46:34 +02:00
Rishabh Dave	93c7d8d79d	don't install NTPd on Atomic Since Atomic doesn't allow any installations and NTPd is not present on Atomic image we are using, abort when ntp_daemon_type is set to ntpd. https://github.com/ceph/ceph-ansible/issues/3572 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `bdff3e48fd`)	2019-06-18 10:46:34 +02:00
Dimitri Savineau	81de8a8106	remove ceph-agent role and references The ceph-agent role was used only for RHCS 2 (jewel) so it's not usefull anymore. The current code will fail on CentOS distribution because the rhscon package is only avaible on Red Hat with the RHCS 2 repository and this ceph release is supported on stable-3.0 branch. Resolves: #4020 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `7503098ca0`)	2019-06-17 14:42:08 -04:00
Dimitri Savineau	ed9b594b80	tests: Update ansible ssh_args variable Because we're using vagrant, a ssh config file will be created for each nodes with options like user, host, port, identity, etc... But via tox we're override ANSIBLE_SSH_ARGS to use this file. This remove the default value set in ansible.cfg. Also adding PreferredAuthentications=publickey because CentOS/RHEL servers are configured with GSSAPIAuthenticationis enabled for ssh server forcing the client to make a PTR DNS query. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `34f9d51178`)	2019-06-17 12:02:36 -04:00
Guillaume Abrioux	64659d2c82	iscsi: assign application (rbd) to pool 'rbd' if we don't assign the rbd application tag on this pool, the cluster will get `HEALTH_WARN` state like following: ``` HEALTH_WARN application not enabled on 1 pool(s) POOL_APP_NOT_ENABLED application not enabled on 1 pool(s) application not enabled on pool 'rbd' ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4cf17a6fdd`)	2019-06-13 14:43:25 +02:00
Dimitri Savineau	95f3908e44	ceph-handler: replace fuser by /proc/net/unix We're using fuser command to see if a process is using a ceph unix socket file. But the fuser command runs through every PID present in /proc/<PID> to see if one of them is using the file. On a system running thousands processes, the fuser command can take a long time to finish. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1717011 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `da9891da1e`)	2019-06-12 23:00:21 +02:00
Guillaume Abrioux	db90debcc7	validate: fail in check_devices at the right task see https://bugzilla.redhat.com/show_bug.cgi?id=1648168#c17 for details. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1648168#c17 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `771648304d`)	2019-06-10 08:09:58 +02:00
Guillaume Abrioux	62647e1935	spec: bring back possibility to install ceph with custom repo This can be seen as a regression for customers who were used to deploy in offline environment with custom repositories. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1673254 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `c933645bf7`)	2019-06-07 17:29:57 +02:00
Dimitri Savineau	0b653ee5b4	update default rhcs values and docs The RHCS documentation mentionned in the default values and group_vars directory are referring to RHCS 2.x while it should be 3.x. Revolves: https://bugzilla.redhat.com/show_bug.cgi?id=1702732 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-06-04 14:18:23 +02:00
Dimitri Savineau	b5fdf5fdcb	vagrant: Default box to centos/7 We don't use ceph/ubuntu-xenial anymore but only centos/7 and centos/atomic-host. Changing the default to centos/7. Resolves: #4036 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `24d0fd7003`)	2019-05-31 13:57:55 -04:00
Dimitri Savineau	8a74928a19	tox: Refact lvm_osds scenario The current lvm_osds only tests filestore on one OSD node. We also have bs_lvm_osds to test bluestore and encryption. Let's use only one scenario to test filestore/bluestore and with or without dmcrypt on four OSD nodes. Also use validate_dmcrypt_bool_value instead of types.boolean on dmcrypt validation via notario. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `52b9f3fb28`)	2019-05-10 11:24:32 +02:00
Mike Christie	0a24078bbb	igw: Fix rolling update service ordering We must stop tcmu-runner after the other rbd-target-* services because they may need to interact with tcmu-runner during shutdown. There is also a bug in some kernels where IO can get stuck in the kernel and by stopping rbd-target-* first we can make sure all IO is flushed. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1659611 Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `d7ef12910e`)	2019-05-10 11:12:50 +02:00
Guillaume Abrioux	900244e065	Revert "Revert "cv: support zap by osd fsid"" This reverts commit `addcc1e61a`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-05-10 09:13:10 +02:00
Guillaume Abrioux	f1b4874176	Revert "Revert "shrink_osd: use cv zap by fsid to remove parts/lvs"" This reverts commit `043ee8c158`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-05-10 09:13:10 +02:00
Guillaume Abrioux	5053f32c15	osds: allow passing devices by path ceph-volume didn't work when the devices where passed by path. Since it now support it, let's allow this feature in ceph-ansible Closes: #3812 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8f2c45dfd3`)	2019-05-09 14:21:43 +02:00
Guillaume Abrioux	addcc1e61a	Revert "cv: support zap by osd fsid" This reverts commit `8454f0144a`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-25 21:27:37 +02:00
Guillaume Abrioux	043ee8c158	Revert "shrink_osd: use cv zap by fsid to remove parts/lvs" This reverts commit `be59e0b451`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-25 21:27:37 +02:00
Dimitri Savineau	2fa8099fa7	osd: set default bluestore_wal_devices empty We only need to set the wal dedicated device when there's three tiers of storage used. Currently the block.wal partition will also be created on the same device than block.db. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1685253 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-25 07:13:38 +00:00
Dimitri Savineau	9ff19cc604	rolling_update: restart all ceph-iscsi services Currently only rbd-target-gw service is restarted during an update. We also need to restart tcmu-runner and rbd-target-api services during the ceph iscsi upgrade. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1659611 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `f1048627ea`)	2019-04-24 23:17:41 +00:00
Dimitri Savineau	7418999638	ceph-mds: Increase cpu limit to 4 In containerized deployment the default mds cpu quota is too low for production environment. This is causing performance degradation compared to bare-metal. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695850 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1999cf3d19`)	2019-04-24 21:44:23 +00:00
Dimitri Savineau	54128db5cd	ceph-osd: Fix merge conflict from mergify The PR #3916 was merged automatically by mergify even if there was a confict in the ceph-osd-run.sh.j2 template. This commit resolves the conflict. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-24 12:41:23 -04:00
Dimitri Savineau	3ae2a687ed	ceph-osd: Increase cpu limit to 4 In containerized deployment the default osd cpu quota is too low for production environment using NVMe devices. This is causing performance degradation compared to bare-metal. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695880 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c17106874c`) # Conflicts: # roles/ceph-osd/templates/ceph-osd-run.sh.j2	2019-04-24 16:02:28 +00:00
Dimitri Savineau	c056ae7b8c	ansible.cfg: Add library path to configuration Ceph module path needs to be configured if we want to avoid issues like: no action detected in task. This often indicates a misspelled module name, or incorrect module path Currently the ansible-lint command in Travis CI complains about that. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1668478 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `a1a871cade`)	2019-04-24 07:49:48 +00:00
Matthew Vernon	1556d802ff	ceph-mon: increase timeout waiting for admin and bootstrap keys With a large and/or busy cluster, it can take significantly more than 30s for a restarted monitor to get to the point where `ceph-create-keys` returns successfully. A recent upgrade of our production cluster failed here because it took a couple of minutes for the newly-upgraded `mon` to be ready. So increase the timeout significantly. This patch is applied to stable-3.2, because the affected code is refactored in stable-4.0 and ceph-create-keys is no longer called. Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>	2019-04-12 17:03:39 +00:00

1 2 3 4 5 ...

4265 Commits (787a6e879e59fc0e93593ff6e4f3a398455ab7f7) All Branches Search

4265 Commits (787a6e879e59fc0e93593ff6e4f3a398455ab7f7)

All Branches