ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	45a7082712	lint: Fix spaces before and after variables ansible-lint reports: [206] Variables should have spaces after {{ and before }} Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-03-01 17:22:24 +00:00
Kevin Coakley	038401fef2	Add changed_when: false to the "get osd ids" statement The "get osd ids" statement only registers the osd_ids_non_container variable. Running "ls /var/lib/ceph/osd/ \| sed 's/.*-//'" should never produce a change on the system. Adding changed_when: false prevents irrelevant change messages from Ansible. Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>	2019-02-28 22:46:19 +00:00
Guillaume Abrioux	21e5db8982	osd: make the 'wait for all osd to be up' task configurable introduce two new variables to make the check that 'wait for all osd to be up' configurable. It's possible that for some deployments, OSDs can take longer to be seen as UP and IN. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1676763 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-20 16:06:04 +00:00
David Waiting	3930791cb7	ensure at least one osd is up The existing task checks that the number of OSDs is equal to the number of up OSDs before continuing. The problem is that if none of the OSDs have been discovered yet, the task will exit immediately and subsequent pool creation will fail (num_osds = 0, num_up_osds = 0). This is related to Bugzilla 1578086. In this change, we also check that at least one OSD is present. In our testing, this results in the task correctly waiting for all OSDs to come up before continuing. Signed-off-by: David Waiting <david_waiting@comcast.com>	2019-02-19 18:31:05 +00:00
John Fulton	cc0bf197e1	Fix CNI error when net=host is not used on OSD calls Follow up fix that `410abd7` missed. Related: ceph#3561 Signed-off-by: John Fulton <fulton@redhat.com>	2019-02-05 22:49:01 +00:00
Guillaume Abrioux	16efdbc59b	podman: support podman installation on rhel8 Add required changes to support podman on rhel8 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1667101 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Andrew Schoen	88eda479a9	ceph-facts: generate devices when osd_auto_discovery is true This task used to live in ceph-osd, but we need it defined here to that ceph-config can use it when trying to determine the number of osds. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2019-02-01 12:28:12 +01:00
Kai Wembacher	a273ed7f60	add support for rocksdb and wal on the same partition in non-collocated Signed-off-by: Kai Wembacher <kai@ktwe.de>	2018-12-20 14:19:46 +01:00
Guillaume Abrioux	d7e77012ef	retry on packages and repositories failures add register/until on all packaging related tasks to avoid non valid CI failure. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-12-19 14:48:27 +00:00
Noah Watkins	3cf5fd2c3e	start_osds: use list instead of keys (re-introduce) the python3 fix merged by: https://github.com/ceph/ceph-ansible/pull/3346 was reintroduced a few days later by: `82a6b5adec` and this patch fixes it again :) Signed-off-by: Noah Watkins <nwatkins@redhat.com>	2018-12-05 23:25:35 +00:00
Sébastien Han	82a6b5adec	osd: manage legacy ceph-disk non-container startup The code is now able (again) to start osds that where configured with ceph-disk on a non-container scenario. Closes: https://github.com/ceph/ceph-ansible/issues/3388 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `452069cb3a`)	2018-12-03 16:01:57 +01:00
Sébastien Han	ec2d1f502d	osd: re-introduce disk_list check This commit `4cc1506303 (diff-51bbe3572e46e3b219ad726da44b64ebL13)` accidentally removed this check. This is a must have for ceph-disk based containerized OSDs. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `9b5a93e3a5`)	2018-12-03 16:01:57 +01:00
Sébastien Han	14fc5bad12	mon: do not serialized container bootstrap This commit unifies the container and non-container code, which in the meantime gives use the ability to deploy N mon container at the same time without having to serialized the deployment. This will drastically reduces the time needed to bootstrap the cluster. Note, this is only possible since Nautilus because the monitors are bootstrap the initial keys on their own once they reach quorum. In the Nautilus version of the ceph-container mon, we stopped generating the keys 'manually' from inside the container, for more detail see: https://github.com/ceph/ceph-container/pull/1238 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Guillaume Abrioux	ccc0c9c24c	osd: remove a leftover this file is never included in ceph-osd, looks like a leftover let's remove it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-12-03 09:12:02 +01:00
Guillaume Abrioux	fead0813b4	remove kv store support the next stable release will drop this feature. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-30 13:45:12 +00:00
Sébastien Han	bc2daaeb71	ceph-osd fix batch with container binary Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	f192bc92a2	ceph_key: use the right container runtime binary Rework all the ceph_key invocation to use either docker or podman binary. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	a96e910114	Add new container scenario Test with podman instead of docker and also support for python 3 only. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	997667a873	osd: expose udev into the container In order to be able to retrieve udev information, we must expose its socket. As per, https://github.com/ceph/ceph/pull/25201 ceph-volume will start consuming udev output. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-26 18:57:12 +00:00
Guillaume Abrioux	7774069d45	refact osd pool size customization Add real default value for osd pool size customization. Ceph itself has an `osd_pool_default_size` default value to `3`. If users don't specify a pool size in various pools definition within ceph-ansible, we should default to `3`. By the way, this kind of condition isn't really clear: ``` when: - rbd_pool_size \| default ("") ``` we should try to get the customized value then default to what is in `osd_pool_default_size` (which has its default value pointing to `ceph_osd_pool_default_size` (`3`) as well) and compare it to `ceph_osd_pool_default_size`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 15:42:50 +00:00
Guillaume Abrioux	d4c0960f04	mon: move `osd_pool_default_pg_num` in `ceph-defaults` `osd_pool_default_pg_num` parameter is set in `ceph-mon`. When using ceph-ansible with `--limit` on a specifc group of nodes, it will fail when trying to access this variables since it wouldn't be defined. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 15:42:50 +00:00
Boris Ranto	c2b0cbd699	start_osds: Use list instead of keys If you use python3 based ansible then keys() returns a dict_keys object, not a list of keys. This breaks the installation on such a system. Using the list filter provides a more robust solution that should work on both python2 and python3 based ansible. You can find some more information about the issue, here: https://github.com/ansible/ansible/issues/19514 Signed-off-by: Boris Ranto <branto@redhat.com>	2018-11-20 18:48:22 +01:00
Guillaume Abrioux	f7fcc012e9	osd: commonize start_osd code since `ceph-volume` introduction, there is no need to split those tasks. Let's refact this part of the code so it's clearer. By the way, this was breaking rolling_update.yml when `openstack_config: true` playbook because nothing ensured OSDs were started in ceph-osd role (In `openstack_config.yml` there is a check ensuring all OSD are UP which was obviously failing) and resulted with OSDs on the last OSD node not started anyway. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-12 10:51:48 +01:00
Sébastien Han	72cae542da	lint: Don't compare to empty string description = 'Use `when: var` rather than `when: var != ""` (or ' \ 'conversely `when: not var` rather than `when: var == ""`)' Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-08 10:22:02 +00:00
Sébastien Han	094ae8baf1	lint: do not use local_action Use delegate_to: localhost instead. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-08 10:22:02 +00:00
Sébastien Han	037bab2922	lint: line length should not exceed 160 chars Line was too long Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-08 10:22:02 +00:00
Sébastien Han	a882ad7ade	lint: use command instead of shell Use command when the tasks does not have any pipes or wilcards. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-31 14:18:36 +01:00
Sébastien Han	53972ee672	lint: add changed_when to command Calling command should have changed_when false otherwise each time it runs it will show as 'changed' and this is irrelevant. Commands should not change things if nothing needs doing Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-31 14:18:36 +01:00
Rishabh Dave	8edbda96df	use blocks directives to group tasks Using block directives simplifies the playbooks and makes them more readable. Fixes: https://github.com/ceph/ceph-ansible/issues/2835 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-10-31 09:37:43 +01:00
Sébastien Han	d209fc9d02	lint yaml Fix [error] too many blank lines (1 > 0) (empty-lines) Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-30 14:41:36 +01:00
Rishabh Dave	ee2d52d33d	allow custom pool size Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1596339 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-10-22 16:00:21 +02:00
Sébastien Han	fbd878c8d5	infra: rename osd-configure to add-osd and improve it The playbook has various improvements: * run ceph-validate role before doing anything * run ceph-fetch-keys only on the first monitor of the inventory list * set noup flag so PGs get distributed once all the new OSDs have been added to the cluster and unset it when they are up and running Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1624962 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-17 11:26:11 +00:00
Guillaume Abrioux	40b7747af7	remove jewel support As of now, we should no longer support Jewel in ceph-ansible. The latest ceph-ansible release supporting Jewel is `stable-3.1`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-12 23:38:17 +00:00
Sébastien Han	31a0438cb2	ceph_volume: refactor This commit does a couple of things: * Avoid code duplication * Clarify the code * add more unit tests * add myself to the author of the module Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Sébastien Han	bfe689094e	osd: do not run when lvm scenario This task was created for ceph-disk based deployments so it's not needed when osd are prepared with ceph-volume. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Sébastien Han	e39fc4f6ce	ceph_volume: add container support for batch command The batch option got recently added, while rebasing this patch it was necessary to implement it. So now, the batch option can work on containerized environments. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1630977 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Sébastien Han	3ddcc9af16	ceph_volume: try to get ride of the dummy container If we run on a containerized deployment we pass an env variable which contains the container image. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Sébastien Han	aa2c1b27e3	ceph-osd: ceph-volume container support Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-10 16:08:41 -04:00
Andrew Schoen	c453ea25c0	ceph-osd: use journal_size and block_db_size for lvm batch Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-10-09 10:09:50 -04:00
Rishabh Dave	b5d2ea269f	don't use "static" field while including tasks Instead used "import_tasks" and "include_tasks" to tell whether tasks must be included statically or dynamically. Fixes: https://github.com/ceph/ceph-ansible/issues/2998 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-10-04 07:44:28 +00:00
Rishabh Dave	380168dadc	don't use "include" to include tasks Use "import_tasks" or "include_tasks" instead. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-09-27 17:53:40 +02:00
Andrew Schoen	b36f3e06b5	ceph_volume: adds the osds_per_device parameter If this is set to anything other than the default value of 1 then the --osds-per-device flag will be used by the batch command to define how many osds will be created per device. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-09-12 20:27:14 +00:00
Sébastien Han	8c70a5b197	osd: fix ceph_release We need ceph_release in the condition, not ceph_stable_release Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1619255 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-20 20:14:56 +02:00
Sébastien Han	3149b2564f	Revert "osd: generate device list for osd_auto_discovery on rolling_update" This reverts commit `e84f11e99e`. This commit was giving a new failure later during the rolling_update process. Basically, this was modifying the list of devices and started impacting the ceph-osd itself. The modification to accomodate the osd_auto_discovery parameter should happen outside of the ceph-osd. Also we are trying to not play ceph-osd role during the rolling_update process so we can speed up the upgrade. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-16 11:13:12 +02:00
Andrew Schoen	6423ab4ad3	lvm: fix condition when selecting which scenario to run devices and lvm_volumes will always be defined, so we need to instead check it's length before deciding to run the scenario. This fixes the failure here: https://2.jenkins.ceph.com/job/ceph-ansible-prs-luminous-bluestore_lvm_osds/86/consoleFull#1667273050b5dd38fa-a56e-4233-a5ca-584604e56e3a Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-08-10 11:46:12 +02:00
Sébastien Han	e84f11e99e	osd: generate device list for osd_auto_discovery on rolling_update rolling_update relies on the list of devices when performing the restart of the OSDs. The task that is builind the devices list out of the ansible_devices dict only runs when there are no partitions on the drives. However during an upgrade the OSD are already configured, they have been prepared and have partitions so this task won't run and thus the devices list will be empty, skipping the restart during rolling_update. We now run the same task under different requirements when rolling_update is true and build a list when: * osd_auto_discovery is true * rolling_update is true * ansible_devices exists * no dm/lv are part of the discovery * the device is not removable * the device has more than 1 sector Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1613626 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-08-10 09:19:40 +02:00
Andrew Schoen	3592c68cca	ceph-osd: adds crush_device_class config option This is used with the lvm osd scenario. When using devices you need the option to set the crush device class for all of the OSDs that are created from those devices. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-08-09 09:41:58 -04:00
Andrew Schoen	6d431ec22d	ceph-volume: implement the 'lvm batch' subcommand This adds the action 'batch' to the ceph-volume module so that we can run the new 'ceph-volume lvm batch' subcommand. A functional test is also included. If devices is defind and osd_scenario is lvm then the 'ceph-volume lvm batch' command will be used to create the OSDs. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-08-09 09:41:58 -04:00
Guillaume Abrioux	053709da97	ceph-osds: backward compatibility with jewel for osp pools creation If we want to be backward compatible with release prior to luminous, we have to set the rule name accordingly to default values used in jewel. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-07-26 18:47:10 +00:00
Sébastien Han	abdb53e16a	ceph-osd: trigger osd container restart on script change The script ceph-osd-run.sh holds the config options to start the container, if one of these options are modified we must restart the container. This was not the case before becauase the 'notify' flag wasn't present. Closing: https://bugzilla.redhat.com/show_bug.cgi?id=1596061 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-28 17:54:13 +02:00
George Shuklin	653b483fc3	Add ceph_keyring_permissions variable to control permissions for keyring files in /etc/ceph. Default value is the same as it was (0600), but this variable allows user to override it (f.e. set it to 0640). Signed-off-by: George Shuklin <george.shuklin@gmail.com>	2018-06-28 15:48:39 +00:00
Konstantin Shalygin	3a07568496	ceph-osd: set 'openstack_keys_tmp' only when 'openstack_config' is defined. If 'openstack_config' is false this task shouldn't be executed. Signed-off-by: Konstantin Shalygin <k0ste@k0ste.ru>	2018-06-11 13:03:55 +02:00
Guillaume Abrioux	433ecc7cbc	osd: copy openstack keys over to all mon When configuring openstack, the created keyrings aren't copied over to all monitors nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1588093 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-07 13:58:57 +08:00
Guillaume Abrioux	aae37b44f5	mons: move set_fact of openstack_keys in ceph-osd Since the openstack_config.yml has been moved to `ceph-osd` we must move this `set_fact` in ceph-osd otherwise the tasks in `openstack_config.yml` using `openstack_keys` will actually use the defaults value from `ceph-defaults`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1585139 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-01 17:12:01 +02:00
Guillaume Abrioux	9d5265fe11	osds: wait for osds to be up before creating pools This is a follow up on #2628. Even with the openstack pools creation moved later in the playbook, there is still an issue because OSDs are not all UP when trying to create pools. Adding a task which checks for all OSDs to be UP with a `retries/until` condition should definitively fix this issue. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-01 15:46:52 +02:00
Guillaume Abrioux	34e646e767	osds: do not set docker_exec_cmd fact in `ceph-osd` there is no need to set `docker_exec_cmd` since the only place where this fact is used is in `openstack_config.yml` which delegate all docker command to a monitor node. It means we need the `docker_exec_cmd` fact that has been set referring to `ceph-mon-*` containers, this fact is already set earlier in `ceph-defaults`. By the way, when collocating an OSD with a MON it fails because the container `ceph-osd-{{ ansible_hostname }}` doesn't exist. Removing this task will allow to collocate an OSD with a MON. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1584179 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-30 16:17:29 +02:00
Guillaume Abrioux	3a0e168a76	mdss: move cephfs pools creation in ceph-mds When deploying a large number of OSD nodes it can be an issue because the protection check [1] won't pass since it tries to create pools before all OSDs are active. The idea here is to move cephfs pools creation in `ceph-mds` role. [1] `e59258943b/src/mon/OSDMonitor.cc (L5673)` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-24 09:39:38 -07:00
Guillaume Abrioux	564a662baf	osds: move openstack pools creation in ceph-osd When deploying a large number of OSD nodes it can be an issue because the protection check [1] won't pass since it tries to create pools before all OSDs are active. The idea here is to move openstack pools creation at the end of `ceph-osd` role. [1] `e59258943b/src/mon/OSDMonitor.cc (L5673)` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-24 09:39:38 -07:00
Vishal Kanaujia	ef5f52b1f3	Skip GPT header creation for lvm osd scenario The LVM lvcreate fails if the disk already has a GPT header. We create GPT header regardless of OSD scenario. The fix is to skip header creation for lvm scenario. fixes: https://github.com/ceph/ceph-ansible/issues/2592 Signed-off-by: Vishal Kanaujia <vishal.kanaujia@flipkart.com>	2018-05-23 11:44:09 -07:00
Andrew Schoen	32bac6b491	ceph-validate: move var checks from ceph-osd into this role Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-05-18 17:58:24 +02:00
Andy McCrae	08a2b58d39	Allow os_tuning_params to overwrite fs.aio-max-nr The order of fs.aio-max-nr (which is hard-coded to 1048576) means that if you set fs.aio-max-nr in os_tuning_params it will effectively be ignored for bluestore scenarios. To resolve this we should move the setting of fs.aio-max-nr above the setting of os_tuning_params, in this way the operator can define the value of fs.aio-max-nr to be something other than 1048576 if they want to. Additionally, we can make the sysctl settings happen in 1 task rather than multiple.	2018-05-11 10:49:37 +01:00
Sébastien Han	65ba85aff6	Expose /var/run/ceph Useful for softwares that do data collection/monitoring like collectd. They can connect to the socket and then retrieve information. Even though the sockets are exposed now, I'm keeping the docker exec to check the socket, this will allow newer version of ceph-ansible to work with older versions. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1563280 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-20 15:48:32 +02:00
Sébastien Han	641f141c0f	selinux: remove chcon calls We know bindmount with the :z option at the end of the -v command so this will basically run the exact same command as we used to run. So to speak: chcon -Rt svirt_sandbox_file_t /var/lib/ceph Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-19 14:59:37 +02:00
Sébastien Han	d2a2793cb0	refactor the way we copy keys This commit does a couple of things: * use a common.yml file that contains things that can be played on both container and non-container * refactor the ability to copy the admin key to the nodes Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-18 16:46:33 +02:00
Sébastien Han	5bbbce527e	osd: do not do anything if the dev has a partition Regardless if the partition is 'ceph' or something else, we don't want to be as strick as checking for a particular partition. If the drive has a partition, we just don't do anything. This solves the case where the server reboots, disks get a different /dev/sda (node) allocation. In this case, prior to restarting the server /dev/sda was an OSD, but now it's /dev/sdb and the other way around. In such scenario, we will try to prepare the OSD and create a new partition, so let's not mess around with devices that have partitions. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498303 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-04-13 19:11:15 +02:00
vasishta p shastry	e1a1f81b6f	osd: to support copy_admin_key	2018-04-11 14:21:15 +02:00
Sébastien Han	e3275c1ca1	osd: add fs.aio-max-nr tuning The number of osds per nodes is limited by aio-max-nr, default is low, so we need to increase it. Full story: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-August/020408.html Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1553407 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-15 14:06:26 +01:00
Sébastien Han	f432819c1e	osd: apply systcl right away Without sysctl_set: yes the sysctm tuning will only get applied on the systctl.conf but not on the fly. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-15 14:06:26 +01:00
Sébastien Han	0f8a4251ba	move system tuning to osd role The changes from these tasks only apply to osd nodes so there is no reason to have them in ceph-common. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-15 14:06:26 +01:00
Sébastien Han	3261ab23b8	osd: remove old crush_location implementation This was causing a lot of pain with the handlers. Also the implementation was not ideal since we were assembling files. Everything can now be done with the ceph_crush module so let's remove that. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-03-06 15:24:31 +00:00
Caleb Boylan	0be60456ce	osd: Add support for multipath disks Multipath disks have partitions with a different format than what ceph-ansible currently supports, this update makes ceph-ansible aware of that format so multipath disks can be used as OSDs Signed-off-by: Caleb Boylan <caleb.boylan@ormuco.com>	2018-02-09 18:06:25 +01:00
Guillaume Abrioux	deaf273b25	syntax: change local_action syntax Use a nicer syntax for `local_action` tasks. We used to have oneliner like this: ``` local_action: wait_for port=22 host={{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }} state=started delay=10 timeout=500 }} ``` The usual syntax: ``` local_action: module: wait_for port: 22 host: "{{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }}" state: started delay: 10 timeout: 500 ``` is nicer and kind of way to keep consistency regarding the whole playbook. This also fix a potential issue about missing quotation : ``` Traceback (most recent call last): File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 213, in <module> main() File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 185, in main rc, out, err = module.run_command(args, executable=executable, use_unsafe_shell=shell, encoding=None, data=stdin) File "/tmp/ansible_wQtWsi/ansible_modlib.zip/ansible/module_utils/basic.py", line 2710, in run_command File "/usr/lib64/python2.7/shlex.py", line 279, in split return list(lex) File "/usr/lib64/python2.7/shlex.py", line 269, in next token = self.get_token() File "/usr/lib64/python2.7/shlex.py", line 96, in get_token raw = self.read_token() File "/usr/lib64/python2.7/shlex.py", line 172, in read_token raise ValueError, "No closing quotation" ValueError: No closing quotation ``` writing `local_action: shell echo {{ fsid }} \| tee {{ fetch_directory }}/ceph_cluster_uuid.conf` can cause trouble because it's complaining with missing quotes, this fix solves this issue. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1510555 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-31 10:45:34 +01:00
Sébastien Han	5132cc3de4	Do not search osd ids if ceph-volume Description of problem: The 'get osd id' task goes through all the 10 times (and its respective timeouts) to make sure that the number of OSDs in the osd directory match the number of devices. This happens always, regardless if the setup and deployment is correct. Version-Release number of selected component (if applicable): Surely the latest. But any ceph-ansible version that contains ceph-volume support is affected. How reproducible: 100% Steps to Reproduce: 1. Use ceph-volume (LVM) to deploy OSDs 2. Avoid using anything in the 'devices' section 3. Deploy the cluster Actual results: TASK [ceph-osd : get osd id _uses_shell=True, _raw_params=ls /var/lib/ceph/osd/ \| sed 's/.-//'] ********************************************************************************************************************************************* task path: /Users/alfredo/python/upstream/ceph/src/ceph-volume/ceph_volume/tests/functional/lvm/.tox/xenial-filestore-dmcrypt/tmp/ceph-ansible/roles/ceph-osd/tasks/start_osds.yml:6 FAILED - RETRYING: get osd id (10 retries left). FAILED - RETRYING: get osd id (9 retries left). FAILED - RETRYING: get osd id (8 retries left). FAILED - RETRYING: get osd id (7 retries left). FAILED - RETRYING: get osd id (6 retries left). FAILED - RETRYING: get osd id (5 retries left). FAILED - RETRYING: get osd id (4 retries left). FAILED - RETRYING: get osd id (3 retries left). FAILED - RETRYING: get osd id (2 retries left). FAILED - RETRYING: get osd id (1 retries left). ok: [osd0] => { "attempts": 10, "changed": false, "cmd": "ls /var/lib/ceph/osd/ \| sed 's/.*-//'", "delta": "0:00:00.002717", "end": "2018-01-21 18:10:31.237933", "failed": true, "failed_when_result": false, "rc": 0, "start": "2018-01-21 18:10:31.235216" } STDOUT: 0 1 2 Expected results: There aren't any (or just a few) timeouts while the OSDs are found Additional info: This is happening because the check is mapping the number of "devices" defined for ceph-disk (in this case it would be 0) to match the number of OSDs found. Basically this line: until: osd_id.stdout_lines\|length == devices\|unique\|length Means in this 2 OSD case it is trying to ensure the following incorrect condition: until: 2 == 0 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1537103	2018-01-30 14:44:38 +01:00
Andrew Schoen	79473badfe	ceph-osd: adds dmcrypt to the lvm scenario Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-01-24 14:10:08 +01:00
Andrew Schoen	6cbb56a3b6	ceph-osd: adds the crush_device_class param to the lvm scenario Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-01-17 13:49:29 +01:00
Sébastien Han	6db4aea453	osd: skip devices marked as '/dev/dead' On a non-collocated scenario, if a drive is faulty we can't really remove it from the list of 'devices' without messing up or having to re-arrange the order of the 'dedicated_devices'. We want to keep this device list ordered. This will prevent the activation failing on a device that we know is failing but we can't remove it yet to not mess up the dedicated_devices mapping with devices. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-01-11 17:34:32 +01:00
Guillaume Abrioux	70401f955b	container: trigger handlers on systemd file change When a systemd unit file is changed we should trigger handlers to restart the services. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-10 16:46:42 +01:00
Guillaume Abrioux	895949d6c4	osd: fix check gpt the gpt label creation doesn't work even with parted module. This commit fixes the gpt label creation by using parted command instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-12-20 17:42:45 +01:00
Konstantin Shalygin	d7dadc3e7b	ceph-osd: respect nvme partitions when device is a disk.	2017-12-12 09:03:18 +01:00
Andrew Schoen	788c3f351a	ceph-osd: adds osd_objectstore to the name when using the ceph_volume module This allows for easier debugging if verbosity is not set high enough. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-12-11 09:58:06 -06:00
Andrew Schoen	5e3d8dbf63	ceph-osd: use the cluster param with the ceph_volume module Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-12-11 09:58:06 -06:00
Andrew Schoen	423166f671	ceph-osd: use the new ceph_volume module for the lvm scenario Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-12-11 09:58:06 -06:00
Andy McCrae	4f1e854c79	Use parted module instead of command	2017-12-11 17:33:40 +10:00
Guillaume Abrioux	b449b16edd	Merge pull request #2215 from squidboylan/support_loopback_devices Add support for using loopback devices as OSDs	2017-11-28 14:04:47 +01:00
Caleb Boylan	8f02bb007f	Add support for using loopback devices as OSDs This is particularly useful in CI environments where you dont have the option of adding extra devices or volumes to the host. It is also a simple change to support loopback devices	2017-11-27 16:02:36 -08:00
Guillaume Abrioux	1cba626484	osd: remove leftover and fix a typo This task was originally needed to fix a docker installation issue (see: #1030). This has been fixed, therefore it can be removed. Fixes: #2199 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-11-21 11:11:34 +01:00
Guillaume Abrioux	efe06be10f	osd: ensure a gpt label is set on device ceph-disk prepare will fail on jewel if a GPT label is not present on device. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-11-17 17:32:23 +01:00
Sébastien Han	932345ab2a	osd: remove leftover from osd partition We used to support osds that are a partition. This is long gone so removing this task. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-11-16 14:58:40 +01:00
Sébastien Han	b1c1322357	osd: remove failed_when on activation There is no need to continue if the activation fails. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-11-16 14:57:49 +01:00
Sébastien Han	80d3a242d0	osd: fix bad activation for dmcrypt We were activating dmcrypt devices with the wrong command. Basically the first task execute the wrong activate command. The task fails but continues because of the 'failed_when: false'. Then the right activation sequence is being done by the next task. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-11-16 14:55:08 +01:00
Andrew Schoen	3c604f1115	lvm: support --data as a raw device or partition in ceph-volume Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-11-15 09:36:17 -06:00
Andrew Schoen	04f02910a9	lvm: ensure the data_vg exists before using it Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-11-15 09:36:17 -06:00
Guillaume Abrioux	aa0b1ed118	tests: remove OSD_FORCE_ZAP variable from tests according to ceph/ceph-container#840, this variable is no longer needed. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-11-14 17:55:01 +01:00
Guillaume Abrioux	0369bd59e2	Merge pull request #2146 from mslovy/wip-fix-crush-location osd: fix crush location for non-containerized deployment	2017-11-13 12:23:44 +01:00
Guillaume Abrioux	c06faf2deb	Merge pull request #2154 from ceph/fix_auto_discover osd: avoid using non desired loop device in autodiscovery	2017-11-10 01:19:20 +01:00
Guillaume Abrioux	591d77220e	osd: always run disk_list test there is no need to have a condition on this task, this test should be always run since the result will be interpreted later. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-11-09 11:51:16 +01:00
Guillaume Abrioux	43975a7332	osd: avoid using non desired loop device in autodiscovery This will prevent ceph-ansible from using a loop device while it shouldn't in auto_discovery mode. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-11-09 10:26:24 +01:00
Guillaume Abrioux	d5dfc63c89	osd: fix automatic prepare when auto_discover Use `devices` variable instead of `ansible_devices`, otherwise it means we are not using the devices which have been 'auto discovered' Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-11-08 10:20:44 +01:00
yaoning	d82a09dddd	fix crush location for non-containerized deployment crush location only set for containerized deployment Signed-off-by: yaoning <yaoning@unitedstack.com>	2017-11-08 12:05:10 +11:00
Sébastien Han	0930f14915	osd: do not use dm when osd_auto_discovery The current code will also return lvm devices such as /dev/dm-2, this kind of device type is not supported by ceph-disk at the moment. Now we just ignore them. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-11-08 11:33:10 +11:00

1 2 3 4 5 ...

440 Commits (675b6788f4d0473fc407f889534d6b0a046e4ba8)