ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Sébastien Han	f203031f88	iscsi: expose /dev/log in the container During its initialisation both rbd-target-api and rbd-target-gw try to open /dev/log for their syslog handler. If the device is not present the service fails to start. Thus expose /dev/log from the host in the container solves that problem. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	4e5d862bb7	testinfra: linting Make flake8 happy on the testinfra files. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	dcc765d7c7	testinfra: add support for podman Since we are now testing on docker and podman our functionnal tests must reflect that. So now, if we detect the podman binary we will use it, otherwise we default to docker. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	f5c2ca3710	ceph_key: fix rstrip for python 3 Removing bytes literals since rstrip only supports type String or None. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	7100cc5e77	test_lookup_ceph_initial_entities: fix The previous dict was missing 2 entities: * client.bootstrap-mgr * client.bootstrap-rbd-mirror So the test was failing since it expects 7 entities to match. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	d9ac9d466c	test_build_key_path_bootstrap_osd: fix The entity name is client.bootstrap-osd (as returned by Ceph), and not bootstrap-osd. The build_key_path function split 'client.bootstrap-osd' on the '.' so using bootstrap-osd fails with index out of range. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	1afa4c5c95	ceph_key: remove set-uid support The support of set-uid was remove from Ceph during the Nautilus cycle by the following commit: d6def8ba1126209f8dcb40e296977dc2b09a376e so this will not work anymore when deploying Nautilus clusters and above. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	f192bc92a2	ceph_key: use the right container runtime binary Rework all the ceph_key invocation to use either docker or podman binary. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	6cca37b683	client: do not use a dummy container anymore Since 84fcf4639140c390a7f1fcd790ba190503713f86 we now use the container binary cli to create ceph keys instead of creating a container and 'docker execing' into it. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	a96e910114	Add new container scenario Test with podman instead of docker and also support for python 3 only. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	bc6e652a1c	ceph_key: rework container support Previously, we were doing a 'docker exec' inside a mon container, this worked but this wasn't ideal since it required a mon to be up to generate keys. We must be able to generate a key without a running mon, e.g, when we create the initial key or simply when you want to generate a key from any node that is not a mon. Now, just like the ceph_volume module we use a 'docker run' command with the right binary as an entrypoint to perform the choosen action, this is more elegant and also only requires an env variable to be set in the playbook: CEPH_CONTAINER_IMAGE. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	a9b337ba66	handler: show unit logs on error This will tremendously help debugging daemons that fail on restart by showing the systemd unit logs. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 11:00:37 +00:00
Guillaume Abrioux	83a67648d8	validate: add nautilus release validate must accept ceph nautilus release. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-27 08:26:43 +00:00
Andrew Schoen	e13f32c1c5	ceph-volume: be idempotent when the batch strategy changes If you deploy with 2 HDDs and 1 SDD then each subsequent deploy both HDD drives will be filtered out, because they're already used by ceph. ceph-volume will report this as a 'strategy change' because the device list went from a mixed type of HDD and SDD to a single type of only SDD. This situation results in a non-zero exit code from ceph-volume. We want to handle this situation gracefully and report that nothing will be changed. A similar json structure to what would have been given by ceph-volume is returned in the 'stdout' key. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1650306 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-11-26 23:23:50 +00:00
Sébastien Han	997667a873	osd: expose udev into the container In order to be able to retrieve udev information, we must expose its socket. As per, https://github.com/ceph/ceph/pull/25201 ceph-volume will start consuming udev output. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-26 18:57:12 +00:00
Guillaume Abrioux	7c99b6df6d	update: fix a typo `hostvars[groups[mon_host]]['ansible_hostname']` seems to be a typo. That should be `hostvars[mon_host]['ansible_hostname']` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-26 18:22:20 +01:00
Guillaume Abrioux	f290e49df8	tests: do not fully override previous ceph_conf_overrides We run an initial deployment with `osd_pool_default_size: 1` in `ceph_conf_overrides`. When re-running the playbook to test idempotency and handlers, we reset `ceph_conf_overrides`, we must append a new value instead of just overwritting it, otherwise, this can lead to error in the CI. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-26 18:22:20 +01:00
Guillaume Abrioux	af78173584	rolling_update: refact set_fact `mon_host` each monitor node should select another monitor which isn't itself. Otherwise, one node in the monitor group won't set this fact and causes failure. Typical error: ``` TASK [create potentially missing keys (rbd and rbd-mirror) when mon is containerized] * task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-update_docker_cluster/rolling_update.yml:200 Thursday 22 November 2018 14:02:30 +0000 (0:00:07.493) 0:02:50.005 *** fatal: [mon1]: FAILED! => {} MSG: The task includes an option with an undefined variable. The error was: 'dict object' has no attribute u'mon2' ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-26 18:22:20 +01:00
Sébastien Han	4e267bee4f	rolling_update: create rbd and rbd-mirror keyrings During an upgrade ceph won't create keys that were not existing on the previous version. So after the upgrade of let's Jewel to Luminous, once all the monitors have the new version they should get or create the keys. It's ok to have the task fails, especially for the rbd-mirror key, which only appears in Nautilus. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1650572 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-26 18:22:20 +01:00
Sébastien Han	691f373543	ceph_key: add a get_key function When checking if a key exists we also have to ensure that the key exists on the filesystem, the key can change on Ceph but still have an outdated version on the filesystem. This solves this issue. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-26 18:22:20 +01:00
Sébastien Han	c14f9b78ff	switch: do not look for devices anymore It's easier lookup a directoriy instead of the block devices, especially because of ceph-volume and ceph-disk have a different way to handle devices. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-23 07:56:23 +00:00
Sébastien Han	cd56dad9fa	switch: disable all ceph units Prior to this commit we were only disabling ceph-osd units, but forgot the ceph.target which is controlling everything and will restart the ceph-osd units at each reboot. Now that everything gets disabled there won't be any conflicts between the old non-container and the new container units. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-23 07:56:23 +00:00
Sébastien Han	fe1d09925a	switch: do not mask systemd unit If we mask it we won't be able to start the OSD container since now the osd container use the osd ID as a name such as: ceph-osd@0 Fixes the error: Failed to execute operation: Cannot send after transport endpoint shutdown Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-23 07:56:23 +00:00
Guillaume Abrioux	5601af8de2	tests: change default pools size default pool size in our test should be explicitly set to 1 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 18:23:07 +00:00
Guillaume Abrioux	ed42262b37	client: change default pool size default pool size should match the real default that is defined in ceph itself. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 18:23:07 +00:00
Guillaume Abrioux	6d1fe32998	defaults: change default size for openstack pools default pool size should match the real default that is defined in ceph itself. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 18:23:07 +00:00
Guillaume Abrioux	fdc438dd0d	defaults: change for default pool size for cephfs_pools default pool size should match the real default that is defined in ceph itself. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 18:23:07 +00:00
Guillaume Abrioux	f1735e9bb0	defaults: add ceph related vars file This is to add a granularity level. We can have ceph specific variables that user shouldn't have to change here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 15:42:50 +00:00
Guillaume Abrioux	7774069d45	refact osd pool size customization Add real default value for osd pool size customization. Ceph itself has an `osd_pool_default_size` default value to `3`. If users don't specify a pool size in various pools definition within ceph-ansible, we should default to `3`. By the way, this kind of condition isn't really clear: ``` when: - rbd_pool_size \| default ("") ``` we should try to get the customized value then default to what is in `osd_pool_default_size` (which has its default value pointing to `ceph_osd_pool_default_size` (`3`) as well) and compare it to `ceph_osd_pool_default_size`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 15:42:50 +00:00
Guillaume Abrioux	d4c0960f04	mon: move `osd_pool_default_pg_num` in `ceph-defaults` `osd_pool_default_pg_num` parameter is set in `ceph-mon`. When using ceph-ansible with `--limit` on a specifc group of nodes, it will fail when trying to access this variables since it wouldn't be defined. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 15:42:50 +00:00
Guillaume Abrioux	68dde424f6	config: convert _osd_memory_target to int ceph.conf doesn't accept float value. Typical error seen: ``` $ sudo ceph daemon osd.2 config get osd_memory_target Can't get admin socket path: unable to get conf option admin_socket for osd.2: parse error setting 'osd_memory_target' to '7823740108,8' (strict_si_cast: unit prefix not recognized) ``` This commit ensures the value inserted in ceph.conf will be an integer. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-21 14:33:27 +00:00
Sébastien Han	e7b3d3e014	site: resync container playbook This PR https://github.com/ceph/ceph-ansible/pull/3251 forgot to create a symlink from site-docker.yml.sample to site-container.yml.sample. This commit resyncs and put the symlink in place. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-20 18:57:55 +01:00
Boris Ranto	dfab42a21f	defaults/facts: Use list instead of keys It is safer to use the list filter than the keys() method since the keys method does have some interoperability issues between python2 and python3 based ansible/jinja. Signed-off-by: Boris Ranto <branto@redhat.com>	2018-11-20 18:48:22 +01:00
Boris Ranto	c2b0cbd699	start_osds: Use list instead of keys If you use python3 based ansible then keys() returns a dict_keys object, not a list of keys. This breaks the installation on such a system. Using the list filter provides a more robust solution that should work on both python2 and python3 based ansible. You can find some more information about the issue, here: https://github.com/ansible/ansible/issues/19514 Signed-off-by: Boris Ranto <branto@redhat.com>	2018-11-20 18:48:22 +01:00
Valentin Lorentz	30ce7e84f4	Discover rbd facts. Signed-off-by: Valentin Lorentz <progval+git@progval.net>	2018-11-20 15:06:01 +01:00
Dan Mick	a2349f05ac	validate plugin: handle missing exception fields without traceback "missing variable" errors introduced by PR3058 would attempt to be reported, but since the exception contained no "path" definition, would cause a second exception in the Invalid exception handler. Make the exception handler verify that any field it tries to use exists, clean up its message formatting, and reduce the verbose level to see the literal error from notario in case more goes wrong in future. Signed-off-by: Dan Mick <dan.mick@redhat.com>	2018-11-19 22:01:07 +00:00
Neha Ojha	10538e9a23	osd_memory_target: standardize unit and fix calculation * The default value of osd_memory_target used by ceph is 4294967296 bytes, so use the same as ceph-ansible default. * Convert ansible_memtotal_mb to bytes to calculate osd_memory_target Signed-off-by: Neha Ojha <nojha@redhat.com>	2018-11-19 09:54:33 +00:00
Guillaume Abrioux	d20e1960ff	doc: update doc to add stable-3.2 information Since the branch has been created, we must reflect it in the doc. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-19 09:00:17 +00:00
Sébastien Han	976b66842f	ceph.ceph-container-common remove symlink This error was introduced in the recent refactor of ceph-docker-common in https://github.com/ceph/ceph-ansible/pull/3251. However, the Ansible galaxy linter is not happy about it and fails importing the role. Removing this since it's not used anymore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-18 21:54:46 +01:00
Guillaume Abrioux	393ab94728	client: fix a typo in create_users_keys.yml `cd1e4ee024` introduced a typo. This commit fixes it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-17 17:31:29 +00:00
Guillaume Abrioux	63b9835cbb	infra: don't restart firewalld if unit is masked if firewalld.service systemd unit is masked, the handler will fail when trying to restart it. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1650281 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-16 09:30:17 +00:00
Noah Watkins	64dee9be0c	Remove outdated documentation Fixes BZ https://bugzilla.redhat.com/show_bug.cgi?id=1640525 Signed-off-by: Noah Watkins <nwatkins@redhat.com>	2018-11-15 22:26:19 +00:00
Sébastien Han	661213a6a8	tox: add lvm setup to shrink mon Fix shrink mon scenario by setting lvm so we can configure ceph-volume lvm osds. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-13 18:31:10 +00:00
Guillaume Abrioux	f7fcc012e9	osd: commonize start_osd code since `ceph-volume` introduction, there is no need to split those tasks. Let's refact this part of the code so it's clearer. By the way, this was breaking rolling_update.yml when `openstack_config: true` playbook because nothing ensured OSDs were started in ceph-osd role (In `openstack_config.yml` there is a check ensuring all OSD are UP which was obviously failing) and resulted with OSDs on the last OSD node not started anyway. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-12 10:51:48 +01:00
Guillaume Abrioux	3ac6619fb9	tests: set pool size to 1 in ceph-override.json setting this setting to 1 makes the CI covering the related code in the playbook without breaking the upgrade scenarios. Those scenarios were broken because there is a check `TASK [waiting for clean pgs...]` in rolling_update.yml, since the pool size for `cephfs_metadata` and `cephfs_data` are updated to `2` in `ceph-override.json` and there is not enough osd to honor this size, some PGs are degraded and make the mentioned check failing. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-12 10:51:48 +01:00
Guillaume Abrioux	bbade5ee0a	site-docker: rename to 'site-container.yml.sample' Add a symlink for backward compatibility Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-12 10:51:48 +01:00
Guillaume Abrioux	c783bc70da	docker-common: rename role rename `ceph-docker-common` role to `ceph-container-common` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-12 10:51:48 +01:00
Guillaume Abrioux	8069c25ba5	docker-common: remove system_checks.yml This check is now part of `ceph-validate`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-12 10:51:48 +01:00
Guillaume Abrioux	1144df2852	docker-common: remove check_mandatory_vars.yml this is part of `ceph-validate` role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-12 10:51:48 +01:00
Guillaume Abrioux	6947f9c3ea	docker-common: remove dirs_permissions.yml this is already done in `ceph-config` role. Let's remove this duplicated task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-12 10:51:48 +01:00

... 2 3 4 5 6 ...

4318 Commits (c0ad91957c55fb13d87f494778c338f5080f52d6) All Branches Search

4318 Commits (c0ad91957c55fb13d87f494778c338f5080f52d6)

All Branches