ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Sébastien Han	14cd286e3a	test: disable nfs for containers Based on https://github.com/ceph/ceph-container/pull/1269 and given there are no stable packages and reliable repository, we disable nfs ganesha temporarly. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-04 12:34:54 +01:00
Sébastien Han	896676ee80	fix json data type Json is a type structure which is always typed as a string, where before this we were declaring a dict, which is not a json valid structure. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-04 12:34:54 +01:00
Sébastien Han	1b6b275229	test: remove leftover [mgrs] Since we now collocated mgrs and mons on the same machine we have to remove the mgrs section, they are not needed anymore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-04 12:34:54 +01:00
Guillaume Abrioux	b04fe72f35	tests: add purge_lvm_osds_container scenario This commits adds the purge_lvm_osds_container scenario. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-12-03 17:35:21 +01:00
Guillaume Abrioux	78116fa6db	purge: add iscsi support add iscsi support for both non containerized and containerized deployment in purge playbooks. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1651054 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-12-03 17:35:21 +01:00
Sébastien Han	82a6b5adec	osd: manage legacy ceph-disk non-container startup The code is now able (again) to start osds that where configured with ceph-disk on a non-container scenario. Closes: https://github.com/ceph/ceph-ansible/issues/3388 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `452069cb3a`)	2018-12-03 16:01:57 +01:00
Sébastien Han	ec2d1f502d	osd: re-introduce disk_list check This commit `4cc1506303 (diff-51bbe3572e46e3b219ad726da44b64ebL13)` accidentally removed this check. This is a must have for ceph-disk based containerized OSDs. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `9b5a93e3a5`)	2018-12-03 16:01:57 +01:00
Sébastien Han	4c51130198	osd: discover osd_objectstore on the fly Applying and passing the OSD_BLUESTORE/FILESTORE on the fly is wrong for existing clusters as their config will be changed. Typically, if an OSD was prepared with ceph-disk on filestore and we change the default objectstore to bluestore, the activation will fail. The flag osd_objectstore should only be used for the preparation, not activation. The activate in this case detects the osd objecstore which prevents failures like the one described above. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:11:47 +00:00
Sébastien Han	bef522627e	ceph-osd: change jinja condition If an existing cluster runs this config, and has ceph-disk OSD, the `expose_partitions` won't be expected by jinja since it's inside the 'old' if. We need it as part of the osd_scenario != 'lvm' condition. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1640273 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:11:47 +00:00
Sébastien Han	bf375327a0	ceph-mgr: refact role for containers Now we simplify the invocation of start and remove some code and the directory 'docker'. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	23f685b352	ceph_key: allow setting 'dest' to a file This is useful in situations where you fetch the key from the mon store and want to write the file with a different name to a dedicated directory. This is important when fetching the mgr key, they are created as mgr.ceph-mon2 but we want them in /var/lib/ceph/mgr/ceph-ceph-mon0/keyring Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	14fc5bad12	mon: do not serialized container bootstrap This commit unifies the container and non-container code, which in the meantime gives use the ability to deploy N mon container at the same time without having to serialized the deployment. This will drastically reduces the time needed to bootstrap the cluster. Note, this is only possible since Nautilus because the monitors are bootstrap the initial keys on their own once they reach quorum. In the Nautilus version of the ceph-container mon, we stopped generating the keys 'manually' from inside the container, for more detail see: https://github.com/ceph/ceph-container/pull/1238 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	61082b3b32	mgr: only copy keys with dedicated mgr When collocating mon and mgr, the mgr container will attempt to create its own key since it has the admin key at its disposal. Also at this point there is nothing to fetch since the key is not created by the mons, as mentionned above the mgr creates the key on its own. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	1c760904b0	site: collocated mon and mgr by default This will speed up the deployment and also deploy mon and mgr collocated just as recommended. This won't prevent you of adding more and dedicaded machines for mgr if needed. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	a502327e52	disable nfs scenario The packages are broken, so let's remove it, until this solved. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	ee1905ad31	mon: add missing include_tasks instead of import_tasks This was probably a leftover/mistake so let's fix this and make the file consistent. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	7cb1040440	config: add missing bootstrap mgr directory This directory is needed so we can fetch the bootstrap mgr key in it. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	8d4de44f5d	mon: default ceph_health_raw to json During the first iteration, the command won't return anything, or can simply fail and might not return a valid json structure. Ansible will fail parsing it in the filter `from_json` so let's default that variable to empty dictionary. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	cfac79bec4	container-common: remove old check This removes a bit of unnecessary code, the check was always wrong because of the condition 'not ceph_current_status.get('rc', 1) == 0' It will never match since `Not` is used for bool and we are checking for an rc. Also, even though the check would work, this will be a major blocker for a complete meltdown. If the whole platform is shutdown then nothing will be up but files will be present, so this check is definitely wrong. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	bb7bfca113	rolling-update: remove old condition This failure condition was only valid at the time where clusters didn't have ceph-mgr activated. Now since we collocate the ceph-mgr with the mon by default, if the daemon wasn't present it will be created during the upgrade. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	a42ba03d71	ceph_volume: fix unit tests Fix the container_binary to use by mocking the CEPH_CONTAINER_BINARY env variable. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	3d0670b41c	ceph_key: apply permissions using ansible code module Instead of applying file permissions from our code, let's rely on the ansible code 'file' module for this. This is now handled at the task declaration level instead of inside the module. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	7ac73202f7	fw: update rules for mon/mgr collocation Since we now deploy mgr on mon we need to open fw rules so the mgr can reach out to the osds. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	5b9d8f9737	mon: remove old ubuntu login status We don't support Ubuntu Precise, so this feature does not exists anymore. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	acc92626f6	sites: fail the playbook on any failure We need to apply any_errors_fatal: true to every play so it can take effect, not only on the initial pass. With this flag, any error in the playbook will cause the playbook to stop. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	5816cd4101	site-container: retry image pull Sometimes pulling an image fails for network hickup, so let's retry 5 times at 5sec interval. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	bd98193ce4	travis: run modules unit tests Travis now runs our modules unit tests to make sure they always pass. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Sébastien Han	a0e5ef8516	mon: secure cluster on container Add the ability to protect pools on containerized clusters. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-12-03 14:39:43 +01:00
Guillaume Abrioux	ccc0c9c24c	osd: remove a leftover this file is never included in ceph-osd, looks like a leftover let's remove it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-12-03 09:12:02 +01:00
Guillaume Abrioux	0187166926	osd: remove an incorrect information This is false, `./defaults/main.yml` is not supposed to be modified directly. groups_vars a/o host_vars should always be preferred. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-12-03 08:11:35 +00:00
Guillaume Abrioux	fead0813b4	remove kv store support the next stable release will drop this feature. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-30 13:45:12 +00:00
Guillaume Abrioux	a952122c38	rolling_update: create missing keyring only on running mon try to create the potentially missing keys only on monitors that are actually running. The current node being played is stopped before this task. By the way, delegating the command on all nodes but the current node being played ensures that the generated keys will be present on all monitors. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-29 16:40:46 +00:00
Christian Berendt	1f73a9900f	Add missing space before }} This will fix the following yamllint warning: Variables should have spaces after {{ and before }} Signed-off-by: Christian Berendt <berendt@betacloud-solutions.de>	2018-11-29 16:04:05 +01:00
Guillaume Abrioux	a86c2b8526	config: write jinja comment with appropriate syntax jinja comment should be written using the jinja syntax `{# ... #}` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654441 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-29 15:48:23 +01:00
Sébastien Han	61fb6972ec	rolling_update: default ceph json output to empty dict So we can avoid the following failure: The conditional check 'hostvars[mon_host]['ansible_hostname'] in (ceph_health_raw.stdout \| from_json)["quorum_names"] or hostvars[mon_host]['ansible_fqdn'] in (ceph_health_raw.stdout \| from_json)["quorum_names"] ' failed. The error was: No JSON object could be decoded We just need to set a default, the next iteration will have a more complete json since the command won't fail. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-29 10:46:15 +00:00
Guillaume Abrioux	e4869ac8bd	validate: change default value for `radosgw_address` change default value of `radosgw_address` to keep consistency with `monitor_address`. Moreover, `ceph-validate` checks if the value is '0.0.0.0' to determine if it has to run `check_eth_rgw.yml`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1600227 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-28 23:13:38 +01:00
Guillaume Abrioux	96ce8761ba	tests: rgw_multisite allow clusters to talk to each other Adding this rule on the hypervisor will allow cluster to talk to each other. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-28 21:53:59 +00:00
Guillaume Abrioux	5d05a09b03	tests: update default pg num and pool size for podman scenario bring the recent refact about `osd_pool_default_pg_num` and `osd_pool_default_size` into podman scenario as well. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-28 11:22:04 +00:00
Guillaume Abrioux	5af5ad6d61	tests: fix image tag for secondary rgw cluster (rgw_multisite) the first cluster is using `latest-master` while the second is using `latest` which is not the right version to be used here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-28 11:22:04 +00:00
Guillaume Abrioux	3d8f4e6304	tests: apply dev_setup on the secondary cluster for rgw_multisite we must apply this playbook before deploying the secondary cluster. Otherwise, there will be a mismatch between the two deployed cluster. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-28 11:22:04 +00:00
Guillaume Abrioux	73287f91bc	mgr: fix mgr keyring error on rolling_update when upgrading from RHCS 2.5 to 3.2, it fails because the task `create ceph mgr keyring(s) when mon is containerized` has a when condition `inventory_hostname == groups[mon_group_name]\|last`. First, this is incorrect because `inventory_hostname` is referring to a mgr node, it means this condition would have never been satisfied. Then, this condition + `serial: 1` makes the mgr keyring creating skipped on the first node. Further, the `ceph-mgr` role tries to copy the mgr keyring (it's not aware we are running `serial: 1`) this leads to a failure like the following: ``` TASK [ceph-mgr : copy ceph keyring(s) if needed] ************************************************************************************************************************************************************************************************************************************************************************* task path: /usr/share/ceph-ansible/roles/ceph-mgr/tasks/common.yml:10 Tuesday 27 November 2018 12:03:34 +0000 (0:00:00.296) 0:11:01.290 **** An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AnsibleFileNotFound: Could not find or access '~/ceph-ansible-keys/48d78ac1-e0d6-4e35-ab3e-772aea7828fc//etc/ceph/local.mgr.magna021.keyring' failed: [magna021] (item={u'dest': u'/var/lib/ceph/mgr/local-magna021/keyring', u'name': u'/etc/ceph/local.mgr.magna021.keyring', u'copy_key': True}) => {"changed": false, "item": {"copy_key": true, "dest": "/var/lib/ceph/mgr/local-magna021/keyring", "name": "/etc/ceph/local.mgr.magna021.keyring"}, "msg": "Could not find or access '~/ceph-ansible-keys/48d78ac1-e0d6-4e35-ab3e-772aea7828fc//etc/ceph/local.mgr.magna021.keyring'"} ``` The ceph_key module is idempotent, so there is no need to have such a condition. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1649957 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-27 18:19:56 +01:00
Sébastien Han	bc2daaeb71	ceph-osd fix batch with container binary Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	aa086f1a47	ceph_key: fix after rebase Fix the tests Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	80ba45793d	fix template generation Position the right condition on ceph_docker_version, activate it when the container_binary is 'docker'. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	00ebdeff78	container-common: remove leftover ntp is installation is managed by the ceph-infra role. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	e5d5dffeb5	shrink-osd: add missing CEPH_BINARY We need to add the right binary to do the docker exec. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Guillaume Abrioux	3684d421e4	defaults: play set_radosgw_address.yml only on rgw nodes This is not needed to play these tasks on nodes that are not in rgw group. Always playing this code makes `shrink_mon.yml` failing. Typical error: ``` TASK [ceph-defaults : set_fact _radosgw_address to radosgw_interface - ipv4] * task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-shrink_mon/roles/ceph-defaults/tasks/set_radosgw_address.yml:21 Thursday 22 November 2018 12:34:51 +0000 (0:00:00.154) 0:00:12.371 *** fatal: [localhost]: FAILED! => {} MSG: The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute u'ansible_eth1' ``` Indeed, `radosgw_interface` is the network interface on rgw only. It is expected that this same interface doesn't exist on `localhost`, so, when running `shrink_mon.yml`, the role `ceph-defaults` is called in `hosts: localhost` and causes the playbook to fail. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	4f57e44f9c	defaults: declare container_binary Always declare container_binary and assign it a correct value. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	ac3e18e4c1	ceph-defaults: use podman on Fedora only It seems Atomic 7.5 has podman already, however this is an old version (0.4). The podman integration is targetting RHEL 8, so Fedora is currently the closest to that. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00
Sébastien Han	69d97f6480	site: symlink site-docker to site-container We deprecated site-docker to site-container so let's have a symlink for backward compatibility. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-27 16:47:40 +00:00

... 2 3 4 5 6 ...

4371 Commits (a1a871cadee5e86d181e1306c985e620b81fccac) All Branches Search

4371 Commits (a1a871cadee5e86d181e1306c985e620b81fccac)

All Branches