ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	c3a2320e01	revert infra: don't restart firewalld if unit is masked If firewalld unit is masked, setting `configure_firewall: false` is enough Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1655059 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1cff1f9806`)	2018-12-04 17:31:31 +01:00
Ramana Raja	0ec2ac34e3	rolling_update: fail if less than 3 MONs ... for non-containerized deployments as well. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1655470 Signed-off-by: Ramana Raja <rraja@redhat.com> (cherry picked from commit `cb784c601d`)	2018-12-04 16:34:57 +01:00
Sébastien Han	50fe56044e	disable nfs scenario The packages are broken, so let's remove it, until this solved. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `a502327e52`)	2018-12-04 14:39:05 +00:00
Sébastien Han	fa8bd10cac	test: disable nfs for containers Based on https://github.com/ceph/ceph-container/pull/1269 and given there are no stable packages and reliable repository, we disable nfs ganesha temporarly. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `6c3ef90ebe`)	2018-12-04 14:39:05 +00:00
Sébastien Han	8d1c67beb2	osd: discover osd_objectstore on the fly Applying and passing the OSD_BLUESTORE/FILESTORE on the fly is wrong for existing clusters as their config will be changed. Typically, if an OSD was prepared with ceph-disk on filestore and we change the default objectstore to bluestore, the activation will fail. The flag osd_objectstore should only be used for the preparation, not activation. The activate in this case detects the osd objecstore which prevents failures like the one described above. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `4c51130198`)	2018-12-04 09:01:50 +00:00
Sébastien Han	1151521784	ceph-osd: change jinja condition If an existing cluster runs this config, and has ceph-disk OSD, the `expose_partitions` won't be expected by jinja since it's inside the 'old' if. We need it as part of the osd_scenario != 'lvm' condition. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1640273 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `bef522627e`)	2018-12-04 09:01:50 +00:00
Sébastien Han	729744c6a8	rolling_update: do not fail on missing keys We don't want to fail on key that are not present since they will get created after the mons are updated. They will be created by the task "create potentially missing keys (rbd and rbd-mirror)". Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1650572 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `ebc901c6af`)	2018-12-03 13:03:33 +01:00
Noah Watkins	e8b10f47dc	rgw: use correct default rgw frontend address since 0.0.0.0 is the default radosgw address (not 'address'), not configuring an address explicitly, and instead configuring the radosgw interface, would result in 0.0.0.0 being used, instead of falling through to section that inspects the interface config option. backport note: this cannot be cherry-picked from master since this code doesn't exist in master. fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1655131 Signed-off-by: Noah Watkins <nwatkins@redhat.com>	2018-12-01 20:09:46 +00:00
Ramana Raja	788723cc22	tox.ini: setup LVs in OSD hosts for '*-cluster' scenarios ... as the scenarios set up ceph clusters with LVM OSDs. Closes: https://github.com/ceph/ceph-ansible/issues/3399 Signed-off-by: Ramana Raja <rraja@redhat.com>	2018-11-30 16:24:49 +00:00
Sébastien Han	452069cb3a	osd: manage legacy ceph-disk non-container startup The code is now able (again) to start osds that where configured with ceph-disk on a non-container scenario. Closes: https://github.com/ceph/ceph-ansible/issues/3388 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-29 23:30:21 +01:00
Guillaume Abrioux	8d93007e56	config: write jinja comment with appropriate syntax jinja comment should be written using the jinja syntax `{# ... #}` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654441 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a86c2b8526`)	2018-11-29 21:19:41 +01:00
Sébastien Han	2cea33f7fc	rolling_update: default ceph json output to empty dict So we can avoid the following failure: The conditional check 'hostvars[mon_host]['ansible_hostname'] in (ceph_health_raw.stdout \| from_json)["quorum_names"] or hostvars[mon_host]['ansible_fqdn'] in (ceph_health_raw.stdout \| from_json)["quorum_names"] ' failed. The error was: No JSON object could be decoded We just need to set a default, the next iteration will have a more complete json since the command won't fail. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	316e49c6d7	client: change default pool size default pool size should match the real default that is defined in ceph itself. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ed42262b37`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	1077ae0060	defaults: change default size for openstack pools default pool size should match the real default that is defined in ceph itself. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6d1fe32998`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	a4db9bd6e8	defaults: change for default pool size for cephfs_pools default pool size should match the real default that is defined in ceph itself. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fdc438dd0d`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	65699e4558	defaults: add ceph related vars file This is to add a granularity level. We can have ceph specific variables that user shouldn't have to change here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f1735e9bb0`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	f0195e97ed	refact osd pool size customization Add real default value for osd pool size customization. Ceph itself has an `osd_pool_default_size` default value to `3`. If users don't specify a pool size in various pools definition within ceph-ansible, we should default to `3`. By the way, this kind of condition isn't really clear: ``` when: - rbd_pool_size \| default ("") ``` we should try to get the customized value then default to what is in `osd_pool_default_size` (which has its default value pointing to `ceph_osd_pool_default_size` (`3`) as well) and compare it to `ceph_osd_pool_default_size`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7774069d45`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	68b2ad11ee	mon: move `osd_pool_default_pg_num` in `ceph-defaults` `osd_pool_default_pg_num` parameter is set in `ceph-mon`. When using ceph-ansible with `--limit` on a specifc group of nodes, it will fail when trying to access this variables since it wouldn't be defined. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d4c0960f04`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	e8dd6b8993	tests: change default pools size default pool size in our test should be explicitly set to 1 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	292d967d2f	update: fix a typo `hostvars[groups[mon_host]]['ansible_hostname']` seems to be a typo. That should be `hostvars[mon_host]['ansible_hostname']` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7c99b6df6d`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	30cec03ae7	tests: do not fully override previous ceph_conf_overrides We run an initial deployment with `osd_pool_default_size: 1` in `ceph_conf_overrides`. When re-running the playbook to test idempotency and handlers, we reset `ceph_conf_overrides`, we must append a new value instead of just overwritting it, otherwise, this can lead to error in the CI. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f290e49df8`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	1f4cf61058	rolling_update: refact set_fact `mon_host` each monitor node should select another monitor which isn't itself. Otherwise, one node in the monitor group won't set this fact and causes failure. Typical error: ``` TASK [create potentially missing keys (rbd and rbd-mirror) when mon is containerized] * task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-update_docker_cluster/rolling_update.yml:200 Thursday 22 November 2018 14:02:30 +0000 (0:00:07.493) 0:02:50.005 *** fatal: [mon1]: FAILED! => {} MSG: The task includes an option with an undefined variable. The error was: 'dict object' has no attribute u'mon2' ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `af78173584`)	2018-11-29 01:49:05 +00:00
Sébastien Han	d4f1f12bd0	rolling_update: create rbd and rbd-mirror keyrings During an upgrade ceph won't create keys that were not existing on the previous version. So after the upgrade of let's Jewel to Luminous, once all the monitors have the new version they should get or create the keys. It's ok to have the task fails, especially for the rbd-mirror key, which only appears in Nautilus. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1650572 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `4e267bee4f`)	2018-11-29 01:49:05 +00:00
Sébastien Han	ee96454980	ceph_key: add a get_key function When checking if a key exists we also have to ensure that the key exists on the filesystem, the key can change on Ceph but still have an outdated version on the filesystem. This solves this issue. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `691f373543`)	2018-11-29 01:49:05 +00:00
Sébastien Han	26ea96424c	switch: do not look for devices anymore It's easier lookup a directoriy instead of the block devices, especially because of ceph-volume and ceph-disk have a different way to handle devices. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `c14f9b78ff`)	2018-11-29 00:31:47 +01:00
Sébastien Han	57ac7b94c0	switch: disable all ceph units Prior to this commit we were only disabling ceph-osd units, but forgot the ceph.target which is controlling everything and will restart the ceph-osd units at each reboot. Now that everything gets disabled there won't be any conflicts between the old non-container and the new container units. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `cd56dad9fa`)	2018-11-29 00:31:47 +01:00
Sébastien Han	8d0379b4d9	switch: do not mask systemd unit If we mask it we won't be able to start the OSD container since now the osd container use the osd ID as a name such as: ceph-osd@0 Fixes the error: Failed to execute operation: Cannot send after transport endpoint shutdown Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `fe1d09925a`)	2018-11-29 00:31:47 +01:00
Sébastien Han	9b5a93e3a5	osd: re-introduce disk_list check This commit `4cc1506303 (diff-51bbe3572e46e3b219ad726da44b64ebL13)` accidentally removed this check. This is a must have for ceph-disk based containerized OSDs. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-29 00:31:13 +01:00
Guillaume Abrioux	659f2c60b5	validate: change default value for `radosgw_address` change default value of `radosgw_address` to keep consistency with `monitor_address`. Moreover, `ceph-validate` checks if the value is '0.0.0.0' to determine if it has to run `check_eth_rgw.yml`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1600227 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e4869ac8bd`)	2018-11-28 23:54:06 +01:00
Guillaume Abrioux	968e6f5854	tests: rgw_multisite allow clusters to talk to each other Adding this rule on the hypervisor will allow cluster to talk to each other. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `96ce8761ba`)	2018-11-28 23:53:58 +01:00
Guillaume Abrioux	133615471a	tests: set pool size to 1 in ceph-override.json setting this setting to 1 makes the CI covering the related code in the playbook without breaking the upgrade scenarios. Those scenarios were broken because there is a check `TASK [waiting for clean pgs...]` in rolling_update.yml, since the pool size for `cephfs_metadata` and `cephfs_data` are updated to `2` in `ceph-override.json` and there is not enough osd to honor this size, some PGs are degraded and make the mentioned check failing. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3ac6619fb9`)	2018-11-28 23:11:46 +01:00
Guillaume Abrioux	4cc1506303	osd: commonize start_osd code since `ceph-volume` introduction, there is no need to split those tasks. Let's refact this part of the code so it's clearer. By the way, this was breaking rolling_update.yml when `openstack_config: true` playbook because nothing ensured OSDs were started in ceph-osd role (In `openstack_config.yml` there is a check ensuring all OSD are UP which was obviously failing) and resulted with OSDs on the last OSD node not started anyway. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f7fcc012e9`)	2018-11-28 23:11:46 +01:00
Guillaume Abrioux	b72d806f4c	mgr: fix mgr keyring error on rolling_update when upgrading from RHCS 2.5 to 3.2, it fails because the task `create ceph mgr keyring(s) when mon is containerized` has a when condition `inventory_hostname == groups[mon_group_name]\|last`. First, this is incorrect because `inventory_hostname` is referring to a mgr node, it means this condition would have never been satisfied. Then, this condition + `serial: 1` makes the mgr keyring creating skipped on the first node. Further, the `ceph-mgr` role tries to copy the mgr keyring (it's not aware we are running `serial: 1`) this leads to a failure like the following: ``` TASK [ceph-mgr : copy ceph keyring(s) if needed] ************************************************************************************************************************************************************************************************************************************************************************* task path: /usr/share/ceph-ansible/roles/ceph-mgr/tasks/common.yml:10 Tuesday 27 November 2018 12:03:34 +0000 (0:00:00.296) 0:11:01.290 **** An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AnsibleFileNotFound: Could not find or access '~/ceph-ansible-keys/48d78ac1-e0d6-4e35-ab3e-772aea7828fc//etc/ceph/local.mgr.magna021.keyring' failed: [magna021] (item={u'dest': u'/var/lib/ceph/mgr/local-magna021/keyring', u'name': u'/etc/ceph/local.mgr.magna021.keyring', u'copy_key': True}) => {"changed": false, "item": {"copy_key": true, "dest": "/var/lib/ceph/mgr/local-magna021/keyring", "name": "/etc/ceph/local.mgr.magna021.keyring"}, "msg": "Could not find or access '~/ceph-ansible-keys/48d78ac1-e0d6-4e35-ab3e-772aea7828fc//etc/ceph/local.mgr.magna021.keyring'"} ``` The ceph_key module is idempotent, so there is no need to have such a condition. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1649957 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `73287f91bc`)	2018-11-28 23:11:46 +01:00
Guillaume Abrioux	3ead8a2586	tests: apply dev_setup on the secondary cluster for rgw_multisite we must apply this playbook before deploying the secondary cluster. Otherwise, there will be a mismatch between the two deployed cluster. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `3d8f4e6304`)	2018-11-28 12:56:57 +00:00
Sébastien Han	2fca8555cc	handler: show unit logs on error This will tremendously help debugging daemons that fail on restart by showing the systemd unit logs. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `a9b337ba66`)	2018-11-27 12:44:15 +00:00
Andrew Schoen	59524c7246	ceph-volume: be idempotent when the batch strategy changes If you deploy with 2 HDDs and 1 SDD then each subsequent deploy both HDD drives will be filtered out, because they're already used by ceph. ceph-volume will report this as a 'strategy change' because the device list went from a mixed type of HDD and SDD to a single type of only SDD. This situation results in a non-zero exit code from ceph-volume. We want to handle this situation gracefully and report that nothing will be changed. A similar json structure to what would have been given by ceph-volume is returned in the 'stdout' key. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1650306 Signed-off-by: Andrew Schoen <aschoen@redhat.com> (cherry picked from commit `e13f32c1c5`)	2018-11-27 00:23:21 +00:00
Guillaume Abrioux	1a1886a442	config: convert _osd_memory_target to int ceph.conf doesn't accept float value. Typical error seen: ``` $ sudo ceph daemon osd.2 config get osd_memory_target Can't get admin socket path: unable to get conf option admin_socket for osd.2: parse error setting 'osd_memory_target' to '7823740108,8' (strict_si_cast: unit prefix not recognized) ``` This commit ensures the value inserted in ceph.conf will be an integer. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `68dde424f6`)	2018-11-21 15:35:55 +00:00
Guillaume Abrioux	abdc245ceb	infra: don't restart firewalld if unit is masked if firewalld.service systemd unit is masked, the handler will fail when trying to restart it. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1650281 (cherry picked from commit `63b9835cbb`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-19 17:32:44 +01:00
Neha Ojha	c96af4bac9	osd_memory_target: standardize unit and fix calculation * The default value of osd_memory_target used by ceph is 4294967296 bytes, so use the same as ceph-ansible default. * Convert ansible_memtotal_mb to bytes to calculate osd_memory_target Signed-off-by: Neha Ojha <nojha@redhat.com> (cherry picked from commit `10538e9a23`)	2018-11-19 10:51:05 +00:00
Guillaume Abrioux	f5d8701ed8	client: fix a typo in create_users_keys.yml `cd1e4ee024` introduced a typo. This commit fixes it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `393ab94728`)	2018-11-17 20:59:11 +00:00
Guillaume Abrioux	62d2ddafd4	validate: allow stable-3.2 to run with ansible 2.4 Although this is not officially supported, this commit allows `stable-3.2` to run against ansible 2.4. This should ease the transition in RHOSP. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-16 08:57:00 +00:00
Jason Dillaman	3b40e2bc87	igw: add support for IPv6 Signed-off-by: Jason Dillaman <dillaman@redhat.com> (cherry picked from commit `0aff0e9ede`) Conflicts: library/igw_purge.py: trivial resolution roles/ceph-iscsi-gw/library/igw_purge.py: trivial resolution	2018-11-13 17:35:58 +00:00
Mike Christie	702f2baccc	igw: open iscsi target port Open the port the iscsi target uses for iscsi traffic. Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `5ba7d1671e`)	2018-11-12 10:46:41 +00:00
Mike Christie	44ee5c7495	igw: use api_port variable for firewall port setting Don't hard code api port because it might be overridden by the user. Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `e2f1f81de4`)	2018-11-12 10:46:41 +00:00
Mike Christie	db576f6f0e	igw: fix firewall iscsi_group_name check The firewall setup for igw is not getting setup because iscsi_group_name does not it exist. It should be iscsi_gw_group_name. Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `a4ff52842c`)	2018-11-12 10:46:41 +00:00
Mike Christie	c843ea1d92	igw: Fix default api port The default igw api port is 5000 in the manual setup docs and ceph-iscsi-config package so this syncs up ansible. Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `a10853c5f8`)	2018-11-12 10:46:41 +00:00
VasishtaShastry	f17140c03d	ceph-validate : Added functions to accept true and flase ceph-validate used to throw error for setting flags as 'true' or 'false' for True and False Now user can set the flags 'dmcrypt' and 'osd_auto_discovery' as 'true' or 'false' Will fix - Bug 1638325 Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com> (cherry picked from commit `098f42f233`)	2018-11-09 16:47:57 +00:00
Rishabh Dave	a74f4204cd	remove configuration files for ceph packages on ubuntu clusters For apt-get, purge command needs to be used, instead of remove command, to remove related configuration files. Otherwise, packages might be shown as installed while running dpkg command even after removing them. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1640061 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `640cad3fd8`)	2018-11-09 16:50:25 +01:00
Mike Christie	77de54025b	igw: stop tcmu-runner on iscsi purge When the iscsi purge playbook is run we stop the gw and api daemons but not tcmu-runner which I forgot on the previous PR. Fixes Red Hat BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1621255 Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `b523a44a1a`)	2018-11-09 16:50:04 +01:00
Guillaume Abrioux	93cdbddd78	tests: test ooo_collocation agasint v3.0.3 ceph-container image Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `811f043947`)	2018-11-09 16:48:35 +01:00

1 2 3 4 5 ...

4200 Commits (3421cb08d9c96cf45a173e62711c17ec628ef321) All Branches Search

4200 Commits (3421cb08d9c96cf45a173e62711c17ec628ef321)

All Branches