ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Vishal Kanaujia	44d514850a	Rolling upgrades: Migrate to ceph-key module This change moves ceph-mgr upgrades to using ceph-key library. Fixes: #2758 Signed-off-by: Vishal Kanaujia <vishal.kanaujia@flipkart.com>	2018-07-03 18:22:14 +02:00
Sébastien Han	20c8065e48	ceph-iscsi: rename group iscsi_gws Let's try to avoid using dashes as testinfra needs to be able to read the groups. Typically, with iscsi-gws we can't add a marker for these iscsi nodes, using an underscore fixes the issue. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-06-08 10:21:54 +02:00
Guillaume Abrioux	232a16d77f	rolling_update: fix facts gathering delegation this is kind of follow up on what has been made in #2560. See #2560 and #2553 for details. Closes: #2708 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-06-06 16:36:30 +08:00
Vishal Kanaujia	08d9432454	Rolling upgrades should use norebalance flag for OSDs The rolling upgrades playbook should have norebalance flag set for OSDs upgrades to wait only for recovery. Fixes: #2657 Signed-off-by: Vishal Kanaujia <vishal.kanaujia@flipkart.com>	2018-06-04 10:59:01 +02:00
Sébastien Han	e91648a7af	rolling_update: add role ceph-iscsi-gw Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1575829 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-26 02:38:47 -07:00
Sébastien Han	da5b104098	rolling_update: fix get fsid for containers When running ansible2.4-update_docker_cluster there is an issue on the "get current fsid" task. The current task only works for non-containerized deployment but will run all the time (even for containerized). This currently results in the following error: TASK [get current fsid] ****************************************************** task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-luminous-ansible2.4-update_docker_cluster/rolling_update.yml:214 Tuesday 22 May 2018 22:48:32 +0000 (0:00:02.615) 0:11:01.035 ********* fatal: [mgr0 -> mon0]: FAILED! => { "changed": true, "cmd": [ "ceph", "--cluster", "test", "fsid" ], "delta": "0:05:00.260674", "end": "2018-05-22 22:53:34.555743", "rc": 1, "start": "2018-05-22 22:48:34.295069" } STDERR: 2018-05-22 22:48:34.495651 7f89482c6700 0 -- 192.168.17.10:0/1022712 >> 192.168.17.12:6789/0 pipe(0x7f8944067010 sd=4 :42654 s=1 pgs=0 cs=0 l=1 c=0x7f894405d510).connect protocol feature mismatch, my 83ffffffffffff < peer 481dff8eea4fffb missing 400000000000000 2018-05-22 22:48:34.495684 7f89482c6700 0 -- 192.168.17.10:0/1022712 >> 192.168.17.12:6789/0 pipe(0x7f8944067010 sd=4 :42654 s=1 pgs=0 cs=0 l=1 c=0x7f894405d510).fault This is not really representative on the real error since the 'ceph' cli is available on that machine. On other environments we will have something like "command not found: ceph". Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-23 04:44:12 +02:00
Sébastien Han	d80a871a07	rolling_update: move osd flag section During a minor update from a jewel to a higher jewel version (10.2.9 to 10.2.10 for example) osd flags don't get applied because they were done in the mgr section which is skipped in jewel since this daemons does not exist. Moving the set flag section after all the mons have been updated solves that problem. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1548071 Co-authored-by: Tomas Petr <tpetr@redhat.com> Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-17 08:17:16 +02:00
Guillaume Abrioux	1b4c3f292d	rolling_update: fix dest path for mgr keys fetching the role `ceph-mgr` that is played later in the playbook fails because the destination path for the fetched keys is wrong. This patch fix the destination path used in the task `fetch ceph mgr key(s)` so there is no mismatch. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574995 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-15 19:30:34 +02:00
Guillaume Abrioux	3b89f1bfb1	rolling_update: get fsid in mgr pre_task {{ fsid }} points to {{ cluster_uuid.stdout }} which is not defined in this part of the rolling_update playbook. Since we need to call {{ fsid }} we must get the fsid and register it to `cluster_uuid`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-05-15 09:01:42 +02:00
Sébastien Han	52fc8a0385	rolling_update: move mgr key creation Until all the mons haven't been updated to Luminous, there is no way to create a key. So we should do the key creation in the mon role only if we are not part of an update. If we are then the key creation is done after the mons upgrade to Luminous. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574995 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-05-15 09:01:42 +02:00
Guillaume Abrioux	c04e67347c	update: look for short and fqdn in ceph_health_raw According to hostname configuration, the task waiting for mons to be in quorum might fail. The idea here is to look for both shortname and fqdn in `ceph_health_raw` instead of just `ansible_hostname` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1546127 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-02-19 10:27:47 +01:00
Andrew Schoen	699c777e68	rolling update: fix undefined jewel_minor_update failure Variables set at the play level with ``vars`` do not carry over into the next play in the playbook. The var jewel_minor_update was set in a previous play but used in this one and was failing because it was not defined. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1544029 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-02-13 17:03:05 +01:00
Guillaume Abrioux	c7ec12d49c	upgrade: skip luminous tasks for jewel minor update These tasks are needed only when upgrading to luminous. They are not needed in Jewel minor upgrade and by the way, they fail because `ceph versions` command doesn't exist. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-01-25 18:30:34 +01:00
Sébastien Han	8af7459476	rolling update: add mgr exception for jewel minor updates When update from a minor Jewel version to another, the playbook will fail on the task "fail if no mgr host is present in the inventory". This now can be worked around by running Ansible with_items -e jewel_minor_update=true Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1535382 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-01-18 14:06:05 +01:00
Andrew Schoen	997edea271	rolling_update: do not fail the playbook if nfs-ganesha is not present The rolling update playbook was attempting to stop the nfs-ganesha service on nodes where jewel is still installed. The nfs-ganesha service did not exist in jewel so the task fails. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2018-01-06 14:07:55 +01:00
Sébastien Han	200785832f	rolling_update: do not require root to answer question There is no need to ask for root on the local action. This will prompt for a password the current user is not part of sudoers. That's unnecessary anyways. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1516947 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-12-19 14:04:55 +01:00
Sébastien Han	4413511b66	all: backward compatibility between stable-2.2 and 3.0 stable-3.0 brought numerous changes in ceph-ansible variables, this PR aims to maintain backward compatibility for someone running stable-2.2 upgrading to stable-3.0 but keeps its groups_vars untouched. We will then determine the right options to make sure the upgrade works but we are expecting that new variables should be used. We will drop this in a near future, maybe 3.1 or 3.2. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-20 11:54:10 +02:00
Guillaume Abrioux	982326373b	upgrade: fix upgrade jewel to luminous for nfs nodes nfs nodes can't be upgraded from jewel to luminous because ceph-nfs role is skipped because of the condition `when: "ceph_release_num[ceph_release] >= ceph_release_num.luminous"`. Indeed, package is upgraded in `ceph-nfs` role, therefore, `ceph_release` is still set to the old version. It means the when can't be satisfied. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-19 20:54:23 +02:00
Guillaume Abrioux	70034451e9	upgrade: fix upgrade jewel to luminous for mgr nodes mgr nodes can't be upgraded from jewel to luminous because ceph-mgr role is skipped because of the condition `when: "ceph_release_num[ceph_release] >= ceph_release_num.luminous"`. Indeed, ceph-mgr package is upgraded in `ceph-mgr` role, therefore, `ceph_release` is still set to the old version. It means the when can't be satisfied. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 302e563601cd6820b1ae44fabdfb1506688c7c9b)	2017-10-19 20:54:23 +02:00
Sébastien Han	d920d4839d	upgrade: support for rbd mirror and nfs - Add upgrade support for rbd mirror and nfs daemons. - Only works with systemd (remove sysvinit and upstart occurence) - A bit of cleanup Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-17 10:54:47 +02:00
Major Hayden	c01851325e	Remove jinja2 delimiters from `when` keys This patch changes the `when:` keys so that they have no jinja2 delimiters. This avoids Ansible warnings which could turn into errors in a future Ansible release.	2017-10-12 11:27:42 -05:00
Sébastien Han	774697ebd8	infra: use the pg check in the right place Use the pg check before doing the pg check, not on the quorum check. Also never quote int when doing comparaison. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-09 17:25:41 +02:00
Sébastien Han	05f26031ea	rolling_update: perform pg check when pgs_num > 0 If num_pgs = 0 the check will never return 0. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-07 03:39:09 +02:00
Sébastien Han	99466e79a1	upgrade: a support for mgrs Also we now play ceph-config to have everything being generated for new daemons bootstrap during upgrade. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1497959 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-03 16:57:31 +02:00
Sébastien Han	b9050d6229	update: fix var register Even if the task is skipped, ansible registers the var as 'skipped' so this task the task using this variable for its next usage. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-29 14:27:55 +02:00
Sébastien Han	a0a5b174ba	rolling_update: clarify mon quorum command Cleaner. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-29 01:19:46 +02:00
Sébastien Han	bd5471b940	update: complete luminous upgrade Once we complete the upgrade to Luminous, we must issue a specific command. For more info read: http://ceph.com/community/new-luminous-upgrade-complete/ Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-28 21:05:00 +02:00
Sébastien Han	68f1f99ee9	update: nicer way to wait for clean pgs More comprhensive and friendly to read. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-28 14:46:26 +02:00
Guillaume Abrioux	7195b08718	update: update rgw systemd unit name The old name is used in `rolling_update.yml` and `purge-docker-cluster.yml`, it breaks the `test_rgw_service_is_running()` test. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-24 14:58:55 +02:00
Sébastien Han	92f9be963b	rolling_update: clarify update doc Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1490188 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-13 15:46:29 -06:00
Sébastien Han	e0a264c7e9	osd: allow multi dedicated journals for containers Fix: https://bugzilla.redhat.com/show_bug.cgi?id=1475820 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-30 12:34:06 +02:00
Sébastien Han	0205f6d645	rolling_update: nicer way to set osd flags Prior to this patch, we were applying the osd flags like this: " General pre tasks Set flags Upgrade OSDs on a host Unset flags <-- this triggers pending scrub to start Set flags Upgrade OSDs on a hosts Unset flags <-- this triggers pending scrub to start . . . General post tasks " Now instead, we apply the flag once before starting the OSD update and unset them once the last OSD is finished. " General pre tasks Set flags and wait for any scrubs to finish Upgrade OSDs on a host Upgrade OSDs on a host . . . Unset flags General post tasks " Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1450754 Signed-off-by: Sébastien Han <seb@redhat.com> Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-25 18:21:28 +02:00
Sébastien Han	4a4a20f07d	rolling update: skip pg check if num_pgs = 0 In our test case we don't have any pgs, thus the check fails. The check always returns an empty array, which makes the comparaison failing. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-24 08:50:49 +02:00
Guillaume Abrioux	7a333d05ce	Add handlers for containerized deployment Until now, there is no handlers for containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 17:12:20 +02:00
Guillaume Abrioux	5adbf0fdaa	Move role dependencies in site.yml/site-docker.yml This will give us more flexibility and avoid a lot of useless when skipping all tasks from a non-desired role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 17:12:14 +02:00
Guillaume Abrioux	206c7a16d0	rolling_update: refact code Refact rolling_update playbook. Add ceph-client upgrade. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 11:10:51 +02:00
Guillaume Abrioux	828f88403e	Update: Avoid screen scraping in rolling update since luminous has revamped the `ceph -s` output, we need to avoid screen scraping. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-12 15:02:39 +02:00
Guillaume Abrioux	73141118d0	Make the new check PGs working with /bin/sh The new test in the checks PGs are no longer working on distributions where /bin/sh isn't linked to /bin/bash. Fix: #1619 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-06-22 17:59:38 +02:00
Andrew Schoen	e2104acb62	rolling_update: set health_mon_check_delay to 15 The old value of 10 did not give enough time for a containerized mon to pass the health check. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-06-13 08:56:44 -05:00
Guillaume Abrioux	5af9bb432c	rewrite check pgs clean tasks Avoid screen scrapping by rewriting `waiting for clean pgs` tasks like it is done in `304de48`. Use the json output returned by `ceph -s` instead Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-06-13 09:48:56 +02:00
Sébastien Han	fdc7866072	Merge pull request #1469 from ceph/refact_code Docker: Refact code	2017-06-02 12:40:25 +02:00
Guillaume Abrioux	ddfe019342	Refact code `ceph-docker-common`: At the moment there is a lot of duplicated tasks in each `./roles/ceph-<role>/tasks/docker/main.yml` that could be refactored in `./roles/ceph-docker-common/tasks/main.yml`. `_containerized_deployment` variables: All `_containerized_deployment` have been refactored to a single variable `containerized_deployment` duplicate `cephx` variables in `group_vars/* have been removed. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-05-24 15:55:41 +02:00
Sébastien Han	90389864d8	rolling-update: set/unset flags on the right container Problem: we are delegating the set/unset flag to a monitor node but we try to call an osd container Solution: use the right container name. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-05-22 09:38:08 +02:00
Sébastien Han	dfd8f4d96e	test: add mgr section to the host inventory file Without this, we don't test the mgr role so we need to add it. Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com> Signed-off-by: Sébastien Han <seb@redhat.com>	2017-04-15 00:16:10 +02:00
Sébastien Han	c37aaa41f4	playbook: homogenize the way list osd ids Problem: too many different commands to do the same thing. The 'cut' command on infrastructure-playbooks/purge-cluster.yml was also wrong. This sed command from osixia in ceph-docker https://github.com/ceph/ceph-docker/pull/580/ addresses all the scenarios. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-03-30 11:51:38 +02:00
Konstantin Shalygin	1662976fc0	Resolve issues when groups names not in default value.	2017-03-27 21:44:30 +07:00
Anthony D'Atri	6c4911276e	Enhance clean PG check to catch active+clean+scrubbing and active+clean+scrubbing+deep Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>	2017-03-19 00:23:26 -07:00
Sébastien Han	8320c14191	Merge pull request #1317 from ibotty/harmonize-docker-names harmonize docker names	2017-03-14 18:20:20 +01:00
Andrew Schoen	46f26bec13	rolling-update: do not set group name vars at playbook level Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:08 -06:00
yangyimincn	8b36cbac64	Update rolling_update.yml The task waiting for the monitor to join the quorum... , the result for ceph -s \| grep monmap only contain monmap, not included quorum: # ceph -s --cluster ceph \| grep monmap monmap e1: 3 mons at {sh-office-ceph-1=10.12.10.34:6789/0,sh-office-ceph-2=10.12.10.35:6789/0,sh-office-ceph-3=10.12.10.36:6789/0} If want to get monitor, should use this: # ceph -s --cluster ceph \| grep election election epoch 80, quorum 0,1 sh-office-ceph-1,sh-office-ceph-2 ceph verison: 10.2.5	2017-02-28 16:56:02 +08:00

1 2

77 Commits (0c863a37839249845d10dc1c39e6853331a5f209)