ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	fa675f2ead	purge-docker-cluster: ensure old logs are removed purge-docker-cluster must remove all osd_disk_prepare logs in `{{ ceph_osd_docker_run_script_path }}`, otherwise if you purge your cluster and try to redeploy it, osds will fail to start since because it will try to retrieve find a partition uuid which doesn't exist. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1510470 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-11-09 17:49:20 +01:00
Caleb Boylan	41d10a2f64	infra: fix take-over-existing-cluster.yml playbook The ansible inventory could have more than just ceph-ansible hosts, so we shouldnt use "hosts: all", also only grab one file when getting the ceph cluster name instead of failing when there is more than one file in /etc/ceph. Also fix location of the ceph.conf template	2017-11-06 15:00:30 -08:00
Sébastien Han	473673ab41	shrink-mon: fix typo in the code doc Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-27 11:59:22 +02:00
Sébastien Han	2837d0a22e	purge: do not reboot by default Rebooting servers is really intrusive and perhaps this is not what the operator wants. So we disable the reboot by default now. Note that the reboot might not happen all the time. It can be enabled by default by running the purge playbook with -e reboot_osd_node=True Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1505011 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-26 14:18:38 +02:00
Guillaume Abrioux	f90f2f3a04	purge: containers are not stopped During purge osd, the containers are not stopped because of a typo, as a result, all the devices can't be unmounted later. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-25 07:58:00 +02:00
Sébastien Han	4413511b66	all: backward compatibility between stable-2.2 and 3.0 stable-3.0 brought numerous changes in ceph-ansible variables, this PR aims to maintain backward compatibility for someone running stable-2.2 upgrading to stable-3.0 but keeps its groups_vars untouched. We will then determine the right options to make sure the upgrade works but we are expecting that new variables should be used. We will drop this in a near future, maybe 3.1 or 3.2. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-20 11:54:10 +02:00
Guillaume Abrioux	982326373b	upgrade: fix upgrade jewel to luminous for nfs nodes nfs nodes can't be upgraded from jewel to luminous because ceph-nfs role is skipped because of the condition `when: "ceph_release_num[ceph_release] >= ceph_release_num.luminous"`. Indeed, package is upgraded in `ceph-nfs` role, therefore, `ceph_release` is still set to the old version. It means the when can't be satisfied. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-10-19 20:54:23 +02:00
Guillaume Abrioux	70034451e9	upgrade: fix upgrade jewel to luminous for mgr nodes mgr nodes can't be upgraded from jewel to luminous because ceph-mgr role is skipped because of the condition `when: "ceph_release_num[ceph_release] >= ceph_release_num.luminous"`. Indeed, ceph-mgr package is upgraded in `ceph-mgr` role, therefore, `ceph_release` is still set to the old version. It means the when can't be satisfied. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 302e563601cd6820b1ae44fabdfb1506688c7c9b)	2017-10-19 20:54:23 +02:00
Sébastien Han	d920d4839d	upgrade: support for rbd mirror and nfs - Add upgrade support for rbd mirror and nfs daemons. - Only works with systemd (remove sysvinit and upstart occurence) - A bit of cleanup Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-17 10:54:47 +02:00
Sébastien Han	39bf102b64	switch: nicer way to check mon quorum re-use the same syntax as rolling_udate.yml Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-17 10:54:36 +02:00
Sébastien Han	b685aceede	Merge pull request #2044 from major/avoid-jinja-in-when Remove jinja2 delimiters from `when` keys	2017-10-12 22:23:06 +02:00
Major Hayden	c01851325e	Remove jinja2 delimiters from `when` keys This patch changes the `when:` keys so that they have no jinja2 delimiters. This avoids Ansible warnings which could turn into errors in a future Ansible release.	2017-10-12 11:27:42 -05:00
Major Hayden	33b200d43a	Suppress yum/dnf/rpm command warnings Ansible throws warnings when using yum/dnf/rpm with the command module: [WARNING]: Consider using yum module rather than running yum This patch adds the `warn: no` argument to suppress the warnings in the Ansible output.	2017-10-12 08:38:05 -05:00
Sébastien Han	13bce287ad	infra: replace osd playbook This playbook can replace failed OSD in containerized and non-containerized env. The current limitation is that it won't allow you to choose between filestore/bluestore and will do collocation as well. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-12 11:53:30 +02:00
Sébastien Han	85e13a864c	purge-iscsi: fix group name Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1500281 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-11 12:52:12 +02:00
Sébastien Han	24b82c2679	purge: fix journal purge Using a condition when osd_scenario == 'non-collocated' was wrong since these partitions can be collocated on a single device also. Removing the check makes the purge of these partitions. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1499871 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-10 09:57:39 +02:00
Guillaume Abrioux	f147b119ed	Merge pull request #2014 from ceph/fixes-2 infra: use the pg check in the right place	2017-10-09 20:14:06 +02:00
Sébastien Han	450108fab9	infra: add independant purge-iscsi-gateways.yml The current inclusion of purge-iscsi-gateways.yml in purge-cluster.yml is not working well and blocking the CI too. So removing it from purge-cluster.yml and re-add the original purge-iscsi-gateways.yml. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-09 17:25:44 +02:00
Sébastien Han	774697ebd8	infra: use the pg check in the right place Use the pg check before doing the pg check, not on the quorum check. Also never quote int when doing comparaison. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-09 17:25:41 +02:00
Sébastien Han	a3e7bcb13f	Merge pull request #2013 from ceph/wip-purge-cluster A couple of purge cluster fixes	2017-10-09 17:18:30 +02:00
Sébastien Han	33a3aa0dda	switch: check pgs only when num_pgs > 0 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-07 03:42:09 +02:00
Sébastien Han	05f26031ea	rolling_update: perform pg check when pgs_num > 0 If num_pgs = 0 the check will never return 0. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-07 03:39:09 +02:00
Sébastien Han	c3c63ae539	switch: rework and fix clean pg wait Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-07 03:39:09 +02:00
Sébastien Han	c693e95cbf	purge-docker: rework device detection we don't need "devices" and other device variable anymore, the playbook detects that for us. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-07 03:39:04 +02:00
Sébastien Han	2fb4981ca9	shrink-osd: admin key not needed for container shrink Also do some clean Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-07 00:20:43 +02:00
Boris Ranto	64e272d818	purge-cluster: Do not use shell for rm The shell wildcard expansion of non-existing paths fails on zsh making the whole script fail. We can use file module with with_fileglob to alleviate the problem instead. Signed-off-by: Boris Ranto <branto@redhat.com>	2017-10-06 22:54:37 +02:00
Boris Ranto	f696cb7637	purge-cluster: Do not fail on systemd commands The systemd can't stop services if the unit files were removed before the cluster was purged. We should just ignore these. Signed-off-by: Boris Ranto <branto@redhat.com>	2017-10-06 22:52:56 +02:00
Sébastien Han	b6b24a5ca9	iscsi: fix wrong group name for iscsi Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498490 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-05 17:25:32 +02:00
Sébastien Han	f37e014a65	Merge pull request #1974 from ceph/mgr-upgrade-luminous upgrade: a support for mgrs	2017-10-03 19:57:31 +02:00
Sébastien Han	99466e79a1	upgrade: a support for mgrs Also we now play ceph-config to have everything being generated for new daemons bootstrap during upgrade. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1497959 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-03 16:57:31 +02:00
Sébastien Han	3bd341f6c0	osd: container use id instead of dev name Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1494127 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-10-03 14:44:00 +02:00
Sébastien Han	3c2c31a591	Merge pull request #1964 from vatelzh/master purge-cluster: delete block partitions if using bluestore	2017-10-02 12:10:26 +02:00
Sébastien Han	b9050d6229	update: fix var register Even if the task is skipped, ansible registers the var as 'skipped' so this task the task using this variable for its next usage. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-29 14:27:55 +02:00
zhangwentao	86a6db0d58	purge-cluster: delete block partitions if using bluestore	2017-09-29 14:04:17 +08:00
Sébastien Han	a0a5b174ba	rolling_update: clarify mon quorum command Cleaner. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-29 01:19:46 +02:00
Sébastien Han	bd5471b940	update: complete luminous upgrade Once we complete the upgrade to Luminous, we must issue a specific command. For more info read: http://ceph.com/community/new-luminous-upgrade-complete/ Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-28 21:05:00 +02:00
Sébastien Han	68f1f99ee9	update: nicer way to wait for clean pgs More comprhensive and friendly to read. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-28 14:46:26 +02:00
Andrew Schoen	fccc604f4a	purge-cluster: default lvm_volumes if not defined Most osd scenarios do not use lvm_volumes, so default it in purge-cluster.yml if it's not defined. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-09-26 15:14:29 -05:00
Guillaume Abrioux	fcb6454e04	rbd-mirror: fix systemd unit in purge-docker rbd-mirror containers are not stopped in purge-docker-cluster playbook because of the wrong name used. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-24 21:18:50 +02:00
Guillaume Abrioux	c80ba7a307	purge: implement mgr purge unti now, mgr nodes are not managed by purge-cluster.yml, therefore it breaks scenario like purge_cluster. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-24 21:18:50 +02:00
Guillaume Abrioux	7195b08718	update: update rgw systemd unit name The old name is used in `rolling_update.yml` and `purge-docker-cluster.yml`, it breaks the `test_rgw_service_is_running()` test. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-24 14:58:55 +02:00
Sébastien Han	6bac613611	shrink: support for container We can now shrink mon and osds on containerized deployment. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492115 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-20 16:25:07 +02:00
Sébastien Han	7fedc8ebf4	Merge pull request #1891 from ceph/clarify-update rolling_update: clarify update doc	2017-09-15 07:08:49 -06:00
Sébastien Han	fe1d84d395	Merge pull request #1892 from ceph/purge-dmcrypt-col purge: only purge specific directories for mon	2017-09-13 17:57:06 -06:00
Sébastien Han	ba3e3b6cc7	purge: only purge specific directories for mon Handles the case when a mon is collocated with an OSD. Closes: https://github.com/ceph/ceph-ansible/issues/1877 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-13 17:07:04 -06:00
Sébastien Han	82c4848ec4	Merge pull request #1885 from ceph/shrink-osd shrink-osd: fix when multiple osds	2017-09-13 16:12:49 -06:00
Sébastien Han	92f9be963b	rolling_update: clarify update doc Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1490188 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-13 15:46:29 -06:00
Sébastien Han	3031e51778	shrink-osd: fix when multiple osds The loop was being built properly so we were always getting the last item as osd host. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1490355 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-13 15:20:11 -06:00
Sébastien Han	aa364264cd	resync ceph-iscsi-gw with old upstream Taken from https://github.com/pcuzner/ceph-iscsi-ansible/tree/tcmu-fixes Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1454945 and https://bugzilla.redhat.com/show_bug.cgi?id=1484083 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-12 18:06:10 -06:00
Sébastien Han	477f86e305	switch to container: fix ceph nfs The service is nfs-ganesha where ceph-nfs@{{ ansible_hostname }} will be the name of the container. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-08 22:43:50 +02:00
Sébastien Han	fdacac9fa0	switch: make osd collection idempotent This commits allows us to run switch-from-non-containerized-to-containerized-ceph-daemons.yml multiple times. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1489353 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-08 11:31:47 +02:00
Sébastien Han	e46440e19c	switch-from-non-containerized-to-containerized: fix devices If devices is passed through an extra var this register won't work so let's only register the var is devices is not defined. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1489099 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-07 23:18:14 +02:00
Sébastien Han	b9ced956d7	purge: get lockbox mountpoint and unmount it Prior command was avoiding the lockbox mountpoint and the playbook was failing with: rmtree failed: [Errno 30] Read-only file system: '/var/lib/ceph/osd-lockbox/4e9d8052-87c2-4fde-a56c-b8c108a3eefc/key-management-mode' Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-07 16:31:31 +02:00
Guillaume Abrioux	d987d26719	tests: force docker variable for switch-to-containers scenario we need to force the value of `docker` variable which is initially set to `false` since it's a migration from non-containerized to containerized cluster. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-06 18:03:52 +02:00
Sébastien Han	b7db600caa	switch-from-non-containerized-to-containerized: mask unit files We must mask the image so we are sure that even if the system reboots then the OSDs won't start. Also remove Ceph udev rules if found on the system prior to deploy containers. If we don't do this we are exposed to conflicts between udev rules and sytemd unit files. Also add the CI will now test the migration from a non-containerized cluster to a containerized cluster. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-05 15:20:31 +02:00
Sébastien Han	579b95fd8a	shrink-mon: wait a little bit for the mon to be out Monitor removal from the monmap is not immediate, so let's wait a little bit and then fail if the monitor is still in the monmap. We try twice in total with 10 sec intervals. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-04 23:08:57 +02:00
Sébastien Han	54d7a81241	infra playbook: move untested scenario to a new dir Move untested/with few confidence playbooks in a untested-by-ci directory. Also removing this directory from the package build. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1461551 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-01 19:58:24 +02:00
Sébastien Han	298a63c437	shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-01 19:12:00 +02:00
Sébastien Han	e0a264c7e9	osd: allow multi dedicated journals for containers Fix: https://bugzilla.redhat.com/show_bug.cgi?id=1475820 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-30 12:34:06 +02:00
Ben England	617d9ee75d	dont use devices var anymore, works for osd_auto_discover	2017-08-28 17:27:01 -04:00
Sébastien Han	0205f6d645	rolling_update: nicer way to set osd flags Prior to this patch, we were applying the osd flags like this: " General pre tasks Set flags Upgrade OSDs on a host Unset flags <-- this triggers pending scrub to start Set flags Upgrade OSDs on a hosts Unset flags <-- this triggers pending scrub to start . . . General post tasks " Now instead, we apply the flag once before starting the OSD update and unset them once the last OSD is finished. " General pre tasks Set flags and wait for any scrubs to finish Upgrade OSDs on a host Upgrade OSDs on a host . . . Unset flags General post tasks " Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1450754 Signed-off-by: Sébastien Han <seb@redhat.com> Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-25 18:21:28 +02:00
Sébastien Han	4a4a20f07d	rolling update: skip pg check if num_pgs = 0 In our test case we don't have any pgs, thus the check fails. The check always returns an empty array, which makes the comparaison failing. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-24 08:50:49 +02:00
Alfredo Deza	e651469a2a	Merge pull request #1797 from ceph/purge-lvm adds purge support for the lvm_osds osd scenario	2017-08-23 14:28:29 -04:00
Sébastien Han	f2499ff5ac	Merge pull request #1788 from ceph/improve-switch switch-from-non-containerized-to-containerized: simplify	2017-08-23 19:47:26 +02:00
Sébastien Han	4f0ecb7f30	switch-from-non-containerized-to-containerized: simplify This commit eases the use of the infrastructure-playbooks/switch-from-non-containerized-to-containerized-ceph-daemons.yml playbook. We basically run it with a couple of pre-tasks and then we let the playbook run the docker roles. It obviously expect to have proper variables configured in order to work. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-23 18:39:45 +02:00
Andrew Schoen	bed57572cc	purge-cluster: adds support for purging lvm osds This also adds a new testing scenario for purging lvm osds Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-08-23 10:33:35 -05:00
Sébastien Han	1ac0969c28	Merge pull request #1778 from ceph/fix-1770 purge: add ability to purge bluestore osd	2017-08-22 23:56:36 +02:00
Giulio Fidente	2c01de4350	Default cluster to ceph in switch to containers	2017-08-22 13:13:36 +02:00
Giulio Fidente	f0423b1804	Parse ceph_docker_registry in switch to containers Defaults it to docker.io as it was for backward compatibility.	2017-08-22 13:11:27 +02:00
Giulio Fidente	a59b84d5c9	Assume mon_docker_privileged false in switch to containers	2017-08-22 13:01:25 +02:00
Giulio Fidente	0106fa6835	Consume public_network vs ceph_mon_docker_subnet In the switch to containers migration there were broken references to ceph_mon_docker_subnet variable, replaced with public_network. Also fixes references to ceph_mon_docker_extra_env setting for it a default as it could be undefined.	2017-08-21 18:34:24 +02:00
Giulio Fidente	386303d42e	Extend set_uid fact to support RH Ceph images	2017-08-21 18:32:08 +02:00
Sébastien Han	9c824b9818	purge: add ability to purge bluestore osd We now purge block db and/or wal partitions if we find any. Closes: https://github.com/ceph/ceph-ansible/issues/1770 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-21 18:08:18 +02:00
Andrew Schoen	d2f4d3666f	Merge pull request #1725 from ceph/simplify-osd-scenario osd: simply osd scenario declaration	2017-08-03 09:31:57 -05:00
Sébastien Han	671f2cd4bc	Merge pull request #1738 from yanyixing/nvmepart fix for nvme part path	2017-08-03 13:37:10 +02:00
yanyx	d506fad056	fix for nvme part path	2017-08-03 17:37:52 +08:00
Sébastien Han	30991b1c0a	osd: simplify scenarios There is only two main scenarios now: * collocated: everything remains on the same device: - data, db, wal for bluestore - data and journal for filestore * non-collocated: dedicated device for some of the component Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-03 10:20:39 +02:00
Sébastien Han	fdc6aebd62	infrastructure-playbooks: update with ceph-defaults roles Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-02 17:12:20 +02:00
Guillaume Abrioux	7a333d05ce	Add handlers for containerized deployment Until now, there is no handlers for containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 17:12:20 +02:00
Guillaume Abrioux	5adbf0fdaa	Move role dependencies in site.yml/site-docker.yml This will give us more flexibility and avoid a lot of useless when skipping all tasks from a non-desired role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 17:12:14 +02:00
Guillaume Abrioux	206c7a16d0	rolling_update: refact code Refact rolling_update playbook. Add ceph-client upgrade. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 11:10:51 +02:00
yanyx	d0a17b11b2	change the partition's ownership	2017-07-27 11:55:30 +08:00
Sébastien Han	fad9d0caec	Merge pull request #1690 from yanyixing/master fix: when osd device is a disk partition	2017-07-26 15:55:29 +02:00
yanyx	2e6233271e	fix: when osd device is a disk partition	2017-07-25 21:39:43 +08:00
Sébastien Han	0c18cf199e	purge: remove leftover unit files Closes https://github.com/ceph/ceph-ansible/issues/1672 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-07-25 13:26:28 +02:00
Guillaume Abrioux	828f88403e	Update: Avoid screen scraping in rolling update since luminous has revamped the `ceph -s` output, we need to avoid screen scraping. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-12 15:02:39 +02:00
Guillaume Abrioux	896d62d78b	Refact: remove ceph_mon_docker_interface variable remove `ceph_mon_docker_interface` and use `monitor_interface` instead for both containerized and non-containerized deployment. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-04 18:08:59 +02:00
Guillaume Abrioux	73141118d0	Make the new check PGs working with /bin/sh The new test in the checks PGs are no longer working on distributions where /bin/sh isn't linked to /bin/bash. Fix: #1619 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-06-22 17:59:38 +02:00
David Galloway	127b5ad9b4	infra: Create a backup of ceph.conf when taking over existing cluster Signed-off-by: David Galloway <dgallowa@redhat.com>	2017-06-21 09:53:09 -04:00
David Galloway	40ed2d7be6	infra: Fix ceph.conf creation when taking over existing cluster Fixes bug introduced in https://github.com/ceph/ceph-ansible/pull/1330 The "stat ceph.conf" task was basically using the stat module on a string instead of the ceph.conf filename. This caused the "generate ceph configuration file" task to fail. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1463382 Signed-off-by: David Galloway <dgallowa@redhat.com>	2017-06-21 09:52:01 -04:00
Andrew Schoen	e2104acb62	rolling_update: set health_mon_check_delay to 15 The old value of 10 did not give enough time for a containerized mon to pass the health check. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-06-13 08:56:44 -05:00
Guillaume Abrioux	5af9bb432c	rewrite check pgs clean tasks Avoid screen scrapping by rewriting `waiting for clean pgs` tasks like it is done in `304de48`. Use the json output returned by `ceph -s` instead Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-06-13 09:48:56 +02:00
Andrew Schoen	59992c54cc	purge-docker-cluster: include ceph_docker_registry We need to include ceph_docker_registry when removing containers/images because if we don't it will assume docker.io which is not always where the image originated from, causing the playbook to fail. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-06-02 09:49:17 -05:00
Sébastien Han	fdc7866072	Merge pull request #1469 from ceph/refact_code Docker: Refact code	2017-06-02 12:40:25 +02:00
Andrew Schoen	f7677e4393	purge-docker-cluster: pip is only used on Debian We only need to purge packages installed by pip on Debian systems. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-05-31 09:03:44 -05:00
Andrew Schoen	8e322d4825	purge-docker-cluster: default raw_journal_devices to [] If we're purging a containerized cluster that did not use the raw_multi_journal OSD scenario then raw_journal_devices will not be defined which causes the playbook to fail. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1455187 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-05-25 07:30:25 -05:00
Guillaume Abrioux	ddfe019342	Refact code `ceph-docker-common`: At the moment there is a lot of duplicated tasks in each `./roles/ceph-<role>/tasks/docker/main.yml` that could be refactored in `./roles/ceph-docker-common/tasks/main.yml`. `_containerized_deployment` variables: All `_containerized_deployment` have been refactored to a single variable `containerized_deployment` duplicate `cephx` variables in `group_vars/* have been removed. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-05-24 15:55:41 +02:00
Sébastien Han	90389864d8	rolling-update: set/unset flags on the right container Problem: we are delegating the set/unset flag to a monitor node but we try to call an osd container Solution: use the right container name. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-05-22 09:38:08 +02:00
Sébastien Han	b93ffe637b	Merge pull request #1476 from WingkaiHo/improve-shrink-osd.yml improve shrink-osd.yml can shrink osd when disk damage	2017-04-27 11:01:27 +02:00
WingkaiHo	0b9f322ca0	improve shrink-osd.yml can shrink osd when disk damage	2017-04-27 10:26:26 +08:00
Andrew Schoen	5a3f95dfc1	purge-cluster: check for any running ceph process after purge Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-04-25 09:30:22 -05:00
Andrew Schoen	26bdd59f5d	purge-cluster: we don't support sysv or upstart anymore Now that ceph-ansible only supports > jewel we don't need to bother with sysv or upstart Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-04-21 15:14:38 -07:00
Andrew Schoen	7ca2bddcce	purge-cluster: do not need to check for running ceph processes Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-04-21 15:12:46 -07:00
Andrew Schoen	aac79df3b3	purge-cluster: no need to remove ceph.target The package uninstalls will stop ceph.target Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-04-21 15:11:03 -07:00
Sébastien Han	dfd8f4d96e	test: add mgr section to the host inventory file Without this, we don't test the mgr role so we need to add it. Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com> Signed-off-by: Sébastien Han <seb@redhat.com>	2017-04-15 00:16:10 +02:00
Sébastien Han	17ac1fd464	Merge pull request #1443 from WingkaiHo/osds-journal-migrate Migrate osd(s) journal to ssd	2017-04-13 16:45:57 +02:00
WingkaiHo	9fba41b4ce	Migrate osd(s) journal to ssd	2017-04-13 11:05:58 +08:00
Daniel Lupescu	d5e56c481a	purge-cluster: fix grep match for NVMe and HP Smart Array devices raw_device would return invalid block device names for NVMe and HPSA devices which would cause sgdisk partition deletion to fail $ echo /dev/nvme1n1p3 \| egrep -o '/dev/([hsv]d[a-z]{1,2}\|cciss/c[0-9]d[0-9]p\|nvme[0-9]n[0-9]p){1,2}' /dev/nvme1n1p $ echo /dev/cciss/c0d0p2 \| egrep -o '/dev/([hsv]d[a-z]{1,2}\|cciss/c[0-9]d[0-9]p\|nvme[0-9]n[0-9]p){1,2}' /dev/cciss/c0d0p	2017-04-11 16:13:28 +03:00
Sébastien Han	c37aaa41f4	playbook: homogenize the way list osd ids Problem: too many different commands to do the same thing. The 'cut' command on infrastructure-playbooks/purge-cluster.yml was also wrong. This sed command from osixia in ceph-docker https://github.com/ceph/ceph-docker/pull/580/ addresses all the scenarios. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-03-30 11:51:38 +02:00
Sébastien Han	35a90ae283	Merge pull request #1386 from WingkaiHo/master Create recover-osds-after-ssd-journal-failure.yml	2017-03-28 09:50:39 +02:00
Konstantin Shalygin	1662976fc0	Resolve issues when groups names not in default value.	2017-03-27 21:44:30 +07:00
WingkaiHo	ac1498b0d7	Merge https://github.com/ceph/ceph-ansible	2017-03-27 10:50:38 +08:00
WingkaiHo	ebb56ccebf	command module instead shell	2017-03-23 17:38:41 +08:00
WingkaiHo	2d44c1cee6	remove service enable	2017-03-23 15:28:14 +08:00
WingkaiHo	14c189fee5	break it into lines since you already use the string block synta and fix disable it here and enable again in later task	2017-03-23 14:49:10 +08:00
WingkaiHo	62c37042fe	remove this detection and simply rely on {{ cluster }}	2017-03-23 09:22:06 +08:00
WingkaiHo	3d10c5981e	fix some pelling mistakes and wirting format, use full device path for device name	2017-03-22 17:48:34 +08:00
WingkaiHo	1e670bdeb0	This assumes ceph as a cluster name. We need detect the name of the cluster	2017-03-22 10:09:06 +08:00
WingkaiHo	83a1ac0c67	This assumes ceph as a cluster name. We need detect the name of the cluster	2017-03-22 10:06:11 +08:00
WingkaiHo	19f9e200d7	Add auto detect the ceph cluster name	2017-03-22 10:00:44 +08:00
WingkaiHo	8602166f6e	Ansible will include host_vars/ansible_hostname.yml itself, no need this task IMO.	2017-03-21 13:50:27 +08:00
WingkaiHo	55725fd01d	fix some syntax error	2017-03-21 11:19:25 +08:00
WingKai Ho	7445113dc4	Create recover-osds-after-ssd-journal-failure.yml This playbook use to recover Ceph OSDs after ssd journal failure.	2017-03-21 11:08:25 +08:00
Anthony D'Atri	6c4911276e	Enhance clean PG check to catch active+clean+scrubbing and active+clean+scrubbing+deep Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>	2017-03-19 00:23:26 -07:00
Daniel Marks	77edd3d40a	Fixing tabs that are breaking the syntax check With the merge of PR #1336 the syntax check fails. This commit replaces the tabs with proper indentation.	2017-03-15 14:15:15 +01:00
Sébastien Han	38ab6de602	Merge pull request #1336 from WingkaiHo/master Load a variable file for devices partition	2017-03-15 11:55:26 +01:00
Sébastien Han	8320c14191	Merge pull request #1317 from ibotty/harmonize-docker-names harmonize docker names	2017-03-14 18:20:20 +01:00
Andrew Schoen	e81d690aa0	switch-to-containers: do not include group vars or role defaults Doing so will override any values set for these in the group_vars directory relative to the users inventory. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:09 -06:00
Andrew Schoen	cf702b05cf	purge-docker-cluster: do not include role defaults or group vars Doing so at playbook level overrides whatever values might be set for these in the user's group_vars directory that's relative to their inventory. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:09 -06:00
Andrew Schoen	aef54d89d9	switch-to-containers: do not set group name vars at playbook level Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:09 -06:00
Andrew Schoen	7289acb6b3	purge-docker-cluster: do not set group names vars at playbook level Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:08 -06:00
Andrew Schoen	46f26bec13	rolling-update: do not set group name vars at playbook level Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:08 -06:00
Andrew Schoen	4fe6607004	purge-cluster: do not set group name vars at playbook level This has the behavior of overriding custom values set in group_vars. I've added defaults to the rest of the group names so that if they are not overridden in group_vars then defaults will be used. See: https://bugzilla.redhat.com/show_bug.cgi?id=1354700 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:08 -06:00
WingKai Ho	0d134b4ad9	Update make-osd-partitions.yml change	2017-03-08 17:46:37 +08:00
WingKai Ho	e2d06068f4	Update make-osd-partitions.yml When ansible do not load the file host_vars/{{ ansible_hostname }}.yml and host_vars/default.yml it will show syntactic, so keyword "skip" to fix it. Exit the playbook if the user not define devices in both host_vars/{{ ansible_hostname }}.yml and host_vars/default.yml	2017-03-06 15:43:09 +08:00
WingKai Ho	2861a483d7	Update make-osd-partitions.yml When ansible do not load the file host_vars/{{ ansible_hostname }}.yml and host_vars/default.yml it will show syntactic err, so add keyword "skip" to fix it. Exit the playbook if the user not define devices in both host_vars/{{ ansible_hostname }}.yml and host_vars/default.yml host_vars/default.yml	2017-03-06 10:33:22 +08:00
WingKai Ho	4cc489f2ba	Update make-osd-partitions.yml fix syntactic error	2017-03-03 17:26:53 +08:00
WingKai Ho	102befa927	Update make-osd-partitions.yml Remove capital `L`	2017-03-02 14:06:41 +08:00
WingKai Ho	c3f170e758	Update make-osd-partitions.yml there is an extra space between 'custom' and 'layout'	2017-03-02 12:24:44 +08:00
WingKai Ho	2967772f6a	Load a variable file for devices parrition load device partition file in directory host_vars 1) if the user define host_vars/hostname.yml load the devices partition on this file. 2) otherwise load host_vars/default.yml for default	2017-03-01 17:27:57 +08:00
yangyimincn	8b36cbac64	Update rolling_update.yml The task waiting for the monitor to join the quorum... , the result for ceph -s \| grep monmap only contain monmap, not included quorum: # ceph -s --cluster ceph \| grep monmap monmap e1: 3 mons at {sh-office-ceph-1=10.12.10.34:6789/0,sh-office-ceph-2=10.12.10.35:6789/0,sh-office-ceph-3=10.12.10.36:6789/0} If want to get monitor, should use this: # ceph -s --cluster ceph \| grep election election epoch 80, quorum 0,1 sh-office-ceph-1,sh-office-ceph-2 ceph verison: 10.2.5	2017-02-28 16:56:02 +08:00
Sébastien Han	4639d89231	infra: fix cluster name detection The previous command was returning /etc/ceph/ceph.conf, we only need 'ceph' to be returned. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-23 15:40:34 -05:00
Tobias Florek	931027e6f7	harmonize docker names Created containers now are named more or less in the form of <ansible role>-<ansible_hostname>	2017-02-23 09:15:05 +01:00
Sébastien Han	3b633d5ddc	purge-docker: re-implement zap devices We now run the container and waits until it dies. Prior to this we were stopping it before completion so not all the devices where zapped. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-21 15:56:09 -05:00
Sébastien Han	a002508a91	purge-docker: also purge journal devices Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-21 15:54:36 -05:00
Andrew Schoen	5622c94e8b	rolling-update: do not use upstart to stop mons when using systemd Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-02-21 12:31:26 -06:00
Shengjing Zhu	32923fd217	fix grep match pattern for osd ids Some playbooks use [0-9]*, others use \d+$ The latter is more correct since cluster name may contain numbers. Signed-off-by: Shengjing Zhu <zsj950618@gmail.com>	2017-02-20 16:35:56 +08:00
Andrew Schoen	22f52a9dc6	purge-cluster: also purge dmcrypt dedicated journals See: https://bugzilla.redhat.com/show_bug.cgi?id=1414647 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-02-15 10:27:17 -06:00
Andrew Schoen	3964929a56	rgw-standalone: also fetch keys from mons This is to allow for ceph-installer usage of this playbook and to ensure that you have the correct keys locally when bootstrapping. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-02-14 16:12:59 -06:00
Andrew Schoen	c5f561a4e9	purge-cluster: remove calamari-server package See: https://bugzilla.redhat.com/show_bug.cgi?id=1422134 Signed-off-by: Andrew Schoen <aschoen@redhat.com> Resolves rhbz#1422134	2017-02-14 09:24:02 -06:00

1 2 3 4 5 ...

326 Commits (c47aa2e83b81b4678d58f427c800faa77e9dd719)