ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Sébastien Han	e46440e19c	switch-from-non-containerized-to-containerized: fix devices If devices is passed through an extra var this register won't work so let's only register the var is devices is not defined. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1489099 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-07 23:18:14 +02:00
Sébastien Han	b9ced956d7	purge: get lockbox mountpoint and unmount it Prior command was avoiding the lockbox mountpoint and the playbook was failing with: rmtree failed: [Errno 30] Read-only file system: '/var/lib/ceph/osd-lockbox/4e9d8052-87c2-4fde-a56c-b8c108a3eefc/key-management-mode' Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-07 16:31:31 +02:00
Guillaume Abrioux	d987d26719	tests: force docker variable for switch-to-containers scenario we need to force the value of `docker` variable which is initially set to `false` since it's a migration from non-containerized to containerized cluster. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-09-06 18:03:52 +02:00
Sébastien Han	b7db600caa	switch-from-non-containerized-to-containerized: mask unit files We must mask the image so we are sure that even if the system reboots then the OSDs won't start. Also remove Ceph udev rules if found on the system prior to deploy containers. If we don't do this we are exposed to conflicts between udev rules and sytemd unit files. Also add the CI will now test the migration from a non-containerized cluster to a containerized cluster. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-05 15:20:31 +02:00
Sébastien Han	579b95fd8a	shrink-mon: wait a little bit for the mon to be out Monitor removal from the monmap is not immediate, so let's wait a little bit and then fail if the monitor is still in the monmap. We try twice in total with 10 sec intervals. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-04 23:08:57 +02:00
Sébastien Han	54d7a81241	infra playbook: move untested scenario to a new dir Move untested/with few confidence playbooks in a untested-by-ci directory. Also removing this directory from the package build. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1461551 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-01 19:58:24 +02:00
Sébastien Han	298a63c437	shrink mon and osd Rework shrinking a monitor and an OSD playbook. Also adding test scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-09-01 19:12:00 +02:00
Sébastien Han	e0a264c7e9	osd: allow multi dedicated journals for containers Fix: https://bugzilla.redhat.com/show_bug.cgi?id=1475820 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-30 12:34:06 +02:00
Ben England	617d9ee75d	dont use devices var anymore, works for osd_auto_discover	2017-08-28 17:27:01 -04:00
Sébastien Han	0205f6d645	rolling_update: nicer way to set osd flags Prior to this patch, we were applying the osd flags like this: " General pre tasks Set flags Upgrade OSDs on a host Unset flags <-- this triggers pending scrub to start Set flags Upgrade OSDs on a hosts Unset flags <-- this triggers pending scrub to start . . . General post tasks " Now instead, we apply the flag once before starting the OSD update and unset them once the last OSD is finished. " General pre tasks Set flags and wait for any scrubs to finish Upgrade OSDs on a host Upgrade OSDs on a host . . . Unset flags General post tasks " Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1450754 Signed-off-by: Sébastien Han <seb@redhat.com> Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-25 18:21:28 +02:00
Sébastien Han	4a4a20f07d	rolling update: skip pg check if num_pgs = 0 In our test case we don't have any pgs, thus the check fails. The check always returns an empty array, which makes the comparaison failing. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-24 08:50:49 +02:00
Alfredo Deza	e651469a2a	Merge pull request #1797 from ceph/purge-lvm adds purge support for the lvm_osds osd scenario	2017-08-23 14:28:29 -04:00
Sébastien Han	f2499ff5ac	Merge pull request #1788 from ceph/improve-switch switch-from-non-containerized-to-containerized: simplify	2017-08-23 19:47:26 +02:00
Sébastien Han	4f0ecb7f30	switch-from-non-containerized-to-containerized: simplify This commit eases the use of the infrastructure-playbooks/switch-from-non-containerized-to-containerized-ceph-daemons.yml playbook. We basically run it with a couple of pre-tasks and then we let the playbook run the docker roles. It obviously expect to have proper variables configured in order to work. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-23 18:39:45 +02:00
Andrew Schoen	bed57572cc	purge-cluster: adds support for purging lvm osds This also adds a new testing scenario for purging lvm osds Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-08-23 10:33:35 -05:00
Sébastien Han	1ac0969c28	Merge pull request #1778 from ceph/fix-1770 purge: add ability to purge bluestore osd	2017-08-22 23:56:36 +02:00
Giulio Fidente	2c01de4350	Default cluster to ceph in switch to containers	2017-08-22 13:13:36 +02:00
Giulio Fidente	f0423b1804	Parse ceph_docker_registry in switch to containers Defaults it to docker.io as it was for backward compatibility.	2017-08-22 13:11:27 +02:00
Giulio Fidente	a59b84d5c9	Assume mon_docker_privileged false in switch to containers	2017-08-22 13:01:25 +02:00
Giulio Fidente	0106fa6835	Consume public_network vs ceph_mon_docker_subnet In the switch to containers migration there were broken references to ceph_mon_docker_subnet variable, replaced with public_network. Also fixes references to ceph_mon_docker_extra_env setting for it a default as it could be undefined.	2017-08-21 18:34:24 +02:00
Giulio Fidente	386303d42e	Extend set_uid fact to support RH Ceph images	2017-08-21 18:32:08 +02:00
Sébastien Han	9c824b9818	purge: add ability to purge bluestore osd We now purge block db and/or wal partitions if we find any. Closes: https://github.com/ceph/ceph-ansible/issues/1770 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-21 18:08:18 +02:00
Andrew Schoen	d2f4d3666f	Merge pull request #1725 from ceph/simplify-osd-scenario osd: simply osd scenario declaration	2017-08-03 09:31:57 -05:00
Sébastien Han	671f2cd4bc	Merge pull request #1738 from yanyixing/nvmepart fix for nvme part path	2017-08-03 13:37:10 +02:00
yanyx	d506fad056	fix for nvme part path	2017-08-03 17:37:52 +08:00
Sébastien Han	30991b1c0a	osd: simplify scenarios There is only two main scenarios now: * collocated: everything remains on the same device: - data, db, wal for bluestore - data and journal for filestore * non-collocated: dedicated device for some of the component Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-03 10:20:39 +02:00
Sébastien Han	fdc6aebd62	infrastructure-playbooks: update with ceph-defaults roles Signed-off-by: Sébastien Han <seb@redhat.com>	2017-08-02 17:12:20 +02:00
Guillaume Abrioux	7a333d05ce	Add handlers for containerized deployment Until now, there is no handlers for containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 17:12:20 +02:00
Guillaume Abrioux	5adbf0fdaa	Move role dependencies in site.yml/site-docker.yml This will give us more flexibility and avoid a lot of useless when skipping all tasks from a non-desired role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 17:12:14 +02:00
Guillaume Abrioux	206c7a16d0	rolling_update: refact code Refact rolling_update playbook. Add ceph-client upgrade. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-08-02 11:10:51 +02:00
yanyx	d0a17b11b2	change the partition's ownership	2017-07-27 11:55:30 +08:00
Sébastien Han	fad9d0caec	Merge pull request #1690 from yanyixing/master fix: when osd device is a disk partition	2017-07-26 15:55:29 +02:00
yanyx	2e6233271e	fix: when osd device is a disk partition	2017-07-25 21:39:43 +08:00
Sébastien Han	0c18cf199e	purge: remove leftover unit files Closes https://github.com/ceph/ceph-ansible/issues/1672 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-07-25 13:26:28 +02:00
Guillaume Abrioux	828f88403e	Update: Avoid screen scraping in rolling update since luminous has revamped the `ceph -s` output, we need to avoid screen scraping. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-12 15:02:39 +02:00
Guillaume Abrioux	896d62d78b	Refact: remove ceph_mon_docker_interface variable remove `ceph_mon_docker_interface` and use `monitor_interface` instead for both containerized and non-containerized deployment. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-07-04 18:08:59 +02:00
Guillaume Abrioux	73141118d0	Make the new check PGs working with /bin/sh The new test in the checks PGs are no longer working on distributions where /bin/sh isn't linked to /bin/bash. Fix: #1619 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-06-22 17:59:38 +02:00
David Galloway	127b5ad9b4	infra: Create a backup of ceph.conf when taking over existing cluster Signed-off-by: David Galloway <dgallowa@redhat.com>	2017-06-21 09:53:09 -04:00
David Galloway	40ed2d7be6	infra: Fix ceph.conf creation when taking over existing cluster Fixes bug introduced in https://github.com/ceph/ceph-ansible/pull/1330 The "stat ceph.conf" task was basically using the stat module on a string instead of the ceph.conf filename. This caused the "generate ceph configuration file" task to fail. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1463382 Signed-off-by: David Galloway <dgallowa@redhat.com>	2017-06-21 09:52:01 -04:00
Andrew Schoen	e2104acb62	rolling_update: set health_mon_check_delay to 15 The old value of 10 did not give enough time for a containerized mon to pass the health check. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-06-13 08:56:44 -05:00
Guillaume Abrioux	5af9bb432c	rewrite check pgs clean tasks Avoid screen scrapping by rewriting `waiting for clean pgs` tasks like it is done in `304de48`. Use the json output returned by `ceph -s` instead Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-06-13 09:48:56 +02:00
Andrew Schoen	59992c54cc	purge-docker-cluster: include ceph_docker_registry We need to include ceph_docker_registry when removing containers/images because if we don't it will assume docker.io which is not always where the image originated from, causing the playbook to fail. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-06-02 09:49:17 -05:00
Sébastien Han	fdc7866072	Merge pull request #1469 from ceph/refact_code Docker: Refact code	2017-06-02 12:40:25 +02:00
Andrew Schoen	f7677e4393	purge-docker-cluster: pip is only used on Debian We only need to purge packages installed by pip on Debian systems. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-05-31 09:03:44 -05:00
Andrew Schoen	8e322d4825	purge-docker-cluster: default raw_journal_devices to [] If we're purging a containerized cluster that did not use the raw_multi_journal OSD scenario then raw_journal_devices will not be defined which causes the playbook to fail. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1455187 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-05-25 07:30:25 -05:00
Guillaume Abrioux	ddfe019342	Refact code `ceph-docker-common`: At the moment there is a lot of duplicated tasks in each `./roles/ceph-<role>/tasks/docker/main.yml` that could be refactored in `./roles/ceph-docker-common/tasks/main.yml`. `_containerized_deployment` variables: All `_containerized_deployment` have been refactored to a single variable `containerized_deployment` duplicate `cephx` variables in `group_vars/* have been removed. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-05-24 15:55:41 +02:00
Sébastien Han	90389864d8	rolling-update: set/unset flags on the right container Problem: we are delegating the set/unset flag to a monitor node but we try to call an osd container Solution: use the right container name. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-05-22 09:38:08 +02:00
Sébastien Han	b93ffe637b	Merge pull request #1476 from WingkaiHo/improve-shrink-osd.yml improve shrink-osd.yml can shrink osd when disk damage	2017-04-27 11:01:27 +02:00
WingkaiHo	0b9f322ca0	improve shrink-osd.yml can shrink osd when disk damage	2017-04-27 10:26:26 +08:00
Andrew Schoen	5a3f95dfc1	purge-cluster: check for any running ceph process after purge Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-04-25 09:30:22 -05:00
Andrew Schoen	26bdd59f5d	purge-cluster: we don't support sysv or upstart anymore Now that ceph-ansible only supports > jewel we don't need to bother with sysv or upstart Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-04-21 15:14:38 -07:00
Andrew Schoen	7ca2bddcce	purge-cluster: do not need to check for running ceph processes Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-04-21 15:12:46 -07:00
Andrew Schoen	aac79df3b3	purge-cluster: no need to remove ceph.target The package uninstalls will stop ceph.target Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-04-21 15:11:03 -07:00
Sébastien Han	dfd8f4d96e	test: add mgr section to the host inventory file Without this, we don't test the mgr role so we need to add it. Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com> Signed-off-by: Sébastien Han <seb@redhat.com>	2017-04-15 00:16:10 +02:00
Sébastien Han	17ac1fd464	Merge pull request #1443 from WingkaiHo/osds-journal-migrate Migrate osd(s) journal to ssd	2017-04-13 16:45:57 +02:00
WingkaiHo	9fba41b4ce	Migrate osd(s) journal to ssd	2017-04-13 11:05:58 +08:00
Daniel Lupescu	d5e56c481a	purge-cluster: fix grep match for NVMe and HP Smart Array devices raw_device would return invalid block device names for NVMe and HPSA devices which would cause sgdisk partition deletion to fail $ echo /dev/nvme1n1p3 \| egrep -o '/dev/([hsv]d[a-z]{1,2}\|cciss/c[0-9]d[0-9]p\|nvme[0-9]n[0-9]p){1,2}' /dev/nvme1n1p $ echo /dev/cciss/c0d0p2 \| egrep -o '/dev/([hsv]d[a-z]{1,2}\|cciss/c[0-9]d[0-9]p\|nvme[0-9]n[0-9]p){1,2}' /dev/cciss/c0d0p	2017-04-11 16:13:28 +03:00
Sébastien Han	c37aaa41f4	playbook: homogenize the way list osd ids Problem: too many different commands to do the same thing. The 'cut' command on infrastructure-playbooks/purge-cluster.yml was also wrong. This sed command from osixia in ceph-docker https://github.com/ceph/ceph-docker/pull/580/ addresses all the scenarios. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-03-30 11:51:38 +02:00
Sébastien Han	35a90ae283	Merge pull request #1386 from WingkaiHo/master Create recover-osds-after-ssd-journal-failure.yml	2017-03-28 09:50:39 +02:00
Konstantin Shalygin	1662976fc0	Resolve issues when groups names not in default value.	2017-03-27 21:44:30 +07:00
WingkaiHo	ac1498b0d7	Merge https://github.com/ceph/ceph-ansible	2017-03-27 10:50:38 +08:00
WingkaiHo	ebb56ccebf	command module instead shell	2017-03-23 17:38:41 +08:00
WingkaiHo	2d44c1cee6	remove service enable	2017-03-23 15:28:14 +08:00
WingkaiHo	14c189fee5	break it into lines since you already use the string block synta and fix disable it here and enable again in later task	2017-03-23 14:49:10 +08:00
WingkaiHo	62c37042fe	remove this detection and simply rely on {{ cluster }}	2017-03-23 09:22:06 +08:00
WingkaiHo	3d10c5981e	fix some pelling mistakes and wirting format, use full device path for device name	2017-03-22 17:48:34 +08:00
WingkaiHo	1e670bdeb0	This assumes ceph as a cluster name. We need detect the name of the cluster	2017-03-22 10:09:06 +08:00
WingkaiHo	83a1ac0c67	This assumes ceph as a cluster name. We need detect the name of the cluster	2017-03-22 10:06:11 +08:00
WingkaiHo	19f9e200d7	Add auto detect the ceph cluster name	2017-03-22 10:00:44 +08:00
WingkaiHo	8602166f6e	Ansible will include host_vars/ansible_hostname.yml itself, no need this task IMO.	2017-03-21 13:50:27 +08:00
WingkaiHo	55725fd01d	fix some syntax error	2017-03-21 11:19:25 +08:00
WingKai Ho	7445113dc4	Create recover-osds-after-ssd-journal-failure.yml This playbook use to recover Ceph OSDs after ssd journal failure.	2017-03-21 11:08:25 +08:00
Anthony D'Atri	6c4911276e	Enhance clean PG check to catch active+clean+scrubbing and active+clean+scrubbing+deep Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>	2017-03-19 00:23:26 -07:00
Daniel Marks	77edd3d40a	Fixing tabs that are breaking the syntax check With the merge of PR #1336 the syntax check fails. This commit replaces the tabs with proper indentation.	2017-03-15 14:15:15 +01:00
Sébastien Han	38ab6de602	Merge pull request #1336 from WingkaiHo/master Load a variable file for devices partition	2017-03-15 11:55:26 +01:00
Sébastien Han	8320c14191	Merge pull request #1317 from ibotty/harmonize-docker-names harmonize docker names	2017-03-14 18:20:20 +01:00
Andrew Schoen	e81d690aa0	switch-to-containers: do not include group vars or role defaults Doing so will override any values set for these in the group_vars directory relative to the users inventory. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:09 -06:00
Andrew Schoen	cf702b05cf	purge-docker-cluster: do not include role defaults or group vars Doing so at playbook level overrides whatever values might be set for these in the user's group_vars directory that's relative to their inventory. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:09 -06:00
Andrew Schoen	aef54d89d9	switch-to-containers: do not set group name vars at playbook level Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:09 -06:00
Andrew Schoen	7289acb6b3	purge-docker-cluster: do not set group names vars at playbook level Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:08 -06:00
Andrew Schoen	46f26bec13	rolling-update: do not set group name vars at playbook level Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:08 -06:00
Andrew Schoen	4fe6607004	purge-cluster: do not set group name vars at playbook level This has the behavior of overriding custom values set in group_vars. I've added defaults to the rest of the group names so that if they are not overridden in group_vars then defaults will be used. See: https://bugzilla.redhat.com/show_bug.cgi?id=1354700 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-03-08 08:57:08 -06:00
WingKai Ho	0d134b4ad9	Update make-osd-partitions.yml change	2017-03-08 17:46:37 +08:00
WingKai Ho	e2d06068f4	Update make-osd-partitions.yml When ansible do not load the file host_vars/{{ ansible_hostname }}.yml and host_vars/default.yml it will show syntactic, so keyword "skip" to fix it. Exit the playbook if the user not define devices in both host_vars/{{ ansible_hostname }}.yml and host_vars/default.yml	2017-03-06 15:43:09 +08:00
WingKai Ho	2861a483d7	Update make-osd-partitions.yml When ansible do not load the file host_vars/{{ ansible_hostname }}.yml and host_vars/default.yml it will show syntactic err, so add keyword "skip" to fix it. Exit the playbook if the user not define devices in both host_vars/{{ ansible_hostname }}.yml and host_vars/default.yml host_vars/default.yml	2017-03-06 10:33:22 +08:00
WingKai Ho	4cc489f2ba	Update make-osd-partitions.yml fix syntactic error	2017-03-03 17:26:53 +08:00
WingKai Ho	102befa927	Update make-osd-partitions.yml Remove capital `L`	2017-03-02 14:06:41 +08:00
WingKai Ho	c3f170e758	Update make-osd-partitions.yml there is an extra space between 'custom' and 'layout'	2017-03-02 12:24:44 +08:00
WingKai Ho	2967772f6a	Load a variable file for devices parrition load device partition file in directory host_vars 1) if the user define host_vars/hostname.yml load the devices partition on this file. 2) otherwise load host_vars/default.yml for default	2017-03-01 17:27:57 +08:00
yangyimincn	8b36cbac64	Update rolling_update.yml The task waiting for the monitor to join the quorum... , the result for ceph -s \| grep monmap only contain monmap, not included quorum: # ceph -s --cluster ceph \| grep monmap monmap e1: 3 mons at {sh-office-ceph-1=10.12.10.34:6789/0,sh-office-ceph-2=10.12.10.35:6789/0,sh-office-ceph-3=10.12.10.36:6789/0} If want to get monitor, should use this: # ceph -s --cluster ceph \| grep election election epoch 80, quorum 0,1 sh-office-ceph-1,sh-office-ceph-2 ceph verison: 10.2.5	2017-02-28 16:56:02 +08:00
Sébastien Han	4639d89231	infra: fix cluster name detection The previous command was returning /etc/ceph/ceph.conf, we only need 'ceph' to be returned. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-23 15:40:34 -05:00
Tobias Florek	931027e6f7	harmonize docker names Created containers now are named more or less in the form of <ansible role>-<ansible_hostname>	2017-02-23 09:15:05 +01:00
Sébastien Han	3b633d5ddc	purge-docker: re-implement zap devices We now run the container and waits until it dies. Prior to this we were stopping it before completion so not all the devices where zapped. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-21 15:56:09 -05:00
Sébastien Han	a002508a91	purge-docker: also purge journal devices Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-21 15:54:36 -05:00
Andrew Schoen	5622c94e8b	rolling-update: do not use upstart to stop mons when using systemd Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-02-21 12:31:26 -06:00
Shengjing Zhu	32923fd217	fix grep match pattern for osd ids Some playbooks use [0-9]*, others use \d+$ The latter is more correct since cluster name may contain numbers. Signed-off-by: Shengjing Zhu <zsj950618@gmail.com>	2017-02-20 16:35:56 +08:00
Andrew Schoen	22f52a9dc6	purge-cluster: also purge dmcrypt dedicated journals See: https://bugzilla.redhat.com/show_bug.cgi?id=1414647 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-02-15 10:27:17 -06:00
Andrew Schoen	3964929a56	rgw-standalone: also fetch keys from mons This is to allow for ceph-installer usage of this playbook and to ensure that you have the correct keys locally when bootstrapping. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-02-14 16:12:59 -06:00
Andrew Schoen	c5f561a4e9	purge-cluster: remove calamari-server package See: https://bugzilla.redhat.com/show_bug.cgi?id=1422134 Signed-off-by: Andrew Schoen <aschoen@redhat.com> Resolves rhbz#1422134	2017-02-14 09:24:02 -06:00
Sébastien Han	c2f1dca823	docker: use a better method to pull images We changed the way we declare image. Prior to this patch we must have a "user/image:tag" format, which is incompatible with non docker-hub registry where you usually don't have a "user". On the docker hub a "user" is also identified as a namespace, so for Ceph the user was "ceph". Variables have been simplified with only: * ceph_docker_image * ceph_docker_image_tag 1. For docker hub images: ceph_docker_name: "ceph/daemon" will give you the 'daemon' image of the 'ceph' user. 2. For non docker hub images: ceph_docker_name: "daemon" will simply give you the "daemon" image. Infrastructure playbooks have been modified as well. The file group_vars/all.docker.yml.sample has been removed as well. It is hard to maintain since we have to generate it manually. If you want to configure specific variables for a specific daemon simply edit group_vars/$DAEMON.yml Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1420207 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-09 17:57:18 +01:00
Andrew Schoen	5ddfc4f85c	Merge pull request #1284 from ceph/BZ-1418980 purge-cluster: do not use ceph-detect-init	2017-02-08 08:46:03 -06:00
Andrew Schoen	4ff5908758	Merge pull request #1289 from ceph/fix-1286 rolling-update: detect init system properly	2017-02-08 06:31:30 -06:00
Andrew Schoen	865b4500dc	purge-cluster: set a default value for fetch_directory if not defined Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-02-08 06:25:43 -06:00
Andrew Schoen	adf6aee643	purge-cluster: remove all include tasks Including variables from role defaults or files in a group_vars directory relative to the playbook is a bad practice. We don't want to do this because including these defaults at the task level overrides values that would be set in a group_vars directory relative to the inventory file, which is the correct usage if you wish to override those default values. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-02-08 06:25:43 -06:00
Andrew Schoen	0476b24af1	purge-cluster: do not use ceph-detect-init We can not always ensure that ceph-detect-init will be present on the system. See: https://bugzilla.redhat.com/show_bug.cgi?id=1418980 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-02-08 06:24:44 -06:00
Sébastien Han	8f94bfb498	rolling-update: detect init system properly Simply use the ansible_service_mgr fact. Closes: #1286 Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-08 08:52:05 +01:00
Sébastien Han	c34d0a9d28	purge-docker: force image deletion even if non-runnin containers are using this image as a reference. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-07 22:14:21 +01:00
Sébastien Han	72cd9199ac	purge: ability to purge client role Signed-off-by: Sébastien Han <seb@redhat.com>	2017-02-07 22:14:18 +01:00
Guillaume Abrioux	76ddcbc271	Remove support of releases prior to Jewel. According to #1216, we need to simply the code by removing the support of anything before Jewel. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2017-01-31 11:00:54 +01:00
Sébastien Han	d5dd658cfa	purge: do not stop ceph.target on each daemon Doing this cause some all the daemons to go down at the same time. In a scenario where we colocate a monitor and an osd, this osds will take some time to go down which will make the 'umount' task fail. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-01-30 14:31:56 +01:00
Sébastien Han	cb57a359ba	purge: do not fail on purge ceph files On systems running docker there is an issue with lxfs that results in the find command returning 1 but actually did the job. e.g: on a system with docker runnning find /var will give us the following error: find: '/var/lib/lxcfs/cgroup/devices/lxc/x1/system.slice/systemd-update-utmp.service/devices.deny': Permission denied find: '/var/lib/lxcfs/cgroup/devices/lxc/x1/system.slice/dev-random.mount/devices.allow': Permission denied ... ... However ceph files got deleted so we ignore the error. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-01-30 14:31:56 +01:00
Sébastien Han	e371bd591c	purge: fix ubuntu purge when not using systemd We now rely on the cli tool ceph-detect-init which will tell us the init system in used on the distribution. We do this instead of the previous lookup for systemd unit files to call the right task depending on the init system. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-01-30 14:31:56 +01:00
Sébastien Han	0e2e270ab2	purge: allow purge to run multiple times with_items is evaluated before the when so in a second run where the variable is empty if will fail with "'dict object' has no attribute 'stdout_lines'". To fix this we had a default array so with_items does not fail and the task is skipped with the when. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-01-30 14:31:56 +01:00
Sébastien Han	0d2e580768	Merge pull request #1250 from ceph/new-tests CI testing updates	2017-01-27 14:30:45 +01:00
Andrew Schoen	d3cb8dba4e	purge-cluster: fix failure when raw_multi_journal is not defined Because the purge-cluster.yml playbook does not have access to the roles default vars then we can be sure that raw_multi_journal is defined. For example, if this was purging a dmcrypt journal then raw_multi_journal might not be defined at all in group_vars/all.yml or group_vars/osds.yml. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-01-27 05:23:17 -06:00
Ivan Font	0298354137	Update to use consistent docker extra env vars This playbook was still referencing the old version of the ceph__docker_extra_env but only for Ceph MONs and Ceph NFS. This playbook was not kept up-to-date when updating the ceph__docker_extra_env variables to add the '-e' option to docker. That's because the addition of '-e' breaks this playbook as it requires a comma separated list of variables for the 'env:' docker module parameter. Therefore this change just makes the playbook consistently broken by referencing the same variable throughout.	2017-01-26 15:57:34 -08:00
Andrew Schoen	b2a6f095f1	purge-cluster: fix syntax when deleting dmcrypt devices Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-01-26 11:28:30 -06:00
Sébastien Han	73ca1a7a00	purge: remove dm-crypt devices When running encrypted OSDs, an encrypted device mapper is used (because created by the crypsetup tool). So before attempting to remove all the partitions on a device we must delete all the encrypted device mappers, then we can delete all the partitions. Signed-off-by: Sébastien Han <seb@redhat.com> Please enter the commit message for your changes. Lines starting	2017-01-25 22:32:46 +01:00
Sébastien Han	adeb3decf3	purge: remove zap_block_devs variable The name of this variable was a bit confusing since its activation will zap all the block devices no matter which osd scenario we are using. Removing this variable and applying a condition on the OSD scenario is now feasible and easier since we import group_vars variable files for OSDs. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-01-18 10:55:01 +01:00
Sébastien Han	b7fcbe5ca2	purge: cosmetic cleanup Just applying our writing syntax convention in the playbook. Signed-off-by: Sébastien Han <seb@redhat.com>	2017-01-18 10:53:21 +01:00
Andrew Schoen	dd8389cdf7	purge-cluster: do not include ceph-osd and ceph-common defaults for osds When purging OSDs we do not need to include these defaults as nothing in the following tasks uses them. Also, it has the side effect of overwriting any variables defined in group_vars files that are relative to the inventory you are using with the default values. That behavior was causing the CI tests to fail. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-01-10 16:57:58 -06:00
Andrew Schoen	321cea8ba9	purge-cluster: get journal partitions after zapping osd disks In my testing zapping the osd disks deleted the journal partitions, making the 'zap ceph journal partitions' task fail because the partitions it found previously do not exist anymore. This moves the task that finds the journal partitions after 'zap osd disks' to catch any partitions ceph-disk might have missed. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-01-03 15:57:17 -06:00
Andrew Schoen	c9e5914377	purge-cluster: use ignore_errors: true when including group_vars files Using failed_when will still throw an exception and stop the playbook if the file you're trying to include doesn't exist. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2017-01-03 15:57:17 -06:00
Sébastien Han	cb1c06901e	Merge pull request #1171 from cbodley/wip-libcephfs2 bump package version to libcephfs2	2017-01-03 10:48:56 +01:00
Shengjing Zhu	2dc2e1d48c	infrastructure playbook: add make osd partition Signed-off-by: Shengjing Zhu <zsj950618@gmail.com>	2016-12-15 22:03:38 +08:00
Casey Bodley	acaf01ac17	purge-cluster: add new version of libcephfs2 the libcephfs version was bumped to 2, so we need to check for that as well when we're removing all ceph packages Signed-off-by: Casey Bodley <cbodley@redhat.com>	2016-12-09 16:54:06 -05:00
Sébastien Han	9dac195200	take-over: use more precise ceph.conf detection Prior to this patch we were just looking for any *.conf file which sometimes could results in multiple matches. The new command looks for a .conf file that must contain [global] and 'fsid' patterns. This will definitely get us the ceph.conf file. We can not directly use ceph.conf because of a different cluster name. Signed-off-by: Sébastien Han <seb@redhat.com>	2016-12-06 16:02:48 +01:00
Sébastien Han	4444d7d78e	git: update gitignore * ignore yml files in general * refactor based on commit f8e043b6ea5ac4e886532d4f2f675c507b44b955 that changed directory layouts Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit ec5c6f5da566611c4e0b88f925cbd26dc90368d6)	2016-12-06 10:18:19 +01:00
chenyanshan	7eab2529ed	this patch fix the regex pattern in infrastructure-playbooks/shrink-osd.yml when the osd's pid num is bigger than 9999 Signed-off-by: chenyanshan <yanshanchen@139.com>	2016-12-05 13:40:38 +08:00
Guillaume Abrioux	b2b7222b3a	[shrink-mon]: force playbook to fail if there is only one mon The playbook will fail if only 1 mon is in the cluster and advise to use the `purge-cluster` playbook instead. Fix #1083	2016-11-25 11:20:11 +01:00
Guillaume Abrioux	a680707f6f	All `include_vars` need to have `.yml`, `.yaml` or `*.json` extension. As introduced in the following PR: - https://github.com/ansible/ansible/pull/17207 we need to refactor our code.	2016-11-24 14:03:49 +01:00
Sébastien Han	829e2b6598	Merge pull request #1077 from font/rolling_update Support containerized rolling update	2016-11-22 16:56:46 +01:00
Sébastien Han	38e846e542	rolling_update: clarify "serial" usage Prior to this commit the serial variable was poorly documented. Now we are making clear that this value should be left untouched as the rolling update mechanism should happen serially. Solves: bz-1396742 Signed-off-by: Sébastien Han <seb@redhat.com>	2016-11-21 14:42:46 +01:00
Ken Dreyer	adfdf6871e	remove apache support for RGW libfcgi is dead upstream (http://tracker.ceph.com/issues/16784) The RGW developers intend to remove libfcgi support entirely before the Luminous release. Since libfcgi gets little-to-no developer attention or testing, remove it entirely from ceph-ansible.	2016-11-18 13:13:12 -07:00
Ivan Font	255e816e28	Rolling update changes for containerized deployments Separate out systemd restart tasks for containerized and non-containerized deployments Signed-off-by: Ivan Font <ifont@redhat.com>	2016-11-17 11:25:25 -08:00
Ivan Font	e72f08080d	Warn user when upgrading cluster with only one mon Signed-off-by: Ivan Font <ifont@redhat.com>	2016-11-17 11:25:25 -08:00
Ivan Font	3ff17f1c8f	Support containerized rolling update - Update rolling update playbook to support containerized deployments for mons, osds, mdss, and rgws - Skip checking if existing cluster is running when performing a rolling update - Fixed bug where we were failing to start the mds container because it was missing the admin keyring. The admin keyring was missing because it was not being pushed from the mon host to the ansible host due to the keyring not being available before running the copy_configs.yml task include file. Now we forcefully wait for the admin keyring to be generated before continuing with the copy_configs.yml task include file - Skip pre_requisite.yml when running on atomic host. This technically no longer requires specifying to skip tasks containing the with_pkg tag - Add missing variables to all.docker.sample - Misc. cleanup Signed-off-by: Ivan Font <ifont@redhat.com>	2016-11-17 11:25:25 -08:00
Alfredo Deza	60ce2311b8	rolling_update: bump retries for osd_check/retries to 20 minutes Signed-off-by: Alfredo Deza <adeza@redhat.com> Resolves: rhbz#1395073	2016-11-17 10:43:58 -05:00
Sébastien Han	81a72cb85d	Merge pull request #1068 from ceph/v2.2 moving to ansible v2.2 compatibility	2016-11-16 16:33:40 +01:00
Andrew Schoen	5f44b118b8	rolling update: stop RGWs before upgrade and start afterwards Signed-off-by: Andrew Schoen <aschoen@redhat.com> Resolves: rhbz#1394929	2016-11-14 14:47:12 -06:00
Andrew Schoen	ded9d9dfd3	rolling update: stop MDSs before upgrading and start afterwards Signed-off-by: Andrew Schoen <aschoen@redhat.com> Resolves: rhbz#1394929	2016-11-14 14:47:12 -06:00
Andrew Schoen	5429c5f8c5	rolling update: stop MONs before upgrading and start afterwards Signed-off-by: Andrew Schoen <aschoen@redhat.com> Resolves: rhbz#1394929	2016-11-14 14:47:12 -06:00
Andrew Schoen	66f09bdac4	rolling update: stop OSDs before upgrading This avoids a bug where OSDs are sometimes restarted twice on upgrades which leaves the OSD process running but not marked up. See: https://bugzilla.redhat.com/show_bug.cgi?id=1394928 https://bugzilla.redhat.com/show_bug.cgi?id=1391675 https://bugzilla.redhat.com/show_bug.cgi?id=1394929 Signed-off-by: Andrew Schoen <aschoen@redhat.com> Resolves: rhbz#1394929	2016-11-14 14:46:58 -06:00
Sébastien Han	991341f525	rolling_update: add variable to upgrade ceph My stupid self removed this crucial variable here: `217ce3ca` thinking it was another hard coded variable import where this is actually the trigger for the upgrade. Closes: #1071 Signed-off-by: Sébastien Han <seb@redhat.com>	2016-11-04 17:31:02 +01:00
Sébastien Han	a2fcd222d2	moving to ansible v2.2 compatibility Signed-off-by: Sébastien Han <seb@redhat.com> Co-Authored-By: Julien Francoz julien@francoz.net	2016-11-04 10:09:38 +01:00
Andrew Schoen	8262ce5e40	rolling update: fix restarts of radosgw Signed-off-by: Andrew Schoen <aschoen@redhat.com> Resolves: rhbz#1391675 Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2016-11-03 14:36:42 -05:00
Eduard Egorov	ab5c9f2a67	Adjust 'devices' list check for being not defined in purge-cluster playbook (see PR #1024 ) Signed-off-by: Eduard Egorov <eduard.egorov@icl-services.com>	2016-11-03 06:36:42 +00:00
Leseb	899c8b309f	Merge pull request #1024 from eduardegorov/egorove_make_devices_optional Make {{ devices }} list optional	2016-11-02 15:12:02 +01:00
Eduard Egorov	e5473ee565	Fix typos Signed-off-by: Eduard Egorov <eduard.egorov@icl-services.com>	2016-11-01 12:29:21 +00:00
Eduard Egorov	3652bb708b	Fix rbd-mirrors group name Signed-off-by: Eduard Egorov <eduard.egorov@icl-services.com>	2016-11-01 12:21:47 +00:00

1 2 3 4 5 ...

275 Commits (91bf53ee932a6748c464bea762f8fb6f07f11347)