Commit Graph

185 Commits (f3851df0c77e2635db10ef505950e5938b539666)

Author SHA1 Message Date
Sébastien Han 6bac613611 shrink: support for container
We can now shrink mon and osds on containerized deployment.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492115
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-20 16:25:07 +02:00
Sébastien Han 7fedc8ebf4 Merge pull request #1891 from ceph/clarify-update
rolling_update: clarify update doc
2017-09-15 07:08:49 -06:00
Sébastien Han fe1d84d395 Merge pull request #1892 from ceph/purge-dmcrypt-col
purge: only purge specific directories for mon
2017-09-13 17:57:06 -06:00
Sébastien Han ba3e3b6cc7 purge: only purge specific directories for mon
Handles the case when a mon is collocated with an OSD.

Closes: https://github.com/ceph/ceph-ansible/issues/1877
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-13 17:07:04 -06:00
Sébastien Han 82c4848ec4 Merge pull request #1885 from ceph/shrink-osd
shrink-osd: fix when multiple osds
2017-09-13 16:12:49 -06:00
Sébastien Han 92f9be963b rolling_update: clarify update doc
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1490188
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-13 15:46:29 -06:00
Sébastien Han 3031e51778 shrink-osd: fix when multiple osds
The loop was being built properly so we were always getting the last
item as osd host.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1490355
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-13 15:20:11 -06:00
Sébastien Han aa364264cd resync ceph-iscsi-gw with old upstream
Taken from https://github.com/pcuzner/ceph-iscsi-ansible/tree/tcmu-fixes

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1454945 and
https://bugzilla.redhat.com/show_bug.cgi?id=1484083
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-12 18:06:10 -06:00
Sébastien Han 477f86e305 switch to container: fix ceph nfs
The service is nfs-ganesha where ceph-nfs@{{ ansible_hostname }} will be
the name of the container.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-08 22:43:50 +02:00
Sébastien Han fdacac9fa0 switch: make osd collection idempotent
This commits allows us to run
switch-from-non-containerized-to-containerized-ceph-daemons.yml multiple
times.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1489353
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-08 11:31:47 +02:00
Sébastien Han e46440e19c switch-from-non-containerized-to-containerized: fix devices
If devices is passed through an extra var this register won't work so
let's only register the var is devices is not defined.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1489099
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-07 23:18:14 +02:00
Sébastien Han b9ced956d7 purge: get lockbox mountpoint and unmount it
Prior command was avoiding the lockbox mountpoint and the playbook was
failing with:

rmtree failed: [Errno 30] Read-only file system:
'/var/lib/ceph/osd-lockbox/4e9d8052-87c2-4fde-a56c-b8c108a3eefc/key-management-mode'

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-07 16:31:31 +02:00
Guillaume Abrioux d987d26719 tests: force docker variable for switch-to-containers scenario
we need to force the value of `docker` variable which is initially set
to `false` since it's a migration from non-containerized to
containerized cluster.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-06 18:03:52 +02:00
Sébastien Han b7db600caa switch-from-non-containerized-to-containerized: mask unit files
We must mask the image so we are sure that even if the system reboots
then the OSDs won't start.

Also remove Ceph udev rules if found on the system prior to deploy
containers. If we don't do this we are exposed to conflicts between udev
rules and sytemd unit files.

Also add the CI will now test the migration from a non-containerized cluster to a
containerized cluster.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-05 15:20:31 +02:00
Sébastien Han 579b95fd8a shrink-mon: wait a little bit for the mon to be out
Monitor removal from the monmap is not immediate, so let's wait a little
bit and then fail if the monitor is still in the monmap.
We try twice in total with 10 sec intervals.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-04 23:08:57 +02:00
Sébastien Han 54d7a81241 infra playbook: move untested scenario to a new dir
Move untested/with few confidence playbooks in a untested-by-ci
directory.
Also removing this directory from the package build.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1461551
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-01 19:58:24 +02:00
Sébastien Han 298a63c437 shrink mon and osd
Rework shrinking a monitor and an OSD playbook. Also adding test
scenario.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-01 19:12:00 +02:00
Sébastien Han e0a264c7e9 osd: allow multi dedicated journals for containers
Fix: https://bugzilla.redhat.com/show_bug.cgi?id=1475820
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-08-30 12:34:06 +02:00
Ben England 617d9ee75d dont use devices var anymore, works for osd_auto_discover 2017-08-28 17:27:01 -04:00
Sébastien Han 0205f6d645 rolling_update: nicer way to set osd flags
Prior to this patch, we were applying the osd flags like this:

"
General pre tasks
Set flags
Upgrade OSDs on a host
Unset flags <-- this triggers pending scrub to start
Set flags
Upgrade OSDs on a hosts
Unset flags <-- this triggers pending scrub to start
.
.
.
General post tasks
"

Now instead, we apply the flag once before starting the OSD update and
unset them once the last OSD is finished.

"
General pre tasks
Set flags and wait for any scrubs to finish
Upgrade OSDs on a host
Upgrade OSDs on a host
.
.
.
Unset flags
General post tasks
"

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1450754
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-08-25 18:21:28 +02:00
Sébastien Han 4a4a20f07d rolling update: skip pg check if num_pgs = 0
In our test case we don't have any pgs, thus the check fails. The check
always returns an empty array, which makes the comparaison failing.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-08-24 08:50:49 +02:00
Alfredo Deza e651469a2a Merge pull request #1797 from ceph/purge-lvm
adds purge support for the lvm_osds osd scenario
2017-08-23 14:28:29 -04:00
Sébastien Han f2499ff5ac Merge pull request #1788 from ceph/improve-switch
switch-from-non-containerized-to-containerized: simplify
2017-08-23 19:47:26 +02:00
Sébastien Han 4f0ecb7f30 switch-from-non-containerized-to-containerized: simplify
This commit eases the use of the
infrastructure-playbooks/switch-from-non-containerized-to-containerized-ceph-daemons.yml
playbook. We basically run it with a couple of pre-tasks and then we let
the playbook run the docker roles.

It obviously expect to have proper variables configured in order to
work.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-08-23 18:39:45 +02:00
Andrew Schoen bed57572cc purge-cluster: adds support for purging lvm osds
This also adds a new testing scenario for purging lvm osds

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-08-23 10:33:35 -05:00
Sébastien Han 1ac0969c28 Merge pull request #1778 from ceph/fix-1770
purge: add ability to purge bluestore osd
2017-08-22 23:56:36 +02:00
Giulio Fidente 2c01de4350 Default cluster to ceph in switch to containers 2017-08-22 13:13:36 +02:00
Giulio Fidente f0423b1804 Parse ceph_docker_registry in switch to containers
Defaults it to docker.io as it was for backward compatibility.
2017-08-22 13:11:27 +02:00
Giulio Fidente a59b84d5c9 Assume mon_docker_privileged false in switch to containers 2017-08-22 13:01:25 +02:00
Giulio Fidente 0106fa6835 Consume public_network vs ceph_mon_docker_subnet
In the switch to containers migration there were broken references
to ceph_mon_docker_subnet variable, replaced with public_network.

Also fixes references to ceph_mon_docker_extra_env setting for it
a default as it could be undefined.
2017-08-21 18:34:24 +02:00
Giulio Fidente 386303d42e Extend set_uid fact to support RH Ceph images 2017-08-21 18:32:08 +02:00
Sébastien Han 9c824b9818 purge: add ability to purge bluestore osd
We now purge block db and/or wal partitions if we find any.

Closes: https://github.com/ceph/ceph-ansible/issues/1770
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-08-21 18:08:18 +02:00
Andrew Schoen d2f4d3666f Merge pull request #1725 from ceph/simplify-osd-scenario
osd: simply osd scenario declaration
2017-08-03 09:31:57 -05:00
Sébastien Han 671f2cd4bc Merge pull request #1738 from yanyixing/nvmepart
fix for nvme part path
2017-08-03 13:37:10 +02:00
yanyx d506fad056 fix for nvme part path 2017-08-03 17:37:52 +08:00
Sébastien Han 30991b1c0a osd: simplify scenarios
There is only two main scenarios now:

* collocated: everything remains on the same device:
  - data, db, wal for bluestore
  - data and journal for filestore
* non-collocated: dedicated device for some of the component

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-08-03 10:20:39 +02:00
Sébastien Han fdc6aebd62 infrastructure-playbooks: update with ceph-defaults roles
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-08-02 17:12:20 +02:00
Guillaume Abrioux 7a333d05ce Add handlers for containerized deployment
Until now, there is no handlers for containerized deployments.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-08-02 17:12:20 +02:00
Guillaume Abrioux 5adbf0fdaa Move role dependencies in site.yml/site-docker.yml
This will give us more flexibility and avoid a lot of useless when
skipping all tasks from a non-desired role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-08-02 17:12:14 +02:00
Guillaume Abrioux 206c7a16d0 rolling_update: refact code
Refact rolling_update playbook.
Add ceph-client upgrade.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-08-02 11:10:51 +02:00
yanyx d0a17b11b2 change the partition's ownership 2017-07-27 11:55:30 +08:00
Sébastien Han fad9d0caec Merge pull request #1690 from yanyixing/master
fix: when osd device is a disk partition
2017-07-26 15:55:29 +02:00
yanyx 2e6233271e fix: when osd device is a disk partition 2017-07-25 21:39:43 +08:00
Sébastien Han 0c18cf199e purge: remove leftover unit files
Closes https://github.com/ceph/ceph-ansible/issues/1672

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-07-25 13:26:28 +02:00
Guillaume Abrioux 828f88403e Update: Avoid screen scraping in rolling update
since luminous has revamped the `ceph -s` output, we need to avoid screen
scraping.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-07-12 15:02:39 +02:00
Guillaume Abrioux 896d62d78b Refact: remove ceph_mon_docker_interface variable
remove `ceph_mon_docker_interface` and use `monitor_interface` instead
for both containerized and non-containerized deployment.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-07-04 18:08:59 +02:00
Guillaume Abrioux 73141118d0 Make the new check PGs working with /bin/sh
The new test in the checks PGs are no longer working on distributions
where /bin/sh isn't linked to /bin/bash.

Fix: #1619
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-06-22 17:59:38 +02:00
David Galloway 127b5ad9b4 infra: Create a backup of ceph.conf when taking over existing cluster
Signed-off-by: David Galloway <dgallowa@redhat.com>
2017-06-21 09:53:09 -04:00
David Galloway 40ed2d7be6 infra: Fix ceph.conf creation when taking over existing cluster
Fixes bug introduced in https://github.com/ceph/ceph-ansible/pull/1330

The "stat ceph.conf" task was basically using the stat module on a
string instead of the ceph.conf filename.  This caused the "generate
ceph configuration file" task to fail.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1463382

Signed-off-by: David Galloway <dgallowa@redhat.com>
2017-06-21 09:52:01 -04:00
Andrew Schoen e2104acb62 rolling_update: set health_mon_check_delay to 15
The old value of 10 did not give enough time for a containerized mon to
pass the health check.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-06-13 08:56:44 -05:00