Commit Graph

2062 Commits (7277a553fa46f788e4a935c68fe7677e92059d5e)
 

Author SHA1 Message Date
Sébastien Han 7277a553fa common: create ceph initial directories
Some users purge their environments and leave it in a non-optimal state.
e.g: packages are still installed but /etc/ceph and /var/lib/ceph don't
exist anymore. This will result in multiple failures across the play,
sometimes hard to detect. Populating these directories "just in case"
should help us solving these problems.

Closes: #1253
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 1149825f8f)
2017-01-31 07:36:03 -06:00
Sébastien Han b925147554 purge: do not stop ceph.target on each daemon
Doing this cause some all the daemons to go down at the same time. In a
scenario where we colocate a monitor and an osd, this osds will take
some time to go down which will make the 'umount' task fail.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit d5dd658cfa)
2017-01-31 07:35:47 -06:00
Sébastien Han d79446898c purge: do not fail on purge ceph files
On systems running docker there is an issue with lxfs that results in
the find command returning 1 but actually did the job.
e.g: on a system with docker runnning find /var will give us the
following error:

find:
'/var/lib/lxcfs/cgroup/devices/lxc/x1/system.slice/systemd-update-utmp.service/devices.deny':
Permission denied
find:
'/var/lib/lxcfs/cgroup/devices/lxc/x1/system.slice/dev-random.mount/devices.allow':
Permission denied
...
...

However ceph files got deleted so we ignore the error.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit cb57a359ba)
2017-01-31 07:35:26 -06:00
Sébastien Han 38d25ba3ee purge: fix ubuntu purge when not using systemd
We now rely on the cli tool ceph-detect-init which will tell us the init
system in used on the distribution. We do this instead of the previous
lookup for systemd unit files to call the right task depending on the
init system.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit e371bd591c)
2017-01-31 07:35:15 -06:00
Sébastien Han 15b89984aa purge: allow purge to run multiple times
with_items is evaluated before the when so in a second run where the
variable is empty if will fail with "'dict object' has no attribute
'stdout_lines'". To fix this we had a default array so with_items does
not fail and the task is skipped with the when.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 0e2e270ab2)
2017-01-31 07:35:00 -06:00
Sébastien Han fa0fdee2c0 osd: make sure osd directory exists
Sometimes users for testing, tend to delete the whole /var/lib/ceph and
then run ansible again, OSD will never come up if we do not create their
directory.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 6f53774ee9)
2017-01-31 07:34:50 -06:00
Andrew Schoen 1b97602776 Merge pull request #1255 from ceph/backport-1250
Backport: 'CI testing updates'
2017-01-27 11:22:45 -06:00
Andrew Schoen b1f7cf6215 tests/purge_cluster: setup a xenial cluster instead of centos7
The purge_dmcrypt scenario also tests centos7, so change this one to
xenial so we can have more test coverage.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 2a87c13f17)
2017-01-27 07:56:41 -06:00
Andrew Schoen bbf426c5d8 purge-cluster: fix failure when raw_multi_journal is not defined
Because the purge-cluster.yml playbook does not have access to the roles
default vars then we can be sure that raw_multi_journal is defined. For
example, if this was purging a dmcrypt journal then raw_multi_journal
might not be defined at all in group_vars/all.yml or
group_vars/osds.yml.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit d3cb8dba4e)
2017-01-27 07:56:31 -06:00
Andrew Schoen 03d229dcff purge-cluster: fix syntax when deleting dmcrypt devices
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit b2a6f095f1)
2017-01-27 07:56:18 -06:00
Andrew Schoen 94fae06e63 tests: adds purge_cluster and purge_dmcrypt scenarios
This also removes the purge_cluster_collocated scenario as it's not
needed now because of purge_cluster.

Moving all the purge commands into its own section allows for ease of
reuse when creating new purge scenarios.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit e05df64fd0)
2017-01-27 07:56:08 -06:00
Andrew Schoen 15cb1e21ba tests/journal_collocation: adds testing values to ceph_conf_overrides
This gives test coverage to changes introduced in:

https://github.com/ceph/ceph-ansible/pull/1214

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 20705aa35a)
2017-01-27 07:55:57 -06:00
Andrew Schoen 75391b4a2a Merge pull request #1251 from ceph/backport-1214
Backport: 'mon: make sure osd_pool_default_size is honoured'
2017-01-26 19:16:53 -06:00
Sébastien Han a3328a9e19 mon: make sure osd_pool_default_size is honoured
This patch makes sure we set the proper pool size on the rbd pool.
Usually during bootstrap the rbd pool size is not honoured so we need to
add this workaround.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit e35070f6ce)
2017-01-26 15:43:23 -06:00
Sébastien Han 676749e94b Merge pull request #1249 from ceph/backport-1235
Backport of purge cluster updates
2017-01-25 23:26:59 +01:00
Sébastien Han 8f17b6d2a6 purge: remove dm-crypt devices
When running encrypted OSDs, an encrypted device mapper is used (because
created by the crypsetup tool). So before attempting to remove all the
partitions on a device we must delete all the encrypted device mappers,
then we can delete all the partitions.

Signed-off-by: Sébastien Han <seb@redhat.com>

 Please enter the commit message for your changes. Lines starting

(cherry picked from commit 73ca1a7a00)

Resolves: backport#1235
2017-01-25 16:24:20 -06:00
Sébastien Han 3b665390bd purge: remove zap_block_devs variable
The name of this variable was a bit confusing since its activation will
zap all the block devices no matter which osd scenario we are using.
Removing this variable and applying a condition on the OSD scenario is
now feasible and easier since we import group_vars variable files for
OSDs.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit adeb3decf3)

Resolves: backport#1235
2017-01-25 16:24:08 -06:00
Sébastien Han 95b52ad6af purge: cosmetic cleanup
Just applying our writing syntax convention in the playbook.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit b7fcbe5ca2)

Resolves: backport#1235
2017-01-25 16:23:50 -06:00
Sébastien Han 325d3ec1dc Merge pull request #1248 from ceph/backport-1247
Backport: Adds ip_version configuration option
2017-01-24 18:30:48 +01:00
Andrew Schoen 6d01936a2a Adds ip_version configuration option
This allows the user to set ip_version to either ipv4 or ipv6. This
resolves a bug where monitor_address is set to an ipv6 address, but the
template fails to render because it's hardcoded to look for an 'ipv4'
key in the ansible facts.

See: https://bugzilla.redhat.com/show_bug.cgi?id=1416010

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

Resolves: bz#1416010
(cherry picked from commit 03cb803bd1)
2017-01-24 11:29:17 -06:00
Andrew Schoen 79e7a23890 Merge pull request #1245 from ceph/backport-1146
backport of PR #1146
2017-01-23 10:59:05 -06:00
Andrew Schoen be3b5b4251 tests/xenial_cluster: adds a client node
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit a774ea025d)
2017-01-23 10:19:12 -06:00
Sébastien Han 42d7092c3a test: add tests for the client role
Here we test the client role.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 48a8cd1a43)
2017-01-23 10:18:56 -06:00
Sébastien Han bc5a73df51 mon: fix mds pool creation
It is not enough to check for the mds to exists, it actually always does
because we declare the variable. So we need to make sure that there is a
mds host.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 90648e7518)
2017-01-23 10:18:42 -06:00
Sébastien Han dfd3c7662f mon: pool creation and pgs
Since we introduced config_overrides we removed a lot of options from
the default template. In some cases, like mds pool, openstack pools etc
we need to know the amount of PGs required. The idea here is to skip the
task if ceph_conf_overrides.global.osd_pool_default_pg_num is not define
in your `group_vars/all.yml`.

Closes: #1145

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ddac3a1fb5)
2017-01-23 10:18:29 -06:00
Alfredo Deza 65be76e3d7 Merge pull request #1237 from ceph/ceph-docker-common-backport
ceph-docker-common backport
2017-01-18 10:51:29 -05:00
Andrew Schoen f8d18253b5 ceph-osd: use ceph_docker_registry when preparing OSDs
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 0c55a35963)
2017-01-18 09:15:32 -06:00
Andrew Schoen e42721edf0 add ceph_docker_registry to all.docker.yml.sample
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit e0d73b5245)
2017-01-18 09:15:21 -06:00
Andrew Schoen 283af6ca60 use ceph_docker_registry when starting containers
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 655b8449ae)
2017-01-18 09:15:03 -06:00
Andrew Schoen d61d8ba3a5 ceph-docker-common: add symlink to ceph.ceph-docker-common
This allows for the role to be used with ansible-galaxy and to fix the
include in all the meta/main.yml files in the roles.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 3713824b79)
2017-01-18 09:14:51 -06:00
Andrew Schoen 12d9d2dca5 use ceph_docker_registry in all the roles instead of docker.io
This allows for ceph-ansible to use other docker registries.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 9449dbf083)
2017-01-18 09:14:24 -06:00
Andrew Schoen 77a8a1f71b ceph-common: include ceph_docker_registry when fetching the image
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 25277587fa)
2017-01-18 09:14:09 -06:00
Andrew Schoen d8a692b61c use ceph-docker-common in roles that support docker deployments
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit c07b7ddbaa)
2017-01-18 09:13:55 -06:00
Andrew Schoen 752a6c1e76 ceph-docker-common: a new role to share things common to docker
We can use this to share common variables and tasks needed for every
containerized deployment.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit f770780dda)
2017-01-18 09:13:38 -06:00
Andrew Schoen 3cc3a8b865 Merge pull request #1233 from ceph/bump-ansible
Bump ansible to 2.2.1
2017-01-17 12:17:31 -06:00
Alfredo Deza 3f55a2d185 tests: bump ansible testing version to 2.2.1 for the 2.2 environment
Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 1a4886a561)
2017-01-17 12:46:34 -05:00
Andrew Schoen 30e3450465 Merge pull request #1231 from ceph/purge-cluster-fixes
Purge cluster fixes
2017-01-17 11:33:37 -06:00
Andrew Schoen 296a19b2b3 tests: copy purge-cluster.yml to root of ceph-ansible
There is an Ansible bug which makes the playbook fail when we are
running a playbook from the non-git root directory. The real problem is
that the ansible.cfg is not honoured and we are including variable from
roles/<role>/defaults/main.yml

The fix is too copy the purge cluster playbook on the git root directory
and execute it.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 48ac9579b6)
2017-01-17 11:45:33 -05:00
Andrew Schoen 70bb86a884 purge-cluster: do not include ceph-osd and ceph-common defaults for osds
When purging OSDs we do not need to include these defaults as nothing in
the following tasks uses them. Also, it has the side effect of
overwriting any variables defined in group_vars files that are relative
to the inventory you are using with the default values. That behavior
was causing the CI tests to fail.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit dd8389cdf7)
2017-01-17 11:43:58 -05:00
Andrew Schoen 7049152ac7 tests: adds a purge_cluster_collocated scenario
This scenario brings up a 1 mon 1 osd cluster using journal collocation,
purges the cluster and then verifies it can redeploy the cluster.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 0ce18daa49)
2017-01-17 09:48:30 -05:00
Andrew Schoen 67a9381a46 purge-cluster: get journal partitions after zapping osd disks
In my testing zapping the osd disks deleted the journal
partitions, making the 'zap ceph journal partitions' task fail because
the partitions it found previously do not exist anymore.

This moves the task that finds the journal partitions after 'zap osd disks'
to catch any partitions ceph-disk might have missed.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 321cea8ba9)
2017-01-17 09:48:22 -05:00
Andrew Schoen 67c24cfed0 purge-cluster: use ignore_errors: true when including group_vars files
Using failed_when will still throw an exception and stop the playbook if
the file you're trying to include doesn't exist.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit c9e5914377)
2017-01-17 09:48:10 -05:00
Andrew Schoen 3162ff1753 Merge pull request #1219 from ceph/rhcs-mds-repo-2.1
common: enable tool repo for mds install of rhcs
2017-01-05 18:46:34 -06:00
Sébastien Han 3f8b1fecf7 common: rename rh_storage to rhcs to match product name
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit d44927de03)
2017-01-05 16:16:27 -07:00
Sébastien Han 1a4e3ab5f3 common: enable tool repo for mds install of rhcs
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1405985

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 775d61ed09)
2017-01-05 16:16:21 -07:00
Andrew Schoen 67ab28a037 Merge pull request #1218 from ceph/ceph-common-tag-2.1
ceph-common: always include release.yml
2017-01-05 16:58:32 -06:00
Ken Dreyer b29f49ca71 ceph-common: always include release.yml
Prior to this change, a playbook run with '--tags' or '--skip-tags'
would fail, because the ceph-common role would not include the
release.yml task, and this file defines critical things like
ceph_release.

Thanks Andrew Schoen <aschoen@redhat.com> for help with the fix.

(cherry picked from commit 63e5b5c406)
2017-01-05 15:29:09 -07:00
Andrew Schoen 99d66e09d9 Merge pull request #1153 from ceph/cluster-name-test
test: add cluster name support test scenario
2016-12-16 13:10:52 -06:00
Sébastien Han 2d8ac4a586 docker: only use systemd to manage containers
Prior to this patch we had several ways to runs containers, we could use
ansible's docker module on some distro and on containers distros we were
using systemd. We strongly believe threating containers as services with
systemd is the right approach so this patch generalizes to all the
distros. These days most of the distros are running systemd so it's fair
assumption.

Signed-off-by: Sébastien Han <seb@redhat.com>
2016-12-16 19:37:05 +01:00
Sébastien Han ce7431a227 docker: add support for cluster name
We need to honour the cluster name that was chosen by ceph-ansible and
pass it to ceph-docker.

Signed-off-by: Sébastien Han <seb@redhat.com>
2016-12-16 14:31:21 +01:00