Commit Graph

4293 Commits (b1fa3c881cf9f1c797083a95d9eb99493b887d8b)
 

Author SHA1 Message Date
Guillaume Abrioux 1877e1b330 tests: run lvm_setup.yml only when osd_scenario is lvm
especially for ooo_collocation scenario which is still using ceph-disk
testing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-31 00:33:10 +01:00
Guillaume Abrioux 2abde600cd tests: add nodes for container-all_daemons scenario
add back iscsigw and rbdmirror vm in all_daemons testing

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-30 14:58:59 +01:00
Noah Watkins b8c39d7613 Add a ceph-volume aware shrink-osd playbook
Signed-off-by: Noah Watkins <nwatkins@redhat.com>
(cherry picked from commit f5dacbf7de)
2019-01-30 14:58:59 +01:00
Noah Watkins 8f57a95048 Rename ceph-disk version of shrink-osd playbook
This will be replaced by a ceph-volume aware verison.

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
(cherry picked from commit 0782cfc546)
2019-01-30 14:58:59 +01:00
Guillaume Abrioux 802e692b7b tests: specify docker params for shrink-osd
Otherwise, it will go with the default values, eg:

"latest" for `ceph_docker_image_tag`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-30 14:58:59 +01:00
Noah Watkins fc6bae26ac Fixup shrink_osd[_container] scenario config
** configuration seems to be for filestore:

[ERROR]: [ceph-osd0] Validation failed for variable: lvm_volumes

** Removing `radosgw_interface: eth1` to resolve:

The task includes an option with an undefined variable. The error was:
'ansible.vars.hostvars.HostVarsVars object' has no attribute
u'ansible_eth1'

The error appears to have been in
'/home/nwatkins/src/ceph-ansible/roles/ceph-defaults/tasks/set_radosgw_address.yml':
line 21, column 5, but may be elsewhere in the file depending on the
exact syntax problem.

The offending line appears to be:

  - name: set_fact _radosgw_address to radosgw_interface - ipv4
    ^ here

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
(cherry picked from commit 50255b9640)
2019-01-30 14:58:59 +01:00
Guillaume Abrioux 299baed635 tests: refact testing in stable-3.2
Apply the same refact recently introduced in master to stable-3.2

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-30 14:58:59 +01:00
Guillaume Abrioux af17e0dfbb override ceph_release with ceph_stable_release
when `ceph_origin` is set to `'repository'` and `ceph_repository` to
`'community'` we need to ensure `ceph_release` reflect
`ceph_stable_release`.

4a3f180f9d simply removed the override
while it should just have to be run only when the condition mentioned
above is satisfied.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0bfefdd5bc)
2019-01-24 14:18:34 +00:00
Guillaume Abrioux e29cdd0a61 config: remove code related to ceph release prior to luminous
This part of the code is not needed since ceph-ansible@master is
intended to deploy ceph@master only.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1bbdde272f)
2019-01-24 14:18:34 +00:00
Guillaume Abrioux eaa92f7e55 ceph-default: rm useless condition
This condition is useless and it's also creating issues we don't see in
our CI. ceph_release is set by either ceph-common or ceph-docker-common
so let's keep it this way.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379

(cherry picked from commit e9188cd202)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-24 14:18:34 +00:00
Giulio Fidente 75855b2d58 Preserve rolling_update backward compatibility with ansible < 2.5
Let's enforce the default value for `client_update_batch` to 20 since
`ansible_forks` isn't always available.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1650184

Signed-off-by: Giulio Fidente <gfidente@redhat.com>
(cherry picked from commit ff8dbe114c)
2019-01-21 14:28:07 +00:00
Guillaume Abrioux 44afe16568 Vagrantfile: remove useless default values
Those default values are useless and might cause issues.

- `osd_scenario` should be mandatory anyway.
- `pool_default_size` is not used anymore (this has been refactored
recently.

Closes: #3468

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c7a929b2dc)
2019-01-21 10:22:18 +01:00
Noah Watkins e57e2d98a1 start_osds: use list instead of keys (re-introduce)
the python3 fix merged by:

  https://github.com/ceph/ceph-ansible/pull/3346

was reintroduced a few days later by:

  82a6b5adec

and this patch fixes it again :)

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
(cherry picked from commit 3cf5fd2c3e)
2019-01-16 15:48:35 +00:00
Brad Hubbard 53650b2fc4 site: Make sure is_atomic is defined
configure_firewall tests the is_atomic variable if the firewalld package
is not present. is_atomic is defined in ceph_facts so include that.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
(cherry picked from commit 55fab6f547)
2019-01-15 15:16:40 +00:00
Sébastien Han a08d4c072f ceph-facts: resync group_vars file
Signed-off-by: Sébastien Han <seb@redhat.com>
2019-01-14 18:54:46 +00:00
Sébastien Han 04d8002614 switch: do not fail on missing key
Some people use the switch playbook to perform upgrade so they end up in
the same situation than https://bugzilla.redhat.com/show_bug.cgi?id=1650572
This is applying the same fix as
729744c6a8.

We don't want to fail on key that are not present since they will get
created after the mons are updated. They will be created by the task
"create potentially missing keys (rbd and rbd-mirror)".

Signed-off-by: Sébastien Han <seb@redhat.com>
2019-01-14 18:54:46 +00:00
Rishabh Dave 4e94d11aa7 ceph-infra: remove ntp_rmp.yml and ntp_debian.yml
This commit fixes the merge conflict that occurred during the
auto-backport and auto-merge of the commit
488281187e.

Also please note that the commit
488281187e was merged (on PR 3477)
"as it is" (despite of merge conflicts) which was not supposed to be
the case ideally. This had a side-effect that the feature of supporting
multiple NTP daemons (new ones are namely chronyd and timesyncd) was
also backported which is itself against the convention. For
consistency's sake the feature was backported to stable-3.1 as well.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-01-09 22:15:18 +01:00
Guillaume Abrioux 416b503476 introduce new role ceph-facts
sometimes we play the whole role `ceph-defaults` just to access the
default value of some variables. It means we play the `facts.yml` part
in this role while it's not desired. Splitting this role will speedup
the playbook.

Closes: #3282

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0eb56e36f8)
2019-01-07 09:14:10 +01:00
Guillaume Abrioux c3bb76b8e9 purge-container: move facts gathering after ceph-defaults role import
This task has to be called after the role `ceph-defaults` has been
played, otherwise, `mon_group_name` will never be known.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a12de3e048)
2019-01-07 09:14:10 +01:00
Guillaume Abrioux b9bf7c6703 purge-container: fix wrong syntax
we want a default value for `mon_group_name`, not for
`groups[mon_group_name]`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d0b3cb7f85)
2019-01-07 09:14:10 +01:00
Guillaume Abrioux 0ff1260fc1 purge-docker: do not call ceph-osd role
calling ceph-osd role in purge playbook is not needed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ae7f3d66a6)
2019-01-07 09:14:10 +01:00
Guillaume Abrioux c405fd1140 purge: gather monitors facts in OSD purge
the OSD part of the purge delegates commands on monitor node, we need to
gather monitors facts to know the `ansible_hostname` fact that is used
in the `docker_exec_cmd` fact.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1a4a6ec855)
2019-01-07 09:14:10 +01:00
Sébastien Han 37ba313d76 purge-container: gather fact before calling ceph-defaults
ceph-defaults relies on facts so we must gather facts before running it.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 62111ff53c)
2019-01-07 09:14:10 +01:00
Sébastien Han 8e83ecfce1 purge-cluster: add support for mon/mgr collocation
Recently we introduced the default collocation of mon/mgr without the
need of a dedicated mgrs section. This means we have to stop the mgr
process on that machine too.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit fc6ebd8ebb)
2019-01-07 09:14:10 +01:00
Sébastien Han 12d6466582 purge-cluster: remove support for other init system
We only support systemd and use the service module anyway.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 3a154fa0ad)
2019-01-07 09:14:10 +01:00
Sébastien Han 782959f094 purge-docker-cluster: add support for mgr/mon collocation
Recently we introduced the collocation of mon and mgr by default, so we
don't need to have an explicit mgrs section for this. This means we have
to remove the mgr container on the mon machines too.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 325a159415)

# Conflicts:
#	infrastructure-playbooks/purge-docker-cluster.yml
2019-01-07 09:14:10 +01:00
Sébastien Han 8ce8d580a4 purge-docker-cluste: add a task to check hosts
It's useful when running on CI to see what might remain on the machines.
So we list all the containers and images. We expect the list to be
empty.

We fail if we see containers running.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 2bcc00896f)
2019-01-07 09:14:10 +01:00
Sébastien Han f37c21a9d0 purge-docker-cluster: add ceph-volume support
This commits adds the support for purging cluster that were deployed
with ceph-volume. It also separates nicely with a block intruction the
work to do when lvm is used or not.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 1751885bc9)
2019-01-07 09:14:10 +01:00
Bruceforce 5c618d7084 The nfs_ganesha_dev_apt_repo variable was set incorrect in task
"fetch nfs-ganesha development repository"
This has to be pushed directly to stable-3.2 since master has diverged

Signed-off-by: Bruceforce <Bruceforce@users.noreply.github.com>
2019-01-07 08:03:19 +00:00
Rishabh Dave b2024899b9 ceph-infra: disable unrequired NTP services
When one of the currently supported NTP services has been set up,
disable rest of the NTP services on Ceph nodes.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 6fa757d343)
2019-01-04 13:52:19 +00:00
Rishabh Dave 488281187e ceph-infra: merge ntp_debian.yml and ntp_rpm.yml
Merge ntp_debian.yml and ntp_rpm.yml into one (the new file is called
setup_ntp.yml) since they are almost identical. Also avoid repetition
of the common setup step for ntpd and chronyd services.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit b03ab60742)

# Conflicts:
#	roles/ceph-infra/tasks/ntp_debian.yml
#	roles/ceph-infra/tasks/ntp_rpm.yml
2019-01-04 13:52:19 +00:00
Sébastien Han 668c7a4db7 fix json data type
Json is a type structure which is always typed as a string, where before
this we were declaring a dict, which is not a json valid structure.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1663026

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 896676ee80)
2019-01-04 12:02:34 +01:00
Guillaume Abrioux dc02156736 update: do not enforce `serial: 1` on client nodes
There is no need to enforce `serial: 1` on client nodes.
Let's make it parameterizable by introducing a new *extra* variable
`client_update_batch`, if not filled this will default to `{{
ansible_forks }}`.

NOTE: this is only usable as an extra variable passed with
`-e client_update_batch=<num>`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1650184

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 268f2cef82)
2019-01-04 11:59:02 +01:00
Rishabh Dave 6e2cd0930f set any_errors_fatal to true for all host sections
Add `any_errors_fatal: true` to all host sections in `site.yml.sample`
and `site-container.yml.sample` so that the playbook execution
ceases spontaneously and instantaneously when errors occurs.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 5f43dae593)
2018-12-20 15:04:41 +01:00
Kai Wembacher e2852eb40e add support for rocksdb and wal on the same partition in non-collocated
Signed-off-by: Kai Wembacher <kai@ktwe.de>
(cherry picked from commit a273ed7f60)
2018-12-20 14:21:14 +01:00
Sébastien Han 3ed5de5cd1 purge: tox add lvm-setup
Since we deploy > purge > deploy the LVs are gone so we much recreate
them.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 656fbd2901)
2018-12-20 14:03:30 +01:00
Andrew Schoen e55ec6c0f5 purge-cluster: skip tasks that use ceph-volume if it's not installed
This will allow the playbook to be idempotent.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1656935

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit ffd56177e7)
2018-12-20 14:03:30 +01:00
Noah Watkins e3f2f5e926 ceph_keys: pass in module for error messages
fixes: #3421

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
(cherry picked from commit 114fac15dc)
2018-12-17 19:58:33 +00:00
Sébastien Han e7a85d301d RELASE-NOTE: fix PR links
Fix wrong position of link and names. The format is [name](link).

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-10 10:02:36 +01:00
Sébastien Han 7a13649cd5 Add release note for stable-3.2
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-12-06 14:14:32 +00:00
Guillaume Abrioux 3764e52ca9 tests: reintroduce purge_cluster scenario
- reintroduce `purge_cluster_container` and `purge_cluster_non_container`
on `stable-3.2`,
- remove all purge scenario based on ceph-disk,
- remove purge_lvm_osds_* scenarios.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-04 18:04:13 +01:00
Guillaume Abrioux 408597f231 tests: add purge_lvm_osds_container scenario
This commits adds the purge_lvm_osds_container scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b04fe72f35)
2018-12-04 18:04:13 +01:00
Guillaume Abrioux e37a90b5ec purge: add iscsi support
add iscsi support for both non containerized and containerized
deployment in purge playbooks.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1651054

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 78116fa6db)
2018-12-04 18:04:13 +01:00
Guillaume Abrioux c3a2320e01 revert infra: don't restart firewalld if unit is masked
If firewalld unit is masked, setting `configure_firewall: false` is
enough

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1655059

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1cff1f9806)
2018-12-04 17:31:31 +01:00
Ramana Raja 0ec2ac34e3 rolling_update: fail if less than 3 MONs
... for non-containerized deployments as well.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1655470

Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit cb784c601d)
2018-12-04 16:34:57 +01:00
Sébastien Han 50fe56044e disable nfs scenario
The packages are broken, so let's remove it, until this solved.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit a502327e52)
2018-12-04 14:39:05 +00:00
Sébastien Han fa8bd10cac test: disable nfs for containers
Based on https://github.com/ceph/ceph-container/pull/1269 and given
there are no stable packages and reliable repository, we disable nfs
ganesha temporarly.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 6c3ef90ebe)
2018-12-04 14:39:05 +00:00
Sébastien Han 8d1c67beb2 osd: discover osd_objectstore on the fly
Applying and passing the OSD_BLUESTORE/FILESTORE on the fly is wrong for
existing clusters as their config will be changed.

Typically, if an OSD was prepared with ceph-disk on filestore and we
change the default objectstore to bluestore, the activation will fail.
The flag osd_objectstore should only be used for the preparation, not
activation. The activate in this case detects the osd objecstore which
prevents failures like the one described above.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 4c51130198)
2018-12-04 09:01:50 +00:00
Sébastien Han 1151521784 ceph-osd: change jinja condition
If an existing cluster runs this config, and has ceph-disk OSD, the
`expose_partitions` won't be expected by jinja since it's inside the
'old' if. We need it as part of the osd_scenario != 'lvm' condition.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1640273
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit bef522627e)
2018-12-04 09:01:50 +00:00
Sébastien Han 729744c6a8 rolling_update: do not fail on missing keys
We don't want to fail on key that are not present since they will get
created after the mons are updated. They will be created by the task
"create potentially missing keys (rbd and rbd-mirror)".

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1650572
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit ebc901c6af)
2018-12-03 13:03:33 +01:00