Commit Graph

344 Commits (81de8a81066febb6b7ea93415a44e8473eacc13f)

Author SHA1 Message Date
Guillaume Abrioux 64659d2c82 iscsi: assign application (rbd) to pool 'rbd'
if we don't assign the rbd application tag on this pool,
the cluster will get `HEALTH_WARN` state like following:

```
HEALTH_WARN application not enabled on 1 pool(s)
POOL_APP_NOT_ENABLED application not enabled on 1 pool(s)
    application not enabled on pool 'rbd'
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4cf17a6fdd)
2019-06-13 14:43:25 +02:00
Dimitri Savineau 8a74928a19 tox: Refact lvm_osds scenario
The current lvm_osds only tests filestore on one OSD node.
We also have bs_lvm_osds to test bluestore and encryption.
Let's use only one scenario to test filestore/bluestore and with or
without dmcrypt on four OSD nodes.
Also use validate_dmcrypt_bool_value instead of types.boolean on
dmcrypt validation via notario.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 52b9f3fb28)
2019-05-10 11:24:32 +02:00
Guillaume Abrioux 5053f32c15 osds: allow passing devices by path
ceph-volume didn't work when the devices where passed by path.
Since it now support it, let's allow this feature in ceph-ansible

Closes: #3812

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8f2c45dfd3)
2019-05-09 14:21:43 +02:00
Dimitri Savineau f3785ef7dd tests: Add debug to ceph-override.json
It's usefull to have logs in debug mode enabled in order to have
more information for developpers.
Also reindent to json file.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d25af1b872)
2019-04-11 15:38:14 +00:00
Dimitri Savineau e3e6285aa9 tests/functional: use ceph-override.json symlink
We don't need to have multiple ceph-override.json copies. We
currently already have symlink to all_daemons/ceph-override.json so
we can do it for all scenarios.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a19054be18)
2019-04-11 15:38:14 +00:00
Ali Maredia e943288cae rgw multisite: add more than 1 rgw to the master or secondary zone
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1664869

Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit 37f46a8c5d)
2019-04-06 08:50:30 +00:00
Guillaume Abrioux f200f1ca87 tests: refact update scenario (stable-3.2)
refact the update scenario like it has been made in master.
(see f0e616962)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-01 16:35:24 +02:00
Guillaume Abrioux 005cb09ba9 tests: add mgr and nfs nodes in all_daemons
even not used, we need to fire up those VMs to be able to perform the
upgrade in the CI.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-28 15:40:43 +01:00
Guillaume Abrioux 224bab0d70 tests: add mgrs section in non_container-collocation
No mgrs are deployed in this scenario, causing the testinfra jobs to
fail.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-05 10:49:45 +01:00
Guillaume Abrioux 36fafadc67 tests: fix collocation scenario
ceph_origin and ceph_repository are mandatory variables.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-05 10:49:45 +01:00
Guillaume Abrioux b7f5233d07 tests: add lvm bluestore dmcrypt support
Add coverage for container / non container lvm bluestore dmcrypt OSDs

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 207fae38d4)
2019-02-28 13:48:39 +00:00
Guillaume Abrioux 15b1f22ca3 tests: do not deploy iscsigw on ubuntu
not supported on non rhel based distribution

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-06 14:48:21 +01:00
Guillaume Abrioux 2738a945a3 tests: add inventory file
add missing inventory file for ubuntu-container-all_daemons job

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-06 14:48:21 +01:00
Guillaume Abrioux 1877e1b330 tests: run lvm_setup.yml only when osd_scenario is lvm
especially for ooo_collocation scenario which is still using ceph-disk
testing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-31 00:33:10 +01:00
Guillaume Abrioux 2abde600cd tests: add nodes for container-all_daemons scenario
add back iscsigw and rbdmirror vm in all_daemons testing

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-30 14:58:59 +01:00
Noah Watkins fc6bae26ac Fixup shrink_osd[_container] scenario config
** configuration seems to be for filestore:

[ERROR]: [ceph-osd0] Validation failed for variable: lvm_volumes

** Removing `radosgw_interface: eth1` to resolve:

The task includes an option with an undefined variable. The error was:
'ansible.vars.hostvars.HostVarsVars object' has no attribute
u'ansible_eth1'

The error appears to have been in
'/home/nwatkins/src/ceph-ansible/roles/ceph-defaults/tasks/set_radosgw_address.yml':
line 21, column 5, but may be elsewhere in the file depending on the
exact syntax problem.

The offending line appears to be:

  - name: set_fact _radosgw_address to radosgw_interface - ipv4
    ^ here

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
(cherry picked from commit 50255b9640)
2019-01-30 14:58:59 +01:00
Guillaume Abrioux 299baed635 tests: refact testing in stable-3.2
Apply the same refact recently introduced in master to stable-3.2

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-30 14:58:59 +01:00
Sébastien Han 50fe56044e disable nfs scenario
The packages are broken, so let's remove it, until this solved.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit a502327e52)
2018-12-04 14:39:05 +00:00
Sébastien Han fa8bd10cac test: disable nfs for containers
Based on https://github.com/ceph/ceph-container/pull/1269 and given
there are no stable packages and reliable repository, we disable nfs
ganesha temporarly.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 6c3ef90ebe)
2018-12-04 14:39:05 +00:00
Guillaume Abrioux 68b2ad11ee mon: move `osd_pool_default_pg_num` in `ceph-defaults`
`osd_pool_default_pg_num` parameter is set in `ceph-mon`.
When using ceph-ansible with `--limit` on a specifc group of nodes, it
will fail when trying to access this variables since it wouldn't be
defined.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d4c0960f04)
2018-11-29 01:49:05 +00:00
Guillaume Abrioux e8dd6b8993 tests: change default pools size
default pool size in our test should be explicitly set to 1

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-11-29 01:49:05 +00:00
Guillaume Abrioux 30cec03ae7 tests: do not fully override previous ceph_conf_overrides
We run an initial deployment with `osd_pool_default_size: 1` in
`ceph_conf_overrides`.
When re-running the playbook to test idempotency and handlers, we reset
`ceph_conf_overrides`, we must append a new value instead of just
overwritting it, otherwise, this can lead to error in the CI.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f290e49df8)
2018-11-29 01:49:05 +00:00
Guillaume Abrioux 133615471a tests: set pool size to 1 in ceph-override.json
setting this setting to 1 makes the CI covering the related code in the
playbook without breaking the upgrade scenarios.

Those scenarios were broken because there is a check `TASK [waiting for
clean pgs...]` in rolling_update.yml, since the pool size for
`cephfs_metadata` and `cephfs_data` are updated to `2` in
`ceph-override.json` and there is not enough osd to honor this size,
some PGs are degraded and make the mentioned check failing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3ac6619fb9)
2018-11-28 23:11:46 +01:00
Guillaume Abrioux f52344300a tests: add more memory for rgw_multsite scenarios
Adding more memory to VMs for rgw_multisite scenarios could avoid this error
I have recently hit in the CI:

(It is worth it to set 1024Mb since there is only 2 nodes in those
scenarios.)

```
fatal: [osd0]: FAILED! => {
    "changed": false,
    "cmd": [
        "docker",
        "run",
        "--rm",
        "--entrypoint",
        "/usr/bin/ceph",
        "docker.io/ceph/daemon:latest-luminous",
        "--version"
    ],
    "delta": "0:00:04.799084",
    "end": "2018-10-29 17:10:39.136602",
    "rc": 1,
    "start": "2018-10-29 17:10:34.337518"
}

STDERR:

Traceback (most recent call last):
  File "/usr/bin/ceph", line 125, in <module>
    import rados
ImportError: libceph-common.so.0: cannot map zero-fill pages: Cannot allocate memory
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-30 14:00:28 +01:00
Guillaume Abrioux 37970a5b3c tests: add rgw_multisite functional test
Add a playbook that will upload a file on the master then try to get
info from the secondary node, this way we can check if the replication
is ok.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-30 14:00:28 +01:00
Guillaume Abrioux 4d464c1003 rgw: add testing scenario for rgw multisite
This will setup 2 cluster with rgw multisite enabled.
First cluster will act as the 'master', the 2nd will be the secondary
one.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-30 14:00:28 +01:00
Sébastien Han 1cdec4069a test_osd: dynamically get the osd container
Do not enforce the container name since this will fail when we have
multiple VMs running OSDs.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-10-29 15:33:12 +01:00
Sébastien Han 876f6ced74 test: convert all the tests to use lvm
ceph-disk is now deprecated in ceph-ansible so let's convert all the ci
tests to use lvm instead of ceph-disk.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-10-29 15:33:12 +01:00
Sébastien Han 2fd7da12bb test: remove ceph-disk CI tests
Since we are removing the ceph-disk test from the ci in master then
there is no need to have the functionnal tests in master anymore.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-10-29 15:33:12 +01:00
Rishabh Dave ee2d52d33d allow custom pool size
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1596339
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-10-22 16:00:21 +02:00
Guillaume Abrioux c47aa2e83b tests: remove unnecessary variables definition
since we set `configure_firewall: true` in
`ceph-defaults/defaults/main.yml` there is no need to explicitly set it
in `centos7_cluster` and `docker_cluster` testing scenarios.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-19 15:12:45 +02:00
Guillaume Abrioux 1f9090884e Revert "tests: test `test_all_docker_osds_are_up_and_in()` from mon nodes"
This approach doesn't work with all scenarios because it's comparing a
local OSD number expected to a global OSD number found in the whole
cluster.

This reverts commit b8ad35ceb9.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-19 00:12:43 +00:00
Guillaume Abrioux cb35cac926 tests: set configure_firewall: true in centos7|docker_cluster
This way the CI will cover this part of the code.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-19 00:12:43 +00:00
Guillaume Abrioux b8ad35ceb9 tests: test `test_all_docker_osds_are_up_and_in()` from mon nodes
Let's get the osd tree from mons instead on osds.
This way we don't have to predict an OSD container name.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-17 17:07:25 +02:00
Guillaume Abrioux b8418ebd17 add-osds: followup on 3632b26
Three fixes:

- fix a typo in vagrant_variables that cause a networking issue for
containerized scenario.
- add containerized_deployment: true
- remove a useless block of code: the fact docker_exec_cmd is set in
ceph-defaults which is played right after.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-17 17:07:25 +02:00
Guillaume Abrioux 3632b26005 tests: add tests for day-2-operation playbook
Adding testing scenarios for day-2-operation playbook.

Steps:
- deploys a cluster,
- run testinfra,
- test idempotency,
- add a new osd node,
- run testinfra

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-17 11:26:11 +00:00
Guillaume Abrioux 40b7747af7 remove jewel support
As of now, we should no longer support Jewel in ceph-ansible.
The latest ceph-ansible release supporting Jewel is `stable-3.1`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-12 23:38:17 +00:00
Sébastien Han fa38b86cf8 test: fix docker test for lvm
The CI is still running ceph-disk tests upstream. So until
https://github.com/ceph/ceph-ansible/pull/3187 is merged nothing will
pass anymore.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-10-12 20:33:01 +00:00
Sébastien Han 31a0438cb2 ceph_volume: refactor
This commit does a couple of things:

* Avoid code duplication
* Clarify the code
* add more unit tests
* add myself to the author of the module

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-10-10 16:08:41 -04:00
Guillaume Abrioux d2ca24eca8 tests: do not install lvm2 on atomic host
we need to detect whether we are running on atomic host to not try to
install lvm2 package.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-10 16:08:41 -04:00
Sébastien Han 90c66a5848 ci: test lvm in containerized
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-10-10 16:08:41 -04:00
Sébastien Han 0735d39518 tests: osd adjust osd name
Now we use id of the OSD instead of the device name.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-10-10 16:08:41 -04:00
Guillaume Abrioux cc6f41f76a tests: fix lvm2 setup issue
not gathering fact causes `package` module to fail because it needs to
detect which OS we are running on to select the right package manager.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-09 16:12:54 -04:00
Alfredo Deza 3e488e8298 tests: install lvm2 before setting up ceph-volume/LVM tests
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2018-10-09 13:48:50 -04:00
Andrew Schoen a68c680225 tests: remove journal_size from lvm-batch testing scenario
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-10-09 10:09:50 -04:00
Sébastien Han 9fe86c2268 test: use osd_objecstore default value
Do not force filestore on our test but whatever is the default of
osd_objecstore.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-09-27 21:23:49 +00:00
Guillaume Abrioux 3285b47703 tests: add an RGW node on osd0 for ooo-collocation
get more coverage by adding an RGW daemon collocated on osd0.
We've missed a bug in the past which could have been caught earlier in
the CI.
Let's add this additional daemon in order to have a better coverage.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-09-24 14:35:25 +02:00
Guillaume Abrioux 3382c5226c tests: fix monitor_address for shrink_osd scenario
b89cc1746 introduced a typo. This commit fixes it

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-09-13 18:14:01 +02:00
Alfredo Deza 58b2308036 tests: use new 'num_osds' variable in tests
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2018-08-31 21:23:20 +00:00
Sébastien Han 7012835d2b ci: stop using different images on the same run
There is no point of using hosts running on atomic AND centos hosts. So
let's run containerized scenarios on Atomic only.

This solves this error here:

```
fatal: [client2]: FAILED! => {
    "failed": true
}

MSG:

The conditional check 'ceph_current_status.rc == 0' failed. The error was: error while evaluating conditional (ceph_current_status.rc == 0): 'dict object' has no attribute 'rc'

The error appears to have been in '/home/jenkins-build/build/workspace/ceph-ansible-nightly-luminous-stable-3.1-ooo_collocation/roles/ceph-defaults/tasks/facts.yml': line 74, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

- name: set_fact ceph_current_status (convert to json)
  ^ here
```

From https://2.jenkins.ceph.com/view/ceph-ansible-stable3.1/job/ceph-ansible-nightly-luminous-stable-3.1-ooo_collocation/37/consoleFull#1765217701b5dd38fa-a56e-4233-a5ca-584604e56e3a

What's happening here is all the hosts excepts the clients are running atomic, so here: https://github.com/ceph/ceph-ansible/blob/master/site-docker.yml.sample#L62
The condition will skipped all the nodes excepts the clients, thus when running ceph-default, the task "is ceph running already?" is skipped but the task above needs the rc of the skipped task.
This is not an error from the playbook, it's a CI setup issue.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-08-23 16:13:54 +02:00