Commit Graph

440 Commits (98783a17b32d5f746c3d1ac04986242eb01ae318)

Author SHA1 Message Date
Sébastien Han 50be3fd9e8 test: remove osd_crush_location from shrink scenarios
This is not needed since this is already covered by docker_cluster and
centos_cluster scenarios.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-08-07 16:20:13 +00:00
Guillaume Abrioux 578aa5c2d5 tests: leave an OSD node in default crush root
jewel used to create a default `rbd` pool in the default crush root
`default`, we need to have at least 1 osd to satisfy the PGs for this
created pool, otherwise the cluster will be in HEALTH_ERR state because
of `pgs stuck unclean`/`pgs stuck inactive`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-07-26 18:47:10 +00:00
Guillaume Abrioux 0a88bccf87 tests: followup on b89cc1746f
Update network subnets in group_vars/all

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-07-24 16:55:15 +02:00
Guillaume Abrioux b89cc1746f tests: do not deploy all daemons for shrink osds scenarios
Let's create a dedicated environment for these scenarios, there is no
need to deploy everything.
By the way, doing so will save some times.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-07-23 18:30:06 +02:00
Guillaume Abrioux 0c863a3783 tests: add support of 'ooo-collocation' scenario when testing against ceph dev
The group_vars/all file is not available on 'ooo-collocation' scenario,
it's making the `dev_setup.yml` failing because this path is hardcoded.

The idea here is to check if the pattern 'ooo-collocation' is present in
`change_dir` variable so we can set this path properly according to the
scenario being run.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-07-20 07:47:33 +02:00
Guillaume Abrioux d8281e50f1 tests: support update scenarios in test_rbd_mirror_is_up()
`test_rbd_mirror_is_up()` is failing on update scenarios because it
assumes the `ceph_stable_release` is still set to the value of the
original ceph release, it means it won't enter in the right part of the
condition and fails.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-07-20 07:46:41 +02:00
Guillaume Abrioux cc71bb96cc tests: followup on #2656
34f70428 has introduced a fix using `command` module while this could
have been achieved by using `lvol` module.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-07-13 07:55:14 +00:00
Guillaume Abrioux 9a65ec231d tests: fix `_get_osd_id_from_host()` in TestOSDs()
We must initialize `children` variable in `_get_osd_id_from_host()`,
otherwise, if for any reason the deployment has failed and result with
an osd host with no OSD registered, we won't enter in the condition,
therefore, `children` is never set and the function tries to return
something undefined.

Typical error:
```
E       UnboundLocalError: local variable 'children' referenced before assignment
```

Fixes: #2860

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-07-10 13:06:23 +00:00
Guillaume Abrioux 09d795b5b7 tests: add mimic support for test_rbd_mirror_is_up()
prior mimic, the data structure returned by `ceph -s -f json` used to
gather information about rbd-mirror daemons looked like below:

```
  "servicemap": {
    "epoch": 8,
    "modified": "2018-07-05 13:21:06.207483",
    "services": {
      "rbd-mirror": {
        "daemons": {
          "summary": "",
          "ceph-nano-luminous-faa32aebf00b": {
            "start_epoch": 8,
            "start_stamp": "2018-07-05 13:21:04.668450",
            "gid": 14107,
            "addr": "172.17.0.2:0/2229952892",
            "metadata": {
              "arch": "x86_64",
              "ceph_version": "ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable)",
              "cpu": "Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz",
              "distro": "centos",
              "distro_description": "CentOS Linux 7 (Core)",
              "distro_version": "7",
              "hostname": "ceph-nano-luminous-faa32aebf00b",
              "instance_id": "14107",
              "kernel_description": "#1 SMP Wed Mar 14 15:12:16 UTC 2018",
              "kernel_version": "4.9.87-linuxkit-aufs",
              "mem_swap_kb": "1048572",
              "mem_total_kb": "2046652",
              "os": "Linux"
            }
          }
        }
      }
    }
  }
```

This part has changed from mimic and became:
```
  "servicemap": {
    "epoch": 2,
    "modified": "2018-07-04 09:54:36.164786",
    "services": {
      "rbd-mirror": {
        "daemons": {
          "summary": "",
          "14151": {
            "start_epoch": 2,
            "start_stamp": "2018-07-04 09:54:35.541272",
            "gid": 14151,
            "addr": "192.168.1.80:0/240942528",
            "metadata": {
              "arch": "x86_64",
              "ceph_release": "mimic",
              "ceph_version": "ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic (stable)",
              "ceph_version_short": "13.2.0",
              "cpu": "Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz",
              "distro": "centos",
              "distro_description": "CentOS Linux 7 (Core)",
              "distro_version": "7",
              "hostname": "ceph-rbd-mirror0",
              "id": "ceph-rbd-mirror0",
              "instance_id": "14151",
              "kernel_description": "#1 SMP Wed May 9 18:05:47 UTC 2018",
              "kernel_version": "3.10.0-862.2.3.el7.x86_64",
              "mem_swap_kb": "1572860",
              "mem_total_kb": "1015548",
              "os": "Linux"
            }
          }
        }
      }
    }
  }
```

This patch modifies the function `test_rbd_mirror_is_up()` in
`test_rbd_mirror.py` so it works with `mimic` and keeps backward compatibility
with `luminous`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-07-06 14:39:13 +02:00
Guillaume Abrioux f2e57a56db tests: factorize docker tests using docker_exec_cmd logic
avoid duplicating test unnecessarily just because of docker exec syntax.
Using the same logic than in the playbook with `docker_exec_cmd` allow us
to execute the same test on both containerized and non containerized environment.

The idea is to set a variable `docker_exec_cmd` with the
'docker exec <container-name>' string when containerized and
set it to '' when non containerized.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-27 07:00:14 +00:00
Guillaume Abrioux fe79a5d240 tests: refact test_all_*_osds_are_up_and_in
these tests are skipped on bluestore osds scenarios.
they were going to fail anyway since they are run on mon nodes and
`devices` is defined in inventory for each osd node. It means
`num_devices * num_osd_hosts` returns `0`.
The result is that the test expects to have 0 OSDs up.

The idea here is to move these tests so they are run on OSD nodes.
Each OSD node checks their respective OSD to be UP, if an OSD has 2
devices defined in `devices` variable, it means we are checking for 2
OSD to be up on that node, if each node has all its OSD up, we can say
all OSD are up.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-26 15:23:39 +00:00
Guillaume Abrioux 1c3dae4a90 tests: skip rgw_tuning_pools_are_set when rgw_create_pools is not defined
since ooo_collocation scenario is supposed to be the same scenario than the
one tested by OSP and they are not passing `rgw_create_pools` the test
`test_docker_rgw_tuning_pools_are_set` will fail:
```
>       pools = node["vars"]["rgw_create_pools"]
E       KeyError: 'rgw_create_pools'
```

skipping this test if `node["vars"]["rgw_create_pools"]` is not defined
fixes this failure.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-26 15:23:39 +00:00
Guillaume Abrioux f68936ca7e tests: fix *_has_correct_value tests
It might happen that the list of ips/hosts in following line (ceph.conf)
- `mon initial memebers = <hosts>`
- `mon host = <ips>`

are not ordered the same way depending on deployment.

This patch makes the tests looking for each ip or hostname in respective
lines.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-20 08:01:57 +02:00
Guillaume Abrioux 481c14455a tests: add more nodes in ooo testing scenario
adding more node in this scenario could help to have a better coverage
so we can catch more potential bugs.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-18 16:44:23 +02:00
Guillaume Abrioux 21894655a7 tests: keep same ceph release during handlers/idempotency test
since `latest` points to `mimic`, we need to force the test to keep the
same ceph release when testing anything else than `mimic`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-15 11:45:51 -04:00
Guillaume Abrioux bbb8691335 tests: increase memory to 1024Mb for centos7_cluster scenario
we see more and more failure like `fatal: [mon0]: UNREACHABLE! => {}` in
`centos7_cluster` scenario, Since we have 30Gb RAM on hypervisors, we
can give monitors a bit more RAM. By the way, nodes on containerized cluster
testing scenario have already 1024Mb memory allocated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-11 23:52:15 +08:00
Sébastien Han 6035978ed9 test: only on containerized iscsi
We don't have the same service running on non-container for now, this
will change soon but for let's only run the test on container.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-06-11 08:34:48 +02:00
Sébastien Han 20c8065e48 ceph-iscsi: rename group iscsi_gws
Let's try to avoid using dashes as testinfra needs to be able to read
the groups.
Typically, with iscsi-gws we can't add a marker for these iscsi nodes,
using an underscore fixes the issue.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-06-08 10:21:54 +02:00
Sébastien Han c00fb12497 ci: add functionnal tests for iscsi
We test if:

* packages are installed
* services are runnning
* service units are enabled

Also fix linting issues

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-06-08 10:21:54 +02:00
Sébastien Han 5ff2f03e3f ci: add iscsi test
Add iscsi CI coverage, this will now deploy iscsi gateways in container.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-06-08 10:21:54 +02:00
Guillaume Abrioux 28d21b4e9c tests: update ooo inventory hostfile
Update the inventory host for tripleo testing scenario so it's the same
parameters than in tripleo CI.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-07 17:26:35 +02:00
Guillaume Abrioux c94ada69e8 tests: improve mds tests
the expected number of mds daemon consist of number of daemons that are
'up' + number of daemons 'up:standby'.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-07 14:01:58 +08:00
Guillaume Abrioux f0cd4b0651 tests: skip disabling fastest mirror detection on atomic host
There is no need to execute this task on atomic hosts.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-05 15:39:37 +02:00
Guillaume Abrioux 47276764f7 tests: fix rgw tests
41b4632 has introduced a change in functionnals tests.
Since the admin keyring isn't copied on rgw nodes anymore in tests, let's use
the rgw keyring to achieve them.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-05 15:24:32 +02:00
Sébastien Han 41b4632abc test: do not always copy admin key
The admin key must be copied on the osd nodes only when we test the
shrink scenario. Shrink relies on ceph-disk commands that require the
admin key on the node where it's being executed.

Now we only copy the key when running on the shrink-osd scenario.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-06-05 09:39:30 +02:00
Guillaume Abrioux 2cf06b515f rgw: refact rgw pools creation
Refact of 8704144e31
There is no need to have duplicated tasks for this. The rgw pools
creation should be delegated on a monitor node se we don't have to care
if the admin keyring is present on rgw node.
By the way, only one task is needed to create the pools, we just need to
use the `docker_exec_cmd` fact already defined in `ceph-defaults` to
achieve it.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1550281

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-06-05 15:00:20 +08:00
Erwan Velu 493f615eae ceph-defaults: Enable local epel repository
During the tests, the remote epel repository is generating a lots of
errors leading to broken jobs (issue #2666)

This patch is about using a local repository instead of a random one.
To achieve that, we make a preliminary install of epel-release, remove
the metalink and enforce a baseurl to our local http mirror.

That should speed up the build process but also avoid the random errors
we face.

This patch is part of a patch series that tries to remove all possible yum failures.

Signed-off-by: Erwan Velu <erwan@redhat.com>
2018-06-04 08:11:35 +02:00
jtudelag 600e1e2c26 rgws: renames create_pools variable with rgw_create_pools.
Renamed to be consistent with the role (rgw) and have a meaningful name.

Signed-off-by: Jorge Tudela <jtudelag@redhat.com>
2018-06-04 06:23:42 +02:00
jtudelag 8704144e31 Adds RGWs pool creation to containerized installation.
ceph command has to be executed from one of the monitor containers
if not admin copy present in RGWs. Task has to be delegated then.

Adds test to check proper RGW pool creation for Docker container scenarios.

Signed-off-by: Jorge Tudela <jtudelag@redhat.com>
2018-06-04 06:23:42 +02:00
Guillaume Abrioux c68126d6fd mdss: do not make pg_num a mandatory params
When playing ceph-mds role, mon nodes have set a fact with the default
pg num for osd pools, we can simply default to this value for cephfs
pools (`cephfs_pools` variable).

At the moment the variable definition for `cephfs_pools` looks like:

```
cephfs_pools:
  - { name: "{{ cephfs_data }}", pgs: "" }
  - { name: "{{ cephfs_metadata }}", pgs: "" }
```

and we have a task in `ceph-validate` to ensure `pgs` has been set to a
valid value.

We could simply avoid this check by setting the default value of `pgs`
to `hostvars[groups[mon_group_name][0]]['osd_pool_default_pg_num']` and
let to users the possibility to override this value.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1581164

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-30 16:20:34 +02:00
Guillaume Abrioux 34f7042852 tests: resize root partition when atomic host
For a few moment we can see failures in the CI for containerized
scenarios because VMs are running out of space at some point.

The default in the images used is to have only 3Gb for root partition
which doesn't sound like a lot.

Typical error seen:

```
STDERR:

failed to register layer: Error processing tar file(exit status 1): open /usr/share/zoneinfo/Atlantic/Canary: no space left on device
```

Indeed, on the machine we can see:
```
Every 2.0s: df -h                                                                                                                                                                                                                                       Tue May 29 17:21:13 2018
Filesystem                 Size  Used Avail Use% Mounted on
/dev/mapper/atomicos-root  3.0G  3.0G   14M 100% /
```

The idea here is to expand this partition with all the available space
remaining by issuing an `lvresize` followed by an `xfs_growfs`.

```
-bash-4.2# lvresize -l +100%FREE /dev/atomicos/root
  Size of logical volume atomicos/root changed from <2.93 GiB (750 extents) to 9.70 GiB (2484 extents).
  Logical volume atomicos/root successfully resized.
```

```
-bash-4.2# xfs_growfs /
meta-data=/dev/mapper/atomicos-root isize=512    agcount=4, agsize=192000 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0 spinodes=0
data     =                       bsize=4096   blocks=768000, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 768000 to 2543616
```

```
-bash-4.2# df -h
Filesystem                 Size  Used Avail Use% Mounted on
/dev/mapper/atomicos-root  9.7G  1.4G  8.4G  14% /
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-30 10:54:35 +02:00
Guillaume Abrioux 98cb6ed8f6 tests: avoid yum failures
In the CI we can see at many times failures like following:

`Failure talking to yum: Cannot find a valid baseurl for repo:
base/7/x86_64`

It seems the fastest mirror detection is sometimes counterproductive and
leads yum to fail.

This fix has been added in the `setup.yml`.
This playbook was used until now only just before playing `testinfra`
and could be used before running ceph-ansible so we can add some
provisionning tasks.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Erwan Velu <evelu@redhat.com>
2018-05-28 22:04:35 +02:00
Guillaume Abrioux a10e73d78d tests: move cephfs_pools variable
let's move this variable in group_vars/all.yml in all testing scenarios
accordingly to this commit 1f15a81c48 so
we keep consistency between the playbook and the tests.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-24 09:39:38 -07:00
Guillaume Abrioux 564a662baf osds: move openstack pools creation in ceph-osd
When deploying a large number of OSD nodes it can be an issue because the
protection check [1] won't pass since it tries to create pools before all
OSDs are active.

The idea here is to move openstack pools creation at the end of `ceph-osd` role.

[1] e59258943b/src/mon/OSDMonitor.cc (L5673)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-24 09:39:38 -07:00
Luigi Toscano 43e96c1f98 ceph-radosgw: disable NSS PKI db when SSL is disabled
The NSS PKI database is needed only if radosgw_keystone_ssl
is explicitly set to true, otherwise the SSL integration is
not enabled.

It is worth noting that the PKI support was removed from Keystone
starting from the Ocata release, so some code paths should be
changed anyway.

Also, remove radosgw_keystone, which is not useful anymore.
This variable was used until fcba2c801a.
Now profiles drives the setting of rgw keystone *.

Signed-off-by: Luigi Toscano <ltoscano@redhat.com>
2018-05-23 23:24:09 -07:00
Guillaume Abrioux a68091c923 tests: update the type for the rule used in pools
As of ceph 12.2.5 the type of the parameter `type` is not a name anymore but
an id, therefore an `int` is expected otherwise it will fail with the
following error

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-04-30 08:15:18 +02:00
Sébastien Han 71efa2eaf4 ci: bump client nodes to 2
In order to test the key distribution is correct we must have 2 client
nodes.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-23 18:34:58 +02:00
Guillaume Abrioux 77831ccb7a tests: update tests for mds to cover multimds case
in case of multimds we must check for the number of mds up instead of
just checking if the hostname of the node is in the fsmap.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-04-12 18:20:58 +02:00
Sébastien Han 82589021e0 ci: fix tripleO scenario
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-11 12:18:34 +02:00
Sébastien Han 2011ec3bcd ci: client copy admin key
If we don't copy the admin key we can't add the key into ceph.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-11 12:18:34 +02:00
Sébastien Han cf73647e7a ci: remove useless tests
These are already handled by ceph-client/defaults/main.yml so the keys
will be created once user_config is set to True.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-11 12:18:34 +02:00
Andrew Schoen 98e237d234 tests: no need to remove partitions in lvm_setup.yml
Now that we are using ceph_volume_zap the partitions are
kept around and should be able to be reused.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-04-10 14:19:21 +02:00
Sébastien Han f3caee8460 ceph-iscsi: fix certificates generation and distribution
Prior to this patch, the certificates where being generated on a single
node only (because of the run_once: true). Thus certificates were not
distributed on all the gateway nodes.

This would require a second ansible run to work. This patches fix the
creation and keys's distribution on all the nodes.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1540845
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-04 09:27:39 +02:00
John Fulton e6e6bd078a Refer to expected-num-ojects as expected_num_objects, not size
Follow up patch to PR 2432 [1] which replaces "size" (sorry if
the original bug used that term, which can be confusing) with
expected_num_objects as is used in the Ceph documentation [2].

[1] https://github.com/ceph/ceph-ansible/pull/2432/files
[2] http://docs.ceph.com/docs/jewel/rados/operations/pools
2018-03-26 15:41:51 +02:00
Sébastien Han 3ab89ab48c ci: re-arrange group_vars files
We should stop putting everything in 'all'. This is too easy and this is
error prone as well for those who are separating variables into host
type, things that you should do.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-03-14 14:22:00 +01:00
Sébastien Han d5f8cac820 ci: remove left over iscsi_gws file
Wrong file that is not used, only iscsi-ggw that is present is correct.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-03-14 14:22:00 +01:00
Sébastien Han 8000ae342e remove unsed ceph_rgw_civetweb_port variable
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-03-14 14:22:00 +01:00
Sébastien Han f119b25bbe client: implement proper pools creation
Just like we did for the monitor and openstack_config we now have the
ability to precisely create pools.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-03-14 14:22:00 +01:00
Sébastien Han e302c1baae mon: add support for erasure code pool
You can now specify type: erasure and   erasure_profile to use when
declaring the pool dictionnary.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-03-14 14:22:00 +01:00
Sébastien Han 4806ff4ff8 ci: test pool creation on container
On containerized scenario we also want to test pool creation.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-03-14 14:22:00 +01:00
Sébastien Han fc0fa48e0d test: add tests for creating crush tree
We now run tests on the newly created ceph_crush module. Now the CI will
create a specific hierarchy for the OSD.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-03-06 15:24:31 +00:00
Sébastien Han fd94840a6e ci: add copy_admin_key test to container scenario
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-03-02 20:59:10 +00:00
Sébastien Han 165d9dec10 remove kernel.pid_max
This is now managed by Ceph packages.

See: https://github.com/ceph/ceph/pull/18544/files

http://tracker.ceph.com/issues/21929

Closes: https://github.com/ceph/ceph-ansible/issues/2410

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-02-23 13:57:57 +01:00
Guillaume Abrioux 4a8986459f tests: change ceph_docker_image_tag for 2nd run
The ceph-ansible upstream CI runs severals tests, including a
'idempotency/handlers' test. It means the playbook is run a first time
and then a second time with an other container image version to ensure the
handlers run properly and the containers are well restarted.
This can cause issues.
For instance, in that specific case which drove me to submit this commit,
I've hit the case where `latest` image ships ceph 12.2.3 while the `stable-3.0`
(which is the image used for the second run) ships ceph 12.2.2.

The goal of this test is not to verify we can upgrade from a specific
version to another but to ensure handlers are working even if it's a valid
failure here.
It should be caught by a test dedicated to that usecase.

We just need to have a container image which has a different id for
the upstream CI, we need the same content in container imagebut a different
image id in the registry since the test relies on image id to decide whether
the container should be restarted.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-02-23 13:54:32 +01:00
Guillaume Abrioux 707458c979 ci: add tripleo scenario testing
This should help to see earlier any failure in a tripleo deployment scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-02-23 13:54:32 +01:00
Sébastien Han 7d690878df test: add test for containers resources changes
We change the ceph_mon_docker_memory_limit on the second run, this
should trigger a restart of services.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-02-14 02:01:29 +01:00
Sébastien Han 79864a8936 test: add test for restart on new container image
Since we have a task to test the handlers we can test a new container to
validate the service restart on a new container image.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-02-14 02:01:29 +01:00
Guillaume Abrioux deaf273b25 syntax: change local_action syntax
Use a nicer syntax for `local_action` tasks.
We used to have oneliner like this:
```
local_action: wait_for port=22 host={{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }} state=started delay=10 timeout=500 }}
```

The usual syntax:
```
    local_action:
      module: wait_for
      port: 22
      host: "{{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }}"
      state: started
      delay: 10
      timeout: 500
```
is nicer and kind of way to keep consistency regarding the whole
playbook.

This also fix a potential issue about missing quotation :

```
Traceback (most recent call last):
  File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 213, in <module>
    main()
  File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 185, in main
    rc, out, err = module.run_command(args, executable=executable, use_unsafe_shell=shell, encoding=None, data=stdin)
  File "/tmp/ansible_wQtWsi/ansible_modlib.zip/ansible/module_utils/basic.py", line 2710, in run_command
  File "/usr/lib64/python2.7/shlex.py", line 279, in split
    return list(lex)                                                                                                                                                                                                                                                                                                            File "/usr/lib64/python2.7/shlex.py", line 269, in next
    token = self.get_token()
  File "/usr/lib64/python2.7/shlex.py", line 96, in get_token
    raw = self.read_token()
  File "/usr/lib64/python2.7/shlex.py", line 172, in read_token
    raise ValueError, "No closing quotation"
ValueError: No closing quotation
```

writing `local_action: shell echo {{ fsid }} | tee {{ fetch_directory }}/ceph_cluster_uuid.conf`
can cause trouble because it's complaining with missing quotes, this fix solves this issue.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1510555

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-01-31 10:45:34 +01:00
Andrew Schoen cfb75b8e29 tests: remove crush_device_class from lvm tests
The --crush-device-class flag for ceph-volume is not available in luminous so lets
remove this testing option for now until it's more widely available.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-01-18 15:03:38 +01:00
Andrew Schoen 64f5772140 tests: adds crush_device_class to lvm tests
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-01-17 13:49:29 +01:00
Sébastien Han 39f2bfd5d5 fix jewel scenarios on container
When deploying Jewel from master we still need to enable this code since
the container image has such check. This check still exists because
ceph-disk is not able to create a GPT label on a drive that does not
have one.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-12-20 13:43:19 +01:00
Guillaume Abrioux ab1dd3027a client: don't try to generate keys
the entrypoint to generate users keyring is `ceph-authtool`, therefore,
it can expand the `$(ceph-authtool --gen-print-key)` inside the
container. Users must generate a keyring themselves.
This commit also adds a check to ensure keyring are properly filled when
`user_config: true`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-12-14 17:22:07 +01:00
Guillaume Abrioux aa0b1ed118 tests: remove OSD_FORCE_ZAP variable from tests
according to ceph/ceph-container#840, this variable is no longer needed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-11-14 17:55:01 +01:00
Sébastien Han d05206236c
Merge pull request #2124 from ceph/lvm-setup-fix
test: when creating the /dev/sdc2 partition specify label as gpt
2017-10-31 16:51:16 +01:00
Andrew Schoen 37a48209cc test: when creating the /dev/sdc2 partition specify label as gpt
ansible==2.4 requires that label be set to gpt, or it will be defaulted
to msdos.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-10-31 09:38:47 -05:00
Guillaume Abrioux c28882c1cd tests: add missing test for rbd
Add a missing test `test_rbd_mirror_service_is_running_from_luminous()`.
Also using bash -c "<cmd>" to make testinfra aware that later in
the upgrade process we are now running `luminous` ceph release so we
must skip the rbd tests related to `jewel` ceph release.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-30 19:44:56 +01:00
Sébastien Han faccd0acf0 Merge pull request #2100 from ceph/lvm-bluestore
ceph-volume lvm bluestore support
2017-10-27 17:36:16 +02:00
Major Hayden f73232caa4
Use check_mode instead of always_run
This patch changes the `always_run: yes` task option to
`check_mode: no` to avoid Ansible warnings.
2017-10-25 09:53:34 -05:00
Major Hayden c2b5118c1b
Revert "Avoid deprecated always_run"
This reverts commit 620fb37dd4.
2017-10-25 09:48:09 -05:00
Alfredo Deza 027d57dd29 tests create a bluestore osd scenario
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-10-25 06:46:39 -04:00
Sébastien Han a53aa9e8b4 ci: new osd scenarios
This commit add new osd scenarios, it aims to simplify the CI setup and
brings a better coverage on the OSD scenarios.
We decided to differentiate between filestore and bluestore, thinking
ahead when filestore won't be supported anymore.
So we now have two classes of tests:

* Filestore
* Bluestore

In each of those classes we have container and non-container.
Then for each we test the following:

* collocated
* collocated dmcrypt
* non-collocated
* non-collocated dmcrypt
* auto discovery collocated
* auto discovery collocated dmcrypt

This gives us a nice coverage and also reduces the footprint on the CI.
We are now up to 4 scenarios, each containing 6 OSD VMs.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-18 09:26:06 +02:00
Guillaume Abrioux 7ee9aa94b5 Merge pull request #1963 from ceph/pull-in-para
site-docker.yml try to fetch images in //
2017-10-13 19:35:11 +02:00
Sébastien Han 71d819620c mds: fix fs pool creation
1. add the variables to docker_collocation
2. trigger the check when a MDS is part of the inventory file, not when
we run on an MDS...

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-13 16:03:04 +02:00
Sébastien Han 90ce4276ca ci: use a container client VM
The client won't run on centos7 anymore but on Atomic host just like the
rest of the daemons.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-13 15:26:03 +02:00
Sébastien Han 3e058bff06 ci: reboot with ansible instead of vagrant reload
vagrant is serialized and takes a lot of time compare to simple reboot.
See the benchmarks below for 3 VMs:

[leseb@rick docker]$ time ANSIBLE_SSH_ARGS="-F
/home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/vagrant_ssh_config"  ansible-playbook -i /home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/hosts reboot.yml

PLAY [mons]
****************************************************************************************************************************************************************************************************

TASK [Gathering Facts]
*****************************************************************************************************************************************************************************************
ok: [mon1]
ok: [mon2]
ok: [mon0]

TASK [restart machine]
*****************************************************************************************************************************************************************************************
changed: [mon2]
changed: [mon1]
changed: [mon0]

TASK [wait for server to boot]
*********************************************************************************************************************************************************************************
ok: [mon2 -> localhost]
ok: [mon0 -> localhost]
ok: [mon1 -> localhost]

TASK [uptime]
**************************************************************************************************************************************************************************************************
changed: [mon2]
changed: [mon0]
changed: [mon1]

PLAY RECAP
*****************************************************************************************************************************************************************************************************
mon0                       : ok=4    changed=2    unreachable=0
failed=0
mon1                       : ok=4    changed=2    unreachable=0
failed=0
mon2                       : ok=4    changed=2    unreachable=0
failed=0

real    0m35.112s
user    0m5.737s
sys     0m1.849s

[leseb@rick docker]$ time vagrant reload
==> mon0: Halting domain...
==> mon0: Starting domain.
==> mon0: Waiting for domain to get an IP address...
==> mon0: Waiting for SSH to become available...
==> mon0: Creating shared folders metadata...
==> mon0: Rsyncing folder:
/home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/
=> /home/vagrant/sync
==> mon0: Machine already provisioned. Run `vagrant provision` or use
the `--provision`
==> mon0: flag to force provisioning. Provisioners marked to run always
will still run.
==> mon1: Halting domain...
==> mon1: Starting domain.
==> mon1: Waiting for domain to get an IP address...
==> mon1: Waiting for SSH to become available...
==> mon1: Creating shared folders metadata...
==> mon1: Rsyncing folder:
/home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/
=> /home/vagrant/sync
==> mon1: Machine already provisioned. Run `vagrant provision` or use
the `--provision`
==> mon1: flag to force provisioning. Provisioners marked to run always
will still run.
==> mon2: Halting domain...
==> mon2: Starting domain.
==> mon2: Waiting for domain to get an IP address...
==> mon2: Waiting for SSH to become available...
==> mon2: Creating shared folders metadata...
==> mon2: Rsyncing folder:
/home/leseb/reproduce-ci/tmp.zgGC7d5mIC/build/workspace/ceph-ansible/tests/functional/centos/7/docker/
=> /home/vagrant/sync
==> mon2: Machine already provisioned. Run `vagrant provision` or use
the `--provision`
==> mon2: flag to force provisioning. Provisioners marked to run always
will still run.

real    1m31.850s
user    0m7.387s
sys     0m0.796s

Reboot via Ansible: 0m35.112s
Reboot via vagrant: 1m31.850s

We save 1/3 time.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-13 09:04:26 +02:00
Guillaume Abrioux 17623a2157 Merge pull request #2036 from ceph/cephfs-pool
mds: precisely define cephfs pool
2017-10-12 17:47:10 +02:00
Sébastien Han 6bd152d555 Merge pull request #2037 from major/remove-always-run
Avoid deprecated always_run
2017-10-12 17:15:28 +02:00
Sébastien Han b49f9bda21 mds: precisely define cephfs pool
We now have a variable called ceph_pools that is mandatory when
deploying a MDS.
It's a dictionnary that contains a pool name and a PG count. PG count is
mandatory and must be set, the playbook will fail otherwise.

Closes: https://github.com/ceph/ceph-ansible/issues/2017
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-12 15:56:04 +02:00
Major Hayden 620fb37dd4
Avoid deprecated always_run
The `always_run` key is deprecated and being removed in Ansible 2.4.
Using it causes a warning to be displayed:

    [DEPRECATION WARNING]: always_run is deprecated.

This patch changes all instances of `always_run` to use the `always`
tag, which causes the task to run each time the playbook runs.
2017-10-12 08:29:44 -05:00
Guillaume Abrioux a179e312fd tests: add missing override for collocation scenario
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-12 14:43:25 +02:00
Guillaume Abrioux a2880e6345 tests: rbd/rgw adapt testinfra for jewel
- the rbd-mirror unit systemd name is not the same when running jewel vs
luminous.
- servicemap is not available on jewel.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-12 00:06:08 +02:00
Guillaume Abrioux a1ea6e7f59 tests: adapt current testing for collocation scenario
Since we introduced collocation testing scenario, we need to adapt
current tests to this new scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-09 17:25:45 +02:00
Sébastien Han 6d7b73fa91 ci: re-add osd_pool_default_size to 1 with the override
If we don't do this the client will create pools with a replica 3 since
osd_pool_default_size was gone in ceph-override.json. This was making
switch_to_containers failing.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-09 17:25:45 +02:00
Sébastien Han abb8c374cf ci: use by-id instead of by-path
by-id relies on the disk WWID which is more reliable then by-path
(pointing to the PCI info)

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-07 03:39:09 +02:00
Sébastien Han b6b24a5ca9 iscsi: fix wrong group name for iscsi
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498490
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-05 17:25:32 +02:00
Guillaume Abrioux 53a69640c9 tests: disable shared folder
Shared folder is not required for tests.
We should avoid hitting the error :
```
uninitialized constant VagrantPlugins::ProviderLibvirt::Action::ShareFolders
```
Also, disabling it might reduce the needed time in certains cases for the VMs
to be started.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-05 15:07:38 +02:00
Guillaume Abrioux 6aa7050acd tests: make all subnet uniq per scenario
If two environments are using the same subnet, we will get trouble
because of ips addresses conflicts.
This commit ensures each scenario has a uniq subnet for both public and cluster
network so we can setup several test environment at a time on a same hypervisor.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-05 15:07:38 +02:00
Guillaume Abrioux 635111bf6a tests: add ceph-override.json for ubuntu/cluster
in addition to 18e2ab4d this commit adds the same file for ubuntu
testing scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-05 12:59:29 +02:00
Guillaume Abrioux 4135091c98 tests: fix broken osd test for xenial_cluster
the path `/dev/disk/by-path/pci-0000:00:01.1-ata-1.0` doesn't exist.
it has to be changed to `/dev/disk/by-path/pci-0000:00:01.1-ata-1`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-05 11:03:41 +02:00
Guillaume Abrioux cdb5023d84 tests: fix brokens tests for mds
5968cf0 broke the test on mds because of leftover.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-04 16:48:23 +02:00
Guillaume Abrioux 2c4258a0fd Refact code for set_osd_pool_default_*
This commit refacts the code regarding all `set_osd_pool_default_*`
related tasks by avoiding usage of useless `set_fact` to determine
whether a key is present in `ceph_conf_overrides`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-04 15:40:10 +02:00
Sébastien Han 5968cf09b1 ci: add collocation scenario
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-04 11:19:12 +02:00
Sébastien Han 18e2ab4d07 test: add handler support
Add idempotency and handler test.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-03 14:44:00 +02:00
Sébastien Han 39ee25637b test: add test for device with 'by-path'
We now test devices to be passed like:
/dev/disk/by-path/pci-0000:00:01.1-ata-1.0

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-03 14:43:57 +02:00
Sébastien Han b4bec52442 tests: add tests for rgw-nfs
rgw-nfs is part of servicemap so we should use it to make sure the
process is up and running.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-29 02:38:24 +02:00
Sébastien Han 77fc8ba87f Merge pull request #1931 from ceph/re-enable-iscsi
iscsi: re-enable the scenario
2017-09-28 19:44:52 +02:00
Sébastien Han 67c78da056 iscsi: re-enable the scenario
CentOS 7.4 vagrant box is now available so re-enabling this scenario.
For more info:
https://seven.centos.org/2017/09/updated-centos-vagrant-images-available-v1708-01/

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-28 18:46:28 +02:00
Ali Maredia ae18cf24d2 test: add test making sure rgw http endpoints are enabled
Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-09-25 14:41:18 -04:00
Sébastien Han d5bfc6f85d mgr: always bootstrap mgr right after the mon
If we don't bootstrap the mgr after the mon and the osds handler are
called, we will never be able to reach a clean state since the pgs
stats are handled by the mgr. This also happens when doing daemon
collocation.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1493920
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-22 17:26:28 +02:00
Sébastien Han c7d9838ad4 tests: add nfs container test
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-21 11:07:14 +02:00
Guillaume Abrioux a069a6fe63 tests: temporary disable `test_nfs_rgw_fsal_export`
This test doesn't work at the moment and need to be fixed.
Disabling it temporary to avoid errors in the CI.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-21 09:56:37 +02:00
Guillaume Abrioux f4fc3bbfea ci: add precise tests to valide daemons are up
Add daemon health check for rgw, mds, mgr, rbd mirror.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-21 09:56:37 +02:00
Sébastien Han 66d41f342d Merge pull request #1889 from ceph/client-containers
client: ability to create keys and pool with no ceph binaries
2017-09-18 17:27:32 +02:00
Sébastien Han 85d73e3be2 client: ability to create keys and pool with no cpeh binaries
On a container env, machines don't have any ceph binaries so we need to
use a container to run the commands.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-18 14:41:52 +02:00
Andrew Schoen 5eff7e24b0 Merge pull request #1890 from ceph/lvm-setup
tests: fix lvm_setup.yml for purge_cluster.yml
2017-09-14 11:38:13 -05:00
Sébastien Han 2f51f0de28 Merge pull request #1880 from ceph/wip-rgw-nfs
nfs: configure RGW FSAL to start up correctly
2017-09-13 14:20:14 -06:00
Andrew Schoen 57f2ad7ef1 tests: delete journal partitions in lvm_setup.yml
Delete these before creating them incase they are left around in a purge
cluster testing scenario. The purge-cluster.yml playbook does not
currently remove partitions used for journals.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-09-13 15:02:54 -05:00
Sébastien Han f67b47d056 Merge pull request #1882 from ceph/multi-journal
osd: drop support for device partition
2017-09-13 11:43:48 -06:00
Sébastien Han aa364264cd resync ceph-iscsi-gw with old upstream
Taken from https://github.com/pcuzner/ceph-iscsi-ansible/tree/tcmu-fixes

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1454945 and
https://bugzilla.redhat.com/show_bug.cgi?id=1484083
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-12 18:06:10 -06:00
Sébastien Han fdf924401f osd: drop support for device partition
We have been struggling with this, it's still broken and breaking other
things too now.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1490283
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-12 17:42:07 -06:00
Ali Maredia 52efe92a87 nfs: configure RGW FSAL to start up correctly
- Add RGW keyring to nfs node
- Add RGW section to ganesha.conf
- Add RGW section to ceph.conf onf nfs node

Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-09-12 16:27:16 -04:00
Andrew Schoen 61357c8e20 tests: no need to create a filesystem on /dev/sdc1 for lvm tests
The partition only needs created and given a gpt label so that a
PARTUUID will exist on the partition.

This task also makes the purge_lvm_osds scenario fail on the second
deployment after purging.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-09-12 15:14:21 -05:00
Sébastien Han 7054615551 ci: deploy rbd mirror
Deploy rbd mirorr in cluster scenario

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-09 01:17:10 +02:00
Sébastien Han 4f325c7ebe ci: remove scenario bluestore_docker_cluster
We don't need to bootstrap a full cluster to bootstrap bluestore. We
have individual scenarios for that.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-08 19:33:24 +02:00
Ali Maredia f8171e8b4a nfs: rename host to have ceph- prefix
Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-09-08 11:38:05 -04:00
Ali Maredia f3e2235b3a nfs-ganesha: add config overrides section
Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-09-08 11:37:58 -04:00
Ali Maredia c907ec41ae nfs: add automated testing for nfs-ganesha roles
Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-09-08 09:14:01 -04:00
Sébastien Han 3753e6cfa7 ceph-osd: fix autodetection activation
Prior to this patch this activation sequence for autodetection was
always skipped because we were asking to activate on device without
partitions, which doesn't make sense.

We also fix the way we lookup for a device, since the data partition is
always numbered 1, we take the min element of the dict.

Closes: https://github.com/ceph/ceph-ansible/issues/1782
Signed-off-by: Sébastien Han <seb@redhat.com>
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-07 17:47:37 +02:00
Sébastien Han b04946430d Merge pull request #1812 from ceph/switch-migration-conta
switch-from-non-containerized-to-containerized: mask unit files
2017-09-07 07:30:34 +02:00
Guillaume Abrioux d987d26719 tests: force docker variable for switch-to-containers scenario
we need to force the value of `docker` variable which is initially set
to `false` since it's a migration from non-containerized to
containerized cluster.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-06 18:03:52 +02:00
Guillaume Abrioux c265180158 tests: Add mgr node for all scenarios
With Luminous we need to have mgr daemon.
This commit adds an mgr daemon for all scenarios.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-05 10:47:10 +02:00
Sébastien Han 11122a0101 client: copy admin key so we can create pools and keys
Needed when user_config is set to true

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-04 23:07:14 +02:00
Sébastien Han f7f4a61d7b Merge pull request #1836 from ceph/shrink-osd-mon
shrink mon and osd
2017-09-01 19:57:44 +02:00
Sébastien Han 298a63c437 shrink mon and osd
Rework shrinking a monitor and an OSD playbook. Also adding test
scenario.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-01 19:12:00 +02:00
Alfredo Deza 1efea767fe tests parted should create gpt labels on new disk
But only for the first partition, so that a new label doesn't
blow away the previous partition created

Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-08-31 10:09:42 -04:00
Alfredo Deza fec030cd27 tests osds units are not enabled in lvm scenarios
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-08-31 08:47:42 -04:00
Andrew Schoen 3185a287de tests: add a filesystem on /dev/sdc1 for lvm osd testing
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-08-30 16:16:04 -05:00
Andrew Schoen fcba9d17f0 ceph-osd: add support for --journal vg/lv for lvm osds
This also updates the tests

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-08-30 15:55:16 -05:00
Andrew Schoen d026e04470 tests: create 2 partitions on /dev/sdc for lvm scenario testing
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-08-30 15:53:28 -05:00
Sébastien Han 13aac5027a Merge pull request #1741 from ceph/refactor-installation
common: refactor installation method
2017-08-30 17:42:29 +02:00
Sébastien Han e0a264c7e9 osd: allow multi dedicated journals for containers
Fix: https://bugzilla.redhat.com/show_bug.cgi?id=1475820
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-08-30 12:34:06 +02:00
Sébastien Han ae2fd45994 common: refactor installation method
The installation process is now described as follow:

* you still have to choose a 'ceph_origin' installation method. The
origin can be a 'repository' (add a new repository), distro (it will use
the packages provided by the native repo source of your distribution),
local (only available on redhat system, it installs locally built
packages). This option is not well tested, so use it carefully

* if ceph_origin == 'repository' you will have to decide what kind of
repository you want to enable:
  - community: corresponds to the stable upstream/community version
  - enterprise: corresponds to the stable enterprise/downstream version
    (basically you are a red hat customer)
  - dev: it will install ceph from packages built out of the github
    development branches

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-08-30 10:52:01 +02:00
Ali Maredia eab9d52ddb tests: fix duplicate osd service test
Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-08-29 21:24:13 -04:00
Guillaume Abrioux b264bfece0 tests: Update tests according to `ceph-config` role implementation
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-08-24 11:33:02 +02:00
Andrew Schoen 594d5e017a ceph-osd: restructure lvm_volumes variable for more flexiblity
The lvm_volumes variable is now a list of dictionaries that represent
each OSD you'd like to deploy using ceph-volume. Each dictionary must
have the following keys: data, journal and data_vg. Each dictionary also
can optionaly provide a journal_vg key.

The 'data' key represents the lv name used for the OSD and the 'data_vg'
key is the vg name that the given lv resides on. The 'journal' key is
either an lv, device or partition. The 'journal_vg' key is optional and
must be the vg name for the journal lv if given. This key is mainly used
for purging of the journal lv if purge-cluster.yml is run.

For example:

  lvm_volumes:
    - data: data_lv1
      journal: journal_lv1
      data_vg: vg1
      journal_vg: vg2
    - data: data_lv2
      journal: /dev/sdc
      data_vg: vg1

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-08-23 10:14:14 -05:00
SirishaGuduru 1359869497 Common: changed civetweb line in rgw section(conf)
Resolves issue: Multiple RGW Ceph.conf Issue #1258

In multi-RGW setup, in ceph.conf the RGW sections
contain identical bind IP in civetweb line. So this
modification fixes that issue and puts the right IP
for each RGW.

Signed-off-by: SirishaGuduru SGuduru@walmartlabs.com

Modified ceph-defaults and ran generate_group_vars_sample.sh

group_vars/osds.yml.sample and group_vars/rhcs.yml.sample are
not part of the changes. But they got modified when
generate_group_vars_sample.sh is ran to generate group_vars/
all.yml.sample.

Uncommented added variables in ceph-defaults

Updated tests by adding value for radosgw_interface

Added radosgw_interface to centos cluster tests

Modified ceph-rgw role,rebased and ran generate_group_vars_sample.sh

In ceph-rgw role removed check_mandatory_vars.yml.
Rebased on master.
Ran generate_group_vars_sample.sh and then the below files got
modified.
2017-08-23 15:03:37 +05:30
Alfredo Deza 9540f472a7 tests/rgw: update tests to use new host fixture
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-08-21 16:50:02 -04:00
Alfredo Deza e3ff46dce2 tests/osd add tests for ceph-volume* executables
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-08-21 16:50:02 -04:00
Alfredo Deza 75060119c1 tests/osd: update tests to use new host fixture
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-08-21 16:50:01 -04:00
Alfredo Deza 31a960c323 tests/mons: update tests to use new host fixture
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-08-21 16:50:01 -04:00
Alfredo Deza ecf917d354 tests/install: update tests to use new host fixture
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-08-21 16:50:01 -04:00
Andrew Schoen 7ab3711cf5 tests: do not use /dev/sda in the lvm scenario
When you udpate to the latest version of the centos/7 box it always puts
the OS on /dev/sda, so do not use it as an OSD.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-08-04 15:57:56 -05:00
Andrew Schoen e597628be9 lvm: update scenario for new osd_scenario variable
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-08-04 06:38:36 -05:00
Andrew Schoen 66df80d600 tests: do not use sudo with dev_setup.yml
This causes problems when the tests are run locally and not in the CI

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-08-04 06:13:09 -05:00
Andrew Schoen 661de0f3b0 tests: adds an lvm_osds testing scenario
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-08-04 06:13:09 -05:00
Sébastien Han 30991b1c0a osd: simplify scenarios
There is only two main scenarios now:

* collocated: everything remains on the same device:
  - data, db, wal for bluestore
  - data and journal for filestore
* non-collocated: dedicated device for some of the component

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-08-03 10:20:39 +02:00
Sébastien Han 8ac7d2e4c9 osd: do not enable osd@id unit file
ceph-disk is responsable for enabling the unit file if needed. Actually
since https://github.com/ceph/ceph/pull/12241 it seems that it's not
even needed. On an event of a restart, udev rules will be trigger and
they will ceph-disk activate the device too so the 'enabled' is not
needed.

Closes: https://github.com/ceph/ceph-ansible/issues/1142
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-07-26 17:17:57 +02:00
Alfredo Deza bc0678e17d Merge pull request #1687 from ceph/dev-tests
tests: run all existing tests with shaman repos
2017-07-18 08:53:25 -04:00
Guillaume Abrioux 0c1352277c tests: add config in ceph_conf_overrides to journal collocation tests
Add config in ceph_conf_overrides options to journal collocation tests.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-07-18 01:02:02 +02:00
Guillaume Abrioux 0570008718 tests: Add a client node to docker scenario
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-07-17 22:47:35 +02:00
Andrew Schoen 7293c40c40 tests: run all existing tests with shaman repos
If you use the 'dev' factor, the testing scenario will
use repos from shaman.ceph.com. You can define CEPH_DEV_BRANCH
and CEPH_DEV_SHA1 to specify which repo you'd like to test.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-07-17 14:42:32 -05:00
Guillaume Abrioux 14dbcb122c tests: fix test_osds_listen_on_* tests
the `test_osds_listen_on_*` consider OSDs will always listen on tcp port
with consecutive tcp port number starting from `6800`.

Eg.
If you have 2 OSDs, tests will assume it should listen on 2 ports for each
network (`public_network` and `cluster_network`), therefore:
`6800, 6801, 6802, 6803`

but sometime it doesn't happen this way and you can get OSDs listening
on tcp port like this :

`6800, 6801, 6802, 6805`

Then the test are failing while it shouldn't.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-07-13 12:37:15 +02:00
Andrew Schoen 7029be70fa tests: remove monitor_interface from centos/7/cluster/group_vars/all
This is to ensure that the template must use the values set in the
inventory.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-07-12 16:34:41 +02:00
Guillaume Abrioux 1183112ee7 Tests: Add an mgr node do dmcrypt-dedicated-journal
Add an mgr node to `dmcrypt-dedicated-journal` scenario testing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-07-12 15:02:39 +02:00
Guillaume Abrioux f16841e09e Tests: rename tests directories
Since we are hitting this bug :

https://bugzilla.redhat.com/show_bug.cgi?id=1324587
eg:

`failed: internal error: Monitor path /var/lib/libvirt/qemu/domain-bs-docker-cl
uster-dmcrypt-journal-collocation_mon0_1499294943_ba9faf7bf296533177f6/monitor.
sock too big for destination`

and we can't upgrade libvirt in our CI for some reason

we need to get the directories name shorter in order to workaround this
issue

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-07-12 15:02:39 +02:00
Guillaume Abrioux 94c3756167 Tests: Add bluestore scenarios
Since we started testing against Luminous, we need to add more scenarios
testing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-07-12 15:02:32 +02:00
Sébastien Han 476d1677eb tests: allow bluestore devices
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-07-11 22:08:35 +02:00
Sébastien Han ed8a7cf1e0 tests: fix block.db partition size
Our devices in the CI are 12GB, there are not big enough for the default
size. Reducing its size.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-07-11 22:08:35 +02:00
Sébastien Han 035846217e Merge pull request #1627 from ceph/ceph-osd-prepare-script
osd: docker, refactor ceph-osd-run.sh.j2
2017-07-06 16:08:59 +02:00
Guillaume Abrioux 527eb050fc Tests: fix scenario for docker-cluster-dmcrypt-journal-collocation
The scenario set in `group_vars/all` for
docker-cluster-dmcrypt-journal-collocation is not the correct one.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-07-06 08:10:50 +02:00
Sébastien Han 044802a979 test: fix docker dmcrypt collocated scenario
We were setting journal_collocation and used raw_journal_devices which
is definitely wrong. We should just stick with devices.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-07-04 19:34:26 +02:00
Guillaume Abrioux 1c8680ef2d Tests: Add bluestore tests
Add two scenarios bluestore_journal_collocation and bluestore_cluster.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-07-04 19:07:23 +02:00
Guillaume Abrioux 896d62d78b Refact: remove ceph_mon_docker_interface variable
remove `ceph_mon_docker_interface` and use `monitor_interface` instead
for both containerized and non-containerized deployment.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-07-04 18:08:59 +02:00
Guillaume Abrioux 2a52d5b555 Tests: update tests according to ipv6 support
Since ceph.conf.j2 has been updated to add ipv6 support, the different
variables in many scenarios need to be updated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-07-04 10:57:27 +02:00
Guillaume Abrioux 35ad0e1c11 Remove duplicate entry in test Vagrantfile
remove some leftover since code has been refactored

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-06-15 16:42:58 +02:00
Andrew Schoen a9867e22fb tests: add PG config to the docker_cluster scenario
This is so we'll pass the PG check when performing a rolling update.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-06-13 08:56:43 -05:00
Guillaume Abrioux ddfe019342 Refact code
`ceph-docker-common`:
  At the moment there is a lot of duplicated tasks in each
  `./roles/ceph-<role>/tasks/docker/main.yml` that could be refactored in
  `./roles/ceph-docker-common/tasks/main.yml`.

`*_containerized_deployment` variables:
  All `*_containerized_deployment` have been refactored to a single
  variable `containerized_deployment`

duplicate `cephx` variables in `group_vars/* have been removed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-05-24 15:55:41 +02:00
Ali Maredia db19984010 test: check if default bucket object quota is applied
Resolves: rhbz#1391500

Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-05-16 17:12:27 -04:00
Ali Maredia 734a07f0b9 rgw: test functionality of conf vars and pool creation for tuning
Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-05-05 10:03:53 -04:00
Andrew Schoen ba5c40d45d tests: all tests must use the node fixture
Otherwise the logic we have for skipping tests does not work.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-05-04 16:37:28 -05:00
Andrew Schoen 2fada9bd6b tests: allow for insecure docker registries when testing rhcs
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-05-02 16:21:48 -05:00
Andrew Schoen 10d39e98dd tests: test that all docker OSDs are up and in
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-05-01 11:40:44 -05:00
Guillaume Abrioux 800b439667 Common: Restore check_socket
Restore the check_socket that was removed by `5bec62b`.
This commit also improves the logging in `restart_*_daemon.sh` scripts

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-04-24 12:31:49 +02:00
Andrew Schoen 8ba1624f21 tests: mgr nodes should have the ceph-mon repo install
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-04-21 10:45:33 -07:00
Sébastien Han dfd8f4d96e test: add mgr section to the host inventory file
Without this, we don't test the mgr role so we need to add it.

Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-04-15 00:16:10 +02:00
Andrew Schoen 0e6d89b9db tests: print contents of group_vars/all after modification
This is just nice to see in the test output so we know exactly what
configuration is going to be used.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-04-12 09:23:21 -05:00
Andrew Schoen 386afbec83 tests: set needed config in group_vars/all for rhcs testing
Instead of relying on environment variables and --extra-vars simply
modify the group_vars/all that ships with the specific testing scenario
to enable ceph_rchs testing.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-04-12 07:53:49 -05:00
Andrew Schoen 35aab1274a tests: enable the rhcs-7-extras-nightly repo on test nodes
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-04-10 13:46:03 -05:00
Andrew Schoen 6145a809ba tests: change centos/atomic-host box name to rhel7
We'd like to test everything rhcs on the same rhel7 box.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-04-10 13:39:16 -05:00
Sébastien Han 186a392656 Merge pull request #1425 from ceph/bump-kraken
common: bump ceph version to kraken
2017-04-10 19:03:39 +02:00
Sébastien Han e48c31c671 common: bump ceph version to kraken
Kraken has been out for a couple of weeks now and the CI can test both
Kraken and Jewel.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-04-10 18:05:19 +02:00
Sébastien Han 61a8b26d59 test: add mgr bootstrap support
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-04-10 15:13:13 +02:00
Sébastien Han 2aa5286544 mgr: add new role for ceph-mgr
The Ceph Manager daemon (ceph-mgr) runs alongside monitor daemons, to
provide additional monitoring and interfaces to external monitoring and
management systems.

Only works as of the Kraken release.

Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-04-10 15:13:09 +02:00
Andrew Schoen f2aaaa4970 tests: change ceph/ubuntu-xenial boxes to rhel7
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-03-23 08:43:11 -05:00
Andrew Schoen 23ab14c105 tests: change hosts in first play in rhcs_setup to localhost
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-03-22 19:40:34 -05:00
Andrew Schoen a4f05e4926 tests: set MTU to 1400 on test node interfaces
In the environment we were testing on, MTU was set to 1500 which causes
download failures of our yum repos. There might be a better way to set
this instead of doing it here in ansible.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-03-22 16:44:29 -05:00
Andrew Schoen 66f3f31702 tests: adds a task to download a repo file for nightly rhel7 packages
This is a url to an actual repo file, not a baseurl to use in a repo.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-03-22 16:18:57 -05:00
Andrew Schoen ead8dcebcf tests: fix ceph tools baseurl
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-03-22 16:02:50 -05:00
Andrew Schoen 408cd61483 tests: enable the downstream rhcs repos
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-03-22 15:42:17 -05:00
Andrew Schoen b0caeed682 tests: fix task in rhcs_setup that changes vagrant box to rhel7
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-03-22 12:38:12 -05:00
Andrew Schoen d261da6679 tests: adds a rhcs_setup.yml playbook
This is used to configure the test nodes for testing Ret Had Ceph
Storage.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-03-22 10:13:51 -05:00
Andrew Schoen a551ad97bb tests: when using pytest mark decorators ensure all fixtures are defined
Decorating a test method directly with a pytest mark seems to break if
the test function does not explicitly define all pytest fixtures it
expects to recieve.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-02-22 13:43:59 -06:00
Sébastien Han 503ec9be57 ci: decorate the tests to not run on docker scenario
Certain scenario won't work on containerized deployment. So we decorate
them so they can be skipped.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-02-22 10:53:03 -05:00
Sébastien Han e22acb81e6 ci: fix issue on ansible2.2-docker_dedicated_journal
journal_collocation was enabled so the test suite was testing this
scenario and obviously failed since there is no second partition to
verify.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-02-21 16:14:39 -05:00
Sébastien Han 51b759fc16 ci: do not use atomic host for ansible2.2-docker_dedicated_journal
Switch to CentOS since Atomic host does not have the right Docker
version.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-02-21 15:56:09 -05:00
Sébastien Han 55bde0336f ci: set a different directory for ceph osd docker run script
/usr/share is not writable on Atomic Host so we use /var/tmp instead.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-02-21 15:56:09 -05:00
Sébastien Han e9311bcc74 ci: do not generate random hostname for ansible2.2-docker_dedicated_journal
This fixes the error: Call to virDomainCreateWithFlags failed: internal
error: Monitor path
/var/lib/libvirt/qemu/domain-docker-cluster-dedicated-journal_osd0_1487692576_dbfc21d851071d3e2cd2/monitor.sock
too big for destination

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-02-21 15:56:09 -05:00
Sébastien Han b91d227b99 docker: make ceph docker osd script path
Since distro will not allow /usr/share to be writable (e.g: atomic) so
we let the operator decide where to put that script.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-02-21 15:56:09 -05:00
Sébastien Han 7b216aa8e0 ci: add docker-cluster-dmcrypt-journal-collocation scenario
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-02-21 15:56:09 -05:00
Sébastien Han 7aabbc931d tests: add scenario for dedicated-journal on docker
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-02-21 15:54:36 -05:00