Commit Graph

530 Commits (fbebe3a6979248c290ee0ef1fdf191b8f3394c20)

Author SHA1 Message Date
Guillaume Abrioux 5df6225ede tests: change subnet in lvm_osds container scenario
This commit changes the subnets in container-lvm_osds scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-08-04 14:00:05 +02:00
Guillaume Abrioux 7efea219d6 tests: refact shrink_osd scenario
This adds more coverage on the shrink_osd scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-08-03 14:46:56 +02:00
Dimitri Savineau 891234668e tests: install pyyaml on osd nodes
Due to [1], ceph-volume has now a dependency on pyyaml but it's not
installed by default via the package dependency.
This patch only add the required package on non containerized
deployment and as temporary workaround for the CI.

[1] https://tracker.ceph.com/issues/46759

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-29 12:49:15 -04:00
Guillaume Abrioux 8ef9fb68bc tests: lvm_setup.yml, add carriage return
This commit adds crlf between each task.
It makes the playbook more readable.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-22 14:25:45 +02:00
Guillaume Abrioux 218f4ae361 tests: (lvm_setup.yml), don't shrink lvol
when rerunning lvm_setup.yml on existing cluster with OSDs already
deployed, it fails like following:

```
fatal: [osd0]: FAILED! => changed=false
  msg: Sorry, no shrinking of data-lv2 to 0 permitted.
```

because we are asking `lvol` module to create a volume on an empty VG
with size extents = `100%FREE`.

The default behavior of `lvol` is to shrink the volume if the LV's current
size is greater than the requested size.

Given the requested size is calculated like this:

`size_requested = size_percent * this_vg['free'] / 100`

in our case, it is similar to:

`size_requested = 100 * 0 / 100` which basically means `0`

So the current LV size is well greater than the requested size which
leads the module to attempt to shrink it to 0 which isn't obviously now
allowed.

Adding `shrink: false` to the module calls fixes this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-22 14:25:45 +02:00
Guillaume Abrioux 9d2f2108e1 ceph-crash: introduce new role ceph-crash
This commit introduces a new role `ceph-crash` in order to deploy
everything needed for the ceph-crash daemon.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-21 20:22:12 +02:00
Dimitri Savineau 957903d561 cephadm: add playbook
This adds a new playbook for deploying ceph via cephadm.

This also adds a new dedicated tox file for CI purpose.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-16 11:40:45 -04:00
Dimitri Savineau fc599ed9f5 tests: remove nfs_ganesha_stable_branch variable
We don't need to override this variable in the group_vars but use the
default value instead.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-06 16:58:59 +02:00
Guillaume Abrioux 5b6f5486f7 tests: update nfs-ganesha to V3.3-stable
not really needed in master, commit intended to be backported in octopus
branch.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-05 17:10:40 +02:00
Dimitri Savineau 829990e60d ceph-osd: remove ceph-osd-run.sh script
Since we only have one scenario since nautilus then we can just move
the container start command from ceph-osd-run.sh to the systemd unit
service.
As a result, the ceph-osd-run.sh.j2 template and the
ceph_osd_docker_run_script_path variable are removed.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-06-18 17:51:13 +02:00
Jan Fajerski 1fe8e819f9 lvm_setup: lookup device from inventory, default to /dev/sd* names
This fixes a long standing fail in ceph-volumes lvm test suite.
Otherwise the default behaviour should not change.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
2020-06-16 18:17:34 +02:00
Guillaume Abrioux 83faf94351 tests: update pools definitions
setting attributes with empty string is a bad user input.
Also, removing `rule_name` attribute when creating a code erasure pool.
(this rule isnt intended for code erasure pool type).

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-05-16 07:31:57 +02:00
Dimitri Savineau 252e78b4e4 docker2podman: manage dashboard nodes
The dashboard nodes (alertmanager, grafana, node-exporter, and prometheus)
were not manage during the docker to podman migration.

This adds the systemd container template of those services to a dedicated
file (systemd.yml) in order to include it in the docker2podman playbook.

This also adds the dashboard container images pull from docker to podman.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1829389

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-05-13 12:02:00 +02:00
Dimitri Savineau 2547ab601a Readd CentOS 7 with conditions
The CentOS 7 distribution could still be used be deploying ceph if
  - it's a containerized deployment
  - it's a non containerized deployment without the dashboard (due to
missing python3 libraries).

The ceph_stable_redhat_distro variable has been remove because we can
rely on the ansible_distribution_major_version fact instead.

The copr el8 repository configuration is only applied for CentOS 8.

The ceph-mgr-dashboard package is only installed when the
dashboard_enabled variable is set to true.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-04-23 13:31:11 +02:00
Guillaume Abrioux 86959abf9b tests: add back nfs testing on master
This commit adds back nfs testing on master branch (containerized
scenario only).

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-04-23 13:27:48 +02:00
Guillaume Abrioux 8c1c34b201 tests: add more coverage in external_clients scenario
Run create_users_keys.yml in external_clients scenario

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-31 14:49:38 -04:00
Dimitri Savineau 0f0a14772c tests: update mgr dashboard socket listening test
Since 15ed9ee the ceph-mgr daemon binds on the IP address on the public
network instead of binding on all addresses.
This commit updates the testinfra code to reflect that change.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-24 21:11:02 +01:00
Dimitri Savineau f2c6281207 tests: add dashboard testinfra configuration
This commit adds basic tests for grafana, prometheus, node-exporter and
ceph mgr dashboard services.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-24 15:19:18 +01:00
Dimitri Savineau fb69f6990c dashboard: allow to set read-only admin user
This commit allows one to set the role for the admin user as read-only.
This can be controlled via the dashboard_admin_user_ro variable but the
default value is false for backward compatibility.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1810176

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-19 15:34:41 +01:00
Guillaume Abrioux 60a2e28189 rgw: add multi-instances support when deploying multisite
This commit adds the multi-instances when deploying rgw multisite

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-12 16:44:48 -04:00
Dimitri Savineau e62532de46 update osd pool set size command
Since [1] we can't use osd pool without replicas (size: 1) by default.
We now need to set the mon_allow_pool_size_one flag to true in the ceph
configuration and add the --yes-i-really-mean-it flag to the osd pool
set size cli.

[1] https://github.com/ceph/ceph/commit/21508bd

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-11 11:25:42 +01:00
Ali Maredia 71f55bd54d rgw multisite: enable more than 1 realm per cluster
Make it so that more than one realm, zonegroup,
or zone can be created during a run of the rgw
multisite ansible playbooks.

The rgw hosts now need to be grouped into zones
and realms in the inventory.

.yml files need to be created in group_vars
for the realms and zones. Sample yaml files
are available.

Also remove multsite destroy playbook
and add --cluster before radosgw-admin commands

remove manually added rgw_zone_endpoints var
and have ceph-ansible automatically add the
correct endpoints of all the rgws in a rgw_zone
from the information provided in that rgws hostvars.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
2020-03-04 12:58:13 -05:00
Guillaume Abrioux 9f0c6df94f tests: add more osd nodes in all_daemons scenario
This commit adds more osd nodes in all_daemons scenario in order to test
erasure pool creation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-04 09:29:01 -05:00
Guillaume Abrioux 248978596a tests: update ooo job
This commit changes the value passed for the attribute 'rule_name' in
openstack_pools definition. It doesn't make sense to have emptry string
as passed value here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-04 09:29:01 -05:00
Guillaume Abrioux 8cacba1f54 tests: add erasure pool creation test in CI
This commit makes the CI testing an OSD pool erasure creation due to the
recent refact of the OSD pool creation tasks in the playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-04 09:29:01 -05:00
Guillaume Abrioux a3b797e059 tests: enable pg autoscaler on 1 pool
This commit enables the pg autoscaler on 1 pool.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-04 09:29:01 -05:00
Guillaume Abrioux 896d00b50e tests: add lvm batch filestore testing
This commit adds an OSD node in lvm-batch scenario in order to test
filestore backend.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-03 13:50:19 -05:00
Guillaume Abrioux 0fc99bb6fa tests: increase journal_size value
Looks like we are still seeing issue [1].
Let's increase this value to unlock the CI (however, it still needs to
be investigated).

Typical error (see [1] for further details) :
```
[root@osd2 ~]# ceph-volume --cluster ceph lvm batch --filestore --yes --journal-size '2048' /dev/sda /dev/sdb --journal-devices /dev/sdc
Running command: /sbin/vgcreate --force --yes ceph-journals-817ef90b-77ac-4f52-b8a9-30893849fb78 /dev/sdc
 stdout: Physical volume "/dev/sdc" successfully created.
 stdout: Volume group "ceph-journals-817ef90b-77ac-4f52-b8a9-30893849fb78" successfully created
--> Refusing to continue with configured size for journal
-->  RuntimeError: journal sizes must be larger than 2GB, detected: 1024.00 MB
```

[1] https://tracker.ceph.com/issues/41374

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-03 13:23:57 -05:00
Dimitri Savineau ac0f68ccf0 ceph-dashboard: update create/get rgw user tasks
Since [1] if a rgw user already exists then the radosgw-admin user create
command will return an error instead of modifying the current user.
We were already doing separated tasks for create and get operation but
only for multisite configuration but it's not enough.
Instead we should do the get task first and depending on the result
execute the create.
This commit also adds missing run_once and delegate_to statement.

[1] https://github.com/ceph/ceph/commit/269e9b9

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-02-18 10:22:21 +01:00
Ali Maredia 1834c1e48d rgw: extend automatic rgw pool creation capability
Add support for erasure code pools.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731148

Signed-off-by: Ali Maredia <amaredia@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
2020-02-17 16:07:43 +01:00
Dimitri Savineau 779a4a6d71 tests: don't install s3cmd on containerized setup
The s3cmd package should only be installed on non containerized
deployment.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-02-17 11:27:52 +01:00
Guillaume Abrioux 910fc61fdc tests: remove legacy `osd_scenario` variable
As of stable-4.0 most of these references aren't needed anymore.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-02-04 10:05:33 +01:00
Guillaume Abrioux 641729357e tests: add external_clients scenario
This commit adds a new 'external ceph clients' scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-01-31 12:02:15 +01:00
Guillaume Abrioux c040199c8f tests: set dashboard|grafana_admin_password
Set these 2 variables in all test scenarios where `dashboard_enabled` is
`True`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-01-29 08:45:34 +01:00
Guillaume Abrioux 3e7dbb4b16 tests: add 'all_in_one' scenario
Add new scenario 'all_in_one' in order to catch more collocated related
issues.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-01-27 15:30:45 -05:00
Dimitri Savineau bb3eae0c80 filestore-to-bluestore: fix osd_auto_discovery
When osd_auto_discovery is set then we need to refresh the
ansible_devices fact between after the filestore OSD purge
otherwise the devices fact won't be populated.
Also remove the gpt header on ceph_disk_osds_devices because
the devices is empty at this point for osd_auto_discovery.
Adding the bool filter when needed.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-22 09:36:09 +01:00
Dimitri Savineau f995b079a6 filestore-to-bluestore: --destroy with raw devices
We still need --destroy when using a raw device otherwise we won't be
able to recreate the lvm stack on that device with bluestore.

Running command: /usr/sbin/vgcreate -s 1G --force --yes ceph-bdc67a84-894a-4687-b43f-bcd76317580a /dev/sdd
 stderr: Physical volume '/dev/sdd' is already in volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151'
  Unable to add physical volume '/dev/sdd' to volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151'
  /dev/sdd: physical volume not initialized.
--> Was unable to complete a new OSD, will rollback changes

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792227

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-21 11:37:39 -05:00
Dimitri Savineau 3900527e16 tests/setup: update mount options on EL 8
The nobarrier mount flag doesn't exist anymoer on XFS in the EL 8
kernel. That's why the task wasn't working on those systems.
We can still use the other options instead of skipping the task.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-11 05:33:01 +01:00
Guillaume Abrioux dc672e86ec tests: add a docker2podman scenario
This commit adds a new scenario in order to test docker-to-podman.yml
migration playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-01-10 10:21:29 -05:00
Guillaume Abrioux 4f2baaab8c tests: disable nfs testing
nfs-ganesha makes the CI failing because of issue related to SELinux.

See:
- https://bugzilla.redhat.com/show_bug.cgi?id=1788563
- https://github.com/nfs-ganesha/nfs-ganesha/issues/527

Until we can get this fixed, let's disable nfs-ganesha testing
temporarily.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-01-08 11:13:46 +01:00
Dimitri Savineau 7b3e6b932c tests/functional: change docker to podman
Some docker commands were hardcoded in tests playbooks and some
conditions were not taking care of the containerized_deployment
variable but only the atomic fact.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-08 11:13:46 +01:00
Guillaume Abrioux 217d95abb2 common: add centos8 support
Ceph octopus only supports CentOS 8.

This commit adds CentOS 8 support:
  - update vagrant image in tox configurations.
  - add CentOS 8 repository for el8 dependencies.
  - CentOS 8 container engine is podman (same than RHEL 8).
  - don't use the epel mirror on sepia because it's epel7 only.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-08 11:13:46 +01:00
Guillaume Abrioux 40de34fb5e tests: add filestore_to_bluestore job
This commit adds a new job in order to test the
filestore-to-bluestore.yml infrastructure playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-11 09:04:41 -05:00
Dimitri Savineau 4a6d19dae2 tests: reduce max_mds from 3 to 2
Having max_mds value equals to the number of mds nodes generates a
warning in the ceph cluster status:

cluster:
id:     6d3e49a4-ab4d-4e03-a7d6-58913b8ec00a'
health: HEALTH_WARN'
        insufficient standby MDS daemons available'
(...)
services:
  mds:     cephfs:3 {0=mds1=up:active,1=mds0=up:active,2=mds2=up:active}'

Let's use 2 active and 1 standby mds.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-12-04 14:07:29 -05:00
Dimitri Savineau 3f29b243ea tests: fix cluster health status
The current ceph cluster health is in warning state:

health: HEALTH_WARN
        13 pool(s) have no replicas configured
        2 pool(s) have non-power-of-two pg_num

Because we're using only 1 replica then we need to disable the redundancy
check.
The pool pg num should be a power of two number (like 16).

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-27 16:20:17 +01:00
Dimitri Savineau ef2cb99f73 ceph-osd: add device class to crush rules
This adds device class support to crush rules when using the class key
in the rule dict via the create-replicated sub command.
If the class key isn't specified then we use the create-simple sub
command for backward compatibility.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-11-14 16:25:46 +01:00
Guillaume Abrioux db77fbda15 tests: add coverage on purge playbook
This commit adds a playbook to be played before we run purge playbook,
it first creates an rbd image then map an rbd device on client0 so the
purge playbook will try to unmap it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-11-08 09:06:11 -05:00
Guillaume Abrioux 384161edcd tests: fix keyring creation in ooo_collocation
This commit removes the backslash in allow command parameter, this was
needed before the ceph_key module integration.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-22 13:45:19 +02:00
Dimitri Savineau 3c2840da03 tests: update container tag for ooo_collocation
It doesn't make sense to test the old 3.0.x container images with
nautilus+ ceph releases.
Also disable the dashboard deployment and switch to bluestore backend.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-22 13:45:19 +02:00
Guillaume Abrioux 25b98b2ce3 tests: add multimds coverage
This commit makes the all_daemons scenario deploying 3 mds in order to
cover the multimds case.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-18 13:43:13 -04:00
Dimitri Savineau 2c03c6fcd3 tests: fix the size on the second data LV
The commit replaces the pv/vg/lv commands used with the ansible command
module by the lvg and lvol modules.
This also fixes the size of the second data LV because we were only using
50% of the remaining space instead of 100%.

With a 50G device, the result was:
  - data-lv1 was 25G
  - data-lv2 was 12.5G
Instead of:
  - data-lv1 was 25G
  - data-lv2 was 25G

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-17 15:49:15 -04:00
Dimitri Savineau 04ec1ad3cc tests: reduce handler mon and osd delay
We don't need to have high handler delay in the CI so reducing to
10 seconds.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-09 09:08:20 +02:00
Dimitri Savineau 010158ff84 tests: fix rgw multisite vagrant variables
The secondary vagrant variables didn't have the grafana vm variable
set which create an vagrant error.

There was an error loading a Vagrantfile. The file being loaded
and the error message are shown below. This is usually caused by
an invalid or undefined variable.

This patch also changes the ssh-extra-args parameter to ssh-common-args
to get the same values for ssh/sftp/scp. Otherwise we can see warnings
from ansible and some tasks are failing.

[WARNING]: sftp transfer mechanism failed on [mon0]. Use ANSIBLE_DEBUG=1
to see detailed information

It also updates the ssh-common-args value for the rgw-multisite scenario
to reflect the ANSIBLE_SSH_ARGS environment variable value.

Finally changing the IP addresses due to the Vagrant refact done in the
commit 778c51a

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-04 15:12:50 -04:00
Guillaume Abrioux 01f6dd52b3 tests: remove debug log verbosity
This was added for debugging purpose.
It's generating very large log output, let's remove this now.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-28 11:20:49 +02:00
Guillaume Abrioux 5bb6a4da42 tests: set copy_admin_key at group_vars level
setting it at extra vars level prevent from setting it per node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux da094ac5ee tests: do not rely on pg_num to validate rgw_tuning_pools
Since the pg_autoscaler has been enabled recently in ceph, this check
should stick to validate the requested pools are well created only.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-18 14:05:23 +02:00
Dimitri Savineau 825045f6b4 tests: use a single grafana node on podman
We don't use multiple grafana nodes for the moment on the others
scenarios and I don't think this is supposed to be working.
We can often see failure on grafana on that scenario.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-08-28 11:42:48 -04:00
Guillaume Abrioux 05686509f3 tests: update test_mgr_is_up()
the data structure has changed in octopus:

```
    "mgrmap": {
        "available": true,
        "modules": [
            "dashboard",
            "prometheus"
        ],
        "num_standbys": 0,
        "services": {
            "prometheus": "http://mgr0:9283/"
        }
    },
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-08-14 16:42:02 +02:00
Dimitri Savineau 31bd5e08a6 Revert "tests: disable nfs-ganesha deployment"
This reverts commit 83940e624b.

Because nfs-ganesha@master (2.9-dev) build has been fixed by [1] then
we can test nfs-ganesha in the CI for master/octopus.

[1] https://github.com/ceph/ceph-build/pull/1346

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-08-07 10:40:43 +02:00
Dimitri Savineau 867583d5dd tests/shrink_rgw: Disable dashboard
The shrink_rgw scenario has been merge just after the PR about enable
ceph dashboard by default.
So right now the shrink_rgw scenrio doesn't have nodes in the grafana
group and fails.
We just need to set dashboard_enabled to false.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-31 14:53:05 -04:00
Guillaume Abrioux 0f620b2584 tests: add more memory in podman job
Typical error :

```
fatal: [mon1 -> mon0]: FAILED! => changed=true
  cmd:
  - podman
  - exec
  - ceph-mon-mon0
  - ceph
  - config
  - set
  - mgr
  - mgr/dashboard/ssl
  - 'false'
  delta: '0:00:00.644870'
  end: '2019-07-30 10:17:32.715639'
  msg: non-zero return code
  rc: 1
  start: '2019-07-30 10:17:32.070769'
  stderr: |-
    Traceback (most recent call last):
      File "/usr/bin/ceph", line 140, in <module>
        import rados
    ImportError: libceph-common.so.0: cannot map zero-fill pages: Cannot allocate memory
    Error: exit status 1
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>
```

Let's add more memory to get around this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-30 13:52:44 +02:00
Guillaume Abrioux d649e00893 tests: deploy dashboard on mons
there's no dedicated nodes for mgr, let's use monitor nodes.
The mgr0 instance spawned isn't used, so if this node is part of the
inventory for this scenario, testinfra will complain because there's no
ceph.conf on this node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-30 13:52:44 +02:00
Rishabh Dave 236b081a3a tests/functional: add a test for shrink-rgw.yml
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and RGW and then runs shrink-rgw.yml to test it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-07-30 08:45:57 +02:00
Guillaume Abrioux 3c2fd337d9 tests: test dashboard deployment with podman scenario
This commit adds a grafana-server section in order to test dashboard
deployment with podman.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-29 14:42:45 +02:00
Guillaume Abrioux fb1b5b3251 dashboard: enable dashboard by default
This commit enables dashboard deployment by default.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1726739

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-29 14:42:45 +02:00
Dimitri Savineau 07c6695d16 Remove NBSP characters
Some NBSP are still present in the yaml files.
Adding a test in travis CI.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-26 16:09:23 -04:00
Guillaume Abrioux 83940e624b tests: disable nfs-ganesha deployment
nfs-ganesha repositories @ dev are broken, this commit disables the
nfs-ganesha deployment so the CI isn't stuck.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-07-24 14:13:06 +02:00
Dimitri Savineau a9a1f633a9 tests/dashboard: use the dedicated grafana node
The Vagrant dashboard scenario creates a dedicated grafana node but
was not use in the ansible inventory.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-07-18 07:22:13 +02:00
Rishabh Dave f80521f773 tests/functional: add a test for shrink-rbdmirror.yml
Add a new functional test that deploys Ceph cluster with three nodes for
MON, OSD and RBD Mirror and, then, runs shrink-rbdmirror.yml to test it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-07-15 11:22:17 +02:00
Rishabh Dave 5c95c34d4b tests/functional: add a test for shrink-mgr.yml
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and MGR and then runs shrink-mgr.yml to test it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-07-09 14:37:02 +02:00
Rishabh Dave 324b3b4a6c tests/functional: add a test for shrink-mds.yml
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and MDS and then runs shrink-mds.yml to test it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-07-08 11:05:28 +02:00
Mike Christie 1e64efc2f0 igw: Update tests to use ceph-iscsi package
gateway_ip_list is depreciated and is only used when using the old
ceph-iscsi-config/cli packages that are no longer being developed
(GH repos are archived). Because ceph-iscsi-config/cli is no longer
being worked on, this modifies the tests to stress the ceph-iscsi
based installs.

Signed-off-by: Mike Christie <mchristi@redhat.com>
2019-07-03 22:13:19 +02:00
Mike Christie b7b2213be1 igw: drop gateway_ip_list for container setups
The gateway_ip_list is not used in container setups, so drop it
for that case.

Signed-off-by: Mike Christie <mchristi@redhat.com>
2019-07-03 22:13:19 +02:00
Guillaume Abrioux 45041f52fd tests: clean nfs_ganesha variables
- clean some leftover.
- move nfs_ganesha_[stable|dev] in group_vars so dev_setup.yml can modify them.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-26 08:58:51 +02:00
Guillaume Abrioux 013ae62177 tests: test nfs-ganesha deployment
Add back the nfs-ganesha deployment testing which was removed because of
broken dependencies.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-26 08:58:51 +02:00
Guillaume Abrioux 9201674b5b tests: deploy nfs-ganesha in container-all_daemons
this commit bring back the nfs-ganesha testing in containerized
deployment.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-24 10:05:11 +02:00
Dimitri Savineau da8b7ab7fb remove ceph restapi references
The ceph restapi configuration was only available until Luminous
release so we don't need those leftovers for nautilus+.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-18 09:13:19 +02:00
Guillaume Abrioux 1019e3b3dc tests: increase docker pull timeout
CI is facing issues where docker pull reach the timeout, let's increase
this to avoid CI failures.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-14 16:23:24 +02:00
Rishabh Dave 67071c3169 align cephfs pool creation
The definitions of cephfs pools should match openstack pools.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
Co-Authored-by: Simone Caronni <simone.caronni@teralytics.net>
2019-06-13 09:44:05 +02:00
Guillaume Abrioux 4cf17a6fdd iscsi: assign application (rbd) to pool 'rbd'
if we don't assign the rbd application tag on this pool,
the cluster will get `HEALTH_WARN` state like following:

```
HEALTH_WARN application not enabled on 1 pool(s)
POOL_APP_NOT_ENABLED application not enabled on 1 pool(s)
    application not enabled on pool 'rbd'
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-13 07:35:39 +02:00
Guillaume Abrioux 9e4e692c61 tests: remove unused variable
`e MGR_DASHBOARD=0` isn't needed anymore here, let's remove this legacy.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-12 10:41:01 -04:00
Guillaume Abrioux 8dd774a99b tests: update docker image tag used in ooo job
ceph-ansible@master isn't intended to deploy luminous.
Let's use latest-master on ceph-ansible@master branch

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-12 10:41:01 -04:00
fmount 069076bbfd Fix units and add ability to have a dedicated instance
Few fixes on systemd unit templates for node_exporter and
alertmanager container parameters.
Added the ability to use a dedicated instance to deploy the
dashboard components (prometheus and grafana).
This commit also introduces the grafana_group_name variable
to refer grafana group and keep consistency with the other
groups.
During the integration with TripleO some grafana/prometheus
template variables resulted undefined. This commit adds the
ability to check if the group exist and create, accordingly,
different job groups in prometheus template.

Signed-off-by: fmount <fpantano@redhat.com>
2019-06-10 18:18:46 +02:00
L3D ab54fe20ec ansible: use 'bool' filter on boolean conditionals
By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these:
```
[DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable,
this behaviour will go away and you might need to add |bool to the expression
in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This
feature will be removed in version 2.12. Deprecation warnings can be disabled
by setting deprecation_warnings=False in ansible.cfg.
```

Now appended ``| bool`` on a lot of the affected variables.

Sometimes the coding style from ``variable|bool`` changed to ``variable | bool`` *(with spaces at the pipe)*.

Closes: #4022

Signed-off-by: L3D <l3d@c3woc.de>
2019-06-06 10:21:17 +02:00
Guillaume Abrioux a78fb209b1 tests: test podman against atomic os instead rhel8
the rhel8 image used is an outdated beta version, it is not worth it to
maintain this image upstream, since it's possible to test podman with a
newer version of centos/atomic-host image.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-06-04 11:32:41 -04:00
Dimitri Savineau de147469d7 tests: update testinfra release
In order to support ansible 2.8 with testinfra we need to use the
latest release (3.0.x).
Adding ssh-config option to py.test.
Also bumping the pytest and xdist version.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-05-20 13:04:58 +02:00
Guillaume Abrioux 17634fc3df tests: add dashboard scenario testing
This commit add a new scenario to test the dashboard deployment via
ceph-ansible.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
Guillaume Abrioux 2798774e96 tests: fix a typo in dev_setup.yml
c907ec41ae introduced a typo.
This commit fixes it.

```
[WARNING]: While constructing a mapping from /home/guits/ceph-ansible/tests/functional/dev_setup.yml, line 21, column 9, found a duplicate dict key (replace).
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-15 11:33:26 +02:00
Dimitri Savineau 52b9f3fb28 tox: Refact lvm_osds scenario
The current lvm_osds only tests filestore on one OSD node.
We also have bs_lvm_osds to test bluestore and encryption.
Let's use only one scenario to test filestore/bluestore and with or
without dmcrypt on four OSD nodes.
Also use validate_dmcrypt_bool_value instead of types.boolean on
dmcrypt validation via notario.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-05-09 09:38:20 +02:00
Rishabh Dave d2cfd8b780 allow adding a manager to a deployed cluster
Add a playbook that deploys manager on a new node and adds that node to
the already deployed Ceph cluster.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-05-07 14:13:06 +02:00
Rishabh Dave f201222447 allow adding a RGW to already deployed cluster
Add a tox scenario that adds a new RGW node as a part of already
deployed Ceph cluster and deploys RGW there.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-05-07 12:36:16 +02:00
Rishabh Dave 221b2b4988 allow adding a RBD mirror to already deployed cluster
Add a tox scenario that adds a new RBD mirror node as a part of already
deployed Ceph cluster and deploys RBD mirror there.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-05-07 09:45:20 +02:00
Dimitri Savineau 564ec9c992 tests: group and parametrize tests
Instead of creating a dedicated test and using the same testinfra
module we can group them into a single test to avoid multiple ansible
connections and testinfra module execution.
This patch also adds parametrize pytest decorator when possible.
Finally fixing some flake minor issue.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-24 10:03:25 +02:00
Rishabh Dave 739a662c80 improve coding style
Keywords requiring only one item shouldn't express it by creating a
list with single item.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-04-23 15:37:07 +02:00
Andrew Schoen 399a821439 tests: adds the migrate_ceph_disk_to_ceph_volume scenario
This test deploys a luminous cluster with ceph-disk created osds
and then upgrades to nautilus and migrates those osds to ceph-volume.
The nodes are then rebooted and cluster state verified.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-04-18 10:55:11 +02:00
Dimitri Savineau f601549a8a test_osds: remove scenario leftover
Since there's only only scenario available we don't need lvm_scenario
and no_lvm_scenario.
Also add missing assert for ceph-volume tests.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-17 17:28:12 +02:00
Dimitri Savineau 9f99f539f7 tests/functional/setup: change mount options
In the CI jobs we can change the mount options of the main partition
to avoid extra operations on disk.
Adding jmespath to tests/requirements.txt due to the json_query
filter usage.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-17 08:23:07 +02:00
Dimitri Savineau c84a74592a test_mons: test mon listening on port 3300
Since nautilus and msgr2 the monitors also bind on port 3300 in
addition of 6789.
This patch updates test_mons to reflect that change.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-04-17 08:19:48 +02:00
Rishabh Dave d5967af7fb allow adding a monitor to a deployed cluster
Add a playbook that deploys a new monitor on a new node, adds that node
to the Ceph cluster and the monitor to the quorum and updates the ceph
configuration file on OSD nodes.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-04-15 10:00:50 +02:00
Guillaume Abrioux 83e84c6a4a tests: remove test_journal_collocation.py in OSD testing
this test is related to ceph-disk which is dropped as of stable-4.0

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-04-11 11:57:02 -04:00