Commit Graph

491 Commits (ee6f0547ba17ec5111dae0d25dda5aed5631aa60)

Author SHA1 Message Date
Dimitri Savineau 78cb9f44bd tests: add quay registry for collocation baremetal
Even if the non containerized collocation scenario deploys ceph with
RPMs then we also deploy the dashboard/monitoring but with containers.
This requires to set the registry variable to ceph's quay.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-09-10 14:23:21 -04:00
Dimitri Savineau 98c9afceb9 tests: use grafana from quay.io
This changes the grafana container image regitry from docker.io to
quay.io to avoid rate limit.
This also adds the missing container image values for docker2podman
and podman scenarios.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-09-09 10:35:02 -04:00
Guillaume Abrioux 657e6c8c3b tests: clean legacy
clean some legacies since quay.ceph.io migration

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-09-09 14:42:41 +02:00
Guillaume Abrioux 7348e9a253 tests: disable nfs-ganesha testing
This commit diables nfs-ganesha testing on master for non-containerized
deployment because the dev repos are broken at the moment.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-09-07 12:54:29 +02:00
Guillaume Abrioux 2cbb7de3b2 tests: migrate to quay.ceph.io registry
in order to avoid docker.io rate limiting

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-09-07 12:54:29 +02:00
Dimitri Savineau 4f308dcf4a tests: reenable ceph-iscsi testing
This re-adds the ceph-iscsi testing for both non containerized and
containerized deployment since the rados connection error on ceph
dev has been fixed [1].

[1] https://tracker.ceph.com/issues/47002

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-08-27 11:13:36 -04:00
Dimitri Savineau 6c11695fbe tests: reenable nfs-ganesha testing
This re-adds the nfs-ganesha testing in non containerized deployment.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-08-20 16:58:54 +02:00
Guillaume Abrioux 8476beb5b1 tests: move erasure pool testing in lvm_osds
This commit moves the erasure pool creation testing from `all_daemons`
to `lvm_osds` so we can decrease the number of osd nodes we spawn so the
OVH Jenkins slaves aren't less overwhelmed when a `all_daemons` based
scenario is being tested.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-08-20 11:50:28 +02:00
Guillaume Abrioux 093e1dcb21 tests: remove hosts-ubuntu inventories
Since we've dropped ubuntu testing, we don't need these inventories
anymore. Let's remove this leftover.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-08-18 11:20:48 +02:00
Guillaume Abrioux bd9e126357 tests: disable iscsigw testing (container)
Temporarily disable iscsigw testing for containerized deployments
because it's broken upstream on ceph@master.
non-containerized deployments use stable build for iscsigw to get around
this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-08-18 11:20:48 +02:00
Guillaume Abrioux e256d8e948 tests: test iscsigw against stable
Since it is broken at the moment with dev repos, let's test against
stable builds so the CI is unlocked.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-08-13 09:49:00 +02:00
Guillaume Abrioux 5df6225ede tests: change subnet in lvm_osds container scenario
This commit changes the subnets in container-lvm_osds scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-08-04 14:00:05 +02:00
Guillaume Abrioux 7efea219d6 tests: refact shrink_osd scenario
This adds more coverage on the shrink_osd scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-08-03 14:46:56 +02:00
Dimitri Savineau 891234668e tests: install pyyaml on osd nodes
Due to [1], ceph-volume has now a dependency on pyyaml but it's not
installed by default via the package dependency.
This patch only add the required package on non containerized
deployment and as temporary workaround for the CI.

[1] https://tracker.ceph.com/issues/46759

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-29 12:49:15 -04:00
Guillaume Abrioux 8ef9fb68bc tests: lvm_setup.yml, add carriage return
This commit adds crlf between each task.
It makes the playbook more readable.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-22 14:25:45 +02:00
Guillaume Abrioux 218f4ae361 tests: (lvm_setup.yml), don't shrink lvol
when rerunning lvm_setup.yml on existing cluster with OSDs already
deployed, it fails like following:

```
fatal: [osd0]: FAILED! => changed=false
  msg: Sorry, no shrinking of data-lv2 to 0 permitted.
```

because we are asking `lvol` module to create a volume on an empty VG
with size extents = `100%FREE`.

The default behavior of `lvol` is to shrink the volume if the LV's current
size is greater than the requested size.

Given the requested size is calculated like this:

`size_requested = size_percent * this_vg['free'] / 100`

in our case, it is similar to:

`size_requested = 100 * 0 / 100` which basically means `0`

So the current LV size is well greater than the requested size which
leads the module to attempt to shrink it to 0 which isn't obviously now
allowed.

Adding `shrink: false` to the module calls fixes this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-22 14:25:45 +02:00
Guillaume Abrioux 9d2f2108e1 ceph-crash: introduce new role ceph-crash
This commit introduces a new role `ceph-crash` in order to deploy
everything needed for the ceph-crash daemon.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-21 20:22:12 +02:00
Dimitri Savineau 957903d561 cephadm: add playbook
This adds a new playbook for deploying ceph via cephadm.

This also adds a new dedicated tox file for CI purpose.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-16 11:40:45 -04:00
Dimitri Savineau fc599ed9f5 tests: remove nfs_ganesha_stable_branch variable
We don't need to override this variable in the group_vars but use the
default value instead.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-06 16:58:59 +02:00
Guillaume Abrioux 5b6f5486f7 tests: update nfs-ganesha to V3.3-stable
not really needed in master, commit intended to be backported in octopus
branch.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-05 17:10:40 +02:00
Dimitri Savineau 829990e60d ceph-osd: remove ceph-osd-run.sh script
Since we only have one scenario since nautilus then we can just move
the container start command from ceph-osd-run.sh to the systemd unit
service.
As a result, the ceph-osd-run.sh.j2 template and the
ceph_osd_docker_run_script_path variable are removed.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-06-18 17:51:13 +02:00
Jan Fajerski 1fe8e819f9 lvm_setup: lookup device from inventory, default to /dev/sd* names
This fixes a long standing fail in ceph-volumes lvm test suite.
Otherwise the default behaviour should not change.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
2020-06-16 18:17:34 +02:00
Guillaume Abrioux 83faf94351 tests: update pools definitions
setting attributes with empty string is a bad user input.
Also, removing `rule_name` attribute when creating a code erasure pool.
(this rule isnt intended for code erasure pool type).

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-05-16 07:31:57 +02:00
Dimitri Savineau 252e78b4e4 docker2podman: manage dashboard nodes
The dashboard nodes (alertmanager, grafana, node-exporter, and prometheus)
were not manage during the docker to podman migration.

This adds the systemd container template of those services to a dedicated
file (systemd.yml) in order to include it in the docker2podman playbook.

This also adds the dashboard container images pull from docker to podman.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1829389

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-05-13 12:02:00 +02:00
Dimitri Savineau 2547ab601a Readd CentOS 7 with conditions
The CentOS 7 distribution could still be used be deploying ceph if
  - it's a containerized deployment
  - it's a non containerized deployment without the dashboard (due to
missing python3 libraries).

The ceph_stable_redhat_distro variable has been remove because we can
rely on the ansible_distribution_major_version fact instead.

The copr el8 repository configuration is only applied for CentOS 8.

The ceph-mgr-dashboard package is only installed when the
dashboard_enabled variable is set to true.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-04-23 13:31:11 +02:00
Guillaume Abrioux 86959abf9b tests: add back nfs testing on master
This commit adds back nfs testing on master branch (containerized
scenario only).

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-04-23 13:27:48 +02:00
Guillaume Abrioux 8c1c34b201 tests: add more coverage in external_clients scenario
Run create_users_keys.yml in external_clients scenario

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-31 14:49:38 -04:00
Dimitri Savineau 0f0a14772c tests: update mgr dashboard socket listening test
Since 15ed9ee the ceph-mgr daemon binds on the IP address on the public
network instead of binding on all addresses.
This commit updates the testinfra code to reflect that change.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-24 21:11:02 +01:00
Dimitri Savineau f2c6281207 tests: add dashboard testinfra configuration
This commit adds basic tests for grafana, prometheus, node-exporter and
ceph mgr dashboard services.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-24 15:19:18 +01:00
Dimitri Savineau fb69f6990c dashboard: allow to set read-only admin user
This commit allows one to set the role for the admin user as read-only.
This can be controlled via the dashboard_admin_user_ro variable but the
default value is false for backward compatibility.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1810176

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-19 15:34:41 +01:00
Guillaume Abrioux 60a2e28189 rgw: add multi-instances support when deploying multisite
This commit adds the multi-instances when deploying rgw multisite

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-12 16:44:48 -04:00
Dimitri Savineau e62532de46 update osd pool set size command
Since [1] we can't use osd pool without replicas (size: 1) by default.
We now need to set the mon_allow_pool_size_one flag to true in the ceph
configuration and add the --yes-i-really-mean-it flag to the osd pool
set size cli.

[1] https://github.com/ceph/ceph/commit/21508bd

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-11 11:25:42 +01:00
Ali Maredia 71f55bd54d rgw multisite: enable more than 1 realm per cluster
Make it so that more than one realm, zonegroup,
or zone can be created during a run of the rgw
multisite ansible playbooks.

The rgw hosts now need to be grouped into zones
and realms in the inventory.

.yml files need to be created in group_vars
for the realms and zones. Sample yaml files
are available.

Also remove multsite destroy playbook
and add --cluster before radosgw-admin commands

remove manually added rgw_zone_endpoints var
and have ceph-ansible automatically add the
correct endpoints of all the rgws in a rgw_zone
from the information provided in that rgws hostvars.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
2020-03-04 12:58:13 -05:00
Guillaume Abrioux 9f0c6df94f tests: add more osd nodes in all_daemons scenario
This commit adds more osd nodes in all_daemons scenario in order to test
erasure pool creation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-04 09:29:01 -05:00
Guillaume Abrioux 248978596a tests: update ooo job
This commit changes the value passed for the attribute 'rule_name' in
openstack_pools definition. It doesn't make sense to have emptry string
as passed value here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-04 09:29:01 -05:00
Guillaume Abrioux 8cacba1f54 tests: add erasure pool creation test in CI
This commit makes the CI testing an OSD pool erasure creation due to the
recent refact of the OSD pool creation tasks in the playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-04 09:29:01 -05:00
Guillaume Abrioux a3b797e059 tests: enable pg autoscaler on 1 pool
This commit enables the pg autoscaler on 1 pool.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-04 09:29:01 -05:00
Guillaume Abrioux 896d00b50e tests: add lvm batch filestore testing
This commit adds an OSD node in lvm-batch scenario in order to test
filestore backend.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-03 13:50:19 -05:00
Guillaume Abrioux 0fc99bb6fa tests: increase journal_size value
Looks like we are still seeing issue [1].
Let's increase this value to unlock the CI (however, it still needs to
be investigated).

Typical error (see [1] for further details) :
```
[root@osd2 ~]# ceph-volume --cluster ceph lvm batch --filestore --yes --journal-size '2048' /dev/sda /dev/sdb --journal-devices /dev/sdc
Running command: /sbin/vgcreate --force --yes ceph-journals-817ef90b-77ac-4f52-b8a9-30893849fb78 /dev/sdc
 stdout: Physical volume "/dev/sdc" successfully created.
 stdout: Volume group "ceph-journals-817ef90b-77ac-4f52-b8a9-30893849fb78" successfully created
--> Refusing to continue with configured size for journal
-->  RuntimeError: journal sizes must be larger than 2GB, detected: 1024.00 MB
```

[1] https://tracker.ceph.com/issues/41374

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-03-03 13:23:57 -05:00
Dimitri Savineau ac0f68ccf0 ceph-dashboard: update create/get rgw user tasks
Since [1] if a rgw user already exists then the radosgw-admin user create
command will return an error instead of modifying the current user.
We were already doing separated tasks for create and get operation but
only for multisite configuration but it's not enough.
Instead we should do the get task first and depending on the result
execute the create.
This commit also adds missing run_once and delegate_to statement.

[1] https://github.com/ceph/ceph/commit/269e9b9

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-02-18 10:22:21 +01:00
Ali Maredia 1834c1e48d rgw: extend automatic rgw pool creation capability
Add support for erasure code pools.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731148

Signed-off-by: Ali Maredia <amaredia@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
2020-02-17 16:07:43 +01:00
Dimitri Savineau 779a4a6d71 tests: don't install s3cmd on containerized setup
The s3cmd package should only be installed on non containerized
deployment.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-02-17 11:27:52 +01:00
Guillaume Abrioux 910fc61fdc tests: remove legacy `osd_scenario` variable
As of stable-4.0 most of these references aren't needed anymore.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-02-04 10:05:33 +01:00
Guillaume Abrioux 641729357e tests: add external_clients scenario
This commit adds a new 'external ceph clients' scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-01-31 12:02:15 +01:00
Guillaume Abrioux c040199c8f tests: set dashboard|grafana_admin_password
Set these 2 variables in all test scenarios where `dashboard_enabled` is
`True`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-01-29 08:45:34 +01:00
Guillaume Abrioux 3e7dbb4b16 tests: add 'all_in_one' scenario
Add new scenario 'all_in_one' in order to catch more collocated related
issues.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-01-27 15:30:45 -05:00
Dimitri Savineau bb3eae0c80 filestore-to-bluestore: fix osd_auto_discovery
When osd_auto_discovery is set then we need to refresh the
ansible_devices fact between after the filestore OSD purge
otherwise the devices fact won't be populated.
Also remove the gpt header on ceph_disk_osds_devices because
the devices is empty at this point for osd_auto_discovery.
Adding the bool filter when needed.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-22 09:36:09 +01:00
Dimitri Savineau f995b079a6 filestore-to-bluestore: --destroy with raw devices
We still need --destroy when using a raw device otherwise we won't be
able to recreate the lvm stack on that device with bluestore.

Running command: /usr/sbin/vgcreate -s 1G --force --yes ceph-bdc67a84-894a-4687-b43f-bcd76317580a /dev/sdd
 stderr: Physical volume '/dev/sdd' is already in volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151'
  Unable to add physical volume '/dev/sdd' to volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151'
  /dev/sdd: physical volume not initialized.
--> Was unable to complete a new OSD, will rollback changes

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792227

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-21 11:37:39 -05:00
Dimitri Savineau 3900527e16 tests/setup: update mount options on EL 8
The nobarrier mount flag doesn't exist anymoer on XFS in the EL 8
kernel. That's why the task wasn't working on those systems.
We can still use the other options instead of skipping the task.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-01-11 05:33:01 +01:00
Guillaume Abrioux dc672e86ec tests: add a docker2podman scenario
This commit adds a new scenario in order to test docker-to-podman.yml
migration playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-01-10 10:21:29 -05:00