Some docker commands were hardcoded in tests playbooks and some
conditions were not taking care of the containerized_deployment
variable but only the atomic fact.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This commit adds a new job in order to test the
filestore-to-bluestore.yml infrastructure playbook.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Having max_mds value equals to the number of mds nodes generates a
warning in the ceph cluster status:
cluster:
id: 6d3e49a4-ab4d-4e03-a7d6-58913b8ec00a'
health: HEALTH_WARN'
insufficient standby MDS daemons available'
(...)
services:
mds: cephfs:3 {0=mds1=up:active,1=mds0=up:active,2=mds2=up:active}'
Let's use 2 active and 1 standby mds.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The current ceph cluster health is in warning state:
health: HEALTH_WARN
13 pool(s) have no replicas configured
2 pool(s) have non-power-of-two pg_num
Because we're using only 1 replica then we need to disable the redundancy
check.
The pool pg num should be a power of two number (like 16).
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This adds device class support to crush rules when using the class key
in the rule dict via the create-replicated sub command.
If the class key isn't specified then we use the create-simple sub
command for backward compatibility.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This commit adds a playbook to be played before we run purge playbook,
it first creates an rbd image then map an rbd device on client0 so the
purge playbook will try to unmap it.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit removes the backslash in allow command parameter, this was
needed before the ceph_key module integration.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
It doesn't make sense to test the old 3.0.x container images with
nautilus+ ceph releases.
Also disable the dashboard deployment and switch to bluestore backend.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The commit replaces the pv/vg/lv commands used with the ansible command
module by the lvg and lvol modules.
This also fixes the size of the second data LV because we were only using
50% of the remaining space instead of 100%.
With a 50G device, the result was:
- data-lv1 was 25G
- data-lv2 was 12.5G
Instead of:
- data-lv1 was 25G
- data-lv2 was 25G
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The secondary vagrant variables didn't have the grafana vm variable
set which create an vagrant error.
There was an error loading a Vagrantfile. The file being loaded
and the error message are shown below. This is usually caused by
an invalid or undefined variable.
This patch also changes the ssh-extra-args parameter to ssh-common-args
to get the same values for ssh/sftp/scp. Otherwise we can see warnings
from ansible and some tasks are failing.
[WARNING]: sftp transfer mechanism failed on [mon0]. Use ANSIBLE_DEBUG=1
to see detailed information
It also updates the ssh-common-args value for the rgw-multisite scenario
to reflect the ANSIBLE_SSH_ARGS environment variable value.
Finally changing the IP addresses due to the Vagrant refact done in the
commit 778c51a
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This was added for debugging purpose.
It's generating very large log output, let's remove this now.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since the pg_autoscaler has been enabled recently in ceph, this check
should stick to validate the requested pools are well created only.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We don't use multiple grafana nodes for the moment on the others
scenarios and I don't think this is supposed to be working.
We can often see failure on grafana on that scenario.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This reverts commit 83940e624b.
Because nfs-ganesha@master (2.9-dev) build has been fixed by [1] then
we can test nfs-ganesha in the CI for master/octopus.
[1] https://github.com/ceph/ceph-build/pull/1346
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The shrink_rgw scenario has been merge just after the PR about enable
ceph dashboard by default.
So right now the shrink_rgw scenrio doesn't have nodes in the grafana
group and fails.
We just need to set dashboard_enabled to false.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
there's no dedicated nodes for mgr, let's use monitor nodes.
The mgr0 instance spawned isn't used, so if this node is part of the
inventory for this scenario, testinfra will complain because there's no
ceph.conf on this node.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and RGW and then runs shrink-rgw.yml to test it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
nfs-ganesha repositories @ dev are broken, this commit disables the
nfs-ganesha deployment so the CI isn't stuck.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The Vagrant dashboard scenario creates a dedicated grafana node but
was not use in the ansible inventory.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Add a new functional test that deploys Ceph cluster with three nodes for
MON, OSD and RBD Mirror and, then, runs shrink-rbdmirror.yml to test it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and MGR and then runs shrink-mgr.yml to test it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and MDS and then runs shrink-mds.yml to test it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
gateway_ip_list is depreciated and is only used when using the old
ceph-iscsi-config/cli packages that are no longer being developed
(GH repos are archived). Because ceph-iscsi-config/cli is no longer
being worked on, this modifies the tests to stress the ceph-iscsi
based installs.
Signed-off-by: Mike Christie <mchristi@redhat.com>
- clean some leftover.
- move nfs_ganesha_[stable|dev] in group_vars so dev_setup.yml can modify them.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The ceph restapi configuration was only available until Luminous
release so we don't need those leftovers for nautilus+.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
CI is facing issues where docker pull reach the timeout, let's increase
this to avoid CI failures.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The definitions of cephfs pools should match openstack pools.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Co-Authored-by: Simone Caronni <simone.caronni@teralytics.net>
if we don't assign the rbd application tag on this pool,
the cluster will get `HEALTH_WARN` state like following:
```
HEALTH_WARN application not enabled on 1 pool(s)
POOL_APP_NOT_ENABLED application not enabled on 1 pool(s)
application not enabled on pool 'rbd'
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Few fixes on systemd unit templates for node_exporter and
alertmanager container parameters.
Added the ability to use a dedicated instance to deploy the
dashboard components (prometheus and grafana).
This commit also introduces the grafana_group_name variable
to refer grafana group and keep consistency with the other
groups.
During the integration with TripleO some grafana/prometheus
template variables resulted undefined. This commit adds the
ability to check if the group exist and create, accordingly,
different job groups in prometheus template.
Signed-off-by: fmount <fpantano@redhat.com>
By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these:
```
[DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable,
this behaviour will go away and you might need to add |bool to the expression
in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This
feature will be removed in version 2.12. Deprecation warnings can be disabled
by setting deprecation_warnings=False in ansible.cfg.
```
Now appended ``| bool`` on a lot of the affected variables.
Sometimes the coding style from ``variable|bool`` changed to ``variable | bool`` *(with spaces at the pipe)*.
Closes: #4022
Signed-off-by: L3D <l3d@c3woc.de>
the rhel8 image used is an outdated beta version, it is not worth it to
maintain this image upstream, since it's possible to test podman with a
newer version of centos/atomic-host image.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
In order to support ansible 2.8 with testinfra we need to use the
latest release (3.0.x).
Adding ssh-config option to py.test.
Also bumping the pytest and xdist version.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
c907ec41ae introduced a typo.
This commit fixes it.
```
[WARNING]: While constructing a mapping from /home/guits/ceph-ansible/tests/functional/dev_setup.yml, line 21, column 9, found a duplicate dict key (replace).
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The current lvm_osds only tests filestore on one OSD node.
We also have bs_lvm_osds to test bluestore and encryption.
Let's use only one scenario to test filestore/bluestore and with or
without dmcrypt on four OSD nodes.
Also use validate_dmcrypt_bool_value instead of types.boolean on
dmcrypt validation via notario.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Add a playbook that deploys manager on a new node and adds that node to
the already deployed Ceph cluster.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add a tox scenario that adds a new RGW node as a part of already
deployed Ceph cluster and deploys RGW there.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add a tox scenario that adds a new RBD mirror node as a part of already
deployed Ceph cluster and deploys RBD mirror there.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Instead of creating a dedicated test and using the same testinfra
module we can group them into a single test to avoid multiple ansible
connections and testinfra module execution.
This patch also adds parametrize pytest decorator when possible.
Finally fixing some flake minor issue.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This test deploys a luminous cluster with ceph-disk created osds
and then upgrades to nautilus and migrates those osds to ceph-volume.
The nodes are then rebooted and cluster state verified.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Since there's only only scenario available we don't need lvm_scenario
and no_lvm_scenario.
Also add missing assert for ceph-volume tests.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
In the CI jobs we can change the mount options of the main partition
to avoid extra operations on disk.
Adding jmespath to tests/requirements.txt due to the json_query
filter usage.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Since nautilus and msgr2 the monitors also bind on port 3300 in
addition of 6789.
This patch updates test_mons to reflect that change.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Add a playbook that deploys a new monitor on a new node, adds that node
to the Ceph cluster and the monitor to the quorum and updates the ceph
configuration file on OSD nodes.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
It's usefull to have logs in debug mode enabled in order to have
more information for developpers.
Also reindent to json file.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
We don't need to have multiple ceph-override.json copies. We
currently already have symlink to all_daemons/ceph-override.json so
we can do it for all scenarios.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
ceph-volume didn't work when the devices where passed by path.
Since it now support it, let's allow this feature in ceph-ansible
Closes: #3812
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Add a tox scenario that adds an new MDS node as a part of already
deployed Ceph cluster and deploys MDS there.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Instead of hardcoding group names, import ceph-defaults earlier. Also,
rectify a minor mistake in vagrant_varaibles.yml for containerized
version of add_osds.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
os.path.join adds the separator (i.e. '/') between the provided path
components only if needed. Providing a single path component doesn't
lead to any checks.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
otherwise a bunch of jobs will fail like following:
```
[WARNING]: Unable to parse /home/jenkins-build/build/workspace/ceph-ansible-nightly-luminous-ubuntu-container-stable-3.2-bluestore_lvm_osds/tests/functional/bs-lvm-osds/container/hosts-ubuntu as an inventory source
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
`test_mon_host_line_has_correct_value()` will cover this test in
anycase. It doesn't worth to have a dedicated test for this.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
nfs-ganesha v2.5 and 2.6 have hit EOL. Install nfs-ganesha v2.7
stable that is currently being maintained.
Signed-off-by: Ramana Raja <rraja@redhat.com>
using `!` mark in tox.ini doesn't work on comma separated list.
The idea here is to skip all containerized scenario in dev_setup.yml and
use the `!` for the update scenario.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
fix the wrong path used in various rgw testinfra tests.
set `1` as default value for `radosgw_num_instances`: if
`ansible_vars.get(radosgw_num_instances)` returns `None`, we can assume
there's only 1 instance since it's the default value in ceph-defaults.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
since we now set this variable in inventory host, the regexp needs to be
updated, the assignment operator is `=`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit reorganizes the testing directory layout.
The idea is to have more consistency with the names of scenario and
their corresponding path, eg: non-container vs. container: each scenario
has a subdirectory for container deployment.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
this commit refacts the way the environment are named by adding a factor
`{non_container,container}`. This will avoid a lot of duplicate
definition in tox.ini and bring kind of consistency.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>