The secondary vagrant variables didn't have the grafana vm variable
set which create an vagrant error.
There was an error loading a Vagrantfile. The file being loaded
and the error message are shown below. This is usually caused by
an invalid or undefined variable.
This patch also changes the ssh-extra-args parameter to ssh-common-args
to get the same values for ssh/sftp/scp. Otherwise we can see warnings
from ansible and some tasks are failing.
[WARNING]: sftp transfer mechanism failed on [mon0]. Use ANSIBLE_DEBUG=1
to see detailed information
It also updates the ssh-common-args value for the rgw-multisite scenario
to reflect the ANSIBLE_SSH_ARGS environment variable value.
Finally changing the IP addresses due to the Vagrant refact done in the
commit 778c51a
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This change implements a filter_plugin that is used in the
ceph-facts, ceph-validate roles and infrastucture-playbooks.
The new filter plugin will return a list of all IP address
that reside in any one of the given IP ranges. The new filter
replaces the use of the ipaddr filter.
ceph.conf already support a comma separated list of CIDRs
for the public_network and cluster_network options.
Changes: [1] and [2] introduced a regression in ceph-ansible
where public_network can no longer be a comma separated list
of cidrs.
With this change a comma separated list of subnet CIDRs can
also be used for monitor_address_block and radosgw_address_block.
[1] commit: d67230b2a2
[2] commit: 20e4852888
Related-To: https://bugs.launchpad.net/tripleo/+bug/1840030
Related-To: https://bugzilla.redhat.com/show_bug.cgi?id=1740283Closes: #4333
Please backport to stable-4.0
Signed-off-by: Harald Jensås <hjensas@redhat.com>
The ANSIBLE_CONFIG value wasn't set correctly for two scenarios. This
environment variable doesn't use '-F'.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
test switch_to_containers job against the latest ceph@master
ceph-container image tag available.
In order to be sure the ceph release deployed in the first step (non
containerized deployment) isn't newer than the tag used for the
containerized migration (which would mean we try to downgrade the
version).
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and RGW and then runs shrink-rgw.yml to test it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
this setting is not needed here since we explicitely set it for
container and non container context.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Add a new functional test that deploys Ceph cluster with three nodes for
MON, OSD and RBD Mirror and, then, runs shrink-rbdmirror.yml to test it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and MGR and then runs shrink-mgr.yml to test it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add a new functional test that deploys a Ceph cluster with three nodes
for MON, OSD and MDS and then runs shrink-mds.yml to test it.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
adding back a sleep 30s after nodes have rebooted before running
testinfra.
This was removed accidentally by d5be83e
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Because we're using vagrant, a ssh config file will be created for
each nodes with options like user, host, port, identity, etc...
But via tox we're override ANSIBLE_SSH_ARGS to use this file. This
remove the default value set in ansible.cfg.
Also adding PreferredAuthentications=publickey because CentOS/RHEL
servers are configured with GSSAPIAuthenticationis enabled for ssh
server forcing the client to make a PTR DNS query.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
the rhel8 image used is an outdated beta version, it is not worth it to
maintain this image upstream, since it's possible to test podman with a
newer version of centos/atomic-host image.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit adds `pytest-rerunfailures` in requirements.txt so we can
retry failing test in testinfra to avoid false positive. (eg: sometimes it
can happen for some reason a service takes too much time to start)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
In order to support ansible 2.8 with testinfra we need to use the
latest release (3.0.x).
Adding ssh-config option to py.test.
Also bumping the pytest and xdist version.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Since a1a871c we don't need to copy the infrastructure playbooks
under the ceph-ansible root directory.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The current lvm_osds only tests filestore on one OSD node.
We also have bs_lvm_osds to test bluestore and encryption.
Let's use only one scenario to test filestore/bluestore and with or
without dmcrypt on four OSD nodes.
Also use validate_dmcrypt_bool_value instead of types.boolean on
dmcrypt validation via notario.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Add a playbook that deploys manager on a new node and adds that node to
the already deployed Ceph cluster.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add a tox scenario that adds a new RGW node as a part of already
deployed Ceph cluster and deploys RGW there.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add a tox scenario that adds a new RBD mirror node as a part of already
deployed Ceph cluster and deploys RBD mirror there.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
update scenario is now handled by tox-update.ini file so we shoudn't
have update reference in tox.ini file.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This test deploys a luminous cluster with ceph-disk created osds
and then upgrades to nautilus and migrates those osds to ceph-volume.
The nodes are then rebooted and cluster state verified.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Add a playbook that deploys a new monitor on a new node, adds that node
to the Ceph cluster and the monitor to the quorum and updates the ceph
configuration file on OSD nodes.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
It's usefull to have logs in debug mode enabled in order to have
more information for developpers.
Also reindent to json file.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
even on master, force the release to be nautilus.
this scenarios is failing because at multiple times this scenario is
actually downgrading the ceph version.
It might happen that the latest-master image is older than what was
deployed in the first step of the scenario (the RPM deployment).
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 41b0fa15ddf2a45402d17faa3bd1e817692fc1d2)
Add a tox scenario that adds an new MDS node as a part of already
deployed Ceph cluster and deploys MDS there.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add a script to retry several times to fire up VMs to avoid vagrant
failures.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Andrew Schoen <aschoen@redhat.com>
there's no need to test this on all scenarios.
testing idempotency on all_daemons should be enough and allow us to save
precious resources for the CI.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
update scenario has been moved to a dedicated tox ini file.
We shouldn't have any references to this scenario in the main tox ini
file.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
On containerized CI jobs the playbook executed is purge-cluster.yml
but it should be set to purge-docker-cluster.yml
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
As of testinfra 2.0.0, the binary name is `py.test`.
But let's pin the version to 1.19.0.
Indeed, migrating to 2.0.0 requires our current testing to be reworked a bit.
Since we don't have the bandwidth ATM for this, it's better to simply
keep testing with testinfra 1.19.0.
Note that I've replaced all `testinfra` occurences by `py.test` anyway.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
I didn't use the `ceph/ubuntu-bionic` image because it's broken at the
time of writing this commit. I'll switch back to `ceph/ubuntu-bionic` as
soon as it will be fixed.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
using `!` mark in tox.ini doesn't work on comma separated list.
The idea here is to skip all containerized scenario in dev_setup.yml and
use the `!` for the update scenario.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
- switch_to_containers, ooo_collocation, podman should be able to specify
which os they are running on.
- those scenarios are only container scenario.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
this is useful when trying to debug an ansible module issue.
This will allow us to replay a failing module on a node when needed.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit reorganizes the testing directory layout.
The idea is to have more consistency with the names of scenario and
their corresponding path, eg: non-container vs. container: each scenario
has a subdirectory for container deployment.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
this commit refacts the way the environment are named by adding a factor
`{non_container,container}`. This will avoid a lot of duplicate
definition in tox.ini and bring kind of consistency.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
- update value for `CEPH_STABLE_RELEASE`: next release will ship with
`nautilus`. This variable is used for stable branch only, this way, it
will be ready when next stable version will be released.
- test upgrade from mimic to ceph@master: don't run dev_setup.yml on update
scenario, and run it in [update] section so we update from mimic to
ceph@master.
- run lvm_setup.yml for all scenarios except `lvm_batch`
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
remove useless environment variable definition since we know we only
test ceph@master on ceph-ansible master branch.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When calling shrink on containerized deployment, we were first doing the
setup with `latest-master` and then when calling the playbook we were
using the default value for `ceph_docker_image_tag` that comes from
ceph-defaults. Now we pass
`ceph_docker_image_tag={env:CEPH_DOCKER_IMAGE_TAG:latest-master}` so the
play will run the right container image.
Signed-off-by: Sébastien Han <seb@redhat.com>
the first cluster is using `latest-master` while the second is using
`latest` which is not the right version to be used here.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
we must apply this playbook before deploying the secondary cluster.
Otherwise, there will be a mismatch between the two deployed cluster.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Add a playbook that will upload a file on the master then try to get
info from the secondary node, this way we can check if the replication
is ok.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This will setup 2 cluster with rgw multisite enabled.
First cluster will act as the 'master', the 2nd will be the secondary
one.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since we are removing the ceph-disk test from the ci in master then
there is no need to have the functionnal tests in master anymore.
Signed-off-by: Sébastien Han <seb@redhat.com>
the line setting `ANSIBLE_CONFIG` obviously contains a typo introduced
by 1e283bf69b
`ANSIBLE_CONFIG` has to point to a path only (path to an ansible.cfg)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Adding testing scenarios for day-2-operation playbook.
Steps:
- deploys a cluster,
- run testinfra,
- test idempotency,
- add a new osd node,
- run testinfra
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
As of now, we should no longer support Jewel in ceph-ansible.
The latest ceph-ansible release supporting Jewel is `stable-3.1`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit does a couple of things:
* Avoid code duplication
* Clarify the code
* add more unit tests
* add myself to the author of the module
Signed-off-by: Sébastien Han <seb@redhat.com>
Using `UPDATE_*` environment variables here will make an upgrade of the
ceph release when running switch_to_containers scenario which is not
correct.
Eg:
If ceph luminous was first deployed, then we should switch to ceph
luminous containers, not to mimic.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The copy module does in fact do variable interpolation so we do not need
to use the template module or keep a template in the source.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Using an explicitly named testing environment name allows us to have a
specific [testenv] block for this test. This greatly simplifies how it will
work as it doesn't really anything from the ceph cluster tests.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This adds the action 'batch' to the ceph-volume module so that we can
run the new 'ceph-volume lvm batch' subcommand. A functional test is
also included.
If devices is defind and osd_scenario is lvm then the 'ceph-volume lvm
batch' command will be used to create the OSDs.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Let's create a dedicated environment for these scenarios, there is no
need to deploy everything.
By the way, doing so will save some times.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
`test_rbd_mirror_is_up()` is failing on update scenarios because it
assumes the `ceph_stable_release` is still set to the value of the
original ceph release, it means it won't enter in the right part of the
condition and fails.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
In addition to ceph/ceph-build#1082
Let's set the ansible version in each ceph-ansible branch's respective
requirements.txt.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
since no latest-bis-jewel exists, it's using latest-bis which points to
ceph mimic. In our testing, using it for idempotency/handlers tests
means upgrading from jewel to mimic which is not what we want do.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We should test ceph-ansible against the latest ansible stable version on
master.
This commit also remove the pinning to 1.7.1 version of testinfra
because ansible 2.5 requires a newer version.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
that may be helpful to know why a test has been skipped.
from pytest doc:
```
-r chars show extra test summary info as specified by chars
(f)ailed, (E)error, (s)skipped, (x)failed, (X)passed,
(p)passed, (P)passed with output, (a)all except pP.
Warnings are displayed at all times except when
--disable-warnings is set
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
since `latest` points to `mimic`, we need to force the test to keep the
same ceph release when testing anything else than `mimic`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
it looks "ansible~=2.4" install latest ansible release in 2.5 so we must
specify we want latest release but inferior to 2.5.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since latest points to mimic for the ceph container images, we need to
set `CEPH_DOCKER_IMAGE_TAG` to `latest-luminous` when ceph release is
luminous
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The admin key must be copied on the osd nodes only when we test the
shrink scenario. Shrink relies on ceph-disk commands that require the
admin key on the node where it's being executed.
Now we only copy the key when running on the shrink-osd scenario.
Signed-off-by: Sébastien Han <seb@redhat.com>
In the CI we can see at many times failures like following:
`Failure talking to yum: Cannot find a valid baseurl for repo:
base/7/x86_64`
It seems the fastest mirror detection is sometimes counterproductive and
leads yum to fail.
This fix has been added in the `setup.yml`.
This playbook was used until now only just before playing `testinfra`
and could be used before running ceph-ansible so we can add some
provisionning tasks.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Erwan Velu <evelu@redhat.com>
Currently tag-build-master-luminous-ubuntu-16.04 is not used anymore.
Also now, 'latest' points to CentOS so we need to make that switch here
too.
We know have latest tags for each stable release so let's use them and
point tox at them to deploy the right version.
Signed-off-by: Sébastien Han <seb@redhat.com>