This test deploys a luminous cluster with ceph-disk created osds
and then upgrades to nautilus and migrates those osds to ceph-volume.
The nodes are then rebooted and cluster state verified.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 399a821439)
Since nautilus and msgr2 the monitors also bind on port 3300 in
addition of 6789.
This patch updates test_mons to reflect that change.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c84a74592a)
Since there's only only scenario available we don't need lvm_scenario
and no_lvm_scenario.
Also add missing assert for ceph-volume tests.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f601549a8a)
Add a playbook that deploys a new monitor on a new node, adds that node
to the Ceph cluster and the monitor to the quorum and updates the ceph
configuration file on OSD nodes.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit d5967af7fb)
this test is related to ceph-disk which is dropped as of stable-4.0
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 83e84c6a4a)
As of stable-4.0, the only valid scenario is `lvm`.
Thus, this makes this variable useless.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4d35e9eeed)
It's usefull to have logs in debug mode enabled in order to have
more information for developpers.
Also reindent to json file.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d25af1b872)
We don't need to have multiple ceph-override.json copies. We
currently already have symlink to all_daemons/ceph-override.json so
we can do it for all scenarios.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a19054be18)
ceph-volume didn't work when the devices where passed by path.
Since it now support it, let's allow this feature in ceph-ansible
Closes: #3812
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7e0adca7a4)
Add a tox scenario that adds an new MDS node as a part of already
deployed Ceph cluster and deploys MDS there.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit c0dfa9b61a)
looks like newer version of pytest-xdist requires pytest>=4.4.0
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ba0a95211c)
On stable-4.0 branch we don't want to use dev setup but stable
release (nautilus).
Also update the container image tag to reflect this change.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
os.path.join adds the separator (i.e. '/') between the provided path
components only if needed. Providing a single path component doesn't
lead to any checks.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
otherwise a bunch of jobs will fail like following:
```
[WARNING]: Unable to parse /home/jenkins-build/build/workspace/ceph-ansible-nightly-luminous-ubuntu-container-stable-3.2-bluestore_lvm_osds/tests/functional/bs-lvm-osds/container/hosts-ubuntu as an inventory source
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
As of testinfra 2.0.0, the binary name is `py.test`.
But let's pin the version to 1.19.0.
Indeed, migrating to 2.0.0 requires our current testing to be reworked a bit.
Since we don't have the bandwidth ATM for this, it's better to simply
keep testing with testinfra 1.19.0.
Note that I've replaced all `testinfra` occurences by `py.test` anyway.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
`test_mon_host_line_has_correct_value()` will cover this test in
anycase. It doesn't worth to have a dedicated test for this.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
nfs-ganesha v2.5 and 2.6 have hit EOL. Install nfs-ganesha v2.7
stable that is currently being maintained.
Signed-off-by: Ramana Raja <rraja@redhat.com>
using `!` mark in tox.ini doesn't work on comma separated list.
The idea here is to skip all containerized scenario in dev_setup.yml and
use the `!` for the update scenario.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
fix the wrong path used in various rgw testinfra tests.
set `1` as default value for `radosgw_num_instances`: if
`ansible_vars.get(radosgw_num_instances)` returns `None`, we can assume
there's only 1 instance since it's the default value in ceph-defaults.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
since we now set this variable in inventory host, the regexp needs to be
updated, the assignment operator is `=`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit reorganizes the testing directory layout.
The idea is to have more consistency with the names of scenario and
their corresponding path, eg: non-container vs. container: each scenario
has a subdirectory for container deployment.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
this commit refacts the way the environment are named by adding a factor
`{non_container,container}`. This will avoid a lot of duplicate
definition in tox.ini and bring kind of consistency.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
sometimes it can fail because the version of `six` package is prior to
1.10.0.
This commit ensures the version is enforced.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
With this, we could have multiple rgw instances on a single host
with a single run, don't have to use rgw-standalone.yml which does not
seems able to bind ports separately.
If you want to have multiple rgw instances, just change 'radosgw_instances'
to the number you want, which defaults to 1.
Not compatible with Multi-Site yet.
Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>
to avoid duplicating code in `site.yml.sample`, `site-docker.yml.sample`
and `setup.yml`, let's isolate this part of the code and simply include
it each time we need it.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since msgr2 changes got merged, the OSDs in master (to be nautilus) will
double the amount of ports they listen to.
Signed-off-by: Alfredo Deza <adeza@redhat.com>
Based on https://github.com/ceph/ceph-container/pull/1269 and given
there are no stable packages and reliable repository, we disable nfs
ganesha temporarly.
Signed-off-by: Sébastien Han <seb@redhat.com>
Since we now collocated mgrs and mons on the same machine we have to
remove the mgrs section, they are not needed anymore.
Signed-off-by: Sébastien Han <seb@redhat.com>
This will speed up the deployment and also deploy mon and mgr collocated
just as recommended.
This won't prevent you of adding more and dedicaded machines for mgr if
needed.
Signed-off-by: Sébastien Han <seb@redhat.com>
bring the recent refact about `osd_pool_default_pg_num` and
`osd_pool_default_size` into podman scenario as well.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since we are now testing on docker and podman our functionnal tests must
reflect that. So now, if we detect the podman binary we will use it,
otherwise we default to docker.
Signed-off-by: Sébastien Han <seb@redhat.com>
We run an initial deployment with `osd_pool_default_size: 1` in
`ceph_conf_overrides`.
When re-running the playbook to test idempotency and handlers, we reset
`ceph_conf_overrides`, we must append a new value instead of just
overwritting it, otherwise, this can lead to error in the CI.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
`osd_pool_default_pg_num` parameter is set in `ceph-mon`.
When using ceph-ansible with `--limit` on a specifc group of nodes, it
will fail when trying to access this variables since it wouldn't be
defined.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
setting this setting to 1 makes the CI covering the related code in the
playbook without breaking the upgrade scenarios.
Those scenarios were broken because there is a check `TASK [waiting for
clean pgs...]` in rolling_update.yml, since the pool size for
`cephfs_metadata` and `cephfs_data` are updated to `2` in
`ceph-override.json` and there is not enough osd to honor this size,
some PGs are degraded and make the mentioned check failing.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
** configuration seems to be for filestore:
[ERROR]: [ceph-osd0] Validation failed for variable: lvm_volumes
** Removing `radosgw_interface: eth1` to resolve:
The task includes an option with an undefined variable. The error was:
'ansible.vars.hostvars.HostVarsVars object' has no attribute
u'ansible_eth1'
The error appears to have been in
'/home/nwatkins/src/ceph-ansible/roles/ceph-defaults/tasks/set_radosgw_address.yml':
line 21, column 5, but may be elsewhere in the file depending on the
exact syntax problem.
The offending line appears to be:
- name: set_fact _radosgw_address to radosgw_interface - ipv4
^ here
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Let's test ceph-ansible master against ansible 2.7 to catch early any
potential issue with this ansible version.
Closes: #3148
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Adding more memory to VMs for rgw_multisite scenarios could avoid this error
I have recently hit in the CI:
(It is worth it to set 1024Mb since there is only 2 nodes in those
scenarios.)
```
fatal: [osd0]: FAILED! => {
"changed": false,
"cmd": [
"docker",
"run",
"--rm",
"--entrypoint",
"/usr/bin/ceph",
"docker.io/ceph/daemon:latest-luminous",
"--version"
],
"delta": "0:00:04.799084",
"end": "2018-10-29 17:10:39.136602",
"rc": 1,
"start": "2018-10-29 17:10:34.337518"
}
STDERR:
Traceback (most recent call last):
File "/usr/bin/ceph", line 125, in <module>
import rados
ImportError: libceph-common.so.0: cannot map zero-fill pages: Cannot allocate memory
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Add a playbook that will upload a file on the master then try to get
info from the secondary node, this way we can check if the replication
is ok.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This will setup 2 cluster with rgw multisite enabled.
First cluster will act as the 'master', the 2nd will be the secondary
one.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We do not use @<device> anymore so we don't need to perform the
readlink check anymore.
Also we are making an exception for ooo which is still using ceph-disk.
Signed-off-by: Sébastien Han <seb@redhat.com>
ceph-disk is now deprecated in ceph-ansible so let's convert all the ci
tests to use lvm instead of ceph-disk.
Signed-off-by: Sébastien Han <seb@redhat.com>
Since we are removing the ceph-disk test from the ci in master then
there is no need to have the functionnal tests in master anymore.
Signed-off-by: Sébastien Han <seb@redhat.com>
since we set `configure_firewall: true` in
`ceph-defaults/defaults/main.yml` there is no need to explicitly set it
in `centos7_cluster` and `docker_cluster` testing scenarios.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This approach doesn't work with all scenarios because it's comparing a
local OSD number expected to a global OSD number found in the whole
cluster.
This reverts commit b8ad35ceb9.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Let's get the osd tree from mons instead on osds.
This way we don't have to predict an OSD container name.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Three fixes:
- fix a typo in vagrant_variables that cause a networking issue for
containerized scenario.
- add containerized_deployment: true
- remove a useless block of code: the fact docker_exec_cmd is set in
ceph-defaults which is played right after.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Adding testing scenarios for day-2-operation playbook.
Steps:
- deploys a cluster,
- run testinfra,
- test idempotency,
- add a new osd node,
- run testinfra
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
As of now, we should no longer support Jewel in ceph-ansible.
The latest ceph-ansible release supporting Jewel is `stable-3.1`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The CI is still running ceph-disk tests upstream. So until
https://github.com/ceph/ceph-ansible/pull/3187 is merged nothing will
pass anymore.
Signed-off-by: Sébastien Han <seb@redhat.com>
This commit does a couple of things:
* Avoid code duplication
* Clarify the code
* add more unit tests
* add myself to the author of the module
Signed-off-by: Sébastien Han <seb@redhat.com>
not gathering fact causes `package` module to fail because it needs to
detect which OS we are running on to select the right package manager.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
get more coverage by adding an RGW daemon collocated on osd0.
We've missed a bug in the past which could have been caught earlier in
the CI.
Let's add this additional daemon in order to have a better coverage.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
If this is set to anything other than the default value of 1 then the
--osds-per-device flag will be used by the batch command to define how
many osds will be created per device.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
There is no point of using hosts running on atomic AND centos hosts. So
let's run containerized scenarios on Atomic only.
This solves this error here:
```
fatal: [client2]: FAILED! => {
"failed": true
}
MSG:
The conditional check 'ceph_current_status.rc == 0' failed. The error was: error while evaluating conditional (ceph_current_status.rc == 0): 'dict object' has no attribute 'rc'
The error appears to have been in '/home/jenkins-build/build/workspace/ceph-ansible-nightly-luminous-stable-3.1-ooo_collocation/roles/ceph-defaults/tasks/facts.yml': line 74, column 3, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
- name: set_fact ceph_current_status (convert to json)
^ here
```
From https://2.jenkins.ceph.com/view/ceph-ansible-stable3.1/job/ceph-ansible-nightly-luminous-stable-3.1-ooo_collocation/37/consoleFull#1765217701b5dd38fa-a56e-4233-a5ca-584604e56e3a
What's happening here is all the hosts excepts the clients are running atomic, so here: https://github.com/ceph/ceph-ansible/blob/master/site-docker.yml.sample#L62
The condition will skipped all the nodes excepts the clients, thus when running ceph-default, the task "is ceph running already?" is skipped but the task above needs the rc of the skipped task.
This is not an error from the playbook, it's a CI setup issue.
Signed-off-by: Sébastien Han <seb@redhat.com>
Using an explicitly named testing environment name allows us to have a
specific [testenv] block for this test. This greatly simplifies how it will
work as it doesn't really anything from the ceph cluster tests.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This adds the action 'batch' to the ceph-volume module so that we can
run the new 'ceph-volume lvm batch' subcommand. A functional test is
also included.
If devices is defind and osd_scenario is lvm then the 'ceph-volume lvm
batch' command will be used to create the OSDs.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
jewel used to create a default `rbd` pool in the default crush root
`default`, we need to have at least 1 osd to satisfy the PGs for this
created pool, otherwise the cluster will be in HEALTH_ERR state because
of `pgs stuck unclean`/`pgs stuck inactive`
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Let's create a dedicated environment for these scenarios, there is no
need to deploy everything.
By the way, doing so will save some times.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Ansible 2.4 is currently end-of-life.
Ansible 2.5 will go end-of-life after Ansible 2.7 is released.
Fixes: #2901
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The group_vars/all file is not available on 'ooo-collocation' scenario,
it's making the `dev_setup.yml` failing because this path is hardcoded.
The idea here is to check if the pattern 'ooo-collocation' is present in
`change_dir` variable so we can set this path properly according to the
scenario being run.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
`test_rbd_mirror_is_up()` is failing on update scenarios because it
assumes the `ceph_stable_release` is still set to the value of the
original ceph release, it means it won't enter in the right part of the
condition and fails.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
34f70428 has introduced a fix using `command` module while this could
have been achieved by using `lvol` module.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We must initialize `children` variable in `_get_osd_id_from_host()`,
otherwise, if for any reason the deployment has failed and result with
an osd host with no OSD registered, we won't enter in the condition,
therefore, `children` is never set and the function tries to return
something undefined.
Typical error:
```
E UnboundLocalError: local variable 'children' referenced before assignment
```
Fixes: #2860
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We should test ceph-ansible against the latest ansible stable version on
master.
This commit also remove the pinning to 1.7.1 version of testinfra
because ansible 2.5 requires a newer version.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
avoid duplicating test unnecessarily just because of docker exec syntax.
Using the same logic than in the playbook with `docker_exec_cmd` allow us
to execute the same test on both containerized and non containerized environment.
The idea is to set a variable `docker_exec_cmd` with the
'docker exec <container-name>' string when containerized and
set it to '' when non containerized.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
these tests are skipped on bluestore osds scenarios.
they were going to fail anyway since they are run on mon nodes and
`devices` is defined in inventory for each osd node. It means
`num_devices * num_osd_hosts` returns `0`.
The result is that the test expects to have 0 OSDs up.
The idea here is to move these tests so they are run on OSD nodes.
Each OSD node checks their respective OSD to be UP, if an OSD has 2
devices defined in `devices` variable, it means we are checking for 2
OSD to be up on that node, if each node has all its OSD up, we can say
all OSD are up.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
since ooo_collocation scenario is supposed to be the same scenario than the
one tested by OSP and they are not passing `rgw_create_pools` the test
`test_docker_rgw_tuning_pools_are_set` will fail:
```
> pools = node["vars"]["rgw_create_pools"]
E KeyError: 'rgw_create_pools'
```
skipping this test if `node["vars"]["rgw_create_pools"]` is not defined
fixes this failure.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
At the moment, a lot of tests are skipped when daemons are collocated.
Our tests consider a node belong to only 1 group while it's possible for
certain scenario it can belong to multiple groups.
Also pinning to pytest 3.6.1 so we can use `request.node.iter_markers()`
Co-Authored-by: Alfredo Deza <adeza@redhat.com>
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
It might happen that the list of ips/hosts in following line (ceph.conf)
- `mon initial memebers = <hosts>`
- `mon host = <ips>`
are not ordered the same way depending on deployment.
This patch makes the tests looking for each ip or hostname in respective
lines.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
adding more node in this scenario could help to have a better coverage
so we can catch more potential bugs.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
since `latest` points to `mimic`, we need to force the test to keep the
same ceph release when testing anything else than `mimic`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
we see more and more failure like `fatal: [mon0]: UNREACHABLE! => {}` in
`centos7_cluster` scenario, Since we have 30Gb RAM on hypervisors, we
can give monitors a bit more RAM. By the way, nodes on containerized cluster
testing scenario have already 1024Mb memory allocated.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We don't have the same service running on non-container for now, this
will change soon but for let's only run the test on container.
Signed-off-by: Sébastien Han <seb@redhat.com>
Let's try to avoid using dashes as testinfra needs to be able to read
the groups.
Typically, with iscsi-gws we can't add a marker for these iscsi nodes,
using an underscore fixes the issue.
Signed-off-by: Sébastien Han <seb@redhat.com>
We test if:
* packages are installed
* services are runnning
* service units are enabled
Also fix linting issues
Signed-off-by: Sébastien Han <seb@redhat.com>
Update the inventory host for tripleo testing scenario so it's the same
parameters than in tripleo CI.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Functional tests are broken when testing against 'dev' release (ceph).
Adding a dummy value here will make it possible to run ceph-ansible CI
against dev ceph release.
Typical error:
```
> if request.node.get_marker("from_luminous") and ceph_release_num[ceph_stable_release] < ceph_release_num['luminous']:
E KeyError: 'dev'
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fd1487d93f21b609a637053f5b33cd2a4e408d00)
the expected number of mds daemon consist of number of daemons that are
'up' + number of daemons 'up:standby'.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
41b4632 has introduced a change in functionnals tests.
Since the admin keyring isn't copied on rgw nodes anymore in tests, let's use
the rgw keyring to achieve them.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The admin key must be copied on the osd nodes only when we test the
shrink scenario. Shrink relies on ceph-disk commands that require the
admin key on the node where it's being executed.
Now we only copy the key when running on the shrink-osd scenario.
Signed-off-by: Sébastien Han <seb@redhat.com>
Refact of 8704144e31
There is no need to have duplicated tasks for this. The rgw pools
creation should be delegated on a monitor node se we don't have to care
if the admin keyring is present on rgw node.
By the way, only one task is needed to create the pools, we just need to
use the `docker_exec_cmd` fact already defined in `ceph-defaults` to
achieve it.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1550281
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
During the tests, the remote epel repository is generating a lots of
errors leading to broken jobs (issue #2666)
This patch is about using a local repository instead of a random one.
To achieve that, we make a preliminary install of epel-release, remove
the metalink and enforce a baseurl to our local http mirror.
That should speed up the build process but also avoid the random errors
we face.
This patch is part of a patch series that tries to remove all possible yum failures.
Signed-off-by: Erwan Velu <erwan@redhat.com>
ceph command has to be executed from one of the monitor containers
if not admin copy present in RGWs. Task has to be delegated then.
Adds test to check proper RGW pool creation for Docker container scenarios.
Signed-off-by: Jorge Tudela <jtudelag@redhat.com>
When playing ceph-mds role, mon nodes have set a fact with the default
pg num for osd pools, we can simply default to this value for cephfs
pools (`cephfs_pools` variable).
At the moment the variable definition for `cephfs_pools` looks like:
```
cephfs_pools:
- { name: "{{ cephfs_data }}", pgs: "" }
- { name: "{{ cephfs_metadata }}", pgs: "" }
```
and we have a task in `ceph-validate` to ensure `pgs` has been set to a
valid value.
We could simply avoid this check by setting the default value of `pgs`
to `hostvars[groups[mon_group_name][0]]['osd_pool_default_pg_num']` and
let to users the possibility to override this value.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1581164
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
`requirements2.5.txt` is pointing to `tests/requirements2.4.txt` while
it should point to `requirements2.4.txt` since they are in the same
directory.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
For a few moment we can see failures in the CI for containerized
scenarios because VMs are running out of space at some point.
The default in the images used is to have only 3Gb for root partition
which doesn't sound like a lot.
Typical error seen:
```
STDERR:
failed to register layer: Error processing tar file(exit status 1): open /usr/share/zoneinfo/Atlantic/Canary: no space left on device
```
Indeed, on the machine we can see:
```
Every 2.0s: df -h Tue May 29 17:21:13 2018
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/atomicos-root 3.0G 3.0G 14M 100% /
```
The idea here is to expand this partition with all the available space
remaining by issuing an `lvresize` followed by an `xfs_growfs`.
```
-bash-4.2# lvresize -l +100%FREE /dev/atomicos/root
Size of logical volume atomicos/root changed from <2.93 GiB (750 extents) to 9.70 GiB (2484 extents).
Logical volume atomicos/root successfully resized.
```
```
-bash-4.2# xfs_growfs /
meta-data=/dev/mapper/atomicos-root isize=512 agcount=4, agsize=192000 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0 spinodes=0
data = bsize=4096 blocks=768000, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
data blocks changed from 768000 to 2543616
```
```
-bash-4.2# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/atomicos-root 9.7G 1.4G 8.4G 14% /
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
In the CI we can see at many times failures like following:
`Failure talking to yum: Cannot find a valid baseurl for repo:
base/7/x86_64`
It seems the fastest mirror detection is sometimes counterproductive and
leads yum to fail.
This fix has been added in the `setup.yml`.
This playbook was used until now only just before playing `testinfra`
and could be used before running ceph-ansible so we can add some
provisionning tasks.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Erwan Velu <evelu@redhat.com>
let's move this variable in group_vars/all.yml in all testing scenarios
accordingly to this commit 1f15a81c48 so
we keep consistency between the playbook and the tests.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When deploying a large number of OSD nodes it can be an issue because the
protection check [1] won't pass since it tries to create pools before all
OSDs are active.
The idea here is to move openstack pools creation at the end of `ceph-osd` role.
[1] e59258943b/src/mon/OSDMonitor.cc (L5673)
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The NSS PKI database is needed only if radosgw_keystone_ssl
is explicitly set to true, otherwise the SSL integration is
not enabled.
It is worth noting that the PKI support was removed from Keystone
starting from the Ocata release, so some code paths should be
changed anyway.
Also, remove radosgw_keystone, which is not useful anymore.
This variable was used until fcba2c801a.
Now profiles drives the setting of rgw keystone *.
Signed-off-by: Luigi Toscano <ltoscano@redhat.com>
As of ceph 12.2.5 the type of the parameter `type` is not a name anymore but
an id, therefore an `int` is expected otherwise it will fail with the
following error
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
in case of multimds we must check for the number of mds up instead of
just checking if the hostname of the node is in the fsmap.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>