This commit [1] seems to have broken a selinux policy preventing nfs-ganesha from
starting properly.
Since we can't address the issue in ceph-ansible, let's disable temporarily nfs-ganesha testing.
[1] dae2da63d5
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Configure repository for cephadm installation and use package install in both containerized and non containerized deployment
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
Otherwise the clients won't be able to reconnect after the reboot in the
all_daemons and collocation jobs.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
for some reason, `quay.io/app-sre/grafana` no longer exist.
as a workaround, all dashboard related images have been mirrored on
quay.ceph.io.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
nfs-ganesha builds in shaman are broken.
This commit disables nfs-ganesha testing in order to unlock the CI.
This is a temporary commit intented to be reverted.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
It has come to our attention that using ansible_* vars that are
populated with INJECT_FACTS_AS_VARS=True is not very performant. In
order to be able to support setting that to off, we need to update the
references to use ansible_facts[<thing>] instead of ansible_<thing>.
Related: ansible#73654
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1935406
Signed-off-by: Alex Schultz <aschultz@redhat.com>
we aren't deploying enough OSD daemon, so it fails like following:
```
stderr: 'Error ERANGE: pool id 10 pg_num 256 size 2 would mean 1536 total pgs, which exceeds max 1500 (mon_max_pg_per_osd 250 * num_in_osds 6)'
```
Let's increase the value of `mon_max_pg_per_osd` in order to get around
this issue in the CI.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This update the grafana container tag to 6.7.4.
The RHCS version is now based on the RHCS 5 container image which is
also based on 6.7.4.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
`all_daemons` scenario can't handle pools with `size: 3` because we have
1 osd node in root=HDD and two nodes in root=default.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This reverts commit 7348e9a253.
Since the nfs-ganesha rpm build for CentOS 8 has been fixed, and
the nfs-ganesha segfault caused by an issue in librgw has also been
fixed.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
this commit changes defaults value in default pool definitions.
there's no need to define `pg_num`, `pgp_num`, `size` and `min_size`,
`ceph_pool` module will use the current default if needed.
This also drops the 3 following `set_fact` in `ceph-facts`:
- osd_pool_default_pg_num,
- osd_pool_default_pgp_num,
- osd_pool_default_size_num
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This change default value of grafana-server group name.
Adding some tasks in ceph-defaults in order to keep backward
compatibility.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The `enable extras on centos` task just doesn't work when using the
variable ceph_docker_enable_centos_extra_repo to true.
fatal: [xxx]; FAILED! => {"changed": false, "msg": "Parameter
'baseurl', 'metalink' or 'mirrorlist' is required."}
The CentOS extras repository is enabled by default so it's pretty
safe to remove this task and the associated variable.
This also removes the ceph_docker_on_openstack variable as it's a
leftover and it is unused.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Looks like nfs-ganesha 3.3 and 4.-dev doesn't work with recent changes
in librgw 16.0.0.
The nfs-ganesha daemon is segfaulting and restart in a loop.
See https://tracker.ceph.com/issues/47520
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This changes the grafana container image regitry from docker.io to
quay.io to avoid rate limit.
This also adds the missing container image values for docker2podman
and podman scenarios.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This commit diables nfs-ganesha testing on master for non-containerized
deployment because the dev repos are broken at the moment.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This re-adds the ceph-iscsi testing for both non containerized and
containerized deployment since the rados connection error on ceph
dev has been fixed [1].
[1] https://tracker.ceph.com/issues/47002
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This commit moves the erasure pool creation testing from `all_daemons`
to `lvm_osds` so we can decrease the number of osd nodes we spawn so the
OVH Jenkins slaves aren't less overwhelmed when a `all_daemons` based
scenario is being tested.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since we've dropped ubuntu testing, we don't need these inventories
anymore. Let's remove this leftover.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Temporarily disable iscsigw testing for containerized deployments
because it's broken upstream on ceph@master.
non-containerized deployments use stable build for iscsigw to get around
this issue.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since it is broken at the moment with dev repos, let's test against
stable builds so the CI is unlocked.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since we only have one scenario since nautilus then we can just move
the container start command from ceph-osd-run.sh to the systemd unit
service.
As a result, the ceph-osd-run.sh.j2 template and the
ceph_osd_docker_run_script_path variable are removed.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
setting attributes with empty string is a bad user input.
Also, removing `rule_name` attribute when creating a code erasure pool.
(this rule isnt intended for code erasure pool type).
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since [1] we can't use osd pool without replicas (size: 1) by default.
We now need to set the mon_allow_pool_size_one flag to true in the ceph
configuration and add the --yes-i-really-mean-it flag to the osd pool
set size cli.
[1] https://github.com/ceph/ceph/commit/21508bd
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This commit makes the CI testing an OSD pool erasure creation due to the
recent refact of the OSD pool creation tasks in the playbook.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Having max_mds value equals to the number of mds nodes generates a
warning in the ceph cluster status:
cluster:
id: 6d3e49a4-ab4d-4e03-a7d6-58913b8ec00a'
health: HEALTH_WARN'
insufficient standby MDS daemons available'
(...)
services:
mds: cephfs:3 {0=mds1=up:active,1=mds0=up:active,2=mds2=up:active}'
Let's use 2 active and 1 standby mds.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The current ceph cluster health is in warning state:
health: HEALTH_WARN
13 pool(s) have no replicas configured
2 pool(s) have non-power-of-two pg_num
Because we're using only 1 replica then we need to disable the redundancy
check.
The pool pg num should be a power of two number (like 16).
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This adds device class support to crush rules when using the class key
in the rule dict via the create-replicated sub command.
If the class key isn't specified then we use the create-simple sub
command for backward compatibility.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This was added for debugging purpose.
It's generating very large log output, let's remove this now.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>