generate all group_vars sample files corresponding to new roles added
for ceph-dashboard implementation.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
since stable-4.0 isn't to deploy ceph releases prior to nautilus,
there's no need to add this complexity here.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
use site.yml to deploy ceph-container-common in order to install docker
even in non-containerized deployments since there's no RPM available to
deploy the differents applications needed for ceph-dashboard.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
there is no need to add more complexity for this, let's use
`containerized_deployment` in order to detect if we are running a
containerized deployment.
The idea is to use `container_exec_cmd` the same way we do in the rest of
the playbook to run the different ceph commands needed to deploy the
ceph-dashboard role.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This is needed for the ceph-dashboard implementation since it requires
to run containerized application which aren't packaged as RPMs.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This adds support for podman in dashboard-related roles. It also drops
the creation of custom network for the dashboard-related roles as this
functionality works in a different way with podman.
Signed-off-by: Boris Ranto <branto@redhat.com>
We cannot use the old fashioned config-key way, here. It was not
supported when the option was introduced (post 14.2.0). Since the option
is not always supported we can simply ignore the potential failure on
ceph clusters that do not support it.
Signed-off-by: Boris Ranto <branto@redhat.com>
This commit adds a list of alerting rules for ceph-dashboard from the
old cephmetrics project. It also installs the configuration file so that
the rules get recognized by the prometheus server.
Signed-off-by: Boris Ranto <branto@redhat.com>
This commit will merge dashboard-ansible installation scripts with
ceph-ansible. This includes several new roles to setup ceph-dashboard
and the underlying technologies like prometheus and grafana server.
Signed-off-by: Boris Ranto & Zack Cerza <team-gmeno@redhat.com>
Co-authored-by: Zack Cerza <zcerza@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
c907ec41ae introduced a typo.
This commit fixes it.
```
[WARNING]: While constructing a mapping from /home/guits/ceph-ansible/tests/functional/dev_setup.yml, line 21, column 9, found a duplicate dict key (replace).
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We never clean the content of /var/lib/docker so we can still have
some data present in this directory after run the purge playbook.
Pip isn't used anymore.
Also update the docker package name (especially the python binding
one).
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Currently podman installation is very tied to RHEL 8 even if we're
able to install it on Debian/Ubuntu distribution.
This patch changes the way we are starting or not the (fat) container
daemon. Before the condition was based on the distribution release
and now on the container_service_name variable.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The shell module doesn't have a stdout_lines attributes. Instead of
using the shell module, we can use the find modules.
Also adding `become: false` to the local tmp directory creation
otherwise we won't have enough right to fetch the files into this
directory.
Resolves: #3966
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The old condition would resolve to
"when": "nfs_ganesha_stable - ceph_repository == 'community'"
now it is
"when": [
"nfs_ganesha_stable",
"ceph_repository == 'community'"
]
Please backport to stable-4.0
Signed-off-by: Bruceforce <markus.greis@gmx.de>
RHCS 4 will be based on Nautilus and only usable on RHEL 8.
Updated the default ceph_rhcs_version to 4 and update the rhcs
repositories to rhcs 4 with RHEL 8.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Set the application to rgw for pools created from rgw_create_pools. On Ceph Nautilus the heath is set to HEALTH_WARN with the message "application not enabled on X pool(s)" if an application isn't specified for a pool.
Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
We must stop tcmu-runner after the other rbd-target-* services
because they may need to interact with tcmu-runner during shutdown.
There is also a bug in some kernels where IO can get stuck in the
kernel and by stopping rbd-target-* first we can make sure all IO is
flushed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1659611
Signed-off-by: Mike Christie <mchristi@redhat.com>
The current lvm_osds only tests filestore on one OSD node.
We also have bs_lvm_osds to test bluestore and encryption.
Let's use only one scenario to test filestore/bluestore and with or
without dmcrypt on four OSD nodes.
Also use validate_dmcrypt_bool_value instead of types.boolean on
dmcrypt validation via notario.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
running an external ceph cluster deployment with (obviously) no
monitors defined in inventory breaks with an undefined error because
`_monitor_addresses` never get defined.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707460
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Add code in ceph-mgr for creating a keyring for manager in so that
managers can be deployed on a separate node too.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add a playbook that deploys manager on a new node and adds that node to
the already deployed Ceph cluster.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
We don't need infrastructure-playbooks/rgw-standalone.yml since
site.yml.sample and site-cotainer.yml.sample can add a new RGW node to
an already deployed Ceph cluster.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Except for some corner case, it's not correct to access some other
node's copy of variable docker_exec_cmd. Therefore replace
"hostvars[groups[mon_group_name][0]]['docker_exec_cmd']" by
"docker_exec_cmd".
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add a tox scenario that adds a new RGW node as a part of already
deployed Ceph cluster and deploys RGW there.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Adds "check_mode: no" to commands which register cluster state in a
variable and don't modify anything. These commands have to run in order
to support running the playbook in check mode.
Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>
Add a tox scenario that adds a new RBD mirror node as a part of already
deployed Ceph cluster and deploys RBD mirror there.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
In containerized deployment the default mds cpu quota is too low
for production environment.
This is causing performance degradation compared to bare-metal.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695850
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
In containerized deployment the default osd cpu quota is too low
for production environment using NVMe devices.
This is causing performance degradation compared to bare-metal.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695880
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Only rbd-target-api and rbd-target-gw were started/enabled for non
containerized deployment.
The issue doesn't happen with containerized setup.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Instead of creating a dedicated test and using the same testinfra
module we can group them into a single test to avoid multiple ansible
connections and testinfra module execution.
This patch also adds parametrize pytest decorator when possible.
Finally fixing some flake minor issue.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
update scenario is now handled by tox-update.ini file so we shoudn't
have update reference in tox.ini file.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
b2f2426 didn't use the generate_group_vars_sample.sh script so we
currently have a difference between the content in group_vars and the
ceph-defaults/defaults directories.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Currently only rbd-target-gw service is restarted during an update.
We also need to restart tcmu-runner and rbd-target-api services
during the ceph iscsi upgrade.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1659611
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Typical error:
```
AttributeError: 'Invalid' object has no attribute 'message'
```
As of python 2.6, `BaseException.message` has been deprecated.
When using python3, it fails because it has been removed.
Let's use `str(error)` instead so we don't hit this error when using
python3.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>