Since we've changed to podman configuration using the detach mode and
systemd type to forking then the container logs aren't present in the
journald anymore.
The default conmon log driver is using k8s-file.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1890439
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 16cd183b9c)
libjemalloc1 package is not required neither for ganesha dependency nor
for the package build process. So this task can be simply dropped.
Signed-off-by: Dmitriy Rabotyagov <noonedeadpunk@ya.ru>
(cherry picked from commit 297532ca41)
This keyring shouldn't be copied when `nfs_obj_gw` is `True` if the
cluster doesn't contain a rgw node, which can be the case given we are
using `nfs_obj_gw` instead of `nfs_file_gw` (cephfs vs. object), the
deployment will fail trying to copy a key that doesn't exist.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit dd4b5b0328)
In case of failure, the systemd ExecStop isn't executed so the container
isn't removed. After a reboot of a failed node, the container doesn't
start because the old container is still present in created state.
We should always try to remove the container in ExecStartPre for this
situation.
A normal reboot doesn't trigger this issue and this also doesn't affect
nodes running containers via docker.
This behaviour was introduced by d43769d.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1858865
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 47b7c00287)
This changes the way we are running the podman containers via systemd.
They are now in dettached mode and Type/PIDFile set.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1834974
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d43769dc2a)
When using docker container engine then the systemd unit scripts only
use a dependency on the docker daemon via the After parameter.
But if docker is restarted on a live system then the ceph systemd units
should wait for the docker daemon to be fully restarted.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846830
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit bd22f1d1ec)
The current ganesha log directory is only present in the container
and not bind mount on the host.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 222fe4abd8)
Since ea2b654d9 we're not running the rados command from the monitor
nodes but from the ganesha node. Unfortunately we don't have the
required keyring on that node to run the rados command as we don't
import the right keyring.
This commit restores the workflow for internal ganesha deployment like
before ea2b654d9 but keeps the rados commands from the ganesha node for
external deployment until we have a better design.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8a890306ad)
Fix the condition on the keyring copy task that prevent the ganesha
keyring to be created in the /var/lib/ceph directory.
Also ensure that the directory exists first.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1831285
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 748ac4b928)
The condition is missing an index here which makes the playbook failing.
Typical error:
```
The conditional check 'not item.get('skipped', False)' failed. The error was: error while evaluating conditional (not item.get('skipped', False)): 'list object' has no attribute 'get'",
```
Also, adds the missing '/keyring' on the `exec_cmd_nfs` fact.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1831342
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cf460274c7)
This commit creates an empty rados index object even when deploying
standalone nfs-ganesha.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822328
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ea2b654d95)
Because we are relying on docker|podman for managing containers then we
don't need systemd to manage the process (like kill).
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 5a03e0ee1c)
Since nfs-ganesha 2.8.3 the rados-urls library has been move to a
dedicated package.
We don't have the same nfs-ganesha 2.8.x between the community and rhcs
repositories.
community: 2.8.1
rhcs: 2.8.3
As a workaround we will install that package only for rhcs setup.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0a3e85e8ca)
The ceph_nfs_ceph_user variable is a string for the ceph-nfs role but a
list in ceph-client role.
6a6785b introduced a confusion between both variable type in the ceph-nfs
role for external ceph with ganesha.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1801319
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 10951eeea8)
When unsetting the noup flag, we must call container_exec_cmd from the
delegated node (first mon member)
Also, adding a `run_once: true` because this task needs to be run only 1
time.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792320
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 22865cde9c)
Since RHEL 8.1 we need to add the ganesha_t type to the permissive
SELinux list.
Otherwise the nfs-ganesha service won't start.
This was done on RHEL 7 previously and part of the nfs-ganesha-selinux
package on RHEL 8.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786110
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d758125290)
this task is a leftover and no longer needed.
It even causes bug when collocating nfs with mon.
Closes: #4609
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b63bd13073)
This commit removes some legacy tasks.
These tasks aren't needed, they cause the playbook to fail when
collocating daemons.
Closes: #4553
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 273413186a)
This commit moves this task in order to stop the nfs server service
regardless the deployment type desired (containerized or non
containerized).
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6c6a512a72)
The syntax here wasn't working, this refact fixes this task.
Also, removing the `ignore_errors: true` which was hidding the failure.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 47034effe0)
There is no need to get n * number of nodes the different keyrings.
Adding a `run_once: true` here avoid running a ceph command too many
times which could be impacting large cluster deployment.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9bad239d77)
This commit isolates the systemd unit files generation for containers into
separate yml files in order to be able importing each corresponding roles
without playing all tasks.
This is needed so we can run ceph-ansible to render systemd unit files
so they call podman instead of docker.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit bd64167469)
Depending on the infrastruture (w/o kerberos auth) then the SecType
value could be different.
Currently this value is hardcoded in the NFS Ganesha template. Instead
we can use a variable.
The default value is still the same to avoid breaking the backward
compatibility.
Closes: #4459
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ca77d7bd31)
Ganesha cannot be operated active/active, in those deployments
where it is managed by pacemaker the container name can be
different than the default.
This change uses "ceph_nfs_service_suffix" where previously
missing to ensure tasks will work with customized names.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1750005
Signed-off-by: Giulio Fidente <gfidente@redhat.com>
(cherry picked from commit d2a2bd7c42)
This commit makes it possible to parametrize the ceph directories modes.
So it changes hardocded mode for ceph related directories from 0755 to
customizable with `ceph_directories_mode` variable.
Closes: #2920
Signed-off-by: Artur Fijalkowski <artur.fijalkowski@ing.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 011270ca69)
To address this warning:
```
[DEPRECATION WARNING]: evaluating nfs_ganesha_dev as a bare variable, this
behaviour will go away and you might need to add |bool to the expression in the
future
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2b9fb377a8)
This task is already present in pre_requisite_non_container.yml
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit edb8d42596)
The ansible_lsb fact is based on the lsb package (lsb-base,
lsb-release or redhat-lsb-core).
If the package isn't installed on the remote host then the fact isn't
populated.
--------
"ansible_lsb": {},
--------
Switching to the ansible_distribution_release fact instead.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit dc187ea6fa)
We already set the become flag to true at a play level in the site*
playbooks so we don't need to set it at a task level.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 7c3640177b)
The definitions of cephfs pools should match openstack pools.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Co-Authored-by: Simone Caronni <simone.caronni@teralytics.net>
(cherry picked from commit 67071c3169)
When using podman, the systemd unit scripts don't have a dependency
on the network. So we're not sure that the network is up and running
when the containers are starting.
With docker this behaviour is already handled because the systemd
unit scripts depend on docker service which is started after the
network.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f49090df7e)
By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these:
```
[DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable,
this behaviour will go away and you might need to add |bool to the expression
in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This
feature will be removed in version 2.12. Deprecation warnings can be disabled
by setting deprecation_warnings=False in ansible.cfg.
```
Now appended ``| bool`` on a lot of the affected variables.
Sometimes the coding style from ``variable|bool`` changed to ``variable | bool`` *(with spaces at the pipe)*.
Closes: #4022
Signed-off-by: L3D <l3d@c3woc.de>
(cherry picked from commit ab54fe20ec)
This commits allows to deploy an internal ganesha with an external ceph
cluster.
This requires to define `external_cluster_mon_ips` with a comma
separated list of external monitors.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1710358
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6a6785b719)
789cef7 introduces a regression in the ganesha configuration file
generation. The new config_template module version broke it.
But the ganesha.conf file isn't an ini file and doesn't really
need to use the config_template module. Instead we can use the
classic template module.
Resolves: #4045
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 616c484698)
Because ansible_distribution_version doesn't return minor version on
CentOS with ansible 2.8 we can apply the selinux anyway but only for
CentOS/RHEL 7.
Starting RHEL 8, there's a dedicated package for selinux called
nfs-ganesha-selinux [1].
Also replace the command module + semanage by the selinux_permissive
module.
[1] https://github.com/nfs-ganesha/nfs-ganesha/commit/a7911f
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0ee833432e)
This commit renames the `docker_exec_cmd` variable to
`container_exec_cmd` so it's more generic.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e74d80e72f)
If we do this in one line we get the error described in #3968fixes#3968
Signed-off-by: Bruceforce <markus.greis@gmx.de>
(cherry picked from commit c3b0ee30a1)
The old condition would resolve to
"when": "nfs_ganesha_stable - ceph_repository == 'community'"
now it is
"when": [
"nfs_ganesha_stable",
"ceph_repository == 'community'"
]
Please backport to stable-4.0
Signed-off-by: Bruceforce <markus.greis@gmx.de>
(cherry picked from commit 29f2c953b4)
Keywords requiring only one item shouldn't express it by creating a
list with single item.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 739a662c80)
Conflicts:
roles/ceph-mon/tasks/ceph_keys.yml
roles/ceph-validate/tasks/check_devices.yml
Otherwise the reader is forced to search for "when" when blocks are too
long.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit e0beaf123a)
Conflicts:
roles/ceph-config/tasks/main.yml
roles/ceph-container-common/tasks/pre_requisites/prerequisites.yml
roles/ceph-validate/tasks/check_devices.yml
This prevents the packaging from restarting services before we do need
to restart them in the rolling update sequence.
We want to handle services restart at rolling_update playbook.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We don't need to set After=docker.service when the container_binary
variable isn't set to docker.
It doesn't break anything currently but it could be confusing when
using podman.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
instead of using `RuntimeDirectory` parameter in systemd unit files,
let's use a systemd `tmpfiles.d` to ensure `/run/ceph`.
Explanation:
`podman` doesn't create the `/var/run/ceph` if it doesn't exist the time
where the container is run while `docker` used to create it.
In case of `switch_to_containers` scenario, `/run/ceph` gets created by
a tmpfiles.d systemd file; when switching to containers, the systemd
unit file complains because `/run/ceph` already exists
The better fix would be to ensure `/usr/lib/tmpfiles.d/ceph-common.conf`
is removed and only rely on `RuntimeDirectory` from systemd unit file parameter
but we come from a non-containerized environment which is already running,
it means `/run/ceph` is already created and when starting the unit to
start the container, systemd will still complain and we can't simply
remove the directory if daemons are collocated.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
/var/run/ceph resides in a non persistent filesystem (tmpfs)
After a reboot, all daemons won't start because this directory will be
missing.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>