This is to ensure `docker_exec_cmd` fact is set with the correct value
in case of daemons collocation
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Also we now play ceph-config to have everything being generated for new
daemons bootstrap during upgrade.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1497959
Signed-off-by: Sébastien Han <seb@redhat.com>
generate_crt|bool|default(false) won't apply the default value, this
generate_crt|default(false)|bool will
Signed-off-by: Sébastien Han <seb@redhat.com>
The task which sets `ceph_current_fsid` in `facts.yml` in case of containerized
deployment, will definitely fail because `docker_exec_cmd` is not set
yet.
This commits simply makes `facts.yml` played after `check_socket.yml` so
`docker_exec_cmd` is set properly.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commits refacts the role ceph-mds
The goal here is to create cephfs in `ceph-mon` for both containerized
and non-containerized cases so we don't need the admin keyring on mds
nodes anymore.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1488999
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Using systemd module allows us to do in one task what we did in three
tasks:
- enable unit file,
- issue a `daemon-reload`,
- start the service
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Running the socket check on all the hosts will override the default
value of docker_exec_cmd, leaving it with the last value (currently
rbd-mirror), as a result the subsequent docker_exec_cmd usage for the
:x
Signed-off-by: Sébastien Han <seb@redhat.com>
There is a bug in the rbd mirror unit file, the upstream fix is here:
https://github.com/ceph/ceph/pull/17969.
This should be reverted once the patch is merged and backport is done.
Signed-off-by: Sébastien Han <seb@redhat.com>
This fixes the error :
```
The conditional check 'sestatus.stdout != 'Disabled'' failed.
```
that occurs when running on non rhel based system since the
`sestatus` fact is registered only on rhel based distribution.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Specify the timeout flag to ceph-create-keys, which causes it to time out
if a monitor quorum isn't achieved. This overrides the default timeout
of 10 minutes, causing ceph-ansible to fail faster in the event of cluster
network issues.
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
We have seen issues with leftover socker. So now, if a socket is found
we also check if it's accessed by a process. If so, we can run the
handler, if not we remove it and continue the playbook.
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
It's sad but we can not rely on the prepare container anymore since the
log are flushed after reboot. So inpecting the container does not return
anything.
Now, instead we use a ephemeral container to look up for the
journal/block.db/block.wal (depending if filestore or bluestore) and
build the activate command accordingly.
Signed-off-by: Sébastien Han <seb@redhat.com>
need to use `hostvars[host]['XXX']` to retrieve the monitor
interface and/or radosgw interface.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1493920
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
- move the file fetch/push to the existing task
- rename the include
- generate the ganesha template from ansible
- re-arrange role structure
- re-use tasks for non-container and container
- configure keys for non-container and container
- fix rgw container key collection;
Signed-off-by: Sébastien Han <seb@redhat.com>
the rbd key was not pushed on rbd nodes because its keyring path was not
added in `ceph_config_keys`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We generate the ceph.conf on all the nodes through the
ceph-docker-common so there is no need to push it to the Ansible file.
Also this is breaking the ceph.conf template generation since we only
generate sections based on the host the ansible task is running on.
For example, what's typically happening, we bootstrap the monitor, we
get a ceph.conf generated for a mon only, we go on an osd, we generate
the ceph.conf with osd section (done by ceph-docker-common) but this
gets overwritten by the copy_config task of the ceph-osd role.
Signed-off-by: Sébastien Han <seb@redhat.com>
- Change capitalization of config options to be
in line with what config.txt in the nfs-ganesha
tree says
Signed-off-by: Ali Maredia <amaredia@redhat.com>
RHCS install wasn't working at all prior to this commit as the name of
the include was pointing to a non-existing file.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492056
Signed-off-by: Sébastien Han <seb@redhat.com>
When ceph-nfs service is managed by pacemaker, it's useful to
not enable and start ceph-nfs service through systemd but let
pacemaker to start the service in a next step.
In analogy to ceph_nfs_rgw_user, we should be able to define a user
with which the nfs-ganesha Ceph FSAL connects to the cluster.
Introduce a ceph_nfs_ceph_user variable, setting its default to
"admin" (which preserves the prior behavior of always connecting as
client.admin).
Fixes#1910.
When Ansible is not run with verbose options it's difficult to see which
include and/or set_fact does what. So adding a name for each clarifies.
Signed-off-by: Sébastien Han <seb@redhat.com>
The variable "statleftover" was removed by commit
a60c74f61e
and never added back to the new playbook,
yet it is still being referenced.
Adding it back
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492224
Signed-off-by: Sébastien Han <seb@redhat.com>
On a container env, machines don't have any ceph binaries so we need to
use a container to run the commands.
Signed-off-by: Sébastien Han <seb@redhat.com>
Use default delay since the mon (in particular) can take more time to
restart.
Solves error with:
STDERR:
Error response from daemon: No such container: ceph-mon-mon0
Signed-off-by: Sébastien Han <seb@redhat.com>
All keys are copied to all nodes.
This commit split that task in each roles so keys are copied to their
respective nodes.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1488999
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Less configuration for the user, the container inherit from the global
variables. No more container specific variables.
Signed-off-by: Sébastien Han <seb@redhat.com>
In a collocated scenario, where you might put a rgw, a mds and a mon on
the same node you don't want the handler blindly restart all the daemons
on the node. Indeed some of them might not be configured yet.
Implementing a more precise socket detection, for each daemon type.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1488813
Signed-off-by: Sébastien Han <seb@redhat.com>
Prior to this patch this activation sequence for autodetection was
always skipped because we were asking to activate on device without
partitions, which doesn't make sense.
We also fix the way we lookup for a device, since the data partition is
always numbered 1, we take the min element of the dict.
Closes: https://github.com/ceph/ceph-ansible/issues/1782
Signed-off-by: Sébastien Han <seb@redhat.com>
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>