After adopting a monitor we need to wait that monitor to join back
the quorum before moving to the next node.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Like rolling_update or switch2container playbooks, we need to set/unset
some osd flags before and after the OSD daemons adoption.
This also adds a task for waiting for clean pgs at then of an OSd node.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
At the end of the playbook we can show the orchestrator status like
we do with the ceph status in initial deployment.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
It's better to use the --placement parameter when using ceph orch apply
commands to avoid confusion in the parameters.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
cephadm uses default value for dashboard container images which need to
be customized by ansible for upstream or downstream purpose.
This feature wasn't present when cephadm-adopt.yml has been designed.
Also set the container_image_base variable for upgrade purpose.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
It looks like we can't run the ceph orch apply commands on nodes other
than monitors even if it used to work in the past.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The ceph config assimilate-conf command requires the client.admin
keyring which isn't present on all nodes most of the time.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The download.nfs-ganesha.org source for nfs-ganesha on CentOS isn't
available anymore.
Let's switch back to shaman since we have builds available now.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This commit adds a note about `stable-3.0` `stable-3.1` branches which
are deprecated and not maintained anymore.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
nfs-ganesha 3.3 is the latest 3.x release available for octopus so we
should update to this version.
https://download.ceph.com/nfs-ganesha/rpm-V3.3-stable/octopus
This will also match the version used in RHCS 5.
Ceph container already uses that version too.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
By default, ansible gathers facts from facter and ohai if installed on
the remote nodes, given we don't need them, let's exclude these facts
from our facts gathering
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
When rgw and osd are collocated, the current workflow prevents from
scaling out the radosgw_num_instances parameter when rerunning the
playbook.
The environment file used in the rgw systemd template is rendered when
executing the `ceph-rgw` role but during a new run of the playbook (in
order to scale out rgw instances), handlers are triggered from `ceph-osd`
role which is run before `ceph-rgw`, therefore it tries to start the new
rgw daemon whereas its corresponding environment file hasn't been
rendered yet and fails like following:
```
ceph-radosgw@rgw.ceph4osd3.rgw1.service failed to run 'start-pre' task: No such file or directory
```
This commit moves the tasks generating this file in `ceph-config` role
so it is generated early.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1851906
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This commit enforces the pytest-rerunfailures installed so it's <9.0
This is to avoid the following error:
```
ERROR: pytest-rerunfailures 9.0 has requirement pytest>=5.0, but you'll have pytest 4.6.11 which is incompatible.
```
latest version of pytest-rerunfailures isn't compatible with the version
of pytest we are using.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
All EL8 dependencies are now present on EPEL 8 so we don't need the
additional repositories that were only a temporary solution.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
If a failure occurs in ceph-validate, the upgrade playbook keeps running
where we expect it to fail.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The commit adds a new playbook for converting an existing ceph cluster
deployed by ceph-ansible to the cephadm orchestrator.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
We need to set the mgr dashboard server ip address before restarting the
dashboard module otherwise we can try to bind the dashboard module on an
already used address.
We already do this configuration for the dashboard port value and ssl
setup so we should do the same for server address too.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1851455
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This is now a dependancy of ceph-mgr so will be installed automatically
and does not need a specific task.
This change means that ceph-mgr installs correctly on Ubuntu Focal where
the python3-routes package is necessary.
Signed-off-by: Jonathan Rosser <jonathan.rosser@rd.bbc.co.uk>
This commit makes the playbook copying self-signed generated certificate
to monitors.
When mons and mgrs are deployed on dedicated nodes the playbook will
fail when trying to import certificate and key files since they are
generated on mgrs whereas we try to import them from a monitor.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846995
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This changes the way we are running the podman containers via systemd.
They are now in dettached mode and Type/PIDFile set.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1834974
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This commit makes the zap function idempotent, especially when using
lvm_volumes variable.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1845668
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
When using docker container engine then the systemd unit scripts only
use a dependency on the docker daemon via the After parameter.
But if docker is restarted on a live system then the ceph systemd units
should wait for the docker daemon to be fully restarted.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846830
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
This commit makes the images pulling skipped if podman isn't installed
on the machine.
In OSP context, the podman installation is done later in the workflow,
it means all `podman pull` commands will fail.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1849559
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Explicitly install ansible-lint pytest pytest-cov via pip results of a
specific pytest version (4.3.1) which is not supported for pytest-cov
(2.10).
Because we are already defining a specific pytest version in the tests
requirements then we can install all the python dependencies from that
file and remove this from the pip install command.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
ansible 2.9.10 seems to have introduced a bug.
See https://github.com/ansible/ansible/issues/70168
This commit excludes this version from ceph-ansible requirements.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since we only have one scenario since nautilus then we can just move
the container start command from ceph-osd-run.sh to the systemd unit
service.
As a result, the ceph-osd-run.sh.j2 template and the
ceph_osd_docker_run_script_path variable are removed.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
The "update apt cache" in the ceph-handler role was never called and the
handler trigger after adding the uca repository doesn't exist at all.
Instead of using a handler for that we can just set the update_cache
parameter to true like the other apt_repository tasks.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
We shouldn't set this flag when running switch_to_containers playbook.
Otherwise the playbook fails waiting for pgs to be clean.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1843569
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
This fixes a long standing fail in ceph-volumes lvm test suite.
Otherwise the default behaviour should not change.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>