Commit Graph

5387 Commits (ed6ae6815d665069c9b6f7d790390dceb0f037c9)
 

Author SHA1 Message Date
Dimitri Savineau 48baf63bc2 cephadm-adopt: wait for monitor in quorum
After adopting a monitor we need to wait that monitor to join back
the quorum before moving to the next node.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0c3a2b72ff)
2020-07-13 10:17:56 -04:00
Dimitri Savineau 980d1a8365 cephadm-adopt: add osd flags during adoption
Like rolling_update or switch2container playbooks, we need to set/unset
some osd flags before and after the OSD daemons adoption.
This also adds a task for waiting for clean pgs at then of an OSd node.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d3b3c8948e)
2020-07-13 10:17:56 -04:00
Dimitri Savineau f4a9f00f20 cephadm-adopt: add iscsi support
The iSCSI support has been added recently in cephadm.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 9fe2694711)
2020-07-13 10:17:56 -04:00
Dimitri Savineau d8a8d74625 cephadm-adopt: remove the cephadm script
At the end of the process when don't need the cephadm script.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c3bbc6b13c)
2020-07-13 10:17:56 -04:00
Dimitri Savineau 90f974abb0 cephadm-adopt: show orchestrator status
At the end of the playbook we can show the orchestrator status like
we do with the ceph status in initial deployment.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 381201a394)
2020-07-13 10:17:56 -04:00
Dimitri Savineau c5009101f1 cephadm-adopt: use placement parameter
It's better to use the --placement parameter when using ceph orch apply
commands to avoid confusion in the parameters.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 91a6c79e41)
2020-07-10 14:53:39 -04:00
Dimitri Savineau 3b9ff9ae26 cephadm-adopt: use custom dashboard images
cephadm uses default value for dashboard container images which need to
be customized by ansible for upstream or downstream purpose.
This feature wasn't present when cephadm-adopt.yml has been designed.
Also set the container_image_base variable for upgrade purpose.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f2d997396e)
2020-07-10 11:08:30 -04:00
Dimitri Savineau f4d62212c6 cephadm-adopt: run orch apply from monitors
It looks like we can't run the ceph orch apply commands on nodes other
than monitors even if it used to work in the past.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit b38019e3ca)
2020-07-10 11:08:30 -04:00
Dimitri Savineau 9d6a33e114 cephadm-adopt: don't fail on systemd reset-failed
If the systemd service exists successfully then we don't need to reset
the failed state.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 27efcbc0e5)
2020-07-10 11:08:30 -04:00
Dimitri Savineau 0af87be5fc cephadm-adopt: copy client.admin keyring
The ceph config assimilate-conf command requires the client.admin
keyring which isn't present on all nodes most of the time.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit fd36433826)
2020-07-10 11:08:30 -04:00
Dimitri Savineau da1dd2a538 tox: add cephadm_adopt scenario
This adds an optional cephadm_adopt scenario which is based on
all_daemons.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 14eed63921)
2020-07-10 11:08:30 -04:00
Guillaume Abrioux c0b8edfea2 rgw: set container memory limit to 4g
This commit changes the container memory limit for rgw daemons.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707488

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 86edae724f)
2020-07-10 10:10:32 -04:00
Guillaume Abrioux b0613bdd36 tests: add docker hub authentication in jobs
This commit makes all jobs authenticating to docker hub in order to
avoid the rate limit.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 40307f810c)
2020-07-09 09:28:05 -04:00
Guillaume Abrioux 25e9abcdc7 ceph_volume: fix regression
do not skip zapping if osd_fsid is passed

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f402ab2b87)
2020-07-08 12:00:07 -04:00
Guillaume Abrioux 885a70374f doc: add stable-5.0 note in release section
This commit adds the missing stable-5.0 details about what it is
supported in this branch regarding ceph/ansible.

Fixes: #5519

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-07 15:09:52 +02:00
Dimitri Savineau 16c5c93411 ceph-nfs: change ganesha devel source
The download.nfs-ganesha.org source for nfs-ganesha on CentOS isn't
available anymore.
Let's switch back to shaman since we have builds available now.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1438ca0120)
2020-07-06 11:01:13 -04:00
Dimitri Savineau 3a2bda2b11 tests: remove nfs_ganesha_stable_branch variable
We don't need to override this variable in the group_vars but use the
default value instead.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit fc599ed9f5)
2020-07-06 11:00:44 -04:00
Guillaume Abrioux 07f1ec287b tests: update nfs-ganesha to V3.3-stable
not really needed in master, commit intended to be backported in octopus
branch.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5b6f5486f7)
2020-07-06 10:37:35 -04:00
Guillaume Abrioux 64412353a7 doc: add a note about deprecated branches
This commit adds a note about `stable-3.0` `stable-3.1` branches which
are deprecated and not maintained anymore.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit bbe30bcc69)
2020-07-03 14:44:35 +02:00
Guillaume Abrioux 22efbc2f6d doc: add a note about containerized deployments
This commit updates the documentation to add a note about containerized
deployments.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e61488507b)
2020-07-03 14:44:35 +02:00
Guillaume Abrioux 6ddf17f1ba doc: fix warning treated as an error
Typical error:

```
Warning, treated as error:
/home/jenkins-build/build/workspace/ceph-ansible-docs-pull-requests/docs/source/day-2/upgrade.rst:2:Title underline too short.
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5c254861bd)
2020-07-03 09:50:42 +02:00
Dimitri Savineau df3acbd974 ceph-defaults: update nfs-ganesha to 3.3
nfs-ganesha 3.3 is the latest 3.x release available for octopus so we
should update to this version.

https://download.ceph.com/nfs-ganesha/rpm-V3.3-stable/octopus

This will also match the version used in RHCS 5.

Ceph container already uses that version too.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 93754bd70c)
2020-07-03 09:05:12 +02:00
Dimitri Savineau 101412de78 vagrant: update centos image to 8.2
CentOS 8.2 (2004) has been relesed so we should switch to this image
when using vagrant.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 72293b6614)
2020-07-03 06:37:58 +02:00
Dimitri Savineau 596f3fa161 radosgw: remove INST_PORT environment variable
This variable isn't consumed by the container so we can remove it.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1361e84a4e)
2020-07-03 06:37:34 +02:00
Guillaume Abrioux cdf61540d8 rgw: fix multi instances scaleout
When rgw and osd are collocated, the current workflow prevents from
scaling out the radosgw_num_instances parameter when rerunning the
playbook.

The environment file used in the rgw systemd template is rendered when
executing the `ceph-rgw` role but during a new run of the playbook (in
order to scale out rgw instances), handlers are triggered from `ceph-osd`
role which is run before `ceph-rgw`, therefore it tries to start the new
rgw daemon whereas its corresponding environment file hasn't been
rendered yet and fails like following:

```
ceph-radosgw@rgw.ceph4osd3.rgw1.service failed to run 'start-pre' task: No such file or directory
```

This commit moves the tasks generating this file in `ceph-config` role
so it is generated early.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1851906

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7dd68b9ac1)
2020-07-03 06:37:34 +02:00
Dimitri Savineau 503bc893fb facts: explicitly disable facter and ohai
By default, ansible gathers facts from facter and ohai if installed on
the remote nodes, given we don't need them, let's exclude these facts
from our facts gathering

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c95adc564b)
2020-07-03 06:37:08 +02:00
Dimitri Savineau 48fd6e6b16 ceph-common: remove copr and sepia repositories
All EL8 dependencies are now present on EPEL 8 so we don't need the
additional repositories that were only a temporary solution.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3592ba1d61)
2020-06-30 17:08:53 +02:00
Jan Fajerski 7410c6eebe ceph-volume.py: add support for batch refactored code
See https://github.com/ceph/ceph/pull/34740 for the batch changes.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit d90834b77f)
2020-06-30 13:57:20 +02:00
Guillaume Abrioux 688d5eebf7 rolling_update: add any_errors_fatal
If a failure occurs in ceph-validate, the upgrade playbook keeps running
where we expect it to fail.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8f9cdf4b10)
2020-06-29 17:13:03 -04:00
Dimitri Savineau e5eba9555b dashboard: configure mgr backend before restart
We need to set the mgr dashboard server ip address before restarting the
dashboard module otherwise we can try to bind the dashboard module on an
already used address.
We already do this configuration for the dashboard port value and ssl
setup so we should do the same for server address too.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1851455

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 03cd75845f)
2020-06-29 12:34:26 -04:00
George Shuklin cfc808804f Add container settings for Ubuntu 20 (the same as Ubuntu 18)
Signed-off-by: George Shuklin <george.shuklin@gmail.com>
(cherry picked from commit 3e87f53875)
2020-06-29 12:33:49 -04:00
Dimitri Savineau c3e89983fc Add playbook for converting cluster to cephadm
The commit adds a new playbook for converting an existing ceph cluster
deployed by ceph-ansible to the cephadm orchestrator.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 548ff26256)
2020-06-29 09:45:22 -04:00
Jan Fajerski 5505b713f2 lvm_setup: lookup device from inventory, default to /dev/sd* names
This fixes a long standing fail in ceph-volumes lvm test suite.
Otherwise the default behaviour should not change.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 1fe8e819f9)
2020-06-27 09:31:47 -04:00
Jonathan Rosser 28ee26ae58 Ansible tests are not filters
The use of "| success" and "| changed" are not valid syntax for modern
ansible releases.

Signed-off-by: Jonathan Rosser <jonathan.rosser@rd.bbc.co.uk>
(cherry picked from commit 42884e8175)
2020-06-26 13:36:42 -04:00
Jonathan Rosser 77002c12c8 Install python routes package as a dependancy rather than directly
This is now a dependancy of ceph-mgr so will be installed automatically
and does not need a specific task.

This change means that ceph-mgr installs correctly on Ubuntu Focal where
the python3-routes package is necessary.

Signed-off-by: Jonathan Rosser <jonathan.rosser@rd.bbc.co.uk>
(cherry picked from commit 92288c11c5)
2020-06-26 13:36:42 -04:00
Dimitri Savineau 7eddd89afa podman: Add Type and PIDFile value to unit files
This changes the way we are running the podman containers via systemd.
They are now in dettached mode and Type/PIDFile set.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1834974

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d43769dc2a)
2020-06-23 17:35:24 +02:00
Dimitri Savineau 51cfb89501 ceph-osd: remove ceph-osd-run.sh script
Since we only have one scenario since nautilus then we can just move
the container start command from ceph-osd-run.sh to the systemd unit
service.
As a result, the ceph-osd-run.sh.j2 template and the
ceph_osd_docker_run_script_path variable are removed.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 829990e60d)
2020-06-23 17:35:24 +02:00
Guillaume Abrioux e1c8a0daf6 dashboard: copy self-signed generated crt to mons
This commit makes the playbook copying self-signed generated certificate
to monitors.
When mons and mgrs are deployed on dedicated nodes the playbook will
fail when trying to import certificate and key files since they are
generated on mgrs whereas we try to import them from a monitor.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846995

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b7539eb275)
2020-06-23 15:43:26 +02:00
Guillaume Abrioux e4f972004b ceph_volume: make zap function idempotent
This commit makes the zap function idempotent, especially when using
lvm_volumes variable.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1845668

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3f47236470)
2020-06-23 09:50:01 +02:00
Dimitri Savineau 5428a41fcf docker: Add Requires on docker service
When using docker container engine then the systemd unit scripts only
use a dependency on the docker daemon via the After parameter.
But if docker is restarted on a live system then the ceph systemd units
should wait for the docker daemon to be fully restarted.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846830

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit bd22f1d1ec)
2020-06-22 17:30:28 -04:00
Guillaume Abrioux a7fc4af06e docker2podman: make images pulling optional
This commit makes the images pulling skipped if podman isn't installed
on the machine.

In OSP context, the podman installation is done later in the workflow,
it means all `podman pull` commands will fail.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1849559

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 37b20b6525)
2020-06-22 14:19:44 -04:00
Guillaume Abrioux 06cfbb10d4 requirements: exclude ansible 2.9.10
ansible 2.9.10 seems to have introduced a bug.

See https://github.com/ansible/ansible/issues/70168

This commit excludes this version from ceph-ansible requirements.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1525990f39)
2020-06-22 13:05:33 -04:00
Dimitri Savineau 1d93b166fb travis: use tests/requirements.txt
Explicitly install ansible-lint pytest pytest-cov via pip results of a
specific pytest version (4.3.1) which is not supported for pytest-cov
(2.10).
Because we are already defining a specific pytest version in the tests
requirements then we can install all the python dependencies from that
file and remove this from the pip install command.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 296aa13b3c)
2020-06-19 18:18:44 -04:00
Dimitri Savineau bc1cc666e4 docs: Add upgrade operation.
This commit adds a chapter about the ceph upgrade process.

Closes: #5393

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit e41487dbce)
2020-06-18 17:58:44 +02:00
Dimitri Savineau 4a8c4446a4 update the release note.
This updates the release note for ceph_pool module.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-06-18 17:56:15 +02:00
Dimitri Savineau b7f38ab1a3 library/ceph_pool: set name parameter as required
The name parameter is required.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d67759611e)
2020-06-17 12:00:59 -04:00
Guillaume Abrioux 4fe8e12484 switch_to_containers: don't set noup flag
We shouldn't set this flag when running switch_to_containers playbook.
Otherwise the playbook fails waiting for pgs to be clean.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1843569

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b91d60d384)
2020-06-17 09:24:02 -04:00
Dimitri Savineau c6e60db2fb container: inspect Id field instead of RepoDigests
When a container image managed by podman isn't tag anymore then the
RepoDigests field when inspecting the image doesn't return any value.
This is different from docker workflow and it breaks the ceph-ansible
container upgrade when collocated multiple services and using a non
fix container tag (like latest or 4).

$ podman images
REPOSITORY              TAG      IMAGE ID       CREATED        SIZE
docker.io/ceph/daemon   latest   680c9c0d38c3   8 days ago     957 MB
<none>                  <none>   011ee108bfc9   2 months ago   1.01 GB

$ podman inspect 680c9c0d38c3 | jq .[0].RepoDigests[0]
"docker.io/ceph/daemon@sha256:20cf789235e23ddaf38e109b391d1496bb88011239d16862c4c106d0e05fea9e"
$ podman inspect 011ee108bfc9 | jq .[0].RepoDigests[0]
null

Because this field returns "null" then the ansible task trying to
determine this value is failing

-----------------------------
fatal: [foo]: FAILED! =>
  msg: |-
    The task includes an option with an undefined variable. The error
    was: None has no element 0

    The error appears to be in
    'roles/ceph-container-common/tasks/fetch_image.yml': line 137,
    column 3, but may be elsewhere in the file depending on the exact
    syntax problem.

    The offending line appears to be:

    - name: set_fact ceph_osd_image_repodigest_before_pulling
      ^ here
-----------------------------

We don't have this behaviour with docker.

$ docker images
REPOSITORY              TAG      IMAGE ID       CREATED        SIZE
docker.io/ceph/daemon   latest   680c9c0d38c3   8 days ago     928 MB
docker.io/ceph/daemon   <none>   011ee108bfc9   2 months ago   986 MB

$ docker inspect 680c9c0d38c3 | jq .[0].RepoDigests[0]
"docker.io/ceph/daemon@sha256:45e6f28bb67c81b826acb64fad5c0da1cac3dffb41a88992fe4ca2be79575fa6"
$ docker inspect 011ee108bfc9 | jq .[0].RepoDigests[0]
"docker.io/ceph/daemon@sha256:b393a73309d72e43ca7d65cd3519036007947671e373eb59aa75a46185c52231"

Instead we should just get the Id field.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1844496

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>

(cherry picked from commit cdb30bd125)
2020-06-16 13:12:26 -04:00
Dimitri Savineau b219b1abed switch_to_container: fix osd systemd regex
The systemd LOAD and ACTIVE fileds could have more than one space between
both values.
This update the systemd regex the same way we're using it in different
part of the code.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1843500

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 50140c9b5d)
2020-06-16 18:10:28 +02:00
Ali Maredia 5b76ba12f7 rgw multisite: add master zone endpoints to zonegroup
We were only adding the endpoints to the master zone but not to the
zonegroup.
This patch fixes the issue.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1839228

Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit 0175c205fa)
2020-06-09 12:29:56 -04:00