Commit Graph

5497 Commits (df98746378e50cbc059bb60da2fada4de91cfbf0)
 

Author SHA1 Message Date
Guillaume Abrioux a4efb521f5 tests: (lvm_setup.yml), don't shrink lvol
when rerunning lvm_setup.yml on existing cluster with OSDs already
deployed, it fails like following:

```
fatal: [osd0]: FAILED! => changed=false
  msg: Sorry, no shrinking of data-lv2 to 0 permitted.
```

because we are asking `lvol` module to create a volume on an empty VG
with size extents = `100%FREE`.

The default behavior of `lvol` is to shrink the volume if the LV's current
size is greater than the requested size.

Given the requested size is calculated like this:

`size_requested = size_percent * this_vg['free'] / 100`

in our case, it is similar to:

`size_requested = 100 * 0 / 100` which basically means `0`

So the current LV size is well greater than the requested size which
leads the module to attempt to shrink it to 0 which isn't obviously now
allowed.

Adding `shrink: false` to the module calls fixes this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 218f4ae361)
2020-07-22 18:46:49 -04:00
Guillaume Abrioux bd12158a1c facts: fix broken facts when using --limit
This commit fixes these tasks when --limit is used.

It makes sure the fact is set on right nodes even when the playbook is
run with `--limit`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f8a951f50c)
2020-07-20 22:49:43 -04:00
Dimitri Savineau b11eeed833 ceph-dashboard: copy TLS cert/key on monitor
The ceph-dashboard role is executed on the mgr nodes so the TLS cert/key
files are copied to those nodes.
But we are running importing the cert/key files into the ceph
configuration on the monitor.

Closes: #5557

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2b8ebf1457)
2020-07-20 22:49:20 -04:00
Dimitri Savineau 0178114f3b cephadm: set the command as a fact
Set the cephadm cmd as a fact instead of rewriting the same command
over and over.
This also fix an issue when using docker as container engine because
the --docker cephadm parameter should be use before the subcommand
not after.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 5ef965c4dc)
2020-07-20 22:48:07 -04:00
Dimitri Savineau b7fd3bc844 cephadm: add playbook
This adds a new playbook for deploying ceph via cephadm.

This also adds a new dedicated tox file for CI purpose.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 957903d561)
2020-07-16 12:00:14 -04:00
Dimitri Savineau a22855319b cephadm-adopt: delegate task for orch apply
This is a partial revert of b38019e because we don't want to execute
the whole play on the monitor otherwise if we have some empty group
like rgws or mdss then the orchestrator commands will still be
executed.
Instead we should keep the real target group name at play level and
delegate the orchestator commands to the monitor. The whole play
will be skipped is the group is empty.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 9596494911)
2020-07-16 10:50:53 -04:00
Dimitri Savineau 585b3e476c cephadm-adopt: inform users about cephadm
Print a message at the end of the playbook to inform users that they
don't have to user ceph-ansible playbooks anymore as everything else
need to be done via cephadm (day 2 operation).

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 75ae1b7e90)
2020-07-15 17:57:41 -04:00
Dimitri Savineau 4e4748b58d cephadm-adopt: refresh the service/daemon list
When reporting the orchestrator service/daemon list at the end of the
playbook, we can use the --refresh option otherwise we could have
an outdated output.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 7164426456)
2020-07-15 17:57:41 -04:00
Dimitri Savineau bc2aebaa26 Revert "cephadm-adopt: remove the cephadm script"
This reverts commit c3bbc6b13c.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ceac81cd24)
2020-07-15 17:57:41 -04:00
Guillaume Abrioux 636fd26848 ceph_key: fix bug in 'info' feature
Fix 'info' feature from ceph_key.py module

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9417ecf0c5)
2020-07-15 13:19:49 -04:00
Dimitri Savineau 48baf63bc2 cephadm-adopt: wait for monitor in quorum
After adopting a monitor we need to wait that monitor to join back
the quorum before moving to the next node.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0c3a2b72ff)
2020-07-13 10:17:56 -04:00
Dimitri Savineau 980d1a8365 cephadm-adopt: add osd flags during adoption
Like rolling_update or switch2container playbooks, we need to set/unset
some osd flags before and after the OSD daemons adoption.
This also adds a task for waiting for clean pgs at then of an OSd node.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d3b3c8948e)
2020-07-13 10:17:56 -04:00
Dimitri Savineau f4a9f00f20 cephadm-adopt: add iscsi support
The iSCSI support has been added recently in cephadm.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 9fe2694711)
2020-07-13 10:17:56 -04:00
Dimitri Savineau d8a8d74625 cephadm-adopt: remove the cephadm script
At the end of the process when don't need the cephadm script.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c3bbc6b13c)
2020-07-13 10:17:56 -04:00
Dimitri Savineau 90f974abb0 cephadm-adopt: show orchestrator status
At the end of the playbook we can show the orchestrator status like
we do with the ceph status in initial deployment.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 381201a394)
2020-07-13 10:17:56 -04:00
Dimitri Savineau c5009101f1 cephadm-adopt: use placement parameter
It's better to use the --placement parameter when using ceph orch apply
commands to avoid confusion in the parameters.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 91a6c79e41)
2020-07-10 14:53:39 -04:00
Dimitri Savineau 3b9ff9ae26 cephadm-adopt: use custom dashboard images
cephadm uses default value for dashboard container images which need to
be customized by ansible for upstream or downstream purpose.
This feature wasn't present when cephadm-adopt.yml has been designed.
Also set the container_image_base variable for upgrade purpose.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f2d997396e)
2020-07-10 11:08:30 -04:00
Dimitri Savineau f4d62212c6 cephadm-adopt: run orch apply from monitors
It looks like we can't run the ceph orch apply commands on nodes other
than monitors even if it used to work in the past.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit b38019e3ca)
2020-07-10 11:08:30 -04:00
Dimitri Savineau 9d6a33e114 cephadm-adopt: don't fail on systemd reset-failed
If the systemd service exists successfully then we don't need to reset
the failed state.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 27efcbc0e5)
2020-07-10 11:08:30 -04:00
Dimitri Savineau 0af87be5fc cephadm-adopt: copy client.admin keyring
The ceph config assimilate-conf command requires the client.admin
keyring which isn't present on all nodes most of the time.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit fd36433826)
2020-07-10 11:08:30 -04:00
Dimitri Savineau da1dd2a538 tox: add cephadm_adopt scenario
This adds an optional cephadm_adopt scenario which is based on
all_daemons.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 14eed63921)
2020-07-10 11:08:30 -04:00
Guillaume Abrioux c0b8edfea2 rgw: set container memory limit to 4g
This commit changes the container memory limit for rgw daemons.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707488

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 86edae724f)
2020-07-10 10:10:32 -04:00
Guillaume Abrioux b0613bdd36 tests: add docker hub authentication in jobs
This commit makes all jobs authenticating to docker hub in order to
avoid the rate limit.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 40307f810c)
2020-07-09 09:28:05 -04:00
Guillaume Abrioux 25e9abcdc7 ceph_volume: fix regression
do not skip zapping if osd_fsid is passed

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f402ab2b87)
2020-07-08 12:00:07 -04:00
Guillaume Abrioux 885a70374f doc: add stable-5.0 note in release section
This commit adds the missing stable-5.0 details about what it is
supported in this branch regarding ceph/ansible.

Fixes: #5519

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-07 15:09:52 +02:00
Dimitri Savineau 16c5c93411 ceph-nfs: change ganesha devel source
The download.nfs-ganesha.org source for nfs-ganesha on CentOS isn't
available anymore.
Let's switch back to shaman since we have builds available now.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1438ca0120)
2020-07-06 11:01:13 -04:00
Dimitri Savineau 3a2bda2b11 tests: remove nfs_ganesha_stable_branch variable
We don't need to override this variable in the group_vars but use the
default value instead.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit fc599ed9f5)
2020-07-06 11:00:44 -04:00
Guillaume Abrioux 07f1ec287b tests: update nfs-ganesha to V3.3-stable
not really needed in master, commit intended to be backported in octopus
branch.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5b6f5486f7)
2020-07-06 10:37:35 -04:00
Guillaume Abrioux 64412353a7 doc: add a note about deprecated branches
This commit adds a note about `stable-3.0` `stable-3.1` branches which
are deprecated and not maintained anymore.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit bbe30bcc69)
2020-07-03 14:44:35 +02:00
Guillaume Abrioux 22efbc2f6d doc: add a note about containerized deployments
This commit updates the documentation to add a note about containerized
deployments.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e61488507b)
2020-07-03 14:44:35 +02:00
Guillaume Abrioux 6ddf17f1ba doc: fix warning treated as an error
Typical error:

```
Warning, treated as error:
/home/jenkins-build/build/workspace/ceph-ansible-docs-pull-requests/docs/source/day-2/upgrade.rst:2:Title underline too short.
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5c254861bd)
2020-07-03 09:50:42 +02:00
Dimitri Savineau df3acbd974 ceph-defaults: update nfs-ganesha to 3.3
nfs-ganesha 3.3 is the latest 3.x release available for octopus so we
should update to this version.

https://download.ceph.com/nfs-ganesha/rpm-V3.3-stable/octopus

This will also match the version used in RHCS 5.

Ceph container already uses that version too.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 93754bd70c)
2020-07-03 09:05:12 +02:00
Dimitri Savineau 101412de78 vagrant: update centos image to 8.2
CentOS 8.2 (2004) has been relesed so we should switch to this image
when using vagrant.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 72293b6614)
2020-07-03 06:37:58 +02:00
Dimitri Savineau 596f3fa161 radosgw: remove INST_PORT environment variable
This variable isn't consumed by the container so we can remove it.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1361e84a4e)
2020-07-03 06:37:34 +02:00
Guillaume Abrioux cdf61540d8 rgw: fix multi instances scaleout
When rgw and osd are collocated, the current workflow prevents from
scaling out the radosgw_num_instances parameter when rerunning the
playbook.

The environment file used in the rgw systemd template is rendered when
executing the `ceph-rgw` role but during a new run of the playbook (in
order to scale out rgw instances), handlers are triggered from `ceph-osd`
role which is run before `ceph-rgw`, therefore it tries to start the new
rgw daemon whereas its corresponding environment file hasn't been
rendered yet and fails like following:

```
ceph-radosgw@rgw.ceph4osd3.rgw1.service failed to run 'start-pre' task: No such file or directory
```

This commit moves the tasks generating this file in `ceph-config` role
so it is generated early.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1851906

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7dd68b9ac1)
2020-07-03 06:37:34 +02:00
Dimitri Savineau 503bc893fb facts: explicitly disable facter and ohai
By default, ansible gathers facts from facter and ohai if installed on
the remote nodes, given we don't need them, let's exclude these facts
from our facts gathering

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c95adc564b)
2020-07-03 06:37:08 +02:00
Dimitri Savineau 48fd6e6b16 ceph-common: remove copr and sepia repositories
All EL8 dependencies are now present on EPEL 8 so we don't need the
additional repositories that were only a temporary solution.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3592ba1d61)
2020-06-30 17:08:53 +02:00
Jan Fajerski 7410c6eebe ceph-volume.py: add support for batch refactored code
See https://github.com/ceph/ceph/pull/34740 for the batch changes.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit d90834b77f)
2020-06-30 13:57:20 +02:00
Guillaume Abrioux 688d5eebf7 rolling_update: add any_errors_fatal
If a failure occurs in ceph-validate, the upgrade playbook keeps running
where we expect it to fail.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8f9cdf4b10)
2020-06-29 17:13:03 -04:00
Dimitri Savineau e5eba9555b dashboard: configure mgr backend before restart
We need to set the mgr dashboard server ip address before restarting the
dashboard module otherwise we can try to bind the dashboard module on an
already used address.
We already do this configuration for the dashboard port value and ssl
setup so we should do the same for server address too.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1851455

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 03cd75845f)
2020-06-29 12:34:26 -04:00
George Shuklin cfc808804f Add container settings for Ubuntu 20 (the same as Ubuntu 18)
Signed-off-by: George Shuklin <george.shuklin@gmail.com>
(cherry picked from commit 3e87f53875)
2020-06-29 12:33:49 -04:00
Dimitri Savineau c3e89983fc Add playbook for converting cluster to cephadm
The commit adds a new playbook for converting an existing ceph cluster
deployed by ceph-ansible to the cephadm orchestrator.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 548ff26256)
2020-06-29 09:45:22 -04:00
Jan Fajerski 5505b713f2 lvm_setup: lookup device from inventory, default to /dev/sd* names
This fixes a long standing fail in ceph-volumes lvm test suite.
Otherwise the default behaviour should not change.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 1fe8e819f9)
2020-06-27 09:31:47 -04:00
Jonathan Rosser 28ee26ae58 Ansible tests are not filters
The use of "| success" and "| changed" are not valid syntax for modern
ansible releases.

Signed-off-by: Jonathan Rosser <jonathan.rosser@rd.bbc.co.uk>
(cherry picked from commit 42884e8175)
2020-06-26 13:36:42 -04:00
Jonathan Rosser 77002c12c8 Install python routes package as a dependancy rather than directly
This is now a dependancy of ceph-mgr so will be installed automatically
and does not need a specific task.

This change means that ceph-mgr installs correctly on Ubuntu Focal where
the python3-routes package is necessary.

Signed-off-by: Jonathan Rosser <jonathan.rosser@rd.bbc.co.uk>
(cherry picked from commit 92288c11c5)
2020-06-26 13:36:42 -04:00
Dimitri Savineau 7eddd89afa podman: Add Type and PIDFile value to unit files
This changes the way we are running the podman containers via systemd.
They are now in dettached mode and Type/PIDFile set.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1834974

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d43769dc2a)
2020-06-23 17:35:24 +02:00
Dimitri Savineau 51cfb89501 ceph-osd: remove ceph-osd-run.sh script
Since we only have one scenario since nautilus then we can just move
the container start command from ceph-osd-run.sh to the systemd unit
service.
As a result, the ceph-osd-run.sh.j2 template and the
ceph_osd_docker_run_script_path variable are removed.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 829990e60d)
2020-06-23 17:35:24 +02:00
Guillaume Abrioux e1c8a0daf6 dashboard: copy self-signed generated crt to mons
This commit makes the playbook copying self-signed generated certificate
to monitors.
When mons and mgrs are deployed on dedicated nodes the playbook will
fail when trying to import certificate and key files since they are
generated on mgrs whereas we try to import them from a monitor.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846995

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b7539eb275)
2020-06-23 15:43:26 +02:00
Guillaume Abrioux e4f972004b ceph_volume: make zap function idempotent
This commit makes the zap function idempotent, especially when using
lvm_volumes variable.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1845668

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3f47236470)
2020-06-23 09:50:01 +02:00
Dimitri Savineau 5428a41fcf docker: Add Requires on docker service
When using docker container engine then the systemd unit scripts only
use a dependency on the docker daemon via the After parameter.
But if docker is restarted on a live system then the ceph systemd units
should wait for the docker daemon to be fully restarted.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846830

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit bd22f1d1ec)
2020-06-22 17:30:28 -04:00