Commit Graph

5499 Commits (d7fd46842d5e723af86daac94ed83942344e659f)
 

Author SHA1 Message Date
Dimitri Savineau 957903d561 cephadm: add playbook
This adds a new playbook for deploying ceph via cephadm.

This also adds a new dedicated tox file for CI purpose.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-16 11:40:45 -04:00
Dimitri Savineau 9596494911 cephadm-adopt: delegate task for orch apply
This is a partial revert of b38019e because we don't want to execute
the whole play on the monitor otherwise if we have some empty group
like rgws or mdss then the orchestrator commands will still be
executed.
Instead we should keep the real target group name at play level and
delegate the orchestator commands to the monitor. The whole play
will be skipped is the group is empty.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-16 09:44:33 -04:00
Dimitri Savineau 75ae1b7e90 cephadm-adopt: inform users about cephadm
Print a message at the end of the playbook to inform users that they
don't have to user ceph-ansible playbooks anymore as everything else
need to be done via cephadm (day 2 operation).

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-15 17:04:59 -04:00
Dimitri Savineau 7164426456 cephadm-adopt: refresh the service/daemon list
When reporting the orchestrator service/daemon list at the end of the
playbook, we can use the --refresh option otherwise we could have
an outdated output.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-15 17:04:59 -04:00
Dimitri Savineau ceac81cd24 Revert "cephadm-adopt: remove the cephadm script"
This reverts commit c3bbc6b13c.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-15 17:04:59 -04:00
Guillaume Abrioux 9417ecf0c5 ceph_key: fix bug in 'info' feature
Fix 'info' feature from ceph_key.py module

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-15 17:06:17 +02:00
Dimitri Savineau 0c3a2b72ff cephadm-adopt: wait for monitor in quorum
After adopting a monitor we need to wait that monitor to join back
the quorum before moving to the next node.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-13 09:16:11 -04:00
Dimitri Savineau d3b3c8948e cephadm-adopt: add osd flags during adoption
Like rolling_update or switch2container playbooks, we need to set/unset
some osd flags before and after the OSD daemons adoption.
This also adds a task for waiting for clean pgs at then of an OSd node.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-13 09:16:11 -04:00
Dimitri Savineau 9fe2694711 cephadm-adopt: add iscsi support
The iSCSI support has been added recently in cephadm.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-13 09:16:11 -04:00
Dimitri Savineau c3bbc6b13c cephadm-adopt: remove the cephadm script
At the end of the process when don't need the cephadm script.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-13 09:16:11 -04:00
Dimitri Savineau 381201a394 cephadm-adopt: show orchestrator status
At the end of the playbook we can show the orchestrator status like
we do with the ceph status in initial deployment.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-13 09:16:11 -04:00
Dimitri Savineau 91a6c79e41 cephadm-adopt: use placement parameter
It's better to use the --placement parameter when using ceph orch apply
commands to avoid confusion in the parameters.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-10 14:05:15 -04:00
Dimitri Savineau f2d997396e cephadm-adopt: use custom dashboard images
cephadm uses default value for dashboard container images which need to
be customized by ansible for upstream or downstream purpose.
This feature wasn't present when cephadm-adopt.yml has been designed.
Also set the container_image_base variable for upgrade purpose.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-10 16:00:24 +02:00
Dimitri Savineau b38019e3ca cephadm-adopt: run orch apply from monitors
It looks like we can't run the ceph orch apply commands on nodes other
than monitors even if it used to work in the past.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-10 16:00:24 +02:00
Dimitri Savineau 27efcbc0e5 cephadm-adopt: don't fail on systemd reset-failed
If the systemd service exists successfully then we don't need to reset
the failed state.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-10 16:00:24 +02:00
Dimitri Savineau fd36433826 cephadm-adopt: copy client.admin keyring
The ceph config assimilate-conf command requires the client.admin
keyring which isn't present on all nodes most of the time.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-10 16:00:24 +02:00
Dimitri Savineau 14eed63921 tox: add cephadm_adopt scenario
This adds an optional cephadm_adopt scenario which is based on
all_daemons.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-10 16:00:24 +02:00
Guillaume Abrioux 0f3bae09bd play: followup on cc0d969
Remove two other pattern 'iscsigws' in main playbook that have been
missed in cc0d9697c5

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-09 10:01:05 -04:00
Guillaume Abrioux 86edae724f rgw: set container memory limit to 4g
This commit changes the container memory limit for rgw daemons.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707488

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-09 15:31:10 +02:00
Guillaume Abrioux bcc673f66c facts: refact `ceph_uid` fact
There's no need to set this fact with a `set_fact`
We can achieve this in `ceph-defaults`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-09 13:37:29 +02:00
Guillaume Abrioux f402ab2b87 ceph_volume: fix regression
do not skip zapping if osd_fsid is passed

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-08 09:52:53 -04:00
Guillaume Abrioux 40307f810c tests: add docker hub authentication in jobs
This commit makes all jobs authenticating to docker hub in order to
avoid the rate limit.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-08 09:52:53 -04:00
Guillaume Abrioux cc0d9697c5 play: remove backward compatibility group name
It's time to remove this old group name.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-08 09:21:19 -04:00
Dimitri Savineau 1438ca0120 ceph-nfs: change ganesha devel source
The download.nfs-ganesha.org source for nfs-ganesha on CentOS isn't
available anymore.
Let's switch back to shaman since we have builds available now.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-06 16:59:25 +02:00
Dimitri Savineau fc599ed9f5 tests: remove nfs_ganesha_stable_branch variable
We don't need to override this variable in the group_vars but use the
default value instead.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-06 16:58:59 +02:00
Guillaume Abrioux 5b6f5486f7 tests: update nfs-ganesha to V3.3-stable
not really needed in master, commit intended to be backported in octopus
branch.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-05 17:10:40 +02:00
Guillaume Abrioux bbe30bcc69 doc: add a note about deprecated branches
This commit adds a note about `stable-3.0` `stable-3.1` branches which
are deprecated and not maintained anymore.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-03 09:27:12 +02:00
Guillaume Abrioux e61488507b doc: add a note about containerized deployments
This commit updates the documentation to add a note about containerized
deployments.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-03 09:27:12 +02:00
Guillaume Abrioux 5c254861bd doc: fix warning treated as an error
Typical error:

```
Warning, treated as error:
/home/jenkins-build/build/workspace/ceph-ansible-docs-pull-requests/docs/source/day-2/upgrade.rst:2:Title underline too short.
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-03 09:22:42 +02:00
Dimitri Savineau 93754bd70c ceph-defaults: update nfs-ganesha to 3.3
nfs-ganesha 3.3 is the latest 3.x release available for octopus so we
should update to this version.

https://download.ceph.com/nfs-ganesha/rpm-V3.3-stable/octopus

This will also match the version used in RHCS 5.

Ceph container already uses that version too.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-03 06:36:49 +02:00
Dimitri Savineau c95adc564b facts: explicitly disable facter and ohai
By default, ansible gathers facts from facter and ohai if installed on
the remote nodes, given we don't need them, let's exclude these facts
from our facts gathering

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-02 17:46:12 +02:00
Dimitri Savineau 1361e84a4e radosgw: remove INST_PORT environment variable
This variable isn't consumed by the container so we can remove it.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-07-02 16:52:29 +02:00
Guillaume Abrioux 7dd68b9ac1 rgw: fix multi instances scaleout
When rgw and osd are collocated, the current workflow prevents from
scaling out the radosgw_num_instances parameter when rerunning the
playbook.

The environment file used in the rgw systemd template is rendered when
executing the `ceph-rgw` role but during a new run of the playbook (in
order to scale out rgw instances), handlers are triggered from `ceph-osd`
role which is run before `ceph-rgw`, therefore it tries to start the new
rgw daemon whereas its corresponding environment file hasn't been
rendered yet and fails like following:

```
ceph-radosgw@rgw.ceph4osd3.rgw1.service failed to run 'start-pre' task: No such file or directory
```

This commit moves the tasks generating this file in `ceph-config` role
so it is generated early.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1851906

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-02 10:39:50 -04:00
Guillaume Abrioux 19097026fb tests: enforce pytest-rerunfailures version
This commit enforces the pytest-rerunfailures installed so it's <9.0

This is to avoid the following error:

```
ERROR: pytest-rerunfailures 9.0 has requirement pytest>=5.0, but you'll have pytest 4.6.11 which is incompatible.
```

latest version of pytest-rerunfailures isn't compatible with the version
of pytest we are using.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-07-02 15:57:39 +02:00
Dimitri Savineau 72293b6614 vagrant: update centos image to 8.2
CentOS 8.2 (2004) has been relesed so we should switch to this image
when using vagrant.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-06-30 13:56:55 +02:00
Jan Fajerski d90834b77f ceph-volume.py: add support for batch refactored code
See https://github.com/ceph/ceph/pull/34740 for the batch changes.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
2020-06-30 09:46:27 +02:00
Dimitri Savineau 3592ba1d61 ceph-common: remove copr and sepia repositories
All EL8 dependencies are now present on EPEL 8 so we don't need the
additional repositories that were only a temporary solution.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-06-30 08:35:19 +02:00
Guillaume Abrioux 8f9cdf4b10 rolling_update: add any_errors_fatal
If a failure occurs in ceph-validate, the upgrade playbook keeps running
where we expect it to fail.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-06-29 12:58:53 -04:00
George Shuklin 3e87f53875 Add container settings for Ubuntu 20 (the same as Ubuntu 18)
Signed-off-by: George Shuklin <george.shuklin@gmail.com>
2020-06-29 12:18:58 -04:00
Dimitri Savineau 548ff26256 Add playbook for converting cluster to cephadm
The commit adds a new playbook for converting an existing ceph cluster
deployed by ceph-ansible to the cephadm orchestrator.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-06-29 09:21:38 -04:00
Dimitri Savineau 03cd75845f dashboard: configure mgr backend before restart
We need to set the mgr dashboard server ip address before restarting the
dashboard module otherwise we can try to bind the dashboard module on an
already used address.
We already do this configuration for the dashboard port value and ssl
setup so we should do the same for server address too.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1851455

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-06-29 14:59:01 +02:00
Jonathan Rosser 42884e8175 Ansible tests are not filters
The use of "| success" and "| changed" are not valid syntax for modern
ansible releases.

Signed-off-by: Jonathan Rosser <jonathan.rosser@rd.bbc.co.uk>
2020-06-26 12:26:25 -04:00
Jonathan Rosser 92288c11c5 Install python routes package as a dependancy rather than directly
This is now a dependancy of ceph-mgr so will be installed automatically
and does not need a specific task.

This change means that ceph-mgr installs correctly on Ubuntu Focal where
the python3-routes package is necessary.

Signed-off-by: Jonathan Rosser <jonathan.rosser@rd.bbc.co.uk>
2020-06-26 12:26:25 -04:00
Guillaume Abrioux b7539eb275 dashboard: copy self-signed generated crt to mons
This commit makes the playbook copying self-signed generated certificate
to monitors.
When mons and mgrs are deployed on dedicated nodes the playbook will
fail when trying to import certificate and key files since they are
generated on mgrs whereas we try to import them from a monitor.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846995

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-06-23 09:37:21 -04:00
Dimitri Savineau d43769dc2a podman: Add Type and PIDFile value to unit files
This changes the way we are running the podman containers via systemd.
They are now in dettached mode and Type/PIDFile set.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1834974

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-06-23 09:37:50 +02:00
Guillaume Abrioux 3f47236470 ceph_volume: make zap function idempotent
This commit makes the zap function idempotent, especially when using
lvm_volumes variable.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1845668

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-06-22 22:16:29 -04:00
Dimitri Savineau bd22f1d1ec docker: Add Requires on docker service
When using docker container engine then the systemd unit scripts only
use a dependency on the docker daemon via the After parameter.
But if docker is restarted on a live system then the ceph systemd units
should wait for the docker daemon to be fully restarted.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846830

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-06-22 23:08:50 +02:00
Guillaume Abrioux 37b20b6525 docker2podman: make images pulling optional
This commit makes the images pulling skipped if podman isn't installed
on the machine.

In OSP context, the podman installation is done later in the workflow,
it means all `podman pull` commands will fail.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1849559

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-06-22 12:19:38 -04:00
Dimitri Savineau 296aa13b3c travis: use tests/requirements.txt
Explicitly install ansible-lint pytest pytest-cov via pip results of a
specific pytest version (4.3.1) which is not supported for pytest-cov
(2.10).
Because we are already defining a specific pytest version in the tests
requirements then we can install all the python dependencies from that
file and remove this from the pip install command.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-06-19 18:10:55 -04:00
Guillaume Abrioux 1525990f39 requirements: exclude ansible 2.9.10
ansible 2.9.10 seems to have introduced a bug.

See https://github.com/ansible/ansible/issues/70168

This commit excludes this version from ceph-ansible requirements.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-06-19 17:32:33 -04:00