Commit Graph

4177 Commits (4dd46ec396dc3dd9bb7fb752331f44038ff9022c)
 

Author SHA1 Message Date
Guillaume Abrioux 4dd46ec396 add-osd: gather facts in second part of playbook
otherwise, it will end up with error like following:

```
FAILED! => {"msg": "'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_hostname'"}
```

because facts won't have been gathered.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1670663

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a440878533)
2019-03-04 15:48:44 +00:00
Guillaume Abrioux 06ad7e0b57 purge: fix rbd-mirror group name
the default is rbdmirrors in ceph-defaults

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 47ebef374f)
2019-03-01 22:16:19 +00:00
Guillaume Abrioux a8467d8f33 purge: fix rbd mirror purge
as of b70d54ac80 the service launched isn't
ceph-rbd-mirror@admin.service.

it's now `ceph-rbd-mirror@rbd-mirror.{{ ansible_hostname }}`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a915308477)
2019-03-01 22:16:19 +00:00
Guillaume Abrioux 5470e6fa42 purge: do not remove /var/lib/apt/lists/*
removing the content of this directory seems a bit agressive and cause a
redeployment to fail after a purge on debian based distrubition.

Typical error:
```
fatal: [mon0]: FAILED! => changed=false
  attempts: 3
  msg: No package matching 'ceph' is available
```

The following task will consider the cache is still valid, so apt
doesn't refresh it:
```
- name: update apt cache if cache_valid_time has expired
  apt:
    update_cache: yes
    cache_valid_time: 3600
  register: result
  until: result is succeeded
```

since the task installing ceph packages has a `update_cache: no` it
fails:

```
- name: install ceph for debian
  apt:
    name: "{{ debian_ceph_pkgs | unique }}"
    update_cache: no
    state: "{{ (upgrade_ceph_packages|bool) | ternary('latest','present') }}"
    default_release: "{{ ceph_stable_release_uca | default('') }}{{ ansible_distribution_release ~ '-backports' if ceph_origin == 'distro' and ceph_use_distro_backports else '' }}"
  register: result
  until: result is succeeded
```

/tmp/* isn't specific to ceph as well, so we shouldn't remove everything
in this directory.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3849f30f58)
2019-03-01 22:16:19 +00:00
Guillaume Abrioux 255eab59ac purge: fix purge of lvm devices
using `shell` module seems to be the only way to make this task working
on rhel based distribution AND debian based distributions.

on ubuntu, using `command` ansible module fails like following
(not due to `sudo` usage or not):
```
ok: [osd1] => changed=false
  cmd: command -v ceph-volume
  failed_when_result: false
  msg: '[Errno 2] No such file or directory: ''command'': ''command'''
  rc: 2
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1653307

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 89f77589fa)
2019-03-01 22:16:19 +00:00
VasishtaShastry 2393d82306 Extends check_devices tasks to non-collocated an lvm-batch scenarios
Tuned name of a task and error message to make it more user understandable

Fixes BZ 1648168 - ceph-validate : devices are not validated in non-collocated and lvm_batch scenario

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1648168

Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
(cherry picked from commit 34c25ef49b)
2019-03-01 04:06:57 +00:00
ToprHarley d1051c8e55 Convert interface names to underscores
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1540881

Signed-off-by: Tomas Petr <tpetr@redhat.com>
(cherry picked from commit 573adce7dd)
2019-02-28 19:02:32 +00:00
Guillaume Abrioux de3465b6a3 osd: add ipc=host in systemd template for containers
in addition to 15812970f0

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d5be83e504)
2019-02-28 13:48:39 +00:00
Guillaume Abrioux 320af524d8 tests: update ceph_volume tests
accordingly to change introduced by b5548ea9412cd7741bee993dddcbfd9daa34cb02

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f2dcb02d21)
2019-02-28 13:48:39 +00:00
Noah Watkins 58a527c192 cv: expose host ipc namespace to ceph-volume container
this is needed to properly handle semaphore synchronization for udev
actions via dmcrypt/cryptsetup.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683770

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
(cherry picked from commit 15812970f0)

# Conflicts:
#	library/ceph_volume.py
2019-02-28 13:48:39 +00:00
Guillaume Abrioux b7f5233d07 tests: add lvm bluestore dmcrypt support
Add coverage for container / non container lvm bluestore dmcrypt OSDs

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 207fae38d4)
2019-02-28 13:48:39 +00:00
fpantano 1033411512 Removed not needed mountpoint and removed ubuntu section
Referring to BZ#1683290, as dsavineau suggests, being this
bug tripleO specific, removed the ubuntu section and removed
useless mountpoints.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683290

Signed-off-by: fpantano <fpantano@redhat.com>
(cherry picked from commit 21fad7ced3)
2019-02-28 12:31:23 +00:00
fpantano 9b843c24f9 Added to the ceph-radosgw service template the ca-trust
volume avoiding to expose useless information.
This bug is referred to the following bugzilla:

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683290

Signed-off-by: fpantano <fpantano@redhat.com>
(cherry picked from commit 0c1944236b)
2019-02-28 12:31:23 +00:00
Kevin Coakley 2005d857df Set permissions on monitor directory to u=rwX,g=rX,o=rX recursive
Set directories to 755 and files to 644 to
/var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} recursively instead of
setting files and directories to 755 recursively. The ceph mon
process writes files to this path with permissions 644. This update stops
ansible from updating the permissions in
/var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} every time ceph mon writes
a file and increases idempotency.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683997

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
(cherry picked from commit d327681b99)
2019-02-28 10:52:04 +00:00
Dimitri Savineau 77596c791d mon: Move client admin variable to defaults
There's no need to set the client_admin_ceph_authtool_cap variable
via a set_fact task.
Instead we can set this in the role defaults.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 58a9d310d5)
2019-02-27 20:03:13 +00:00
Dimitri Savineau 05c6ac4d78 mon: Add mds permissions to client.admin
The administrator keyring needs full capabilities on mds like mon,
osd and mgr.
Whithout this, the client.admin key won't be able to run commands
against mds (like ceph tell mds.0 session ls)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1672878

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit dd7b7604de)
2019-02-27 20:03:13 +00:00
Guillaume Abrioux 8cc75e516c common: do not override ceph_release when ceph_repository is 'rhcs'
We shouldn't reset `ceph_release` with `ceph_stable_release` when
`ceph_repository` is `rhcs`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2b60a35634)
2019-02-21 13:03:16 +00:00
Guillaume Abrioux d15b055854 osd: make the 'wait for all osd to be up' task configurable
introduce two new variables to make the check that 'wait for all osd to
be up' configurable.
It's possible that for some deployments, OSDs can take longer to be seen
as UP and IN.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1676763

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 21e5db8982)
2019-02-20 16:53:06 +00:00
David Waiting eba80adb1a ensure at least one osd is up
The existing task checks that the number of OSDs is equal to the number of up OSDs before continuing.

The problem is that if none of the OSDs have been discovered yet, the task will exit immediately and subsequent pool creation will fail (num_osds = 0, num_up_osds = 0).

This is related to Bugzilla 1578086.

In this change, we also check that at least one OSD is present. In our testing, this results in the task correctly waiting for all OSDs to come up before continuing.

Signed-off-by: David Waiting <david_waiting@comcast.com>
(cherry picked from commit 3930791cb7)
2019-02-19 19:02:16 +00:00
Sébastien Han 2c1a690774 ceph_key: fix rstrip for python 3
Removing bytes literals since rstrip only supports type String or None.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit f5c2ca3710)
2019-02-18 16:39:38 +00:00
Patrick C. F. Ernzer a43c68df7d setup_ntp: call handler to disable ntpd if chronyd used
The task setup chronyd called the handler disable chronyd, which of
course defeats the purpose.

Changing the task to disable ntpd instead fixes the issue of chronyd
being disabled after it got enabled.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1673664
Fixes: #3582

Signed-off-by: Patrick C. F. Ernzer pcfe@redhat.com
(cherry picked from commit c605ff6a68)
2019-02-15 09:09:36 +00:00
Guillaume Abrioux 6200f90ab2 iscsi: fix permission denied error
Typical error:
```
fatal: [iscsi-gw0]: FAILED! =>
  msg: 'an error occurred while trying to read the file ''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key'': [Errno 13] Permission denied: b''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key'''
```

`become: True` is not needed on the following task:

`copy crt file(s) to gateway nodes`.

Since it's already set in the main playbook (site.yml/site-container.yml)

The thing is that the files get generated in the 'fetch_directory' with
root user because there is a 'delegate_to' + we run the playbook with
`become: True` (from main playbook).

The idea here is to create files under ansible user so we can open them
later to copy them on the remote machine.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9d590f4339)
2019-02-11 16:17:44 +00:00
Justin Riley bde156352b add 'custom' as valid ceph_repository value
This is documented as valid:

561746f75e/group_vars/all.yml.sample (L245)

Signed-off-by: Justin Riley <justin.t.riley@gmail.com>
(cherry picked from commit 6a79870d62)
2019-02-11 10:07:24 +00:00
Leah Neukirchen d855cb2595 Fix uses of default(omit) with string concatenation
When {{omit}} is concatenated with another string, it expands to something
like __omit_place_holder__63eea0d96dd6ed867b95405e11d87dddf61f448d.
However, in these use-cases we need an empty string.

Regression introduced in d53f55e807.

Signed-off-by: Leah Neukirchen <leah.neukirchen@mayflower.de>
2019-02-08 11:01:11 +00:00
Guillaume Abrioux 15b1f22ca3 tests: do not deploy iscsigw on ubuntu
not supported on non rhel based distribution

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-06 14:48:21 +01:00
Guillaume Abrioux 2738a945a3 tests: add inventory file
add missing inventory file for ubuntu-container-all_daemons job

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-06 14:48:21 +01:00
Guillaume Abrioux 0317831b5b ansible: increase fact cache timeout
10m seems a bit low, indeed, a complete run can take more than 1h.
Let's increase it to 2h

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b37c4adb32)
2019-02-06 10:22:14 +00:00
Sébastien Han 7db797d8df osd: expose udev into the container
In order to be able to retrieve udev information, we must expose its
socket. As per, https://github.com/ceph/ceph/pull/25201 ceph-volume will
start consuming udev output.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 997667a873)
2019-02-06 00:37:11 +00:00
Guillaume Abrioux 303cc85754 osd: bind mount /var/run/udev/
without this, the command `ceph-volume lvm list --format json` hangs and
takes a very long time to complete.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7ade032807)
2019-02-06 00:37:11 +00:00
Noah Watkins be59e0b451 shrink_osd: use cv zap by fsid to remove parts/lvs
Fixes:
  https://bugzilla.redhat.com/show_bug.cgi?id=1569413
  https://bugzilla.redhat.com/show_bug.cgi?id=1572933

Note: rebased

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
(cherry picked from commit 9a43674d2e)
2019-02-06 00:37:11 +00:00
Noah Watkins ebd72708b1 test: add missing test dependency
[nwatkins@smash ceph-ansible]$ virtualenv env
[nwatkins@smash ceph-ansible]$ env/bin/pip install -r tests/requirements.txt
[nwatkins@smash ceph-ansible]$ env/bin/python -c "import mock"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'mock'

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
(cherry picked from commit 8a5530ee98)
2019-02-06 00:37:11 +00:00
Noah Watkins 8454f0144a cv: support zap by osd fsid
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
(cherry picked from commit fce9f6ef60)
2019-02-06 00:37:11 +00:00
Rishabh Dave 701ea71392 set `any_errors_fatal` true for left out host sections
Many hosts sections in site.yml.sample were left out during the
backport commit 6e2cd0930f.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-01-31 07:52:31 +00:00
Patrick Donnelly befdb1e48b use shortname in keyring path
socket.gethostname may return a FQDN. Problem found in Linode.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 8cd0308f5f)
2019-01-31 00:56:29 +00:00
Guillaume Abrioux 1877e1b330 tests: run lvm_setup.yml only when osd_scenario is lvm
especially for ooo_collocation scenario which is still using ceph-disk
testing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-31 00:33:10 +01:00
Guillaume Abrioux 2abde600cd tests: add nodes for container-all_daemons scenario
add back iscsigw and rbdmirror vm in all_daemons testing

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-30 14:58:59 +01:00
Noah Watkins b8c39d7613 Add a ceph-volume aware shrink-osd playbook
Signed-off-by: Noah Watkins <nwatkins@redhat.com>
(cherry picked from commit f5dacbf7de)
2019-01-30 14:58:59 +01:00
Noah Watkins 8f57a95048 Rename ceph-disk version of shrink-osd playbook
This will be replaced by a ceph-volume aware verison.

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
(cherry picked from commit 0782cfc546)
2019-01-30 14:58:59 +01:00
Guillaume Abrioux 802e692b7b tests: specify docker params for shrink-osd
Otherwise, it will go with the default values, eg:

"latest" for `ceph_docker_image_tag`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-30 14:58:59 +01:00
Noah Watkins fc6bae26ac Fixup shrink_osd[_container] scenario config
** configuration seems to be for filestore:

[ERROR]: [ceph-osd0] Validation failed for variable: lvm_volumes

** Removing `radosgw_interface: eth1` to resolve:

The task includes an option with an undefined variable. The error was:
'ansible.vars.hostvars.HostVarsVars object' has no attribute
u'ansible_eth1'

The error appears to have been in
'/home/nwatkins/src/ceph-ansible/roles/ceph-defaults/tasks/set_radosgw_address.yml':
line 21, column 5, but may be elsewhere in the file depending on the
exact syntax problem.

The offending line appears to be:

  - name: set_fact _radosgw_address to radosgw_interface - ipv4
    ^ here

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
(cherry picked from commit 50255b9640)
2019-01-30 14:58:59 +01:00
Guillaume Abrioux 299baed635 tests: refact testing in stable-3.2
Apply the same refact recently introduced in master to stable-3.2

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-30 14:58:59 +01:00
Guillaume Abrioux af17e0dfbb override ceph_release with ceph_stable_release
when `ceph_origin` is set to `'repository'` and `ceph_repository` to
`'community'` we need to ensure `ceph_release` reflect
`ceph_stable_release`.

4a3f180f9d simply removed the override
while it should just have to be run only when the condition mentioned
above is satisfied.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0bfefdd5bc)
2019-01-24 14:18:34 +00:00
Guillaume Abrioux e29cdd0a61 config: remove code related to ceph release prior to luminous
This part of the code is not needed since ceph-ansible@master is
intended to deploy ceph@master only.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1bbdde272f)
2019-01-24 14:18:34 +00:00
Guillaume Abrioux eaa92f7e55 ceph-default: rm useless condition
This condition is useless and it's also creating issues we don't see in
our CI. ceph_release is set by either ceph-common or ceph-docker-common
so let's keep it this way.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379

(cherry picked from commit e9188cd202)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-24 14:18:34 +00:00
Giulio Fidente 75855b2d58 Preserve rolling_update backward compatibility with ansible < 2.5
Let's enforce the default value for `client_update_batch` to 20 since
`ansible_forks` isn't always available.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1650184

Signed-off-by: Giulio Fidente <gfidente@redhat.com>
(cherry picked from commit ff8dbe114c)
2019-01-21 14:28:07 +00:00
Guillaume Abrioux 44afe16568 Vagrantfile: remove useless default values
Those default values are useless and might cause issues.

- `osd_scenario` should be mandatory anyway.
- `pool_default_size` is not used anymore (this has been refactored
recently.

Closes: #3468

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c7a929b2dc)
2019-01-21 10:22:18 +01:00
Noah Watkins e57e2d98a1 start_osds: use list instead of keys (re-introduce)
the python3 fix merged by:

  https://github.com/ceph/ceph-ansible/pull/3346

was reintroduced a few days later by:

  82a6b5adec

and this patch fixes it again :)

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
(cherry picked from commit 3cf5fd2c3e)
2019-01-16 15:48:35 +00:00
Brad Hubbard 53650b2fc4 site: Make sure is_atomic is defined
configure_firewall tests the is_atomic variable if the firewalld package
is not present. is_atomic is defined in ceph_facts so include that.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
(cherry picked from commit 55fab6f547)
2019-01-15 15:16:40 +00:00
Sébastien Han a08d4c072f ceph-facts: resync group_vars file
Signed-off-by: Sébastien Han <seb@redhat.com>
2019-01-14 18:54:46 +00:00
Sébastien Han 04d8002614 switch: do not fail on missing key
Some people use the switch playbook to perform upgrade so they end up in
the same situation than https://bugzilla.redhat.com/show_bug.cgi?id=1650572
This is applying the same fix as
729744c6a8.

We don't want to fail on key that are not present since they will get
created after the mons are updated. They will be created by the task
"create potentially missing keys (rbd and rbd-mirror)".

Signed-off-by: Sébastien Han <seb@redhat.com>
2019-01-14 18:54:46 +00:00