Commit Graph

4630 Commits (d36bab5557e9fc9c50c5db29d1331f90e9285ced)
 

Author SHA1 Message Date
Guillaume Abrioux b3eb9206fa osd: support numactl options on OSD activate
This commit adds OSD containers activate with numactl support.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1684146

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-11 10:14:50 +01:00
Dimitri Savineau b23c05ae52 add-osd.yml: Add become flag for ceph-validate
The check_devices task fails if the ceph-validate role isn't executed
as a privileged user (Permission denied).

failed: [osd0] (item=/dev/sdb) => {"changed": false, "err": "Error:
Error opening /dev/sdb: Permission denied\n", "item": "/dev/sdb",
"msg": "Error while getting device information with parted script:
'/sbin/parted -s -m /dev/sdb -- unit 'MiB' print'", "out": "", "rc": 1}

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-09 05:54:46 +00:00
Dimitri Savineau a089e1ec23 systemd/service: Set docker.service conditionally
We don't need to set After=docker.service when the container_binary
variable isn't set to docker.
It doesn't break anything currently but it could be confusing when
using podman.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-07 20:56:11 +00:00
Dimitri Savineau d6e71d769c common: Use rhsm_repository module for RHCS
Instead of using subscription-manager with command module we can use
the rhsm_repository ansible module.
This module already uses repos list feature to determine if a
repository is enabled or not. That way this module is idempotent so
we don't need changed_when: false anymore.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-07 19:15:42 +00:00
Dimitri Savineau 5da9a7dec5 ceph_key: Use client name to build key path
Because the client name is part of the client key path we can reuse
the user variable to build this path.
Also remove a duplicate user variable declaration.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-07 08:59:04 +00:00
Dimitri Savineau 676b4c979b travis: Add python 2.7
Because we're still using Linux distributions with python 2.7 (like
CentOS/RHEL 7) it could be useful to run travis tests against python
2.7 even if the support will be ended in 2020.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-06 02:25:55 +00:00
Dimitri Savineau 53514a5b50 common: Add noarch to community repository
The ceph stable community repository only enables the basearch
packages url.
Adding the noarch url because starting with nautilus release, some
packages are added there and useful for mgr or grafana.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-06 00:25:11 +00:00
Dimitri Savineau 4d32ecc980 Force osd pool min_size value to integer
After b8d580b and e9e5d5a we could have either item.min_size or
osd_pool_default_min_size using string instead of int causing the
condition to be true when it's false.
As a result, the task could try to set the pool min_size value to
0 which leads to:

Error EINVAL: pool min_size must be between 1 and 1

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-05 19:48:09 +00:00
Dimitri Savineau cb381b41fe Add CONTAINER_IMAGE env var to ceph daemons
Ceph daemons will set the CONTAINER_IMAGE environment variable value
in the daemon metadata.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-05 15:07:05 +00:00
Guillaume Abrioux e9e5d5a39a fix pool min_size customization
b8d580b3f4 introduced a bug when
`min_size` isn't set (default to 0).

Typical error:

```
Error EINVAL: pool min_size must be between 1 and 1
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-05 13:29:34 +00:00
Radu Toader b8d580b3f4 Customize pools min_size
Signed-off-by: Radu Toader <radu.m.toader@gmail.com>
2019-03-05 10:57:15 +00:00
Radu Toader 2048255f61 When creating pool, read pool.application and make the call to ceph osd pool enable application
Signed-off-by: Radu Toader <radu.m.toader@gmail.com>
2019-03-05 09:16:03 +00:00
Kevin Coakley b11dc13476 Updated 7 ansible-lint issues in the ceph-mon, ceph-osd, and ceph-rgw roles
The following lint issues have been resolved:

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-mon/tasks/ceph_keys.yml:2

[305] Use shell only when shell functionality is required
/home/travis/build/ceph/ceph-ansible/roles/ceph-osd/tasks/start_osds.yml:47

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:2

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:7

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:14

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:19

[301] Commands should not change things if nothing needs doing
/home/travis/build/ceph/ceph-ansible/roles/ceph-rgw/tasks/multisite/destroy.yml:24

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
2019-03-04 22:25:35 +00:00
Guillaume Abrioux 359f8a9a4a nfs: fix systemd template service for ubuntu
`mkdir` is located in `/bin` on Ubuntu.
Let's use some jinja to support Ubuntu.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-04 19:54:25 +00:00
Guillaume Abrioux 3d3eee8f38 tests: add symlink for ubuntu hosts inventory
otherwise a bunch of jobs will fail like following:

```
 [WARNING]: Unable to parse /home/jenkins-build/build/workspace/ceph-ansible-nightly-luminous-ubuntu-container-stable-3.2-bluestore_lvm_osds/tests/functional/bs-lvm-osds/container/hosts-ubuntu as an inventory source
 [WARNING]: No inventory was parsed, only implicit localhost is available
 [WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-04 18:05:26 +01:00
Guillaume Abrioux b42250332a tests: pin testinfra version
As of testinfra 2.0.0, the binary name is `py.test`.

But let's pin the version to 1.19.0.
Indeed, migrating to 2.0.0 requires our current testing to be reworked a bit.
Since we don't have the bandwidth ATM for this, it's better to simply
keep testing with testinfra 1.19.0.

Note that I've replaced all `testinfra` occurences by `py.test` anyway.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-04 14:44:27 +01:00
Guillaume Abrioux a440878533 add-osd: gather facts in second part of playbook
otherwise, it will end up with error like following:

```
FAILED! => {"msg": "'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_hostname'"}
```

because facts won't have been gathered.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1670663

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-04 14:44:27 +01:00
Guillaume Abrioux 47ebef374f purge: fix rbd-mirror group name
the default is rbdmirrors in ceph-defaults

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-01 20:31:14 +00:00
Guillaume Abrioux a915308477 purge: fix rbd mirror purge
as of b70d54ac80 the service launched isn't
ceph-rbd-mirror@admin.service.

it's now `ceph-rbd-mirror@rbd-mirror.{{ ansible_hostname }}`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-01 20:31:14 +00:00
Guillaume Abrioux 3849f30f58 purge: do not remove /var/lib/apt/lists/*
removing the content of this directory seems a bit agressive and cause a
redeployment to fail after a purge on debian based distrubition.

Typical error:
```
fatal: [mon0]: FAILED! => changed=false
  attempts: 3
  msg: No package matching 'ceph' is available
```

The following task will consider the cache is still valid, so apt
doesn't refresh it:
```
- name: update apt cache if cache_valid_time has expired
  apt:
    update_cache: yes
    cache_valid_time: 3600
  register: result
  until: result is succeeded
```

since the task installing ceph packages has a `update_cache: no` it
fails:

```
- name: install ceph for debian
  apt:
    name: "{{ debian_ceph_pkgs | unique }}"
    update_cache: no
    state: "{{ (upgrade_ceph_packages|bool) | ternary('latest','present') }}"
    default_release: "{{ ceph_stable_release_uca | default('') }}{{ ansible_distribution_release ~ '-backports' if ceph_origin == 'distro' and ceph_use_distro_backports else '' }}"
  register: result
  until: result is succeeded
```

/tmp/* isn't specific to ceph as well, so we shouldn't remove everything
in this directory.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-01 20:31:14 +00:00
Guillaume Abrioux 89f77589fa purge: fix purge of lvm devices
using `shell` module seems to be the only way to make this task working
on rhel based distribution AND debian based distributions.

on ubuntu, using `command` ansible module fails like following
(not due to `sudo` usage or not):
```
ok: [osd1] => changed=false
  cmd: command -v ceph-volume
  failed_when_result: false
  msg: '[Errno 2] No such file or directory: ''command'': ''command'''
  rc: 2
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1653307

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-03-01 20:31:14 +00:00
Dimitri Savineau 45a7082712 lint: Fix spaces before and after variables
ansible-lint reports:

[206] Variables should have spaces after {{ and before }}

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-03-01 17:22:24 +00:00
VasishtaShastry 34c25ef49b Extends check_devices tasks to non-collocated an lvm-batch scenarios
Tuned name of a task and error message to make it more user understandable

Fixes BZ 1648168 - ceph-validate : devices are not validated in non-collocated and lvm_batch scenario

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1648168

Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
2019-03-01 02:13:51 +00:00
Kevin Coakley 038401fef2 Add changed_when: false to the "get osd ids" statement
The "get osd ids" statement only registers the osd_ids_non_container variable. Running "ls /var/lib/ceph/osd/ | sed 's/.*-//'" should never produce a change on the system. Adding changed_when: false prevents irrelevant change messages from Ansible.

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
2019-02-28 22:46:19 +00:00
ToprHarley 573adce7dd Convert interface names to underscores
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1540881

Signed-off-by: Tomas Petr <tpetr@redhat.com>
2019-02-28 17:07:34 +00:00
Guillaume Abrioux d5be83e504 osd: add ipc=host in systemd template for containers
in addition to 15812970f0

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-28 13:14:09 +00:00
Guillaume Abrioux 1c30c76f8c mergify: need 2 approvals to merge a 'skip ci' PR
This will avoid merging PR with 1 approval + [skip ci]

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-28 13:07:51 +01:00
Guillaume Abrioux f2dcb02d21 tests: update ceph_volume tests
accordingly to change introduced by b5548ea9412cd7741bee993dddcbfd9daa34cb02

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-28 12:01:18 +00:00
Noah Watkins 15812970f0 cv: expose host ipc namespace to ceph-volume container
this is needed to properly handle semaphore synchronization for udev
actions via dmcrypt/cryptsetup.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683770

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2019-02-28 12:01:18 +00:00
Guillaume Abrioux 207fae38d4 tests: add lvm bluestore dmcrypt support
Add coverage for container / non container lvm bluestore dmcrypt OSDs

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-28 12:01:18 +00:00
fpantano 21fad7ced3 Removed not needed mountpoint and removed ubuntu section
Referring to BZ#1683290, as dsavineau suggests, being this
bug tripleO specific, removed the ubuntu section and removed
useless mountpoints.

Signed-off-by: fpantano <fpantano@redhat.com>
2019-02-28 09:46:10 +00:00
fpantano 0c1944236b Added to the ceph-radosgw service template the ca-trust
volume avoiding to expose useless information.
This bug is referred to the following bugzilla:

https://bugzilla.redhat.com/show_bug.cgi?id=1683290

Signed-off-by: fpantano <fpantano@redhat.com>
2019-02-28 09:46:10 +00:00
Dimitri Savineau 58a9d310d5 mon: Move client admin variable to defaults
There's no need to set the client_admin_ceph_authtool_cap variable
via a set_fact task.
Instead we can set this in the role defaults.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-02-27 18:39:39 +00:00
Dimitri Savineau dd7b7604de mon: Add mds permissions to client.admin
The administrator keyring needs full capabilities on mds like mon,
osd and mgr.
Whithout this, the client.admin key won't be able to run commands
against mds (like ceph tell mds.0 session ls)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1672878

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-02-27 18:39:39 +00:00
Guillaume Abrioux 4ab02d2cd1 tests: set ceph_origin and ceph_repository for non_container-collocation
those variables are mandatory.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-27 15:58:35 +00:00
Guillaume Abrioux f68ad10bc9 mon: do not create unnecessarily mgr keyrings
there's no need to generate mgr keyrings 'mgr.monX' when mgrs aren't
collocated with monitors.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-27 15:58:35 +00:00
Kevin Coakley d327681b99 Set permissions on monitor directory to u=rwX,g=rX,o=rX recursive
Set directories to 755 and files to 644 to /var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} recursively instead of setting files and directories to 755 recursively. The ceph mon process writes files to this path with permissions 644. This update stops ansible from updating the permissions in /var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} every time ceph mon writes a file and increases idempotency.

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
2019-02-27 10:48:19 +00:00
Guillaume Abrioux 7fd92348bb tests: add mgr node for all_daemons scenario
add a monitor node to cover in the CI the case where mgrs and monitors
are not collocated

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-26 13:19:06 +00:00
Guillaume Abrioux 299c7b670e site.yml: do not bootstrap mgrs on monitors by default
Let's bootstrap mgrs on monitors only if there's no mgrs section in
inventory hostfile.

Closes: #3613

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-26 13:19:06 +00:00
Dimitri Savineau dc1c0dcee2 ceph-osd: Drop memory flag with bluestore
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-02-26 07:27:06 +00:00
Guillaume Abrioux 8f42007272 facts: fix auto_discovery exclude
the previous approach was wrong.
checking if `item.key` is in `osd_auto_discovery_exclude` (`['dm-',
'loop']`) is incorrect because it will obviously not match. Therefore,
the condition will return `True` whatever the device we are checking.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-26 03:16:33 +00:00
Guillaume Abrioux c43b30e3ce common: fix retry on raw install python for rhel
when the following failure is thrown
```
    rhel-8.0.0-beta-1.7-     [===                 ] ---  B/s |   0  B     --:-- ETArhel-8.0.0-beta-1.7-appstream                   0.0  B/s |   0  B     00:00
    rhel-8.0.0-beta-1.7-     [===                 ] ---  B/s |   0  B     --:-- ETArhel-8.0.0-beta-1.7-baseos                      0.0  B/s |   0  B     00:00
    rhel-8.0.0-beta-1.7-     [   ===              ] ---  B/s |   0  B     --:-- ETArhel-8.0.0-beta-1.7-builder                     0.0  B/s |   0  B     00:00
    Failed to synchronize cache for repo 'rhel-8.0.0-beta-1.7-appstream', ignoring this repo.
    Failed to synchronize cache for repo 'rhel-8.0.0-beta-1.7-baseos', ignoring this repo.
    Failed to synchronize cache for repo 'rhel-8.0.0-beta-1.7-builder', ignoring this repo.
    No match for argument: python3
    Error: Unable to find a match
```

dnf returns 0 anyway.

Let's ensure the pattern 'Failed' isn't present in the output.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-25 10:05:34 +00:00
Guillaume Abrioux 83d7ef777e osd: add possibility to exclude device in osd_auto_discovery
Add a new `osd_auto_discovery_exclude` to give the possibility of
excluding some devices in auto_discovery scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-25 10:05:34 +00:00
Dimitri Savineau b7338d438a ceph-infra: Remove restart firewalld handler
There's no need to restart firewalld service when a new rule is
added due to the usage of the immediate flag.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-02-22 18:15:47 +00:00
Guillaume Abrioux fa13289c65 tests: fix network interfaces names in conftest.py
Set network interfaces names according to the OS distribution in
conftest.py

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-22 16:24:18 +01:00
Guillaume Abrioux 2ed203da61 Revert "tests: add ubuntu bionic support"
This reverts commit 33c09af250.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-22 16:24:18 +01:00
Guillaume Abrioux 3d66e913f4 tests: switch ubuntu image to bionic
I didn't use the `ceph/ubuntu-bionic` image because it's broken at the
time of writing this commit. I'll switch back to `ceph/ubuntu-bionic` as
soon as it will be fixed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-22 09:20:16 +01:00
Guillaume Abrioux 33c09af250 tests: add ubuntu bionic support
This commit brings all modifications needed to test against ubuntu
bionic.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-22 09:20:16 +01:00
Guillaume Abrioux 2b60a35634 common: do not override ceph_release when ceph_repository is 'rhcs'
We shouldn't reset `ceph_release` with `ceph_stable_release` when
`ceph_repository` is `rhcs`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-21 11:58:49 +01:00
Guillaume Abrioux 21e5db8982 osd: make the 'wait for all osd to be up' task configurable
introduce two new variables to make the check that 'wait for all osd to
be up' configurable.
It's possible that for some deployments, OSDs can take longer to be seen
as UP and IN.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1676763

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-20 16:06:04 +00:00