Commit Graph

572 Commits (a8fe484e1533db760b35fd948a30cc7fd512df2f)

Author SHA1 Message Date
Guillaume Abrioux 9167e19b1c tests: skip rbdmirror tests on non-secondary daemon
the daemon is not running on the 'primary' daemon.
Therefore, these tests are not needed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 37e67fb672)
2022-08-08 13:06:50 +02:00
Guillaume Abrioux ee5303bd0f rbd-mirror: major refactor
- Use config-key store to add cluster peer.
- Support multiple pools mirroring.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b74ff6e22c)
2022-08-08 13:05:20 +02:00
Guillaume Abrioux a15448e5cc tests: add new scenario subset_update
new scenario in order to test the subset upgrade approach using tags.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fb8a66149b)
2021-10-25 23:22:35 +02:00
Guillaume Abrioux c5a8343417 tests: ensure ca-certificates is up to date
otherwise the `rpm_key` module fails because it can't verify the
certificate.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-02 15:48:31 +02:00
Guillaume Abrioux a890d6a043 tests: remove all references to ceph_stable_release
this is legacy and not needed anymore.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f277a39dfe)
2021-10-02 15:48:31 +02:00
Guillaume Abrioux a8026b3516 tests: set rgw_instances in collect-logs.yml
in order to gather rgw logs, we need rgw_instances to be set.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c2e46fe5a5)
2021-09-30 17:54:31 +02:00
Guillaume Abrioux dda2368892 tests: update collect-logs.yml playbook
- change `ceph -s` output to json-pretty.
- gather rgw logs
- add `health detail` command

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b2ccc7234a)
2021-09-30 17:49:29 +02:00
Guillaume Abrioux 6dd85f3344 tests: move collect-logs.yml to ceph-ansible repo
related ceph-build PR: ceph/ceph-build#1914

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 702564518b)
2021-09-29 16:41:40 +02:00
Dimitri Savineau 3aba6c8e4e tests/rgw: use json format output for user info
If the radosgw user already exists then we need to have the output in json
format because we are expecting to load the output with json.loads()
Otherwise we have pytest failure like:

```console
self = <json.decoder.JSONDecoder object at 0x7fa2f00a5fd0>, s = '', idx = 0

    def raw_decode(self, s, idx=0):
        """Decode a JSON document from ``s`` (a ``str`` beginning with
        a JSON document) and return a 2-tuple of the Python
        representation and the index in ``s`` where the document ended.

        This can be used to decode a JSON document from a string that may
        have extraneous data at the end.

        """
        try:
            obj, end = self.scan_once(s, idx)
        except StopIteration as err:
>           raise JSONDecodeError("Expecting value", s, err.value) from None
E           json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
```

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f2bd8ae70f)
2021-08-27 14:40:37 -04:00
Dimitri Savineau e1dd35f6d6 tests/rgw: add timeout 5s to radosgw-admin command
If the radosgw daemons aren't up and running correctly (like not registered
in the servicemap or the OSD are down) then the radosgw-admin will hang
forever.
Jenkins will kill the jobs after 3h but we don't want to wait until this global
timeout.
Adding the timeout 5 command to the radosgw-admin commands (which is already
present on other ceph calls) allows the job to fail earlier.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f01ae82eec)
2021-08-27 14:40:37 -04:00
Guillaume Abrioux 79b8f3a74d lib/ceph-volume: support zapping by osd_id
This commit adds the support for zapping an osd by osd_id in the
ceph_volume module.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 70f1d6e2cd)
2021-07-26 17:49:42 +02:00
Guillaume Abrioux 5fa7102b1f tests: remove legacy file
This inventory isn't used anywhere.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 304d1cbb97)
2021-06-30 16:12:07 +02:00
Guillaume Abrioux 31ad2d9338 workflows: add signed-off check
This adds a github workflow for checking the signed off line in commit
messages.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8c09497567)
2021-06-30 09:52:44 +02:00
Guillaume Abrioux e899b84f6f workflow: add group_vars/defaults checks
let's use github workflow for checking defaults values.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d71db816c6)
2021-06-30 09:52:44 +02:00
Guillaume Abrioux 71764c3440 tests: disable test_mgr_dashboard_is_listening
Due to a recent commit that has introduced a regression in ceph, this
test is failing.
Temporarily disabling it to unblock the CI.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2e19d1705e)
2021-06-07 15:12:43 +02:00
Alex Schultz 7ddbe74712 Use ansible_facts
It has come to our attention that using ansible_* vars that are
populated with INJECT_FACTS_AS_VARS=True is not very performant.  In
order to be able to support setting that to off, we need to update the
references to use ansible_facts[<thing>] instead of ansible_<thing>.

Related: ansible#73654
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1935406
Signed-off-by: Alex Schultz <aschultz@redhat.com>
(cherry picked from commit a7f2fa73e6)
2021-03-26 00:16:58 +01:00
Guillaume Abrioux 93defc4f4b tests: switch to quay.ceph.io for dashboard images
for some reason, `quay.io/app-sre/grafana` no longer exist.
as a workaround, all dashboard related images have been mirrored on
quay.ceph.io.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c90b0985e5)
2021-03-25 14:11:11 +01:00
Guillaume Abrioux bd6cc79fa0 tests: fix `test_rgw_is_up` test
The data structure seems to have been modified in ceph@master (quincy).

This commit update the test accordingly.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b8080bac41)
2021-03-25 14:11:11 +01:00
Guillaume Abrioux 358ea3853a tests: fix `test_nfs_is_up` test
the data structure seems to have been modified in ceph@master (quincy).

This commit update the test accordingly.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7e1db0b599)
2021-03-25 14:11:11 +01:00
Florian Haas 6fe14c6d01 requirements.txt: Move the six dependency into the general requirements
config_template.py depends on six, which isn't listed in the default
requirements.txt. This previously frequently wasn't a problem, because
six used to be a standard package being installed into a venv, and
lots of other projects depended on it.

It also does get installed for unit and integration tests via
tests/requirements.txt, so any broken dependency on six wouldn't be
detected by tox runs.

However, as other projects and distributions have phased out Python
2.7 support the dependency on six becomes less common. Thus, as long
as ceph-ansible does require it for config_template.py, add it to the
base requirements.

Signed-off-by: Florian Haas <florian@citynetwork.eu>
(cherry picked from commit d49ea9818b)
2021-03-03 13:22:29 +01:00
Guillaume Abrioux 8fada83589 tests: set `mon_max_pg_per_osd` in rgw_multisite
Otherwise, the job fails when it tries to create a bucket with `s3cmd mb`
command because we have too many PGs per OSD.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 54bae480d2)
2021-02-10 08:32:24 +01:00
Guillaume Abrioux 14267fe0c4 rgw: multisite refact
Add the possibility to deploy rgw multisite configuration with a mix of
secondary and primary zones on a same rgw node.
Before that, on a same node, all instances were either primary
zones *OR* secondary.

Now you can define a rgw instance like following:

```
rgw_instances:
  - instance_name: 'rgw0'
    rgw_zonemaster: false
    rgw_zonesecondary: true
    rgw_zonegroupmaster: false
    rgw_realm: 'france'
    rgw_zonegroup: 'zonegroup-france'
    rgw_zone: paris-00
    radosgw_address: "{{ _radosgw_address }}"
    radosgw_frontend_port: 8080
    rgw_zone_user: jacques.chirac
    rgw_zone_user_display_name: "Jacques Chirac"
    system_access_key: P9Eb6S8XNyo4dtZZUUMy
    system_secret_key: qqHCUtfdNnpHq3PZRHW5un9l0bEBM812Uhow0XfB
    endpoint: http://192.168.101.12:8080
```

Basically it's now possible to define `rgw_zonemaster`,
`rgw_zonesecondary` and `rgw_zonegroupmaster` at the intsance
level instead of the whole node level.

Also, this commit adds an option `deploy_secondary_zones` (default True)
which can be set to `False` in order to explicitly ask the playbook to
not deploy secondary zones in case where the corresponding endpoint are
not deployed yet.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1915478

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 71a5e666e3)
2021-01-28 16:37:50 -05:00
Dimitri Savineau 3f16132e44 library: add ceph_osd_flag module
This adds ceph_osd_flag ansible module for replacing the command module
usage with the ceph osd set/unset commands.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 5da593604a)
2020-12-15 17:36:28 +01:00
Guillaume Abrioux 8106dcff44 tests: rgw_multisite playbook test refactor
Currently we create an object from the primary sites but we try to read
that object still from the master which doesn't make sense, we should
try to read it from a secondary site.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e2ea403d5e)
2020-12-15 17:30:04 +01:00
Guillaume Abrioux d14723d5b4 mon: refact initial keyring generation
adding monitor is no longer possible because we generate a new mon
keyring each time the playbook is run.

Fixes: #5864
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1902281

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 970c6a4ee6)
2020-12-01 09:53:26 -05:00
Guillaume Abrioux 18b34a5bef ceph_key: support using different keyring
Currently the `ceph_key` module doesn't support using a different
keyring than `client.admin`.
This commit adds the possibility to use a different keyring.

Usage:
```
      ceph_key:
        name: "client.rgw.myrgw-node.rgw123"
        cluster: "ceph"
        user: "client.bootstrap-rgw"
        user_key: /var/lib/ceph/bootstrap-rgw/ceph.keyring
        dest: "/var/lib/ceph/radosgw/ceph-rgw.myrgw-node.rgw123/keyring"
        caps:
          osd: 'allow rwx'
          mon: 'allow rw'
          import_key: False
        owner: "ceph"
        group: "ceph"
        mode: "0400"
```

Where:
`user` corresponds to `-n (--name)`
`user_key` corresponds to `-k (--keyring)`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 12e6260266)
2020-12-01 09:53:26 -05:00
Guillaume Abrioux 41c7c77817 Revert "ceph_key: support using different keyring"
This reverts commit 74eb7cbecb.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-12-01 09:53:26 -05:00
Dimitri Savineau fcf260b65b tests: use github workflow for pytest
Move the pytest testing from TravisCI to Github workflow.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3e79f0322a)
2020-11-18 10:49:30 -05:00
Guillaume Abrioux 04484f5c52 tests: enforce pytest-rerunfailures version
This commit enforces the pytest-rerunfailures installed so it's <9.0

This is to avoid the following error:

```
ERROR: pytest-rerunfailures 9.0 has requirement pytest>=5.0, but you'll have pytest 4.6.11 which is incompatible.
```

latest version of pytest-rerunfailures isn't compatible with the version
of pytest we are using.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 19097026fb)
2020-11-18 10:49:30 -05:00
Guillaume Abrioux 7ffc8534ef tests: change cephfs pool size
`all_daemons` scenario can't handle pools with `size: 3` because we have
1 osd node in root=HDD and two nodes in root=default.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5713ea5d5)
2020-10-06 17:54:51 -04:00
Guillaume Abrioux 74eb7cbecb ceph_key: support using different keyring
Currently the `ceph_key` module doesn't support using a different
keyring than `client.admin`.
This commit adds the possibility to use a different keyring.

Usage:
```
      ceph_key:
        name: "client.rgw.myrgw-node.rgw123"
        cluster: "ceph"
        user: "client.bootstrap-rgw"
        user_key: /var/lib/ceph/bootstrap-rgw/ceph.keyring
        dest: "/var/lib/ceph/radosgw/ceph-rgw.myrgw-node.rgw123/keyring"
        caps:
          osd: 'allow rwx'
          mon: 'allow rw'
          import_key: False
        owner: "ceph"
        group: "ceph"
        mode: "0400"
```

Where:
`user` corresponds to `-n (--name)`
`user_key` corresponds to `-k (--keyring)`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 12e6260266)
2020-10-06 09:21:58 -04:00
Guillaume Abrioux b13f0d12e7 tests: reboot and test idempotency on collocation
test reboot and idempotency on collocation scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f83f798206)
2020-10-06 09:21:58 -04:00
Guillaume Abrioux 765db7ceec flake8: fix pep8 syntax on tests/functional/tests/
tests/conftest.py and tests present in tests/functional/tests/ has been
missed from previous commit

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8596f1d52c)

# Conflicts:
#	.github/workflows/flake8.yml
2020-10-06 10:04:01 +02:00
Guillaume Abrioux d0f29e08d8 flake8: fix all tests/library/*.py files
This commit modifies all *.py files in ./tests/library/ so flake8
passes.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e49a5241f0)
2020-10-06 08:56:45 +02:00
Dimitri Savineau fd0b9491b6 ansible: bump to ansible 2.9
Prior this commit we were supporting both ansible 2.8 and 2.9.
Let's drop 2.8 now.

Closes: #5459
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1879178

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-09-15 13:13:09 -04:00
Guillaume Abrioux f31258d604 tests: do not run node_exporter test on clients
We need to skip these tests on client nodes since we don't deploy
node_exporter on them anymore

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5650a6d7d0)
2020-09-14 16:13:25 -04:00
Dimitri Savineau 47f24ec047 Add CentOS 8 support for rpm deployment
We were only supporting CentOS 8 for containerized deployment.
Since Nautilus 14.2.10 we now have el8 rpm packages so we should be
able to deploy a nautilus ceph cluster with el8.
Note that the nfs-ganesha isn't supported because there's no el8 rpm
packages for nfs-ganesha V2.8.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-09-10 20:38:34 -04:00
Dimitri Savineau 0f7da8b9d1 pytest: register ceph_crash mark
Otherwise we see some pytest warning.

PytestUnknownMarkWarning: Unknown pytest.mark.ceph_crash - is this a typo?
You can register custom marks to avoid this warning - for details,
see https://docs.pytest.org/en/latest/mark.html

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 03d4620269)
2020-09-10 20:35:04 -04:00
Guillaume Abrioux 66dde0034b ceph-crash: introduce new role ceph-crash
This commit introduces a new role `ceph-crash` in order to deploy
everything needed for the ceph-crash daemon.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9d2f2108e1)
2020-09-10 20:35:04 -04:00
Dimitri Savineau d461631c86 tests: use grafana from quay.io
This changes the grafana container image regitry from docker.io to
quay.io to avoid rate limit.
This also adds the missing container image values for docker2podman
and podman scenarios.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit dd05d8ba90)
2020-09-10 21:37:06 +02:00
Guillaume Abrioux 2001039c0e tests: migrate to quay.ceph.io registry
in order to avoid docker.io rate limiting

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 218aedaab6)
2020-09-10 21:37:06 +02:00
Guillaume Abrioux 2754895b89 tests: move erasure pool testing in lvm_osds
This commit moves the erasure pool creation testing from `all_daemons`
to `lvm_osds` so we can decrease the number of osd nodes we spawn so the
OVH Jenkins slaves aren't less overwhelmed when a `all_daemons` based
scenario is being tested.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8476beb5b1)
2020-08-20 14:16:57 +02:00
Guillaume Abrioux bd5cde631b tests: refact shrink_osd scenario
This adds more coverage on the shrink_osd scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7efea219d6)
2020-08-06 13:10:42 +02:00
Guillaume Abrioux 9e40062570 tests: lvm_setup.yml, add carriage return
This commit adds crlf between each task.
It makes the playbook more readable.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8ef9fb68bc)
2020-07-22 18:47:27 -04:00
Guillaume Abrioux 53793b352e tests: (lvm_setup.yml), don't shrink lvol
when rerunning lvm_setup.yml on existing cluster with OSDs already
deployed, it fails like following:

```
fatal: [osd0]: FAILED! => changed=false
  msg: Sorry, no shrinking of data-lv2 to 0 permitted.
```

because we are asking `lvol` module to create a volume on an empty VG
with size extents = `100%FREE`.

The default behavior of `lvol` is to shrink the volume if the LV's current
size is greater than the requested size.

Given the requested size is calculated like this:

`size_requested = size_percent * this_vg['free'] / 100`

in our case, it is similar to:

`size_requested = 100 * 0 / 100` which basically means `0`

So the current LV size is well greater than the requested size which
leads the module to attempt to shrink it to 0 which isn't obviously now
allowed.

Adding `shrink: false` to the module calls fixes this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 218f4ae361)
2020-07-22 18:47:27 -04:00
Dimitri Savineau 056a4fe866 ceph-dashboard: update create/get rgw user tasks
Since [1] if a rgw user already exists then the radosgw-admin user create
command will return an error instead of modifying the current user.
We were already doing separated tasks for create and get operation but
only for multisite configuration but it's not enough.
Instead we should do the get task first and depending on the result
execute the create.
This commit also adds missing run_once and delegate_to statement.

[1] https://github.com/ceph/ceph/commit/269e9b9

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ac0f68ccf0)
2020-07-20 21:21:57 +02:00
Jan Fajerski 14e9672f00 lvm_setup: lookup device from inventory, default to /dev/sd* names
This fixes a long standing fail in ceph-volumes lvm test suite.
Otherwise the default behaviour should not change.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 1fe8e819f9)
2020-06-29 10:25:58 +02:00
Dimitri Savineau a99c94ea11 ceph-osd: remove ceph-osd-run.sh script
Since we only have one scenario since nautilus then we can just move
the container start command from ceph-osd-run.sh to the systemd unit
service.
As a result, the ceph-osd-run.sh.j2 template and the
ceph_osd_docker_run_script_path variable are removed.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 829990e60d)
2020-06-23 17:35:01 +02:00
Guillaume Abrioux 8ef3fee41b ceph_volume: make zap function idempotent
This commit makes the zap function idempotent, especially when using
lvm_volumes variable.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1845668

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3f47236470)
2020-06-23 10:49:07 +02:00
Dimitri Savineau a97e24fee9 docker2podman: manage dashboard nodes
The dashboard nodes (alertmanager, grafana, node-exporter, and prometheus)
were not manage during the docker to podman migration.

This adds the systemd container template of those services to a dedicated
file (systemd.yml) in order to include it in the docker2podman playbook.

This also adds the dashboard container images pull from docker to podman.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1829389

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 252e78b4e4)
2020-06-03 13:20:24 -04:00