Commit Graph

6072 Commits (a95726c40900d30bb3728fc99e7487c65ea2d0a1)
 

Author SHA1 Message Date
Guillaume Abrioux f6b49f78a9 facts: fix deployments with different net interface names
Deployments when radosgws don't have the same names for
network interface.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2095605

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-07-05 10:01:40 +02:00
Guillaume Abrioux e223630cf0 tests: add yes_i_know=true in tox-shrink_osd.ini
main branch requires it. Otherwise the playbook won't run.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-07-05 10:00:49 +02:00
Guillaume Abrioux 6623f34679 tests: drop shrink_osd from tox.ini
shrink_osd has its own tox config file (tox-shrink_osd.ini)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-07-05 10:00:49 +02:00
Guillaume Abrioux 2e823b117e common: fix a typo
s/of/or ..

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2099828#c25

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-07-03 07:22:21 +02:00
Guillaume Abrioux b18a1aa3ca backup-and-restore: fix check on 'target_node' variable
If the user doesn't pass a valid name (present in the inventory)
the playbook will fail like following:

```
fatal: [localhost -> {{ target_node }}]: FAILED! =>
  msg: |-
    The task includes an option with an undefined variable. The error was: "hostvars['10.70.46.40']" is undefined
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-29 09:05:19 +02:00
Guillaume Abrioux 848dd03fa6 backup-and-restore: fix check on 'mode' variable
Typical failure:

```
fatal: [localhost]: FAILED! =>
  msg: |-
    The conditional check 'mode not in ['backup', 'restore']' failed. The error was: error while evaluating conditional (mode not in ['backup', 'restore']): 'mode' is undefined
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-29 08:47:54 +02:00
Guillaume Abrioux 7d848fa19e Revert "upgrade: block upgrade when rgw multisite is active"
This reverts commit 51bc8cb636.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-29 06:55:31 +02:00
Guillaume Abrioux 564684b14a Revert "tests: use build main/6f765e2"
This reverts commit a568314eda.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-22 14:29:03 +02:00
Guillaume Abrioux 19fedfbac5 nfs: use repo from SIG
RPMs for nfs-ganesha aren't hosted anymore at https://download.ceph.com

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-22 01:17:20 +02:00
Guillaume Abrioux 3962cf6cb9 Revert "tests: temporarily disable nfs-ganesha"
This reverts commit 3f923d69ac.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-22 01:17:20 +02:00
Guillaume Abrioux 77fc0e105e add better clarification on ceph-ansible current status
Given that the project is still maintained for the time being, let's add a better
comment about that.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-15 13:02:15 +02:00
Guillaume Abrioux 11c0e93165 mergify: reindent file properly
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-15 11:28:40 +02:00
Guillaume Abrioux a571a9128e mergify: add backport configuration for stable-7.0
So we can ask mergify to do automatic backport in stable-7.0

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-15 11:28:40 +02:00
Guillaume Abrioux e28c486e52 backup-and-restore: fix a typo
Typo introduced during initial implementation.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-15 10:51:54 +02:00
Guillaume Abrioux a568314eda tests: use build main/6f765e2
shaman is broken at the moment, this is the last working build on 'main'

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-15 08:09:10 +02:00
Guillaume Abrioux aa68b06c99 ansible: bump to ansible 2.12
Add required changes to support ansible 2.12

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-15 08:09:10 +02:00
Michael Wagner 4edaab5f4c fix(ceph-grafana): make dashboard download work again
This fixes the dashboard download for pacific and later.

Signed-off-by: Michael Wagner <mitch.wagna@gmail.com>
2022-06-14 14:36:23 +02:00
Guillaume Abrioux 8a5fb702f2 mergify: add restriction on backport command
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-14 10:02:19 +02:00
Guillaume Abrioux 5d9001b8e9 doc: update ansible version requirement
This updates the documentation regarding the Ansible version
requirement for `stable-6.0` branch.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-13 09:59:21 +02:00
Seena Fallah bb849a5586 ceph_pool: set target size ratio on both 'on' and 'warn' mode
when we set target_size_ratio to warn it means that the administrator wants to get suggestion from the mgr module but apply it manually when he/she wants. So it's in the same approach as 'on' mode just triggered by hand.
So there is no need to set pg_num when target_size_ratio is 'warn' and the mgr module will calculate the correct pg_num and the administrator will adjust it whenever he/she wants.

It is the same approach that was in #6471

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2022-05-31 14:37:04 +02:00
David Galloway 245133f9e0 whitelist->allowlist
https://github.com/tox-dev/tox/blob/master/docs/changelog.rst#v3180-2020-07-23

Signed-off-by: David Galloway <dgallowa@redhat.com>
2022-05-30 15:15:15 +02:00
David Galloway a698c07716 master->root
Signed-off-by: David Galloway <dgallowa@redhat.com>
2022-05-30 15:15:15 +02:00
David Galloway bcedff95bd master->main
Signed-off-by: David Galloway <dgallowa@redhat.com>
2022-05-30 15:15:15 +02:00
Guillaume Abrioux 41d62596fc cephadm_adopt: set autotune_memory_target_ratio
This adds a task that sets `autotune_memory_target_ratio` depending on the
value of `is_hci`.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2028693

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-05-30 14:56:42 +02:00
Francesco Pantano 0e9b3902b0 Add ceph_infra tag to rolling_update
When the upgrade from Ceph 4 to 5 is performed in the OpenStack context,
ceph-ansible triggers the rolling_update playbook, which is supposed to
rollout new Ceph containers.  The ceph-infra role tries to take care
about firewall, ntp config and logrotate; however, TripleO manages them
through tripleo-heat-templates.  This patch just add an additional tag
to skip the ceph-infra role in the OpenStack context.

Closes: https://bugzilla.redhat.com/2090456
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
2022-05-27 15:05:16 +02:00
Guillaume Abrioux 5ab46f836d purge: reset-failed ceph-crash
This ensures we always reset-failed the ceph-crash service.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-05-23 10:37:42 +02:00
Guillaume Abrioux dee49779c9 tests: use latest version for pytest
with the bump of py version, let's use newer version for pytest.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-05-12 22:51:31 +02:00
Guillaume Abrioux ef1d76f351 tests: install ansible.utils collection
otherwise, it's missing for external_clients and subset_update jobs

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-05-12 22:51:31 +02:00
Guillaume Abrioux ff424b1c39 collections: install ansible.utils
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-05-12 22:51:31 +02:00
Guillaume Abrioux c1649862a9 common: move to `ansible.utils.ipwrap`
ipwrap has moved to ansible.utils

see
db4920ebf6

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-05-12 22:51:31 +02:00
Guillaume Abrioux 1e11f879f6 common: config rhcs tools repo on all nodes
Otherwise `cephadm` can't be installed during cephadm-adopt.yml
playbook execution.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2073480

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-05-12 22:51:31 +02:00
Guillaume Abrioux 6e2ebe857d cephadm-adopt: remove legacy directory after adoption
When this directory is left after the osd adoption, it leads to the following error:

```
[WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
    host axdesec2ocs1n002.ecommerce.inditex.grp `cephadm ceph-volume` failed: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config
ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config'.
```

this is because of an unexpected behavior regarding 'config inferring' when a legacy directory is present in /var/lib/ceph.

Note: this doesn't fix the root cause, this is a workaround.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2075510

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-05-12 09:58:14 +02:00
Guillaume Abrioux ed0bba4d77 contrib: add a playbook
this playbook can backup or restore some ceph files.
(/etc/ceph, /var/lib/ceph, ...)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-04-28 22:57:27 +02:00
Guillaume Abrioux 8939441027 Warn about ceph-ansible deprecation
The official installer is now cephadm. stable-6.0 is the last
release of ceph-ansible such as we know it.

It will become a playbook intended for deploying minimal
Ceph cluster (mostly for development/testing purposes)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-04-27 18:32:43 +02:00
Guillaume Abrioux ef0455a0b1 tests: update vagrant_box default value
This updates the default value for the vagrant_box variable
in all vagrant_variables.yml files

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-04-27 15:01:56 +02:00
David Galloway 1c740c424a tests/setup: Use local mirror of centos 8 stream repo
The mirrors provided by CentOS' mirrorlists are super slow

Signed-off-by: David Galloway <dgallowa@redhat.com>
2022-04-21 11:43:39 +02:00
Ingo Ebel c5bb450f87 added AlmaLinux and Rocky for iscsi deploy
Signed-off-by: Ingo Ebel <ingo.ebel@desy.de>
2022-04-14 00:35:48 +02:00
Guillaume Abrioux 0f34cd16d8 dashboard: allow collecting stats from the host
This commit makes podman bindmount `/:/rootfs:ro` so the container can
collect data from the host.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2028775

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-04-13 20:35:01 +02:00
pinotelio f288364c5c ceph-facts: fix ansible templating error for auto osd discovery
This commit fixes templating error that occurs when using auto osd discovery. Getting the len before converting the result to a list causes "object of type generator has no len()" error.

Signed-off-by: pinotelio <ahmadreza.mollapour@gmail.com>
2022-04-13 14:26:35 +02:00
Guillaume Abrioux 1cd1fa0560 validate: drop a check
Since the ISO install method removal, ceph-ansible isn't able
to detect wheter the user is deploying in a 'disconnected environment'.
By the way, given that ceph-ansible is available only for upgrading to RHCS 5,
this check doesn't make sense anymore, let's drop it.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2062147

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-04-07 13:49:51 +02:00
insatomcat 58fdc03e63 do not update Debian cache when package-install is disabled
When deploying with --skip-tags=package-install (when there is no access to a repository), the playbook is still trying to update the package cache, which makes the playbook fail.
This change prevents the playbook to try to update the cache when the package-install tag is skipped.

Signed-off-by: Florent CARLI <florent.carli@rte-france.com>
2022-03-31 15:28:44 +02:00
Guillaume Abrioux 72e4654aae dashboard: always set `dashboard_server_addr`
When running the playbook with `--limit`, if the play targeted doesn't match
hosts present in the mgr group the playbook can fail.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063029

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-03-25 22:36:23 +01:00
Teoman ONAY f851d3232c Using another user than root for cephadm ssh connections fails
Fixes commit da42f3d139

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2048734

Signed-off-by: Teoman ONAY <tonay@redhat.com>
2022-03-20 12:51:16 +01:00
Guillaume Abrioux 3e87df5e8f tests: update the system before deploying
Having a system up-to-date is usually a good idea.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-03-20 08:13:38 +01:00
Guillaume Abrioux 51bc8cb636 upgrade: block upgrade when rgw multisite is active
With this commit, upgrading a cluster from Nautilus to Pacific with
active rgw multisite replication will be blocked.
This is because a lot of bugs are currently present in Pacific regarding
RGW multisite.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063702

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-03-20 08:13:38 +01:00
Teoman ONAY 7e8ce2567e Turn off SELinux separation for containers MON and RGW
Initially MONs and RGW binded /etc/pki/ca-trust/extracted using the :z flag
(introduced to solve an OSP TripleO issue on RHEL - #3638) but using
this flag prevents local services (like sssd) running on the host from accessing
the certificates/files in that folder.

Signed-off-by: Teoman ONAY <tonay@redhat.com>
2022-03-08 14:45:45 +01:00
Guillaume Abrioux 266b6e739c adopt: fix node labelling
When using group of group, the playbook will apply undesired
labels on nodes.
This commit fixes it by applying only the expected labels.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2057528

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-03-03 15:52:00 +01:00
Teoman ONAY f8c6bba657 Add cluster custom name support
When using cluster custom names, cephadm commands are executed using
the default admin keyring name which fails.

Signed-off-by: Teoman ONAY <tonay@redhat.com>
2022-03-03 15:52:00 +01:00
Teoman ONAY da42f3d139 Enable user to change the account used for ssh connection
By default cephadm uses root account to connect remotely
to other nodes in the cluster. This change allows to choose
another account.
This commit also allows to use a dedicated subnet for cephadm mgmt.

Signed-off-by: Teoman ONAY <tonay@redhat.com>
2022-03-03 15:52:00 +01:00
Guillaume Abrioux 2f11982590 purge: ceph-crash purge fixes
This fixes the service file removal and makes the playbook
call `systemctl reset-failed` on the service because in Ceph
Nautilus, ceph-crash doesn't handle `SIGTERM` signal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-03-02 10:08:35 +01:00