Commit Graph

775 Commits (371c25f0ef154ec72b386a869f0ea8aa0429898d)

Author SHA1 Message Date
Guillaume Abrioux 371c25f0ef adopt: fix bug in mon_ip_list set_fact
`default('{}')` must be before `| from_json`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f30767432b)
2022-02-09 12:40:09 +01:00
Guillaume Abrioux cb197575dd adopt: check for POOL_APP_NOT_ENABLED warning
This commit makes the cephadm-adopt playbook fail if the cluster
has the `POOL_APP_NOT_ENABLED` warning raised.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2040243

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ddae06e1a2)
2022-02-09 12:40:09 +01:00
jowsiewski 3c38b1e410 Remove the remaining packages
Signed-off-by: jowsiewski <owsiewski@gmail.com>
(cherry picked from commit 1dfd195c7e)
2022-02-04 11:14:38 +01:00
Francesco Pantano 8f15179d57 Add with_pkg tag on package related tasks
In the OpenStack context we let the integration tool (TripleO)
deal with repositories and packages.
This change just adds the with_pkg tag to allow TripleO skipping
both the repositories and packages installation.

Signed-off-by: Francesco Pantano <fpantano@redhat.com>
(cherry picked from commit 12dd8b5df1)
2022-02-04 09:52:07 +01:00
Guillaume Abrioux fa281c7538 adopt: create nfs exports at the user level
The current implementation is wrong.
ceph-ansible lists all existing buckets and try to create
an export for each of them.
Instead, it's easier to create the export at the user level.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2037691

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7f517cdd22)
2022-01-29 15:25:46 +01:00
Guillaume Abrioux 17d8351971 cephadm-adopt: use named args in rgw export creation
In order to avoid breaking changes, let's use named argument
instead of positional argument syntax in the command line
used to create rgw export.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2037691

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit aee1f06497)
2022-01-06 16:52:05 +01:00
Guillaume Abrioux e676502c8f purge: remove ceph directories on client nodes
Otherwise any ceph directories are left over on client nodes
after the purge.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2024815

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 20035852a4)
2022-01-06 10:33:31 +01:00
Guillaume Abrioux 7791fac222 update: speed up client play
wip

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 817c03bc0e)
2021-12-15 13:48:14 +01:00
Guillaume Abrioux 8a32576d20 cephadm-adopt: ensure /etc/ceph is present on monitoring node
When deploying the monitoring stack on a dedicated node, the directory
`/etc/ceph` has never been created. Therefore, the play for adopting the
monitoring stack fails because it can't write the minimal config file.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2029697

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7ece59b41d)
2021-12-07 23:09:42 +01:00
Guillaume Abrioux b16d9fc289 cephadm-adopt: bindmount /var/lib/ceph with 'ro'
When collocating osds with iscsigw daemons, cephadm bindmounts the
following:

```
-v /var/lib/ceph/6126c064-6a9e-4092-8a64-977930df0843/iscsi.rbd.ceph-ameenasuhani-4fs3bq-node5.vomtqb/configfs:/sys/kernel/config
```

this prevents cephadm-adopt playbook from running container and bindmounting `/var/lib/ceph:/var/lib/ceph:z`

since 'ro' is enough in this playbook, let's replace the ':z' option on
this bindmount with ':ro'

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027411

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c4fdf956bd)
2021-11-30 21:04:31 +01:00
Guillaume Abrioux 1628347253 adopt: fix ceph_origin and ceph_repository defaults
This is overriding those variables because the precedence at the 'block
var' level is greater than the group_vars/host_vars.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2026861

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5ea2ece99)
2021-11-30 13:02:24 +01:00
Guillaume Abrioux 6bdaa9e3d5 cephadm: support adding hosts with ipv6
The current implementation doesn't support adding hosts when using ipv6
addresses.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4f2c2af9b4)
2021-11-08 10:36:14 +01:00
Guillaume Abrioux 0097cb09f1 cephadm: use public_network when adding hosts
When adding host, using ansible_facts['default_ipv4']['address'] might
not be the desired network, we shouldn't enforce the subnet with the
default route.
Let's use the public_network instead.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2006415

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2f34531304)
2021-11-08 10:36:14 +01:00
Dimitri Savineau 041e8b0eaa cephadm-adopt: remove logrotate configuration
cephadm uses its own logrotate configuration file so ceph-ansible needs
to remove that custom file during the cephadm-adopt playbook.

Closes: #6944

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c41241244e)
2021-11-03 11:51:03 +01:00
Guillaume Abrioux 19dadc98da update: move a set_fact
ceph-facts roles makes decisions based on the fact `rolling_update` so
it must be called before we run this role.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5edcc4214)
2021-11-03 11:50:27 +01:00
Guillaume Abrioux 8f648269ec update: support --limit on monitor nodes
Change needed in order to support --limit on mon nodes.
Otherwise, a call to `hostvars[groups[mon_group_name][0]]['_current_monitor_address']`
throws an error:

```
"The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute '_current_monitor_address'"
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304#c28

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 82eee4303b)
2021-11-03 08:48:38 +01:00
Guillaume Abrioux a752edbd29 Revert "update: block upgrade when nfs+rgw is deployed"
This reverts commit 93f1765259.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2017508

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-28 08:13:05 +02:00
Guillaume Abrioux f7d67f7669 rolling_update: modify default health_osd_check_*
let's do more retries with a shorter delay.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 50a21d695e)
2021-10-25 21:08:44 +02:00
Guillaume Abrioux e5ef104c57 adopt: fix rbd mirror adoption
The rbd mirroring is broken because cephadm doesn't bindmount /etc/ceph anymore.
It means the keyrings and ceph config file aren't available after the
migration.
The idea here is to remove the current rbd mirror peer and add it back
to the mon config store so we aren't bound to the /etc/ceph directory.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967440

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9c794aa9bc)
2021-10-25 20:14:07 +02:00
Guillaume Abrioux b1bdb708d0 adopt: use mgr/nfs volume
use the mgr 'nfs' module to recreate nfs exports.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1954971

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4257410dcd)
2021-10-25 17:16:15 +02:00
Guillaume Abrioux efc6979db5 rolling_update: fix pre and post osd upgrade play
when using --limit osds, the play before and after osd upgrade are
skipped because we use `hosts: "{{ mon_group_name | default('mons') }}[0]"`
using `hosts: "{{ osds_group_name | default('osds') }}" with
`delegate_to` to the first monitor addresses this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fc9f87c45f)
2021-10-25 15:33:18 +02:00
Guillaume Abrioux ca25ebb323 update: support upgrading a subset of nodes
It can be useful in a large cluster deployment to split the upgrade and
only upgrade a group of nodes at a time.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5cf9db2b0)
2021-10-25 15:33:18 +02:00
Per Abildgaard Toft 3edc6ac5f2 shrink-osd: fix regression because of a wrong regex
968891f449 introduced a regression.
The regex is wrong because it doesn't allow to shrink osds with id
greater than 9

Fixes: #6950

Signed-off-by: Per Abildgaard Toft <per@minfejl.dk>
(cherry picked from commit 84118a3063)
2021-10-21 12:38:25 +02:00
Seena Fallah fde6354dcd cephadm: set ssh configs at bootstrap step
Add support ssh_user and ssh_config to cephadm bootstrap plugin

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit ae6be71b08)
2021-10-15 16:15:38 +02:00
Guillaume Abrioux 86ab9e44b6 shrink-osd: check osd id format
This adds a check early in order to ensure the format of osd ids passed
is correct.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2005734

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 968891f449)
2021-10-15 14:35:23 +02:00
Seena Fallah 191ec4f40f cephadm: install cephadm from repository
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 5822936252)
2021-10-13 08:10:05 +02:00
Seena Fallah 7b19748304 cephadm-adopt: configure repository for cephadm installation
Configure repository for cephadm installation and use package install in both containerized and non containerized deployment

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 339212a7c6)
2021-10-13 08:10:05 +02:00
Francesco Pantano 642a83dc6b Add ceph_nfs_adopt tag to the cephadm-adopt playbook
There are existing OpenStack scenarios where nfs is still not managed
by cephadm. For this reason sometimes is useful skip the nfs part of
the adoption playbook and leave this daemon unmanaged.
The purpose of this patch is providing a tag to enable the OpenStack
operators to skip this playbook section.

Closes: https://bugzilla.redhat.com/2009212
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
(cherry picked from commit b7299f258b)
2021-10-01 23:32:33 +02:00
Seena Fallah c3fe1a6206 cephadm: use cephadm_ssh_user for ssh user
Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 0b78faa723)
2021-10-01 23:31:39 +02:00
Guillaume Abrioux 4b5a0c0443 cephadm: add admin label on mon nodes
This is needed if you want a copy of the admin keyring on the admin
nodes.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b555f1d1cd)
2021-10-01 23:23:06 +02:00
Guillaume Abrioux d196881ebb cephadm-adopt: add no_log: true
Let's add a `no_log: true` on the `cephadm registry-login` task.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0a3b916ee7)
2021-09-28 21:15:02 +02:00
Guillaume Abrioux a053adbe84 adopt: stop iscsi services in the first place
If old containers are still running, it can make tcmu-runner process
unable to open devices and there's nothing else to do than restarting
the container.

Also, as per discussion with iscsi experts, iscsi should be migrated before
OSDs. (the client should be closed before the server)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000412

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d12efa1ab4)
2021-09-28 18:46:49 +02:00
Seena Fallah cb5a675e49 cephadm-adopt: use cephadm_ssh_user for ssh user
Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 67389d08d4)
2021-09-13 16:26:24 +02:00
Daniel Pivonka 969e41fa2e cephadm-adopt: set cephadm registry login info
registry login info needs to be stored in cluster for cephadm and future hosts

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000103
Signed-off-by: Daniel Pivonka <dpivonka@redhat.com>
(cherry picked from commit 1c50dc29cf)
2021-09-13 16:18:40 +02:00
Seena Fallah 432ab37c6b purge: add remove_docker tag
This can help to skip docker removal tasks

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit ff39c8d70b)
2021-09-09 16:41:32 +02:00
Seena Fallah 0897c08518 purge: add container_binary needed for zap osds
`container_binary` isn't set anymore in the purge osd play because of a
regression introduced by 60aa70a.
The CI didn't catch it because the play purging node-exporter sets this
variable for all nodes before we run the purge osd play.

This commit fixes this regression.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit a51ce767ca)
2021-09-09 14:40:30 +02:00
Dimitri Savineau ac6604ab61 purge-dashboard: remove cid files
This adds the service cid file cleanup as supported in the classic purge
playbook since b9dd253

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786691

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit cddc23f511)
2021-09-08 12:05:22 -04:00
Dimitri Savineau ac5353a2d8 cephadm-adopt: fix orch host add with FQDN
When a node is configured with FQDN as the hostname value then the
`ceph orch host add` command will fail because the `ansible_hostname` used
by that command contains the short hostname which won't match the current
hostname (FQDN)
Instead we can use the ansible_nodename fact.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1997083

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2630f8d47a)
2021-08-26 17:10:55 -04:00
Dimitri Savineau e3e849378e cephadm-adopt: remove ceph-nfs.target
This systemd target doesn't exist at all.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8ba6101bbb)
2021-08-18 15:29:03 -04:00
Guillaume Abrioux d7311aeefc containers: introduce target systemd unit
This adds ceph-*.target systemd unit files support for containerized
deployments.
This also fixes a regression introduced by PR #6719 (rgw and nfs systemd
units not getting purged)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1962748

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 09ef465f62)
2021-08-18 13:42:50 -04:00
Guillaume Abrioux 056b18aa0e update: gather facts only one time
this play doesn't need to gather facts from localhost

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c14e9114ba)
2021-08-17 15:31:34 -04:00
VasishtaShastry 6ed0919796 Fixes typo in rgw-add-users-buckets playbook
Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
(cherry picked from commit 478d9fdcb6)
2021-08-09 14:31:42 -04:00
Guillaume Abrioux 6e9cf80747 adopt: import rgw ssl certificate into kv store
Without this, when rgw is managed by cephadm, it fails to start because
the ssl certificate isn't present in the kv store.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1987010
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1988404

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 930fc4c850)
2021-08-05 14:47:47 -04:00
Dimitri Savineau 2377da8f9b infra: use dedicated variables for balancer status
The balancer status is registered during the cephadm-adopt, rolling_update
and swith2container playbooks. But it is also used in the ceph-handler role
which is included in those playbooks too.
Even if the ceph-handler tasks are skipped for rolling_update and
switch2container, the balancer_status variable is erased with the skip task
result.

play1:
  register: balancer_status
play2:
  register: balancer_status <-- skipped
play3:
  when: (balancer_status.stdout | from_json)['active'] | bool

This leads to issue like:

The conditional check '(balancer_status.stdout | from_json)['active'] | bool'
failed. The error was: Unexpected templating type error occurred on
({% if (balancer_status.stdout | from_json)['active'] | bool %} True
{% else %} False {% endif %}): expected string or buffer.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1982054

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 386661699b)
2021-08-04 11:43:47 -04:00
Dimitri Savineau 31cc8bd2aa osds: use osd pool ls instead of osd dump command
The ceph osd pool ls detail command is a subset of the ceph osd dump
command.

$ ceph osd dump --format json|wc -c
10117
$ ceph osd pool ls detail --format json|wc -c
4740

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 06471a4b82)
2021-08-03 13:57:14 -04:00
Dimitri Savineau 7f5b986e01 rolling_update: get ceph version when mons exist
eec3878 introduced a regression for upgrade scenarios where there's no
monitor nodes at all (like ganesha standalone, external clients, etc..)

TASK [get the ceph release being deployed] ************************************
task path: infrastructure-playbooks/rolling_update.yml:121
Thursday 29 July 2021  15:55:29 +0000 (0:00:00.484)       0:00:15.802 *********
fatal: [client0]: FAILED! =>
  msg: '''dict object'' has no attribute ''mons'''

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit e87a47cf0c)
2021-08-03 12:17:47 -04:00
Benoît Knecht c8348ab0d9 infrastructure-playbooks: Get Ceph info in check mode
In the `set osd flags` block, run the Ceph commands that gather information
from the cluster (and don't make any changes to it) even when running in check
mode.

This allows the tasks that depend on the variables set by those tasks to
succeed in check mode.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit d7653dca95)
2021-08-02 15:53:49 +02:00
Guillaume Abrioux 76f68843e5 update: check the ceph release
Check early which Ceph release is going to be deployed and fail if it
doesn't correspond to the ceph-ansible version being used.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978643

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit eec38784ec)
2021-07-26 13:35:30 -04:00
Guillaume Abrioux 036b03a7bb purge: support osd_auto_discovery
This adds a task that zaps by osd id so we can support the scenario
where osds were deployed with `osd_auto_discovery` is true.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876860

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4144074a50)
2021-07-22 12:43:57 -04:00
Guillaume Abrioux b36e8ec935 purge: merge playbooks
This refactor merges the two playbooks so we only have to maintain 1
playbook.
(Symlink the old purge-container-cluster.yml playbook for backward
 compatibility).

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 17cd83bf3a)
2021-07-22 12:43:57 -04:00