Commit Graph

50 Commits (32c2a18f8c803e6b670ae20763e712c0308672e5)

Author SHA1 Message Date
Guillaume Abrioux e2cbffc1c7 drop variables from 'hosts' key
The linter complains about that.
It doesn't work anyway so it doesn't make sense to leave these variables
here.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-02-14 09:54:13 +01:00
Teoman ONAY 49da07df68 shrink-osd fails when the OSD container is stopped
ceph-volume simple scan cannot be executed as it is meant to be
run inside the OSD container.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2164414

Signed-off-by: Teoman ONAY <tonay@ibm.com>
2023-03-15 16:00:22 +01:00
Teoman ONAY 64e08f2c0b Refresh /etc/ceph/osd json files content before zapping the disks
If the physical disk to device path mapping has changed since the
last ceph-volume simple scan (e.g. addition or removal of disks),
a wrong disk could be deleted.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2071035

Signed-off-by: Teoman ONAY <tonay@redhat.com>
2022-07-11 09:14:40 +02:00
Guillaume Abrioux aa68b06c99 ansible: bump to ansible 2.12
Add required changes to support ansible 2.12

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2022-06-15 08:09:10 +02:00
Per Abildgaard Toft 84118a3063 shrink-osd: fix regression because of a wrong regex
968891f449 introduced a regression.
The regex is wrong because it doesn't allow to shrink osds with id
greater than 9

Fixes: #6950

Signed-off-by: Per Abildgaard Toft <per@minfejl.dk>
2021-10-21 10:01:23 +02:00
Guillaume Abrioux 968891f449 shrink-osd: check osd id format
This adds a check early in order to ensure the format of osd ids passed
is correct.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2005734

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2021-10-12 18:26:18 +02:00
Alex Schultz a7f2fa73e6 Use ansible_facts
It has come to our attention that using ansible_* vars that are
populated with INJECT_FACTS_AS_VARS=True is not very performant.  In
order to be able to support setting that to off, we need to update the
references to use ansible_facts[<thing>] instead of ansible_<thing>.

Related: ansible#73654
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1935406
Signed-off-by: Alex Schultz <aschultz@redhat.com>
2021-03-08 20:54:02 +01:00
Dimitri Savineau cf7345f143 consume ceph_volume module when possible
We should always use the ceph_volume ansible module when possible.
This patch replace the ceph-volume inventory and lvm {list,zap} commands
called via the command/shell modules by the corresponding call with the
ceph_volume module.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-12-01 17:54:10 +01:00
Dimitri Savineau 0b5b1de963 library: add ceph_osd module
This adds ceph_osd ansible module for replacing the command module
usage with the ceph osd destroy/down/in/out/purge/rm commands.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-11-30 16:53:45 +01:00
Guillaume Abrioux 9fba6eecfa lint: variables should have spaces before and after
Fix ansible lint 206 error:

[206] Variables should have spaces before and after: {{ var_name }}

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-11-23 08:33:47 +01:00
Guillaume Abrioux 5450de58b3 lint: commands should not change things
Fix ansible lint 301 error:

[301] Commands should not change things if nothing needs doing

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-11-23 08:33:47 +01:00
Dimitri Savineau 50104650e7 add missing boolean filter
Otherwise this will generate an ansible warning about the missing
filter.

[DEPRECATION WARNING]: evaluating xxx as a bare variable, this behaviour
will go away and you might need to add |bool to the expression in the
future.
Also see CONDITIONAL_BARE_VARS configuration toggle.. This feature will
be removed in version 2.12.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-09-28 20:45:01 +02:00
Guillaume Abrioux 448cc280b7 common: don't enable debug log on ceph-volume calls by default
ceph-volume can generate large logs at some point.

debug logs by definition should be enabled only when debugging.

Let's make it customizable with a variable which is set to `False` by
default.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-08-11 15:03:20 +02:00
Benoît Knecht fe8fbd3ee2 shrink-osd: various fixes
This handles missing /etc/ceph/osd, by ensuring we actually found files in
`/etc/ceph/osd` before trying to slurp their content.

This also add a missing `| default(False)` to avoid fowlloing error:

```
fatal: [ceph01]: FAILED! =>
  msg: |-
    The conditional check 'ceph_osd_data_json[item.2]['encrypted'] | bool' failed. The error was: error while evaluating conditional (ceph_osd_data_json[item.2]['encrypted'] | bool): 'dict object' has no attribute 'encrypted'
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1862416

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2020-08-05 01:30:57 +02:00
Guillaume Abrioux 8933bfde33 shrink_osd: remove osd data directory
Otherwise it leaves an empty directory.
When shrinking and redeploying multiple OSDs you have no guarantee it
will reuse the same osd id.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-08-03 14:46:56 +02:00
Dimitri Savineau 64701437de container: remove ulimit nofile parameter
Since Ceph Octopus is python3 only we don't need to specify the max open
files anymore with the container engine.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-30 09:54:23 +02:00
Dimitri Savineau 08ac2e3034 shrink: don't use localhost node
The ceph-facts are running on localhost so if this node is using a
different OS/release that the ceph node we can have a mismatch between
docker/podman container binary.
This commit also reduces the scope of the ceph-facts role because we only
need the container_binary tasks.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2020-03-03 10:32:15 +01:00
Guillaume Abrioux 1de2bf9991 shrink-osd: support shrinking ceph-disk prepared osds
This commit adds the ceph-disk prepared osds support

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1796453

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-02-26 11:45:41 -05:00
Guillaume Abrioux 55970b18f1 shrink-osd: don't run ceph-facts entirely
We need to call ceph-facts only for setting `container_binary`.
Since this task has been isolated we can use `tasks_from` to only execute the
needed task.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2020-02-26 11:45:41 -05:00
Benoît Knecht 8b3df4e418 infrastructure-playbooks: Run shrink-osd tasks on monitor
Instead of running shring-osd tasks on localhost and delegating most of
them to the first monitor, run all of them on the first monitor
directly.

This has the added advantage of becoming root on the monitor only, not
on localhost.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2020-02-19 11:16:30 -05:00
Guillaume Abrioux 6d9ca6b05b shrink-osd: support fqdn in inventory
When using fqdn in inventory, that playbook fails because of some tasks
using the result of ceph osd tree (which returns shortname) to get
some datas in hostvars[].

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1779021

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-12-09 10:52:38 -05:00
L3D ab54fe20ec ansible: use 'bool' filter on boolean conditionals
By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these:
```
[DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable,
this behaviour will go away and you might need to add |bool to the expression
in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This
feature will be removed in version 2.12. Deprecation warnings can be disabled
by setting deprecation_warnings=False in ansible.cfg.
```

Now appended ``| bool`` on a lot of the affected variables.

Sometimes the coding style from ``variable|bool`` changed to ``variable | bool`` *(with spaces at the pipe)*.

Closes: #4022

Signed-off-by: L3D <l3d@c3woc.de>
2019-06-06 10:21:17 +02:00
Guillaume Abrioux e74d80e72f rename docker_exec_cmd variable
This commit renames the `docker_exec_cmd` variable to
`container_exec_cmd` so it's more generic.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-05-16 16:39:13 +02:00
wumingqiao 5320aa11c4 shrink_osd: mark all osd(s) out in one command
Signed-off-by: wumingqiao <wumingqiao@beyondcent.com>
2019-05-15 16:04:28 +02:00
letterwuyu d57f6fcdc6 Fix comment content
Signed-off-by: lishuhao letterwuyu@gmail.com
2019-05-07 10:54:44 +02:00
Rishabh Dave 739a662c80 improve coding style
Keywords requiring only one item shouldn't express it by creating a
list with single item.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2019-04-23 15:37:07 +02:00
Noah Watkins 9a43674d2e shrink_osd: use cv zap by fsid to remove parts/lvs
Fixes:
  https://bugzilla.redhat.com/show_bug.cgi?id=1569413
  https://bugzilla.redhat.com/show_bug.cgi?id=1572933

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2019-01-24 16:34:13 +01:00
Guillaume Abrioux 0eb56e36f8 introduce new role ceph-facts
sometimes we play the whole role `ceph-defaults` just to access the
default value of some variables. It means we play the `facts.yml` part
in this role while it's not desired. Splitting this role will speedup
the playbook.

Closes: #3282

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-12-12 11:18:01 +01:00
Rishabh Dave 2fb12ae554 use pre_tasks and post_tasks when necessary
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-12-05 08:17:10 +00:00
Rishabh Dave e4f0af2b78 don't use private option for import_role
Since sharing variables amongst roles has been made default since
Ansible 2.6, private option has been deprecated; so stop using it.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-12-04 23:45:59 +00:00
Sébastien Han e5d5dffeb5 shrink-osd: add missing CEPH_BINARY
We need to add the right binary to do the docker exec.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-11-27 16:47:40 +00:00
Noah Watkins b848d2be4c don't use "role" or "roles" to include roles
see 3f62fc585f

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-11-08 17:45:37 +01:00
Noah Watkins f5dacbf7de Add a ceph-volume aware shrink-osd playbook
Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-11-08 17:45:37 +01:00
Noah Watkins 0782cfc546 Rename ceph-disk version of shrink-osd playbook
This will be replaced by a ceph-volume aware verison.

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-11-08 17:45:37 +01:00
Rishabh Dave 3f62fc585f don't use "role" or "roles" to include roles
Since import_role and include_role are more readable, explicit (about
the nature of inclusion) and flexible (allows placibf inclusion
anywhere) amongst the tasks, use them instead of using roles or role
keyword. Besides, these keywords also allow more arguments.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-10-31 09:38:59 +01:00
Guillaume Abrioux 57f0b6a476 shrink-osd: follow up on 36fb3cde
- Adds loop in bash to satisfy the 1:n relation between `osd_hosts` and the
different device lists.
- Fixes some container name which were using the host hostname instead
of the actual container one.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-09-18 07:27:41 +00:00
Sébastien Han 735e1917db shrink-osd: purge dedicated devices
Once the OSD is destroyed we also have to purge the associated devices,
this means purging journal, db , wal partitions too.

This now works for container and non-container.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1572933
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-09-18 07:27:41 +00:00
Guillaume Abrioux 4159326a18 shrink-osd: fix purge osd on containerized deployment
ce1dd8d introduced the purge osd on containers but it was incorrect.

`resolve parent device` and `zap ceph osd disks` tasks must be delegated to
their respective OSD nodes.
Indeed, they were run on the ansible node, it means it was trying to
resolve parent devices from this node where it should be done on OSD
nodes.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1612095

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-09-13 18:14:01 +02:00
Sébastien Han ce1dd8d2b3 shrink-osd: purge osd on containerized deployment
Prior to this commit we were only stopping the container, but now we
also purge the devices.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1572933
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-07-18 14:26:22 +00:00
Sébastien Han 66c1ea8cd5 shrink-osd: ability to shrink NVMe drives
Now if the service name contains nvme we know we need to remove the last
2 character instead of 1.

If nvme then osd_to_kill_disks is nvme0n1, we need nvme0
If ssd or hdd then osd_to_kill_disks is sda1, we need sda

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1561456
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-20 15:08:29 +02:00
Sébastien Han 2fb4981ca9 shrink-osd: admin key not needed for container shrink
Also do some clean

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-07 00:20:43 +02:00
Sébastien Han 6bac613611 shrink: support for container
We can now shrink mon and osds on containerized deployment.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492115
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-20 16:25:07 +02:00
Sébastien Han 3031e51778 shrink-osd: fix when multiple osds
The loop was being built properly so we were always getting the last
item as osd host.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1490355
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-13 15:20:11 -06:00
Sébastien Han 298a63c437 shrink mon and osd
Rework shrinking a monitor and an OSD playbook. Also adding test
scenario.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1366807
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-01 19:12:00 +02:00
Sébastien Han fdc6aebd62 infrastructure-playbooks: update with ceph-defaults roles
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-08-02 17:12:20 +02:00
WingkaiHo 0b9f322ca0 improve shrink-osd.yml can shrink osd when disk damage 2017-04-27 10:26:26 +08:00
chenyanshan 7eab2529ed this patch fix the regex pattern in infrastructure-playbooks/shrink-osd.yml when the osd's pid num is bigger than 9999
Signed-off-by: chenyanshan <yanshanchen@139.com>
2016-12-05 13:40:38 +08:00
Guillaume Abrioux a680707f6f All `include_vars` need to have `*.yml`, `*.yaml` or `*.json` extension.
As introduced in the following PR:
- https://github.com/ansible/ansible/pull/17207
we need to refactor our code.
2016-11-24 14:03:49 +01:00
Sébastien Han a2fcd222d2 moving to ansible v2.2 compatibility
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-Authored-By: Julien Francoz julien@francoz.net
2016-11-04 10:09:38 +01:00
Sébastien Han fde819d1a8 create a directory for infrastructure playbooks
Since we have a couple of infrastructure related playbooks
(additionnally to the roles we are using to deploy Ceph), it makes sense
to have them located in a separate directory.

Signed-off-by: Sébastien Han <seb@redhat.com>
2016-08-17 11:53:34 +02:00