Commit Graph

4858 Commits (f59dad620d43a740466a26f7fb8eba1ffc5ba0af)
 

Author SHA1 Message Date
Guillaume Abrioux f59dad620d playbook: add missing tags
Add missing tag on ceph-handler role call.
Otherwise, we can't use `--tags='ceph_update_config'` for updating the
ceph configuration file.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1754432

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-04 17:01:02 +02:00
Guillaume Abrioux ccc11cfc93 handler: followup on #4519
This commit adds some missing `| bool` filters.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-04 10:49:15 -04:00
Dimitri Savineau dd526cfe4e ceph-dashboard: add cluster parameter to ceph cmd
The ceph dashboard tasks didn't use the cluster option if the cluster
name isn't the default value.

Closes: #4529

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-04 16:10:22 +02:00
Guillaume Abrioux 411bd07d54 handlers: refact osd handler
This commit merges the two restart tasks into a single one, this way
it's one task less to notify.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-04 09:42:20 -04:00
Dimitri Savineau cf47594b47 dashboard: remove useless block section
The block section were used with the dashboard_enabled condition when
the code was included in the main playbooks.
Because this condition isn't present in the dashboard playbook anymore
we can remove the block section.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-04 07:42:05 +02:00
Dimitri Savineau 0346871fb5 ceph-handler: don't restart all OSDs with limit
When using the ansible --limit option on one or few OSD nodes and if the
handler is triggered then we will restart the OSD service on all OSDs
nodes instead of the hosts limited by the limit value.
Even if the play is limited by the --limit value we are using all OSD
nodes from the OSD group.

  with_items: '{{ groups[osd_group_name] }}'

Instead we should iterate only on the nodes present in both OSD group and
limit list.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-03 14:52:27 -04:00
Dimitri Savineau 780cf36a59 ceph-facts: fix _radosgw_address with block
e695efc introduced a regression in the _radosgw_address fact when using
the radosgw_address_block variable.
There's no item there because we don't use the items lookup. This is
only used for _monitor_address with monitor_address_block.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1758099

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-03 16:15:45 +02:00
Guillaume Abrioux 778c51a0ff Vagrantfile: support more than 9 nodes per daemon type
because of the current ip address assignation, it's not possible to
deploy more than 9 nodes per daemon type.
This commit refact a bit and allows us to get around this limitation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-02 13:10:52 +02:00
Guillaume Abrioux 9bad239d77 common: improve keyrings generation
There is no need to get n * number of nodes the different keyrings.
Adding a `run_once: true` here avoid running a ceph command too many
times which could be impacting large cluster deployment.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-02 13:09:50 +02:00
Dimitri Savineau ec3b687dc4 ceph-facts: use --admin-daemon to get fsid
During the rolling_update scenario, the fsid value is retrieve from the
current ceph cluster configuration via the ceph daemon config command.
This command tries first to resolve the admin socket path via the
ceph-conf command.
Unfortunately this command won't work if you have a duplicate key in the
ceph configuration even if it only produces a warning. As a result the
task will fail.

Can't get admin socket path: unable to get conf option admin_socket for
mon.xxx: warning: line 13: 'osd_memory_target' in section 'osd' redefined

Instead of using ceph daemon we can use the --admin-daemon option
because we already know what the socket admin path value based on the
ceph cluster and mon hostname values.

Closes: #4492

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-02 10:07:13 +02:00
Guillaume Abrioux 272d16e101 validate: fix gpt header check
Check for gpt header when osd scenario is lvm or lvm batch.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-01 11:46:17 -04:00
Guillaume Abrioux ed8616aa66 rbdmirror: rename a file
rename this file to be more generic.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-01 10:27:51 -04:00
Guillaume Abrioux e08194dd67 rgw: refact tasks directory layout
This commit moves containerized deployment related files to `./tasks/`
directory. This is needed to make `docker-to-podman.yml` working since
we use `tasks_from:` option.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-01 10:27:51 -04:00
Guillaume Abrioux c69816c6b7 rbdmirror: refact tasks directory layout
This commit moves containerized deployment related files to `./tasks/`
directory. This is needed to make `docker-to-podman.yml` working since
we use `tasks_from:` option.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-01 10:27:51 -04:00
Guillaume Abrioux 4636f3f7e2 iscsigw: refact tasks directory layout
This commit moves containerized deployment related files to `./tasks/
directory. This is needed to make `docker-to-podman.yml` working since
we use `tasks_from:` option.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-01 10:27:51 -04:00
Guillaume Abrioux f2017dcda2 upgrade: add an infra playbook to migrate systemd units to podman
this commit adds a new playbook to force systemd units for containers to
use podman instead of docker.
This is needed in the rhel8 upgrade context so after the base OS is upgraded
containers can be started using podman.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-01 10:27:51 -04:00
Guillaume Abrioux bd64167469 container: isolate systemd tasks
This commit isolates the systemd unit files generation for containers into
separate yml files in order to be able importing each corresponding roles
without playing all tasks.
This is needed so we can run ceph-ansible to render systemd unit files
so they call podman instead of docker.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-10-01 10:27:51 -04:00
Dimitri Savineau 20b1a464ec ceph-facts: update external grafana fact filter
e695efc hasn't been updated with the changes introduced in 9bb11c7 so
the ips_in_ranges filter isn't used for an external grafana instance.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-10-01 10:47:14 +02:00
Guillaume Abrioux 669b337ee3 stalebot: decrease daysUntilStale parameter
30d + 7d for closing an issue that is inactive sounds enough.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-30 14:38:26 +02:00
Guillaume Abrioux 89fbbab616 repo: enable stalebot
enable stale bot in order to close PR with no recent activity.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-30 12:07:03 +02:00
Guillaume Abrioux 01f6dd52b3 tests: remove debug log verbosity
This was added for debugging purpose.
It's generating very large log output, let's remove this now.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-28 11:20:49 +02:00
Guillaume Abrioux e4444d29e0 Revert "ceph-common: install only necesarry ceph-* packages on debian"
This reverts commit 58b27ef0b3.
This is breaking debian based OS deployments.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-28 08:04:12 +02:00
Boris Ranto b96c6da832 ceph-defaults: Change the default prometheus port
The old default prometheus port 9090 clashes with cockpit in rhel 8. The
9090 port is reserved for web service administration of machines. We
should change the default to something that does not clash with other
ports used in rhel 8, at least by default. The port 9092 seems like a
good choice in my testing.

Signed-off-by: Boris Ranto <branto@redhat.com>
2019-09-28 04:40:42 +02:00
Boris Ranto f067e53c6e rhcs_edits: Fix ose container versions
For some reason, the floating tags were changed from v4.1 to just 4.1
for these images when switching ti registry.redhat.io. We should fix
the locations.

We are also changing the downstream grafana image to the one we used for
rhcs 3. The ose grafana image lacks the support for a lot of features
that we need (e.g. vonage and piechart grafana plugins, grafana-cli
binary and others).

Signed-off-by: Boris Ranto <branto@redhat.com>
2019-09-28 04:40:42 +02:00
Guillaume Abrioux d84160a170 update: reset mon_host after mons upgrade
after all mon are upgraded, let's reset mon_host which is used in the
rest of the playbook for setting `container_exec_cmd` so we are sure to
use the right value.

Typical error:

```
failed: [mds0 -> mon0] (item={u'path': u'/var/lib/ceph/bootstrap-mds/ceph.keyring', u'name': u'client.bootstrap-mds', u'copy_key': True}) => changed=true
  ansible_loop_var: item
  cmd:
  - docker
  - exec
  - ceph-mon-mon2
  - ceph
  - --cluster
  - ceph
  - auth
  - get
  - client.bootstrap-mds
  delta: '0:00:00.016294'
  end: '2019-09-27 13:54:58.828835'
  item:
    copy_key: true
    name: client.bootstrap-mds
    path: /var/lib/ceph/bootstrap-mds/ceph.keyring
  msg: non-zero return code
  rc: 1
  start: '2019-09-27 13:54:58.812541'
  stderr: 'Error response from daemon: No such container: ceph-mon-mon2'
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-28 04:37:47 +02:00
Johannes Kastl a1811ca097 install python-xml on SUSE/openSUSE only if python2 is installed
raw_install_python.yml: on SUSE/openSUSE, install python-xml package only
if python2 is installed already

Background:
On SLES 15.x / openSUSE Leap 15.x, the python2 package `python-base` provides
/usr/bin/python, while python3 only provides /usr/bin/python3.

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-09-27 14:19:32 +02:00
Johannes Kastl 5cf22e9b31 move python-xml to raw_install_python.yml
The package python-xml is needed for ansible's zypper module to interact with
the zypper package management tool.

roles/ceph-defaults/defaults/main.yml:
Remove python-xml from variable suse_package_dependencies to only
install python-xml on SUSE/openSUSE if python is not found.
raw_install_python.yml already contains all the logic needed to check
if there is a valid python installation, so this is better suited there.

openSUSE Leap 15.x / SLES 15.x do no longer have /usr/bin/python,
only /usr/bin/python3, which already contains the xml module, so
nothing needs to be installed in that case.

Signed-off-by: Johannes Kastl <kastl@b1-systems.de>
2019-09-27 14:19:32 +02:00
Harald Jensås e695efcaf7 Replace ipaddr() with ips_in_ranges()
This change implements a filter_plugin that is used in the
ceph-facts, ceph-validate roles and infrastucture-playbooks.
The new filter plugin will return a list of all IP address
that reside in any one of the given IP ranges. The new filter
replaces the use of the ipaddr filter.

ceph.conf already support a comma separated list of CIDRs
for the public_network and cluster_network options.

Changes: [1] and [2] introduced a regression in ceph-ansible
where public_network can no longer be a comma separated list
of cidrs.

With this change a comma separated list of subnet CIDRs can
also be used for monitor_address_block and radosgw_address_block.

[1] commit: d67230b2a2
[2] commit: 20e4852888

Related-To: https://bugs.launchpad.net/tripleo/+bug/1840030
Related-To: https://bugzilla.redhat.com/show_bug.cgi?id=1740283

Closes: #4333
Please backport to stable-4.0

Signed-off-by: Harald Jensås <hjensas@redhat.com>
2019-09-27 10:11:53 +02:00
Dimitri Savineau 74ab59c4f3 ceph-dashboard: Add prometheus api host
The set-prometheus-api-host ceph dashboard subcommand was missing in
ceph-dashboard role. Only grafana and alermanager were present.
This commit also remove the trailing slash at the end of the host/url
values.

Closes: #4453

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-09-27 09:16:12 +02:00
Anthony Rusdi 58b27ef0b3 ceph-common: install only necesarry ceph-* packages on debian
Currently, ceph package only an meta-package that do not contain
actual software, but simply depend on other packages. It's been few
release since debian stretch (official), ubuntu bionic (official),
ubuntu uca repository and upstream debian-jewel.
As we only support nautilus and higher release for master branch,
I propose to drop ceph package and use ceph-base instead for repository
model other than rhcs so debian ceph install will be more minimalis.

Signed-off-by: Anthony Rusdi <33247310+antrusd@users.noreply.github.com>
2019-09-27 01:11:22 +02:00
Dimitri Savineau ca77d7bd31 ceph-nfs: Allow to configure SecType value
Depending on the infrastruture (w/o kerberos auth) then the SecType
value could be different.
Currently this value is hardcoded in the NFS Ganesha template. Instead
we can use a variable.
The default value is still the same to avoid breaking the backward
compatibility.

Closes: #4459

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-09-27 00:33:18 +02:00
liuxu 195f70897c dashboard: add grafana dashboard support on Debian based OS
download grafana dashboard files from github when running on Debian based OS

Signed-off-by: liuxu <liuxu623@gmail.com>
2019-09-26 18:49:56 +02:00
fmount 9bb11c7b2a Inject ceph grafana dashboard layouts
This change just adds the task to inject from the
ceph dashboard mgr module the required layouts
to show all the cluster metrics on the grafana
instance.
Since we're now able to push grafana layouts through
the ceph mgr module command, the dashboards configuration
template is no longer needed on containerized environments.
This commit also fixes the Vagrantfile IP static assigment
in the grafana section because it generates an issue (it's
the same of the mgr instance).
Finally, considering some deployments that use an external
grafana server instance, we reworked the 'grafana_server_addr'
assignment to address these requirements.

Signed-off-by: fmount <fpantano@redhat.com>
2019-09-26 11:12:20 -04:00
Sam Choraria 7cc9f93680 rolling_update.yml: force ceph-volume scan on osds
The rolling_update.yml playbook fails when scanning ceph-disk osds while
deploying nautilus. The --force flag is required to scan existing osds
and rewrite their json metadata.

Signed-off-by: Sam Choraria <sam.choraria@bbc.co.uk>
2019-09-26 16:53:25 +02:00
Guillaume Abrioux 167737dd3d iscsigw: install python-requests
Typical error at rbd-target-api startup:

```
Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: Traceback (most recent call last):
Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: File "/usr/bin/rbd-target-api", line 39, in <module>
Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: from gwcli.utils import (APIRequest, valid_gateway, valid_client,
Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: File "/usr/lib/python2.7/site-packages/gwcli/utils.py", line 1, in <module>
Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: import requests
Sep 25 12:12:29 iscsi-gw0 rbd-target-api[9959]: ImportError: No module named requests
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux 006df148d0 tests: pin jinja2 version
ensure we get the latest jinja2 version.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux 5bb6a4da42 tests: set copy_admin_key at group_vars level
setting it at extra vars level prevent from setting it per node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux ab370b6ad8 global: remove fetch_directory dependency
This commit drops the fetch_directory dependency.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux 3f9ccdaa8a infrastructure-playbooks: add filestore-to-bluestore.yml
This playbook helps to migrate all osds on a node from filestore to
bluestore backend.
Note that *ALL* osd on the specified osd nodes will be shrinked and
redeployed.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux 09e04a9197 osd: add wal_devices option support to ceph_volume module
This commit adds the `wal_devices` option support to the
ceph_volume module.
passing a devices list in `bluestore_wal_devices` will make ceph-volume
creating 1 vg using these devices to create block.wal partitions.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux 70f1b37097 osd: update doc text in defaults/main.yml
This commit removes ceph-disk references.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux 7b836eaa47 osd: add block_db_devices option support to ceph_volume module
This commit adds the `block_db_devices` option support to the
ceph_volume module.
passing a devices list in `dedicated_devices` will make ceph-volume
creating 1 vg using these devices to create block.db partitions for data
devices.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Guillaume Abrioux c785ad3637 lv-create: fix a typo
This commit fixes a typo.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-26 11:35:24 +02:00
Mehdy 9fa98d79fd shrink-rgw.yml: fix confirmation play's name
the confirmation play's name should confirm removing rgw instead of
monitor

Signed-off-by: Mehdy Khoshnoody <mehdy.khoshnoody@gmail.com>
2019-09-24 07:47:56 +02:00
Dimitri Savineau ec56a95013 group_vars: remove useless dashboard files
The only useful ansible group for the grafana/prometheus stack is
grafana-server so no one of those files are actually needed.
The default values for all dashboard roles are present in ceph-defaults
role so it's also present in in group_vars/all.yml.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-09-18 16:16:02 +02:00
Guillaume Abrioux 2b97ac921b validate: check ceph_docker_registry_* length
This commit adds a condition to check whether these variables are empty.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-18 16:03:18 +02:00
Dimitri Savineau 9f4a99fb24 container: Allow to use registry authentication
The registry.redhat.io regsitry requires authentication so before pulling
the RHCS 4 container images from the registry we need to do the login
step.
This is done via the new ceph_docker_registry_auth variable. The
default value is false but true for RHCS setup.
When set to true, you need to provide the username and password
for the registry via the associated variables.
This patch also updates the ceph_docker_registry value for RHCS setup.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1748911

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-09-18 16:03:18 +02:00
Dimitri Savineau f90696c36e rhel8: add default python bin path
On RHEL 8 system we should check the /usr/libexec/platform-python path
instead of installing python36 package.

[DEPRECATION WARNING]: Distribution redhat 8.0 on host xxxxx should use
/usr/libexec/platform-python, but is using /usr/bin/python for backward
compatibility with prior Ansible releases. A future Ansible release will
default to using the discovered platform python for this host.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-09-18 14:35:53 +02:00
Dimitri Savineau 734c0dc310 shrink-mon: search mon in the quorum_names list
If we're looking at the mon hostname in the ceph status output then
there's some scenarios where this could be true.
If we collocate some services (mons, mgrs, etc..) then the hostname of
the monitor to shrink will still be present in the ceph status (like
in mgrs or other).
Instead we should check the hostame only in the mon part of the output.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-09-18 14:35:02 +02:00
Guillaume Abrioux da094ac5ee tests: do not rely on pg_num to validate rgw_tuning_pools
Since the pg_autoscaler has been enabled recently in ceph, this check
should stick to validate the requested pools are well created only.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-09-18 14:05:23 +02:00