Commit Graph

162 Commits (cb0926262d9beb21f2c7746ca56a1adc6ec987c4)

Author SHA1 Message Date
Guillaume Abrioux 1884506189 update: follow new recommandation to upgrade mds cluster
Refact the mds cluster upgrade code in order to follow the documented
recommandation.
See: https://github.com/ceph/ceph/blob/luminous/doc/cephfs/upgrading.rst

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1569689

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 71cebf80a6)
2019-10-21 15:44:38 -04:00
Dimitri Savineau 0b653ee5b4 update default rhcs values and docs
The RHCS documentation mentionned in the default values and
group_vars directory are referring to RHCS 2.x while it should be
3.x.

Revolves: https://bugzilla.redhat.com/show_bug.cgi?id=1702732

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2019-06-04 14:18:23 +02:00
Dimitri Savineau 7418999638 ceph-mds: Increase cpu limit to 4
In containerized deployment the default mds cpu quota is too low
for production environment.
This is causing performance degradation compared to bare-metal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695850

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1999cf3d19)
2019-04-24 21:44:23 +00:00
Dimitri Savineau 56215d7688 ceph-mds: Set application pool to cephfs
We don't need to use the cephfs variable for the application pool
name because it's always cephfs.
If the cephfs variable is set to something else than the default
value it will break the appplication pool task.

Resolves: #3790

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d2efb7f02b)
2019-04-11 15:38:14 +00:00
Matthew Vernon a8c9b65d13 UCA: Uncomment UCA variables in defaults, fix consequent breakage
The Ubuntu Cloud Archive-related (UCA) defaults in
roles/ceph-defaults/defaults/main.yml were commented out, which means
if you set `ceph_repository` to "uca", you get undefined variable
errors, e.g.

```
The task includes an option with an undefined variable. The error was: 'ceph_stable_repo_uca' is undefined

The error appears to have been in '/nfs/users/nfs_m/mv3/software/ceph-ansible/roles/ceph-common/tasks/installs/debian_uca_repository.yml': line 6, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

- name: add ubuntu cloud archive repository
  ^ here

```

Unfortunately, uncommenting these results in some other breakage,
because further roles were written that use the fact of
`ceph_stable_release_uca` being defined as a proxy for "we're using
UCA", so try and install packages from the bionic-updates/queens
release, for example, which doesn't work. So there are a few `apt` tasks
that need modifying to not use `ceph_stable_release_uca` unless
`ceph_origin` is `repository` and `ceph_repository` is `uca`.

Closes: #3475
Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
(cherry picked from commit 9dd913cf8a)
2019-04-09 16:54:37 +00:00
Leah Neukirchen d855cb2595 Fix uses of default(omit) with string concatenation
When {{omit}} is concatenated with another string, it expands to something
like __omit_place_holder__63eea0d96dd6ed867b95405e11d87dddf61f448d.
However, in these use-cases we need an empty string.

Regression introduced in d53f55e807.

Signed-off-by: Leah Neukirchen <leah.neukirchen@mayflower.de>
2019-02-08 11:01:11 +00:00
Guillaume Abrioux f0195e97ed refact osd pool size customization
Add real default value for osd pool size customization.
Ceph itself has an `osd_pool_default_size` default value to `3`.

If users don't specify a pool size in various pools definition within
ceph-ansible, we should default to `3`.

By the way, this kind of condition isn't really clear:
```
when:
  - rbd_pool_size | default ("")
```

we should try to get the customized value then default to what is in
`osd_pool_default_size` (which has its default value pointing to
`ceph_osd_pool_default_size` (`3`) as well) and compare it to
`ceph_osd_pool_default_size`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7774069d45)
2018-11-29 01:49:05 +00:00
Guillaume Abrioux 68b2ad11ee mon: move `osd_pool_default_pg_num` in `ceph-defaults`
`osd_pool_default_pg_num` parameter is set in `ceph-mon`.
When using ceph-ansible with `--limit` on a specifc group of nodes, it
will fail when trying to access this variables since it wouldn't be
defined.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d4c0960f04)
2018-11-29 01:49:05 +00:00
Guillaume Abrioux 748342f5b6 roles: fix *_docker_memory_limit default value
append 'm' suffix to specify the unit size used in all
`*_docker_memory_limit`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-29 14:59:09 +01:00
Neha Ojha b7e4d4eb84 roles: do not limit docker_memory_limit for various daemons
Since we do not have enough data to put valid upper bounds for the memory
usage of these daemons, do not put artificial limits by default. This will
help us avoid failures like OOM kills due to low default values.

Whenever required, these limits can be manually enforced by the user.

More details in
https://bugzilla.redhat.com/show_bug.cgi?id=1638148

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1638148
Signed-off-by: Neha Ojha <nojha@redhat.com>
2018-10-29 14:59:09 +01:00
Rishabh Dave ee2d52d33d allow custom pool size
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1596339
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-10-22 16:00:21 +02:00
Guillaume Abrioux 40b7747af7 remove jewel support
As of now, we should no longer support Jewel in ceph-ansible.
The latest ceph-ansible release supporting Jewel is `stable-3.1`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-10-12 23:38:17 +00:00
Noah Watkins 306e308f13 Avoid using tests as filter
Fixes the deprecation warning:

  [DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of
  using `result|search` use `result is search`.

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-10-10 04:26:33 +00:00
Rishabh Dave 380168dadc don't use "include" to include tasks
Use "import_tasks" or "include_tasks" instead.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2018-09-27 17:53:40 +02:00
Sébastien Han a629408967 ceph-mds: enable application pool
We now enable the application type 'cephfs' for each cephfs pools we
create.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1590275
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-07-02 10:28:34 +00:00
Sébastien Han f623997271 systemd: remove changed_when: false
When using a module there is no need to apply this Ansible option. The
module will handle the idempotency on its own. So the module decides
wether or not the task has changed during the execution.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-06-28 17:54:13 +02:00
George Shuklin 653b483fc3 Add ceph_keyring_permissions variable to control permissions for
keyring files in /etc/ceph. Default value is the same as it was (0600),
but this variable allows user to override it (f.e. set it to 0640).

Signed-off-by: George Shuklin <george.shuklin@gmail.com>
2018-06-28 15:48:39 +00:00
Patrick Donnelly 9ce81ae845 ceph-mds: do not enable multimds on jewel
Multiple active MDS became stable in Luminous.

Introduced-by: c8573fe0d7
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-06-12 10:47:34 +02:00
Guillaume Abrioux 608ea947a9 mds: move mds fs pools creation
When collocating mds on monitor node, the cephpfs will fail
because `docker_exec_cmd` is reset to `ceph-mds-monXX` which is
incorrect because we need to delegate the task on `ceph-mon-monXX`.
In addition, it wouldn't have worked since `ceph-mds-monXX` container
isn't started yet.

Moving the task earlier in the `ceph-mds` role will fix this issue.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-25 11:16:56 +02:00
Guillaume Abrioux 3a0e168a76 mdss: move cephfs pools creation in ceph-mds
When deploying a large number of OSD nodes it can be an issue because the
protection check [1] won't pass since it tries to create pools before all
OSDs are active.

The idea here is to move cephfs pools creation in `ceph-mds` role.

[1] e59258943b/src/mon/OSDMonitor.cc (L5673)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1578086

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-05-24 09:39:38 -07:00
Sébastien Han 65ba85aff6 Expose /var/run/ceph
Useful for softwares that do data collection/monitoring like collectd.
They can connect to the socket and then retrieve information.

Even though the sockets are exposed now, I'm keeping the docker exec to
check the socket, this will allow newer version of ceph-ansible to work
with older versions.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1563280
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-20 15:48:32 +02:00
Sébastien Han 641f141c0f selinux: remove chcon calls
We know bindmount with the :z option at the end of the -v command so
this will basically run the exact same command as we used to run. So to
speak:

chcon -Rt svirt_sandbox_file_t /var/lib/ceph

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-19 14:59:37 +02:00
Sébastien Han d2a2793cb0 refactor the way we copy keys
This commit does a couple of things:

* use a common.yml file that contains things that can be played on both
container and non-container

* refactor the ability to copy the admin key to the nodes

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-04-18 16:46:33 +02:00
vasishta p shastry db3a5ce6d9 mds: to support copy_admin_keyring 2018-04-11 14:21:15 +02:00
Randy J. Martinez ca572a11f1 ceph-mds: delete duplicate tasks which cause multimds container deployments to fail.
This update will resolve error['cephfs' is undefined.] in multimds container deployments.
See: roles/ceph-mon/tasks/create_mds_filesystems.yml. The same last two tasks are present there, and actully need to happen in that role since "{{ cephfs }}" gets defined in
roles/ceph-mon/defaults/main.yml, and not roles/ceph-mds/defaults/main.yml.

Signed-off-by: Randy J. Martinez <ramartin@redhat.com>
2018-03-29 09:32:40 +02:00
Grant Slater 1e1b26ca4d mds: fix ansible_service_mgr typo
This commit fixes a typo introduced by 4671b9e74e
2018-02-26 13:05:14 +01:00
Giulio Fidente bdcc52b96d Check for docker sockets named after both _hostname or _fqdn
While hostname -f will always return an hostname including its
domain part and -s without the domain part, the behavior when
no arguments are given can include or not include the domain part
depending on how the system is configured; the socket name might
not match the instance name then.
2018-02-06 14:16:54 +01:00
Guillaume Abrioux deaf273b25 syntax: change local_action syntax
Use a nicer syntax for `local_action` tasks.
We used to have oneliner like this:
```
local_action: wait_for port=22 host={{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }} state=started delay=10 timeout=500 }}
```

The usual syntax:
```
    local_action:
      module: wait_for
      port: 22
      host: "{{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }}"
      state: started
      delay: 10
      timeout: 500
```
is nicer and kind of way to keep consistency regarding the whole
playbook.

This also fix a potential issue about missing quotation :

```
Traceback (most recent call last):
  File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 213, in <module>
    main()
  File "/tmp/ansible_wQtWsi/ansible_module_command.py", line 185, in main
    rc, out, err = module.run_command(args, executable=executable, use_unsafe_shell=shell, encoding=None, data=stdin)
  File "/tmp/ansible_wQtWsi/ansible_modlib.zip/ansible/module_utils/basic.py", line 2710, in run_command
  File "/usr/lib64/python2.7/shlex.py", line 279, in split
    return list(lex)                                                                                                                                                                                                                                                                                                            File "/usr/lib64/python2.7/shlex.py", line 269, in next
    token = self.get_token()
  File "/usr/lib64/python2.7/shlex.py", line 96, in get_token
    raw = self.read_token()
  File "/usr/lib64/python2.7/shlex.py", line 172, in read_token
    raise ValueError, "No closing quotation"
ValueError: No closing quotation
```

writing `local_action: shell echo {{ fsid }} | tee {{ fetch_directory }}/ceph_cluster_uuid.conf`
can cause trouble because it's complaining with missing quotes, this fix solves this issue.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1510555

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-01-31 10:45:34 +01:00
Guillaume Abrioux 70401f955b container: trigger handlers on systemd file change
When a systemd unit file is changed we should trigger handlers to
restart the services.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2018-01-10 16:46:42 +01:00
Sébastien Han 97f520bc74 containers: bump memory limit
A default value of 4GB for MDS is more appropriate and 3GB for OSD also.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1531607
Signed-off-by: Sébastien Han <seb@redhat.com>
2018-01-09 11:26:50 +01:00
Christian Berendt 50a848dc40 Rename fact docker_version to ceph_docker_version
The name docker_version is very generic and is also used by other
roles. As a result, there may be name conflicts. To avoid this a
ceph_ prefix should be used for this fact. Since it is an internal
fact renaming is not a problem.
2017-12-15 20:12:21 +01:00
Guillaume Abrioux 26afe46e13 docker: add missing condition for selinux tasks
on `client` and `mds` roles, it tries to set selinux even on non rhel
based distributions.`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-12-14 17:00:14 +01:00
Markos Chandras 8135638c58 ceph-mds: Add support for openSUSE Leap distributions
Add support for openSUSE Leap distributions

Signed-off-by: Markos Chandras <mchandras@suse.de>
2017-11-14 10:51:23 +00:00
Major Hayden f73232caa4
Use check_mode instead of always_run
This patch changes the `always_run: yes` task option to
`check_mode: no` to avoid Ansible warnings.
2017-10-25 09:53:34 -05:00
Major Hayden c2b5118c1b
Revert "Avoid deprecated always_run"
This reverts commit 620fb37dd4.
2017-10-25 09:48:09 -05:00
Christian Berendt 4c380c9ef8 Cleanup readme files in roles directories
The contents of the README files are no longer up to date.
Documentation for all roles is located below the docs directory.
2017-10-17 11:22:06 +02:00
Christian Berendt cf901f0171 In docker start scripts replace \u00a0 with \u0020
This will solve the following issue when starting docker containers on ubuntu:

invalid argument "1\u00a0" for --cpus=1 : failed to parse 1  as a rational number

Closes-bug: #2056
2017-10-16 15:16:48 +02:00
Major Hayden 620fb37dd4
Avoid deprecated always_run
The `always_run` key is deprecated and being removed in Ansible 2.4.
Using it causes a warning to be displayed:

    [DEPRECATION WARNING]: always_run is deprecated.

This patch changes all instances of `always_run` to use the `always`
tag, which causes the task to run each time the playbook runs.
2017-10-12 08:29:44 -05:00
Guillaume Abrioux 3c64abe07d mds: move installation packages in role itself
Make role `ceph-mds` handling itself the installation of `ceph-mds`
package.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-09 17:25:46 +02:00
Guillaume Abrioux 784cc73da0 set docker_exec_cmd fact early in each role
This is to ensure `docker_exec_cmd` fact is set with the correct value
in case of daemons collocation

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-04 11:31:09 +02:00
Sébastien Han 5968cf09b1 ci: add collocation scenario
Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-04 11:19:12 +02:00
Guillaume Abrioux 62770cd7de refact MDS role
This commits refacts the role ceph-mds

The goal here is to create cephfs in `ceph-mon` for both containerized
and non-containerized cases so we don't need the admin keyring on mds
nodes anymore.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1488999

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-02 09:12:31 +02:00
Guillaume Abrioux 466f6f35b7 Use systemd module instead of service.
Using systemd module allows us to do in one task what we did in three
tasks:

- enable unit file,
- issue a `daemon-reload`,
- start the service

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-29 14:54:00 +02:00
Guillaume Abrioux 913ad53709 docker: add condition to run selinux tasks only on rhel os family
This fixes the error :

```
The conditional check 'sestatus.stdout != 'Disabled'' failed.
```

that occurs when running on non rhel based system since the
`sestatus` fact is registered only on rhel based distribution.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-29 02:35:07 +02:00
Sébastien Han cb05172605 docker: we don't need to copy the ceph.conf on all the nodes
We generate the ceph.conf on all the nodes through the
ceph-docker-common so there is no need to push it to the Ansible file.

Also this is breaking the ceph.conf template generation since we only
generate sections based on the host the ansible task is running on.

For example, what's typically happening, we bootstrap the monitor, we
get a ceph.conf generated for a mon only, we go on an osd, we generate
the ceph.conf with osd section (done by ceph-docker-common) but this
gets overwritten by the copy_config task of the ceph-osd role.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-20 16:33:29 +02:00
Sébastien Han d100b4e596 name includes and set_fact for clarity
When Ansible is not run with verbose options it's difficult to see which
include and/or set_fact does what. So adding a name for each clarifies.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-18 23:39:46 +02:00
Guillaume Abrioux 0f506f4f0a Docker: split the task 'copy ceph configs&keys'
All keys are copied to all nodes.
This commit split that task in each roles so keys are copied to their
respective nodes.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1488999

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-11 21:14:13 +02:00
Sébastien Han 2ea7f287fa docker: simplify variable declaration
Less configuration for the user, the container inherit from the global
variables. No more container specific variables.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-09 01:22:06 +02:00
Guillaume Abrioux 44fd928e23 mds: rename mds_socket fact
Rename this fact to keep consistency with handlers in `ceph-defaults`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-09-08 15:57:58 +02:00
Sébastien Han 2fa151b9e8 container: introduce resource limitation for containers
This can be controlled via 2 options:

* ceph_$DAEMON_docker_memory_limit
* ceph_$DAEMON_docker_cpu_limit

All daemons default to 1GB for memory and 1 CPU by default.
Recommendations from:
https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/red_hat_ceph_storage_hardware_guide/minimum_recommendations

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-09-06 14:52:21 +02:00