ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	cb8f0237e1	ceph-rgw: allow specifying crush rule on pool We already support specifiying a custom crush rule during pool creation in ceph-osd role but not in ceph-rgw role. This patch adds the missing code to implement this feature. Note this is only available for replicated pool not erasure. The rule must also exist prior the pool creation. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1855439 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-08-17 22:59:06 +02:00
Dimitri Savineau	47b7c00287	podman: always remove container on start In case of failure, the systemd ExecStop isn't executed so the container isn't removed. After a reboot of a failed node, the container doesn't start because the old container is still present in created state. We should always try to remove the container in ExecStartPre for this situation. A normal reboot doesn't trigger this issue and this also doesn't affect nodes running containers via docker. This behaviour was introduced by `d43769d`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1858865 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-23 17:00:38 +02:00
Guillaume Abrioux	86edae724f	rgw: set container memory limit to 4g This commit changes the container memory limit for rgw daemons. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707488 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-09 15:31:10 +02:00
Dimitri Savineau	1361e84a4e	radosgw: remove INST_PORT environment variable This variable isn't consumed by the container so we can remove it. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-02 16:52:29 +02:00
Guillaume Abrioux	7dd68b9ac1	rgw: fix multi instances scaleout When rgw and osd are collocated, the current workflow prevents from scaling out the radosgw_num_instances parameter when rerunning the playbook. The environment file used in the rgw systemd template is rendered when executing the `ceph-rgw` role but during a new run of the playbook (in order to scale out rgw instances), handlers are triggered from `ceph-osd` role which is run before `ceph-rgw`, therefore it tries to start the new rgw daemon whereas its corresponding environment file hasn't been rendered yet and fails like following: ``` ceph-radosgw@rgw.ceph4osd3.rgw1.service failed to run 'start-pre' task: No such file or directory ``` This commit moves the tasks generating this file in `ceph-config` role so it is generated early. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1851906 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-02 10:39:50 -04:00
Dimitri Savineau	d43769dc2a	podman: Add Type and PIDFile value to unit files This changes the way we are running the podman containers via systemd. They are now in dettached mode and Type/PIDFile set. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1834974 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-06-23 09:37:50 +02:00
Dimitri Savineau	bd22f1d1ec	docker: Add Requires on docker service When using docker container engine then the systemd unit scripts only use a dependency on the docker daemon via the After parameter. But if docker is restarted on a live system then the ceph systemd units should wait for the docker daemon to be fully restarted. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846830 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-06-22 23:08:50 +02:00
Ali Maredia	0175c205fa	rgw multisite: add master zone endpoints to zonegroup We were only adding the endpoints to the master zone but not to the zonegroup. This patch fixes the issue. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1839228 Signed-off-by: Ali Maredia <amaredia@redhat.com>	2020-06-09 09:50:18 -04:00
Benoît Knecht	d2b7670c7d	ceph-rgw: Make sure pool name templates are expanded It is common to set templated pool names in `rgw_create_pools`, e.g. ```yaml rgw_create_pools: "{{ rgw_zone }}.rgw.buckets.index": pg_num: 16 size: 3 type: replicated ``` This worked fine with Ansible 2.8, but broke in Ansible 2.9 due to a change in the way `with_dict` works [1]. This commit replaces the use of `with_dict` with ```yaml loop: "{{ rgw_create_pools \| dict2items }}" ``` which works as intended and expands the template in the pool name. [1]: https://docs.ansible.com/ansible/latest/porting_guides/porting_guide_2.9.html#loops Closes #5348 Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-05-11 11:51:27 -04:00
Dimitri Savineau	34e6e8e06c	ceph-rgw: use match instead of equalto from jinja2 The '==' jinja2 operator (or 'equalto') has been introduced in jinja2 2.8. On EL7, jinja2 version is 2.7 so the operator isn't present creating templating error like: The error was: TemplateRuntimeError: no test named '==' Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1747206 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-05-06 14:23:10 -04:00
Guillaume Abrioux	60a2e28189	rgw: add multi-instances support when deploying multisite This commit adds the multi-instances when deploying rgw multisite Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-12 16:44:48 -04:00
Dimitri Savineau	e62532de46	update osd pool set size command Since [1] we can't use osd pool without replicas (size: 1) by default. We now need to set the mon_allow_pool_size_one flag to true in the ceph configuration and add the --yes-i-really-mean-it flag to the osd pool set size cli. [1] https://github.com/ceph/ceph/commit/21508bd Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-11 11:25:42 +01:00
Guillaume Abrioux	b3bbd6bb77	rgw: fix a typo in create_realm_zonegroup_zone_lists This commit fixes a typo. `s/realms/secondary_realms` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-10 14:13:30 +01:00
Guillaume Abrioux	7a8a719e75	rgw: add retry/until on pools tasks Sometimes, these task can timeout for some reason. Adding these retries can help to avoid unexcepted failures. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-06 08:55:13 -05:00
Ali Maredia	71f55bd54d	rgw multisite: enable more than 1 realm per cluster Make it so that more than one realm, zonegroup, or zone can be created during a run of the rgw multisite ansible playbooks. The rgw hosts now need to be grouped into zones and realms in the inventory. .yml files need to be created in group_vars for the realms and zones. Sample yaml files are available. Also remove multsite destroy playbook and add --cluster before radosgw-admin commands remove manually added rgw_zone_endpoints var and have ceph-ansible automatically add the correct endpoints of all the rgws in a rgw_zone from the information provided in that rgws hostvars. Signed-off-by: Ali Maredia <amaredia@redhat.com>	2020-03-04 12:58:13 -05:00
Dimitri Savineau	9d3b49293d	purge: stop rgw instances by iteration It looks like that the service module doesn't support wildcard anymore for stopping/disabling multiple services. fatal: [rgw0]: FAILED! => changed=false msg: 'This module does not currently support using glob patterns, found '''' in service name: ceph-radosgw@' ...ignoring Instead we should iterate over the rgw_instances list. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-02 16:32:06 +01:00
Dimitri Savineau	44e750ee5d	ceph-rgw: increase connection timeout to 10 5s as a connection timeout could be low in some setup. Let's increase it to 10s. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-24 16:01:36 +01:00
Sam Choraria	2a2656a985	ceph-rgw: allow SSL certificate content to supplied Allow SSL certificate & key contents to be written to the path specified by radosgw_frontend_ssl_certificate. This permits a certificate to be deployed & renewal of expired certificates through ceph-ansible. Signed-off-by: Sam Choraria <sam.choraria@bbc.co.uk>	2020-02-17 16:22:11 +01:00
Ali Maredia	1834c1e48d	rgw: extend automatic rgw pool creation capability Add support for erasure code pools. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731148 Signed-off-by: Ali Maredia <amaredia@redhat.com> Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-17 16:07:43 +01:00
Dimitri Savineau	5a03e0ee1c	containers: add KillMode=none to systemd templates Because we are relying on docker\|podman for managing containers then we don't need systemd to manage the process (like kill). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-13 16:11:33 +01:00
Guillaume Abrioux	483adb5d79	common: add a default value for ceph_directories_mode Since this variable makes it possible to customize the mode for ceph directories, let's make it a bit more explicit by adding a default value in ceph-defaults. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-22 09:35:35 +01:00
Benoît Knecht	3842aa1a30	ceph-rgw: Fix customize pool size "when" condition In `3c31b19ab3`, I fixed the `customize pool size` task by replacing `item.size` with `item.value.size`. However, I missed the same issue in the `when` condition. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-01-20 09:26:53 -05:00
Guillaume Abrioux	3e262e072b	containers: use --cpus instead --cpu-quota When using docker 1.13.1, the current condition: ``` {% if (container_binary == 'docker' and ceph_docker_version.split('.')[0] is version_compare('13', '>=')) or container_binary == 'podman' -%} ``` is wrong because it compares the first digit (1) whereas it should compare the second one. It means we always use `--cpu-quota` although documentation recommend using `--cpus` when docker version is 1.13.1 or higher. From the doc: > --cpu-quota=<value> Impose a CPU CFS quota on the container. The number of > microseconds per --cpu-period that the container is limited to before > throttled. As such acting as the effective ceiling. > If you use Docker 1.13 or higher, use --cpus instead. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-16 13:51:43 -05:00
Benoît Knecht	3c31b19ab3	ceph-rgw: Fix custom pool size setting RadosGW pools can be created by setting ```yaml rgw_create_pools: .rgw.root: pg_num: 512 size: 2 ``` for instance. However, doing so would create pools of size `osd_pool_default_size` regardless of the `size` value. This was due to the fact that the Ansible task used ``` {{ item.size \| default(osd_pool_default_size) }} ``` as the pool size value, but `item.size` is always undefined; the correct variable is `item.value.size`. Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-01-08 16:16:38 -05:00
Guillaume Abrioux	9bad239d77	common: improve keyrings generation There is no need to get n * number of nodes the different keyrings. Adding a `run_once: true` here avoid running a ceph command too many times which could be impacting large cluster deployment. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-10-02 13:09:50 +02:00
Guillaume Abrioux	e08194dd67	rgw: refact tasks directory layout This commit moves containerized deployment related files to `./tasks/` directory. This is needed to make `docker-to-podman.yml` working since we use `tasks_from:` option. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-10-01 10:27:51 -04:00
Guillaume Abrioux	bd64167469	container: isolate systemd tasks This commit isolates the systemd unit files generation for containers into separate yml files in order to be able importing each corresponding roles without playing all tasks. This is needed so we can run ceph-ansible to render systemd unit files so they call podman instead of docker. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-10-01 10:27:51 -04:00
Guillaume Abrioux	ab370b6ad8	global: remove fetch_directory dependency This commit drops the fetch_directory dependency. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-09-26 11:35:24 +02:00
Dimitri Savineau	42082c0a27	lint: fix error [201,206] [201] Trailing whitespace [206] Variables should have spaces before and after: {{ var_name }} Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-08-29 14:28:35 -04:00
Artur Fijalkowski	011270ca69	global: make directories mode parameterizable This commit makes it possible to parametrize the ceph directories modes. So it changes hardocded mode for ceph related directories from 0755 to customizable with `ceph_directories_mode` variable. Closes: #2920 Signed-off-by: Artur Fijalkowski <artur.fijalkowski@ing.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-23 09:38:17 +02:00
guihecheng	a0590cae9d	rgw/multisite: assign 'rgw_zone' to the exact section in ceph.conf since the following commit: commit `1ac94c048f` rgw: add support for multiple rgw instances on a single host we have multi-instance rgw support on a single host and the config section name of the rgw changed from [client.rgw.$(hostname)] -> [client.rgw.$(hostname).rgwX] when X is the sequence number: 0,1,2,... So we should assign 'rgw_zone' item to the exact rgw instance config section in ceph.conf Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>	2019-08-23 08:14:10 +02:00
Guillaume Abrioux	19950b5170	container: rename docker directories Those 2 directories should be renamed to be more generic (docker vs. podman). Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-24 16:31:46 +02:00
Giulio Fidente	d526803c6c	Add radosgw_frontend_ssl_certificate parameter This is necessary when configuring RGW with SSL because in addition to passing specific frontend options, civetweb appends the 's' character to the binding port and beast uses ssl_endpoint instead of endpoint. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1722071 Signed-off-by: Giulio Fidente <gfidente@redhat.com>	2019-07-02 14:14:37 -04:00
Guillaume Abrioux	33eed78d17	containers: improve logging bindmount /var/log/ceph on all containers so it's possible to retrieve logs from the host. related ceph-container PR: ceph/ceph-container#1408 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1710548 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-28 13:30:36 -04:00
Dimitri Savineau	7c3640177b	roles: Remove useless become (true) flag We already set the become flag to true at a play level in the site* playbooks so we don't need to set it at a task level. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-06-19 10:31:32 +02:00
Dimitri Savineau	f49090df7e	podman: Add systemd dependency on network.target When using podman, the systemd unit scripts don't have a dependency on the network. So we're not sure that the network is up and running when the containers are starting. With docker this behaviour is already handled because the systemd unit scripts depend on docker service which is started after the network. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-06-07 09:28:58 +02:00
L3D	ab54fe20ec	ansible: use 'bool' filter on boolean conditionals By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these: ``` [DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable, this behaviour will go away and you might need to add \|bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This feature will be removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. ``` Now appended ``\| bool`` on a lot of the affected variables. Sometimes the coding style from ``variable\|bool`` changed to ``variable \| bool`` (with spaces at the pipe). Closes: #4022 Signed-off-by: L3D <l3d@c3woc.de>	2019-06-06 10:21:17 +02:00
Guillaume Abrioux	e74d80e72f	rename docker_exec_cmd variable This commit renames the `docker_exec_cmd` variable to `container_exec_cmd` so it's more generic. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-05-16 16:39:13 +02:00
Kevin Coakley	381c58ca3e	Set the rgw_create_pools pools application to rgw Set the application to rgw for pools created from rgw_create_pools. On Ceph Nautilus the heath is set to HEALTH_WARN with the message "application not enabled on X pool(s)" if an application isn't specified for a pool. Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>	2019-05-13 09:48:25 +02:00
Rishabh Dave	739a662c80	improve coding style Keywords requiring only one item shouldn't express it by creating a list with single item. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-04-23 15:37:07 +02:00
Kyle Bader	0bee90b201	rgw: add cpuset support 1/ The OSD already supports cpuset to be used for containerized deployments through the use of the ceph_osd_docker_cpuset_cpus variable. This adds similar support to the RGW service for containerized deployments by setting a new variable named ceph_rgw_docker_cpuset_cpus. Like the OSD, there are times where using distinct cores has advantages over using the CFS in kernel scheduler. ceph_rgw_docker_cpuset_cpus accepts a comma delimited set of CPU ids 2/ Add support for specifying --cpuset-mem variable to restrict the cgroup's memory allocations to a particular numa node, which should typically correspond with the cpu ids of that numa node that were provided with --cpuset-cpus. To ensure the correct cpu ids are used one can run `numactl --hardware` to list the nodes and which cpu ids correspond to each. Signed-off-by: Kyle Bader <kbader@redhat.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-18 15:55:19 +02:00
François Lafont	4c3e77d869	ceph-rgw: Fix bad paths which depend on the clustername The path of the RGW environment file (in the /var/lib/ceph/radosgw/ directory) depends on the Ceph clustername. It was not taken into account in the Ansible role `ceph-rgw`. Signed-off-by: flaf <francois.lafont.1978@gmail.com>	2019-04-09 06:16:31 +02:00
Ali Maredia	37f46a8c5d	rgw multisite: add more than 1 rgw to the master or secondary zone Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1664869 Signed-off-by: Ali Maredia <amaredia@redhat.com>	2019-04-06 08:01:19 +02:00
Dimitri Savineau	d3ae9fd05f	radosgw: Raise cpu limit to 8 In containerized deployment the default radosgw quota is too low for production environment. This is causing performance degradation compared to bare-metal. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1680171 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-04 18:50:48 +02:00
Guillaume Abrioux	6f47c20c3a	rgw: fix a typo `ee2d52d33d` introduced a typo. This commit fixes it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-25 16:02:56 -04:00
Guillaume Abrioux	3c4f464c54	rgw: cleanup legacy task this task was here for backward compatibility. It's time to remove it in the next release. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-25 16:02:56 -04:00
Guillaume Abrioux	9134624578	rgw: add a retry on pool related tasks sometimes those tasks might fail because of a timeout. I've been facing this several times in the CI, adding this retry might help and won't hurt in any case. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-25 16:02:56 -04:00
Guillaume Abrioux	82764afe8d	update: mask systemd service units during upgrade This prevents the packaging from restarting services before we do need to restart them in the rolling update sequence. We want to handle services restart at rolling_update playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-25 16:02:56 -04:00
Dimitri Savineau	a089e1ec23	systemd/service: Set docker.service conditionally We don't need to set After=docker.service when the container_binary variable isn't set to docker. It doesn't break anything currently but it could be confusing when using podman. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-03-07 20:56:11 +00:00
Dimitri Savineau	cb381b41fe	Add CONTAINER_IMAGE env var to ceph daemons Ceph daemons will set the CONTAINER_IMAGE environment variable value in the daemon metadata. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-03-05 15:07:05 +00:00

1 2 3 4 5 ...

275 Commits (51c382677dfa5db8fc39ca9c3c4898e017f3c189)