ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	14f2d616ee	ceph-nfs: use template module for configuration `789cef7` introduces a regression in the ganesha configuration file generation. The new config_template module version broke it. But the ganesha.conf file isn't an ini file and doesn't really need to use the config_template module. Instead we can use the classic template module. Resolves: #4045 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `616c484698`)	2019-06-24 20:47:25 +02:00
Dimitri Savineau	d08af0a654	ceph-disk: Set max open files limit on container Same behaviour than ceph-volume (`b987534`). The ceph-disk command runs faster when using ulimit nofile with container cli. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-06-24 10:06:11 +02:00
Dimitri Savineau	2b492e3de1	ceph-handler: Fix OSD restart script There's two big issues with the current OSD restart script. 1/ We try to test if the ceph osd daemon socket exists but we use a wildcard for the socket name : /var/run/ceph/*.asok. This fails because we usually have multiple ceph osd sockets (or other ceph daemon collocated) present in /var/run/ceph directory. Currently the test fails with: bash: line xxx: [: too many arguments But it doesn't stop the script execution. Instead we can specify the full ceph osd socket name because we already know the OSD id. 2/ The container filter pattern is wrong and could matches multiple containers resulting the script to fail. We use the filter with two different patterns. One is with the device name (sda, sdb, ..) and the other one is with the OSD id (ceph-osd-0, ceph-osd-15, ..). In both case we could match more than needed. $ docker container ls CONTAINER ID IMAGE NAMES 958121a7cc7d ceph-daemon:latest ceph-osd-strg0-sda 589a982d43b5 ceph-daemon:latest ceph-osd-strg0-sdb 46c7240d71f3 ceph-daemon:latest ceph-osd-strg0-sdaa 877985ec3aca ceph-daemon:latest ceph-osd-strg0-sdab $ docker container ls -q -f "name=sda" 958121a7cc7d 46c7240d71f3 877985ec3aca $ docker container ls CONTAINER ID IMAGE NAMES 2db399b3ee85 ceph-daemon:latest ceph-osd-5 099dc13f08f1 ceph-daemon:latest ceph-osd-13 5d0c2fe8f121 ceph-daemon:latest ceph-osd-17 d6c7b89db1d1 ceph-daemon:latest ceph-osd-1 $ docker container ls -q -f "name=ceph-osd-1" 099dc13f08f1 5d0c2fe8f121 d6c7b89db1d1 Adding an extra '$' character at the end of the pattern solves the problem. Finally removing the get_container_osd_id function because it's not used in the script at all. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `45d46541cb`)	2019-06-21 14:49:55 -04:00
Dimitri Savineau	f4212b20e5	ceph-volume: Set max open files limit on container The ceph-volume lvm list command takes ages to complete when having a lot of LV devices on containerized deployment. For instance, with 25 OSDs on a node it takes 3 mins 44s to list the OSD. Adding the max open files limit to the container engine cli when executing the ceph-volume command seems to improve a lot thee execution time ~30s. This was impacting the OSDs creation with ceph-volume (both filestore and bluestore) when using multiple LV devices. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b987534881`)	2019-06-20 20:01:13 -04:00
Guillaume Abrioux	f29366b848	ceph-osd: do not relabel /run/udev in containerized context Otherwise content in /run/udev is mislabeled and prevent some services like NetworkManager from starting. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `80875adba7`)	2019-06-19 23:46:46 +02:00
Rishabh Dave	114078bfa1	ceph-infra: make chronyd default NTP daemon Since timesyncd is not available on RHEL-based OSs, change the default to chronyd for RHEL-based OSs. Also, chronyd is chrony on Ubuntu, so set the Ansible fact accordingly. Fixes: https://github.com/ceph/ceph-ansible/issues/3628 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `9d88d3199f`)	2019-06-18 10:46:34 +02:00
Rishabh Dave	93c7d8d79d	don't install NTPd on Atomic Since Atomic doesn't allow any installations and NTPd is not present on Atomic image we are using, abort when ntp_daemon_type is set to ntpd. https://github.com/ceph/ceph-ansible/issues/3572 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `bdff3e48fd`)	2019-06-18 10:46:34 +02:00
Dimitri Savineau	81de8a8106	remove ceph-agent role and references The ceph-agent role was used only for RHCS 2 (jewel) so it's not usefull anymore. The current code will fail on CentOS distribution because the rhscon package is only avaible on Red Hat with the RHCS 2 repository and this ceph release is supported on stable-3.0 branch. Resolves: #4020 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `7503098ca0`)	2019-06-17 14:42:08 -04:00
Dimitri Savineau	ed9b594b80	tests: Update ansible ssh_args variable Because we're using vagrant, a ssh config file will be created for each nodes with options like user, host, port, identity, etc... But via tox we're override ANSIBLE_SSH_ARGS to use this file. This remove the default value set in ansible.cfg. Also adding PreferredAuthentications=publickey because CentOS/RHEL servers are configured with GSSAPIAuthenticationis enabled for ssh server forcing the client to make a PTR DNS query. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `34f9d51178`)	2019-06-17 12:02:36 -04:00
Guillaume Abrioux	64659d2c82	iscsi: assign application (rbd) to pool 'rbd' if we don't assign the rbd application tag on this pool, the cluster will get `HEALTH_WARN` state like following: ``` HEALTH_WARN application not enabled on 1 pool(s) POOL_APP_NOT_ENABLED application not enabled on 1 pool(s) application not enabled on pool 'rbd' ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `4cf17a6fdd`)	2019-06-13 14:43:25 +02:00
Dimitri Savineau	95f3908e44	ceph-handler: replace fuser by /proc/net/unix We're using fuser command to see if a process is using a ceph unix socket file. But the fuser command runs through every PID present in /proc/<PID> to see if one of them is using the file. On a system running thousands processes, the fuser command can take a long time to finish. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1717011 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `da9891da1e`)	2019-06-12 23:00:21 +02:00
Guillaume Abrioux	db90debcc7	validate: fail in check_devices at the right task see https://bugzilla.redhat.com/show_bug.cgi?id=1648168#c17 for details. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1648168#c17 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `771648304d`)	2019-06-10 08:09:58 +02:00
Dimitri Savineau	0b653ee5b4	update default rhcs values and docs The RHCS documentation mentionned in the default values and group_vars directory are referring to RHCS 2.x while it should be 3.x. Revolves: https://bugzilla.redhat.com/show_bug.cgi?id=1702732 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-06-04 14:18:23 +02:00
Guillaume Abrioux	5053f32c15	osds: allow passing devices by path ceph-volume didn't work when the devices where passed by path. Since it now support it, let's allow this feature in ceph-ansible Closes: #3812 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `8f2c45dfd3`)	2019-05-09 14:21:43 +02:00
Dimitri Savineau	2fa8099fa7	osd: set default bluestore_wal_devices empty We only need to set the wal dedicated device when there's three tiers of storage used. Currently the block.wal partition will also be created on the same device than block.db. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1685253 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-25 07:13:38 +00:00
Dimitri Savineau	7418999638	ceph-mds: Increase cpu limit to 4 In containerized deployment the default mds cpu quota is too low for production environment. This is causing performance degradation compared to bare-metal. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695850 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `1999cf3d19`)	2019-04-24 21:44:23 +00:00
Dimitri Savineau	54128db5cd	ceph-osd: Fix merge conflict from mergify The PR #3916 was merged automatically by mergify even if there was a confict in the ceph-osd-run.sh.j2 template. This commit resolves the conflict. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-24 12:41:23 -04:00
Dimitri Savineau	3ae2a687ed	ceph-osd: Increase cpu limit to 4 In containerized deployment the default osd cpu quota is too low for production environment using NVMe devices. This is causing performance degradation compared to bare-metal. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695880 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c17106874c`) # Conflicts: # roles/ceph-osd/templates/ceph-osd-run.sh.j2	2019-04-24 16:02:28 +00:00
Matthew Vernon	1556d802ff	ceph-mon: increase timeout waiting for admin and bootstrap keys With a large and/or busy cluster, it can take significantly more than 30s for a restarted monitor to get to the point where `ceph-create-keys` returns successfully. A recent upgrade of our production cluster failed here because it took a couple of minutes for the newly-upgraded `mon` to be ready. So increase the timeout significantly. This patch is applied to stable-3.2, because the affected code is refactored in stable-4.0 and ceph-create-keys is no longer called. Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>	2019-04-12 17:03:39 +00:00
Dimitri Savineau	56215d7688	ceph-mds: Set application pool to cephfs We don't need to use the cephfs variable for the application pool name because it's always cephfs. If the cephfs variable is set to something else than the default value it will break the appplication pool task. Resolves: #3790 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d2efb7f02b`)	2019-04-11 15:38:14 +00:00
Guillaume Abrioux	c5c354a61a	remove all NBSPs char in stable-3.2 branch this can cause issues, let's replace all of these chars with real spaces. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-04-10 13:27:48 +02:00
Matthew Vernon	a8c9b65d13	UCA: Uncomment UCA variables in defaults, fix consequent breakage The Ubuntu Cloud Archive-related (UCA) defaults in roles/ceph-defaults/defaults/main.yml were commented out, which means if you set `ceph_repository` to "uca", you get undefined variable errors, e.g. ``` The task includes an option with an undefined variable. The error was: 'ceph_stable_repo_uca' is undefined The error appears to have been in '/nfs/users/nfs_m/mv3/software/ceph-ansible/roles/ceph-common/tasks/installs/debian_uca_repository.yml': line 6, column 3, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: add ubuntu cloud archive repository ^ here ``` Unfortunately, uncommenting these results in some other breakage, because further roles were written that use the fact of `ceph_stable_release_uca` being defined as a proxy for "we're using UCA", so try and install packages from the bionic-updates/queens release, for example, which doesn't work. So there are a few `apt` tasks that need modifying to not use `ceph_stable_release_uca` unless `ceph_origin` is `repository` and `ceph_repository` is `uca`. Closes: #3475 Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk> (cherry picked from commit `9dd913cf8a`)	2019-04-09 16:54:37 +00:00
Dimitri Savineau	efa0083f3c	ceph-osd: Drop memory flag with bluestore Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `dc1c0dcee2`)	2019-04-09 13:26:20 +00:00
Dimitri Savineau	bbb8ca6643	mon/rgw: use last ipv6 address When using monitor_address_block or radosgw_address_block variables to configure the mon/rgw address we're getting the first ip address from the ansible facts present in that cidr. When there's VIP on that network the first filter could return the wrong value. This seems to affect only IPv6 setup because the VIP addresses are added to the ansible facts at the beginning of the list. This is the opposite (at the end) when using IPv4. This causes the mon/rgw processes to bind on the VIP address. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1680155 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-04-09 06:17:27 +02:00
Ali Maredia	e943288cae	rgw multisite: add more than 1 rgw to the master or secondary zone Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1664869 Signed-off-by: Ali Maredia <amaredia@redhat.com> (cherry picked from commit `37f46a8c5d`)	2019-04-06 08:50:30 +00:00
Dimitri Savineau	d1b3d18af1	radosgw: Raise cpu limit to 8 In containerized deployment the default radosgw quota is too low for production environment. This is causing performance degradation compared to bare-metal. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1680171 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d3ae9fd05f`)	2019-04-04 19:14:28 +02:00
Guillaume Abrioux	b92c826661	defaults: change default value for ceph_docker_image_tag Since nautilus has been released, it's now the latest stable release, it means the tag `latest` now refers to nautilus. `stable-3.2` isn't intended to deploy nautilus, therefore, we should change the default value for this variable to the latest release stable-3.2 is able to deploy (mimic). Closes: #3734 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-03-21 18:37:21 +00:00
Dimitri Savineau	e4a71eabd9	ceph-osd: Ensure lvm2 is installed When using osd_scenario lvm, we never check if the lvm2 package is present on the host. When using containerized deployment and docker on CentOS/RedHat this package will be automatically installed as a dependency but not for Ubuntu distribution. OSD deployed via ceph-volume require the lvmetad.socket to be active and running. Resolves: #3728 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `179fdfbc19`)	2019-03-20 22:59:28 +00:00
Guillaume Abrioux	d3f6556041	osd: backward compatibility with old disk_list.sh location Since all files in container image have moved to `/opt/ceph-container` this check must look for new AND the old path so it's backward compatible. Otherwise it could end up by templating an inconsistent `ceph-osd-run.sh`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `987bdac963`)	2019-03-18 21:56:53 +00:00
Dimitri Savineau	46e8898093	ceph-validate: fail if there's no ipaddr available in monitor_address_block subnet When using monitor_address_block to determine the ip address of the monitor node, we need an ip address available in that cidr to be present in the ansible facts (ansible_all_ipv[46]_addresses). Currently we don't check if there's an ip address available during the ceph-validate role. As a result, the ceph-config role fails due to an empty list during ceph.conf template creation but the error isn't explicit. TASK [ceph-config : generate ceph.conf configuration file] ***** fatal: [0]: FAILED! => {"msg": "No first item, sequence was empty."} With this patch we will fail before the ceph deployment with an explicit failure message. Resolves: rhbz#1673687 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5c39735be5`)	2019-03-18 18:31:18 +00:00
Gregory Orange	86e39a29c8	Change docker_container parameter network to network_mode Addressing "populate kv_store with custom ceph.conf": Unsupported parameters for (docker_container) module. Looking at https://docs.ansible.com/ansible/latest/modules/docker_container_module.html shows that the correct parameter is network_mode, not network. Signed-off-by: Gregory Orange <gregoryo2014@users.noreply.github.com>	2019-03-18 13:23:10 +00:00
Dimitri Savineau	bfa99cdd53	Set the default crush rule in ceph.conf Currently the default crush rule value is added to the ceph config on the mon nodes as an extra configuration applied after the template generation via the ansible ini module. This implies two behaviors: 1/ On each ceph-ansible run, the ceph.conf will be regenerated via ceph-config+template and then ceph-mon+ini_file. This leads to a non necessary daemons restart. 2/ When other ceph daemons are collocated on the monitor nodes (like mgr or rgw), the default crush rule value will be erased by the ceph.conf template (mon -> mgr -> rgw). This patch adds the osd_pool_default_crush_rule config to the ceph template and only for the monitor nodes (like crush_rules.yml). The default crush rule id is read (if exist) from the current ceph configuration. The default configuration is -1 (ceph default). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1638092 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `d8538ad4e1`)	2019-03-14 14:48:03 +00:00
Dimitri Savineau	2f3206abeb	ceph-osd: Install numactl package when needed With `3e32dce` we can run OSD containers with numactl support. When using numactl command in a containerized deployment we need to be sure that the corresponding package is installed on the host. The package installation is only executed when the ceph_osd_numactl_opts variable isn't empty. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `b7f4e3e7c7`)	2019-03-12 08:14:47 +00:00
Guillaume Abrioux	34086ec233	osd: support numactl options on OSD activate This commit adds OSD containers activate with numactl support. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1684146 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b3eb9206fa`)	2019-03-11 09:50:29 +00:00
VasishtaShastry	2393d82306	Extends check_devices tasks to non-collocated an lvm-batch scenarios Tuned name of a task and error message to make it more user understandable Fixes BZ 1648168 - ceph-validate : devices are not validated in non-collocated and lvm_batch scenario Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1648168 Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com> (cherry picked from commit `34c25ef49b`)	2019-03-01 04:06:57 +00:00
ToprHarley	d1051c8e55	Convert interface names to underscores Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1540881 Signed-off-by: Tomas Petr <tpetr@redhat.com> (cherry picked from commit `573adce7dd`)	2019-02-28 19:02:32 +00:00
Guillaume Abrioux	de3465b6a3	osd: add ipc=host in systemd template for containers in addition to `15812970f0` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d5be83e504`)	2019-02-28 13:48:39 +00:00
fpantano	1033411512	Removed not needed mountpoint and removed ubuntu section Referring to BZ#1683290, as dsavineau suggests, being this bug tripleO specific, removed the ubuntu section and removed useless mountpoints. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683290 Signed-off-by: fpantano <fpantano@redhat.com> (cherry picked from commit `21fad7ced3`)	2019-02-28 12:31:23 +00:00
fpantano	9b843c24f9	Added to the ceph-radosgw service template the ca-trust volume avoiding to expose useless information. This bug is referred to the following bugzilla: Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683290 Signed-off-by: fpantano <fpantano@redhat.com> (cherry picked from commit `0c1944236b`)	2019-02-28 12:31:23 +00:00
Kevin Coakley	2005d857df	Set permissions on monitor directory to u=rwX,g=rX,o=rX recursive Set directories to 755 and files to 644 to /var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} recursively instead of setting files and directories to 755 recursively. The ceph mon process writes files to this path with permissions 644. This update stops ansible from updating the permissions in /var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} every time ceph mon writes a file and increases idempotency. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683997 Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu> (cherry picked from commit `d327681b99`)	2019-02-28 10:52:04 +00:00
Dimitri Savineau	77596c791d	mon: Move client admin variable to defaults There's no need to set the client_admin_ceph_authtool_cap variable via a set_fact task. Instead we can set this in the role defaults. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `58a9d310d5`)	2019-02-27 20:03:13 +00:00
Dimitri Savineau	05c6ac4d78	mon: Add mds permissions to client.admin The administrator keyring needs full capabilities on mds like mon, osd and mgr. Whithout this, the client.admin key won't be able to run commands against mds (like ceph tell mds.0 session ls) Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1672878 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `dd7b7604de`)	2019-02-27 20:03:13 +00:00
Guillaume Abrioux	8cc75e516c	common: do not override ceph_release when ceph_repository is 'rhcs' We shouldn't reset `ceph_release` with `ceph_stable_release` when `ceph_repository` is `rhcs` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2b60a35634`)	2019-02-21 13:03:16 +00:00
Guillaume Abrioux	d15b055854	osd: make the 'wait for all osd to be up' task configurable introduce two new variables to make the check that 'wait for all osd to be up' configurable. It's possible that for some deployments, OSDs can take longer to be seen as UP and IN. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1676763 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `21e5db8982`)	2019-02-20 16:53:06 +00:00
David Waiting	eba80adb1a	ensure at least one osd is up The existing task checks that the number of OSDs is equal to the number of up OSDs before continuing. The problem is that if none of the OSDs have been discovered yet, the task will exit immediately and subsequent pool creation will fail (num_osds = 0, num_up_osds = 0). This is related to Bugzilla 1578086. In this change, we also check that at least one OSD is present. In our testing, this results in the task correctly waiting for all OSDs to come up before continuing. Signed-off-by: David Waiting <david_waiting@comcast.com> (cherry picked from commit `3930791cb7`)	2019-02-19 19:02:16 +00:00
Patrick C. F. Ernzer	a43c68df7d	setup_ntp: call handler to disable ntpd if chronyd used The task setup chronyd called the handler disable chronyd, which of course defeats the purpose. Changing the task to disable ntpd instead fixes the issue of chronyd being disabled after it got enabled. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1673664 Fixes: #3582 Signed-off-by: Patrick C. F. Ernzer pcfe@redhat.com (cherry picked from commit `c605ff6a68`)	2019-02-15 09:09:36 +00:00
Guillaume Abrioux	6200f90ab2	iscsi: fix permission denied error Typical error: ``` fatal: [iscsi-gw0]: FAILED! => msg: 'an error occurred while trying to read the file ''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key'': [Errno 13] Permission denied: b''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key''' ``` `become: True` is not needed on the following task: `copy crt file(s) to gateway nodes`. Since it's already set in the main playbook (site.yml/site-container.yml) The thing is that the files get generated in the 'fetch_directory' with root user because there is a 'delegate_to' + we run the playbook with `become: True` (from main playbook). The idea here is to create files under ansible user so we can open them later to copy them on the remote machine. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `9d590f4339`)	2019-02-11 16:17:44 +00:00
Leah Neukirchen	d855cb2595	Fix uses of default(omit) with string concatenation When {{omit}} is concatenated with another string, it expands to something like __omit_place_holder__63eea0d96dd6ed867b95405e11d87dddf61f448d. However, in these use-cases we need an empty string. Regression introduced in `d53f55e807`. Signed-off-by: Leah Neukirchen <leah.neukirchen@mayflower.de>	2019-02-08 11:01:11 +00:00
Sébastien Han	7db797d8df	osd: expose udev into the container In order to be able to retrieve udev information, we must expose its socket. As per, https://github.com/ceph/ceph/pull/25201 ceph-volume will start consuming udev output. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `997667a873`)	2019-02-06 00:37:11 +00:00
Guillaume Abrioux	303cc85754	osd: bind mount /var/run/udev/ without this, the command `ceph-volume lvm list --format json` hangs and takes a very long time to complete. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7ade032807`)	2019-02-06 00:37:11 +00:00
Guillaume Abrioux	af17e0dfbb	override ceph_release with ceph_stable_release when `ceph_origin` is set to `'repository'` and `ceph_repository` to `'community'` we need to ensure `ceph_release` reflect `ceph_stable_release`. `4a3f180f9d` simply removed the override while it should just have to be run only when the condition mentioned above is satisfied. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0bfefdd5bc`)	2019-01-24 14:18:34 +00:00
Guillaume Abrioux	e29cdd0a61	config: remove code related to ceph release prior to luminous This part of the code is not needed since ceph-ansible@master is intended to deploy ceph@master only. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1bbdde272f`)	2019-01-24 14:18:34 +00:00
Guillaume Abrioux	eaa92f7e55	ceph-default: rm useless condition This condition is useless and it's also creating issues we don't see in our CI. ceph_release is set by either ceph-common or ceph-docker-common so let's keep it this way. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379 (cherry picked from commit `e9188cd202`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-01-24 14:18:34 +00:00
Noah Watkins	e57e2d98a1	start_osds: use list instead of keys (re-introduce) the python3 fix merged by: https://github.com/ceph/ceph-ansible/pull/3346 was reintroduced a few days later by: `82a6b5adec` and this patch fixes it again :) Signed-off-by: Noah Watkins <nwatkins@redhat.com> (cherry picked from commit `3cf5fd2c3e`)	2019-01-16 15:48:35 +00:00
Sébastien Han	04d8002614	switch: do not fail on missing key Some people use the switch playbook to perform upgrade so they end up in the same situation than https://bugzilla.redhat.com/show_bug.cgi?id=1650572 This is applying the same fix as `729744c6a8`. We don't want to fail on key that are not present since they will get created after the mons are updated. They will be created by the task "create potentially missing keys (rbd and rbd-mirror)". Signed-off-by: Sébastien Han <seb@redhat.com>	2019-01-14 18:54:46 +00:00
Rishabh Dave	4e94d11aa7	ceph-infra: remove ntp_rmp.yml and ntp_debian.yml This commit fixes the merge conflict that occurred during the auto-backport and auto-merge of the commit `488281187e`. Also please note that the commit `488281187e` was merged (on PR 3477) "as it is" (despite of merge conflicts) which was not supposed to be the case ideally. This had a side-effect that the feature of supporting multiple NTP daemons (new ones are namely chronyd and timesyncd) was also backported which is itself against the convention. For consistency's sake the feature was backported to stable-3.1 as well. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-01-09 22:15:18 +01:00
Guillaume Abrioux	416b503476	introduce new role ceph-facts sometimes we play the whole role `ceph-defaults` just to access the default value of some variables. It means we play the `facts.yml` part in this role while it's not desired. Splitting this role will speedup the playbook. Closes: #3282 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `0eb56e36f8`)	2019-01-07 09:14:10 +01:00
Bruceforce	5c618d7084	The nfs_ganesha_dev_apt_repo variable was set incorrect in task "fetch nfs-ganesha development repository" This has to be pushed directly to stable-3.2 since master has diverged Signed-off-by: Bruceforce <Bruceforce@users.noreply.github.com>	2019-01-07 08:03:19 +00:00
Rishabh Dave	b2024899b9	ceph-infra: disable unrequired NTP services When one of the currently supported NTP services has been set up, disable rest of the NTP services on Ceph nodes. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `6fa757d343`)	2019-01-04 13:52:19 +00:00
Rishabh Dave	488281187e	ceph-infra: merge ntp_debian.yml and ntp_rpm.yml Merge ntp_debian.yml and ntp_rpm.yml into one (the new file is called setup_ntp.yml) since they are almost identical. Also avoid repetition of the common setup step for ntpd and chronyd services. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `b03ab60742`) # Conflicts: # roles/ceph-infra/tasks/ntp_debian.yml # roles/ceph-infra/tasks/ntp_rpm.yml	2019-01-04 13:52:19 +00:00
Kai Wembacher	e2852eb40e	add support for rocksdb and wal on the same partition in non-collocated Signed-off-by: Kai Wembacher <kai@ktwe.de> (cherry picked from commit `a273ed7f60`)	2018-12-20 14:21:14 +01:00
Guillaume Abrioux	c3a2320e01	revert infra: don't restart firewalld if unit is masked If firewalld unit is masked, setting `configure_firewall: false` is enough Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1655059 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `1cff1f9806`)	2018-12-04 17:31:31 +01:00
Sébastien Han	8d1c67beb2	osd: discover osd_objectstore on the fly Applying and passing the OSD_BLUESTORE/FILESTORE on the fly is wrong for existing clusters as their config will be changed. Typically, if an OSD was prepared with ceph-disk on filestore and we change the default objectstore to bluestore, the activation will fail. The flag osd_objectstore should only be used for the preparation, not activation. The activate in this case detects the osd objecstore which prevents failures like the one described above. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `4c51130198`)	2018-12-04 09:01:50 +00:00
Sébastien Han	1151521784	ceph-osd: change jinja condition If an existing cluster runs this config, and has ceph-disk OSD, the `expose_partitions` won't be expected by jinja since it's inside the 'old' if. We need it as part of the osd_scenario != 'lvm' condition. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1640273 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `bef522627e`)	2018-12-04 09:01:50 +00:00
Sébastien Han	729744c6a8	rolling_update: do not fail on missing keys We don't want to fail on key that are not present since they will get created after the mons are updated. They will be created by the task "create potentially missing keys (rbd and rbd-mirror)". Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1650572 Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `ebc901c6af`)	2018-12-03 13:03:33 +01:00
Noah Watkins	e8b10f47dc	rgw: use correct default rgw frontend address since 0.0.0.0 is the default radosgw address (not 'address'), not configuring an address explicitly, and instead configuring the radosgw interface, would result in 0.0.0.0 being used, instead of falling through to section that inspects the interface config option. backport note: this cannot be cherry-picked from master since this code doesn't exist in master. fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1655131 Signed-off-by: Noah Watkins <nwatkins@redhat.com>	2018-12-01 20:09:46 +00:00
Sébastien Han	452069cb3a	osd: manage legacy ceph-disk non-container startup The code is now able (again) to start osds that where configured with ceph-disk on a non-container scenario. Closes: https://github.com/ceph/ceph-ansible/issues/3388 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-29 23:30:21 +01:00
Guillaume Abrioux	8d93007e56	config: write jinja comment with appropriate syntax jinja comment should be written using the jinja syntax `{# ... #}` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654441 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `a86c2b8526`)	2018-11-29 21:19:41 +01:00
Guillaume Abrioux	316e49c6d7	client: change default pool size default pool size should match the real default that is defined in ceph itself. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ed42262b37`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	1077ae0060	defaults: change default size for openstack pools default pool size should match the real default that is defined in ceph itself. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `6d1fe32998`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	a4db9bd6e8	defaults: change for default pool size for cephfs_pools default pool size should match the real default that is defined in ceph itself. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `fdc438dd0d`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	65699e4558	defaults: add ceph related vars file This is to add a granularity level. We can have ceph specific variables that user shouldn't have to change here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f1735e9bb0`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	f0195e97ed	refact osd pool size customization Add real default value for osd pool size customization. Ceph itself has an `osd_pool_default_size` default value to `3`. If users don't specify a pool size in various pools definition within ceph-ansible, we should default to `3`. By the way, this kind of condition isn't really clear: ``` when: - rbd_pool_size \| default ("") ``` we should try to get the customized value then default to what is in `osd_pool_default_size` (which has its default value pointing to `ceph_osd_pool_default_size` (`3`) as well) and compare it to `ceph_osd_pool_default_size`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7774069d45`)	2018-11-29 01:49:05 +00:00
Guillaume Abrioux	68b2ad11ee	mon: move `osd_pool_default_pg_num` in `ceph-defaults` `osd_pool_default_pg_num` parameter is set in `ceph-mon`. When using ceph-ansible with `--limit` on a specifc group of nodes, it will fail when trying to access this variables since it wouldn't be defined. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `d4c0960f04`)	2018-11-29 01:49:05 +00:00
Sébastien Han	9b5a93e3a5	osd: re-introduce disk_list check This commit `4cc1506303 (diff-51bbe3572e46e3b219ad726da44b64ebL13)` accidentally removed this check. This is a must have for ceph-disk based containerized OSDs. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-11-29 00:31:13 +01:00
Guillaume Abrioux	659f2c60b5	validate: change default value for `radosgw_address` change default value of `radosgw_address` to keep consistency with `monitor_address`. Moreover, `ceph-validate` checks if the value is '0.0.0.0' to determine if it has to run `check_eth_rgw.yml`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1600227 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e4869ac8bd`)	2018-11-28 23:54:06 +01:00
Guillaume Abrioux	4cc1506303	osd: commonize start_osd code since `ceph-volume` introduction, there is no need to split those tasks. Let's refact this part of the code so it's clearer. By the way, this was breaking rolling_update.yml when `openstack_config: true` playbook because nothing ensured OSDs were started in ceph-osd role (In `openstack_config.yml` there is a check ensuring all OSD are UP which was obviously failing) and resulted with OSDs on the last OSD node not started anyway. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f7fcc012e9`)	2018-11-28 23:11:46 +01:00
Sébastien Han	2fca8555cc	handler: show unit logs on error This will tremendously help debugging daemons that fail on restart by showing the systemd unit logs. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `a9b337ba66`)	2018-11-27 12:44:15 +00:00
Guillaume Abrioux	1a1886a442	config: convert _osd_memory_target to int ceph.conf doesn't accept float value. Typical error seen: ``` $ sudo ceph daemon osd.2 config get osd_memory_target Can't get admin socket path: unable to get conf option admin_socket for osd.2: parse error setting 'osd_memory_target' to '7823740108,8' (strict_si_cast: unit prefix not recognized) ``` This commit ensures the value inserted in ceph.conf will be an integer. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `68dde424f6`)	2018-11-21 15:35:55 +00:00
Guillaume Abrioux	abdc245ceb	infra: don't restart firewalld if unit is masked if firewalld.service systemd unit is masked, the handler will fail when trying to restart it. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1650281 (cherry picked from commit `63b9835cbb`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-19 17:32:44 +01:00
Neha Ojha	c96af4bac9	osd_memory_target: standardize unit and fix calculation * The default value of osd_memory_target used by ceph is 4294967296 bytes, so use the same as ceph-ansible default. * Convert ansible_memtotal_mb to bytes to calculate osd_memory_target Signed-off-by: Neha Ojha <nojha@redhat.com> (cherry picked from commit `10538e9a23`)	2018-11-19 10:51:05 +00:00
Guillaume Abrioux	f5d8701ed8	client: fix a typo in create_users_keys.yml `cd1e4ee024` introduced a typo. This commit fixes it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `393ab94728`)	2018-11-17 20:59:11 +00:00
Guillaume Abrioux	62d2ddafd4	validate: allow stable-3.2 to run with ansible 2.4 Although this is not officially supported, this commit allows `stable-3.2` to run against ansible 2.4. This should ease the transition in RHOSP. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-11-16 08:57:00 +00:00
Jason Dillaman	3b40e2bc87	igw: add support for IPv6 Signed-off-by: Jason Dillaman <dillaman@redhat.com> (cherry picked from commit `0aff0e9ede`) Conflicts: library/igw_purge.py: trivial resolution roles/ceph-iscsi-gw/library/igw_purge.py: trivial resolution	2018-11-13 17:35:58 +00:00
Mike Christie	702f2baccc	igw: open iscsi target port Open the port the iscsi target uses for iscsi traffic. Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `5ba7d1671e`)	2018-11-12 10:46:41 +00:00
Mike Christie	44ee5c7495	igw: use api_port variable for firewall port setting Don't hard code api port because it might be overridden by the user. Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `e2f1f81de4`)	2018-11-12 10:46:41 +00:00
Mike Christie	db576f6f0e	igw: fix firewall iscsi_group_name check The firewall setup for igw is not getting setup because iscsi_group_name does not it exist. It should be iscsi_gw_group_name. Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `a4ff52842c`)	2018-11-12 10:46:41 +00:00
Mike Christie	c843ea1d92	igw: Fix default api port The default igw api port is 5000 in the manual setup docs and ceph-iscsi-config package so this syncs up ansible. Signed-off-by: Mike Christie <mchristi@redhat.com> (cherry picked from commit `a10853c5f8`)	2018-11-12 10:46:41 +00:00
Sébastien Han	12ce311da5	rbd-mirror: enable ceph-rbd-mirror.target Without this the daemon will never start after reboot. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit `b7a791e902`)	2018-11-09 16:48:35 +01:00
Guillaume Abrioux	d5409109fb	rgw: move multisite default variables in ceph-defaults Move all rgw multisite variables in ceph-defaults so ceph-validate can go through them. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 17:41:35 +01:00
Guillaume Abrioux	547e90f281	rgw: move multisite related tasks after docker/main.yml We must play this task after the container has started otherwise rgw_multisite tasks will fail. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Guillaume Abrioux	710e11668d	rgw: add rgw_multisite for containerized deployments run commands on containers when containerized deployments. (At the moment, all commands are run on the host only) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Guillaume Abrioux	fe88c89c9c	validate: remove check on rgw_multisite_endpoint_addr definition since `rgw_multisite_endpoint_addr` has a default value to `{{ ansible_fqdn }}`, it shouldn't be mandatory to set this variable. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Ali Maredia	59e6d04f9b	rgw: add ceph-validate tasks for multisite, other fixes - updated README-MULTISITE - re-added destroy.yml - added tasks in ceph-validate to make sure the rgw multisite vars are set Signed-off-by: Ali Maredia <amaredia@redhat.com>	2018-10-30 14:00:28 +01:00
Guillaume Abrioux	77d5d128c3	rgw: add a dedicated variable for multisite endpoint We should give users the possibility to set the IP they want as multisite endpoint, setting the default value to `{{ ansible_fqdn }}` to not force them to set this variable. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-30 14:00:28 +01:00
Ali Maredia	474f151450	rgw: update rgw multisite tasks - remove destroy tasks - cleanup conditionals and syntax - remove unnecessary realm pulls - enable multisite to be tested in automated testing infra - add multisite related vars to main.yml and group_vars - update README-MULTISITE - ensure all `radosgw-admin` commands are being run on a mon Signed-off-by: Ali Maredia <amaredia@redhat.com>	2018-10-30 14:00:28 +01:00
Guillaume Abrioux	748342f5b6	roles: fix _docker_memory_limit default value append 'm' suffix to specify the unit size used in all `_docker_memory_limit`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2018-10-29 14:59:09 +01:00
Neha Ojha	b7e4d4eb84	roles: do not limit docker_memory_limit for various daemons Since we do not have enough data to put valid upper bounds for the memory usage of these daemons, do not put artificial limits by default. This will help us avoid failures like OOM kills due to low default values. Whenever required, these limits can be manually enforced by the user. More details in https://bugzilla.redhat.com/show_bug.cgi?id=1638148 Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1638148 Signed-off-by: Neha Ojha <nojha@redhat.com>	2018-10-29 14:59:09 +01:00
Sébastien Han	0e63f0f3c9	Merge branch 'master' into wip-rm-calamari	2018-10-29 14:50:37 +01:00
Sébastien Han	5ab90b358c	nfs: do not create the nfs user if already present Check if the user exists and skip its creation if true. Closes: https://github.com/ceph/ceph-ansible/issues/3254 Signed-off-by: Sébastien Han <seb@redhat.com>	2018-10-26 16:24:38 +00:00

1 2 3 4 5 ...

2106 Commits (c409d6e96008cd431f1679d2582325f174c47879)