ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	e150df789e	ceph-facts: fix read osd pool default crush fact We don't need to use run_once on that task when having running monitors otherwise the read task could be skip and the set task will fail. The conditional check 'crush_rule_variable.rc == 0' failed. The error was: error while evaluating conditional (crush_rule_variable.rc == 0): 'dict object' has no attribute 'rc' Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1898856 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-18 12:55:43 -05:00
Benoît Knecht	c5f7343a2f	ceph-facts: Fix osd_pool_default_crush_rule fact The `osd_pool_default_crush_rule` is set based on `crush_rule_variable`, which is the output of a `grep` command. However, two consecutive tasks can set that variable, and if the second task is skipped, it still overwrites the `crush_rule_variable`, leading the `osd_pool_default_crush_rule` to be set to `ceph_osd_pool_default_crush_rule` instead of the output of the first task. This commit ensures that the fact is set right after the `crush_rule_variable` is assigned, before it can be overwritten. Closes #5912 Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>	2020-11-13 09:36:49 +01:00
Dimitri Savineau	3f9081931f	rgw/rbdmirror: use service dump instead of ceph -s The ceph status command returns a lot of information stored in variables and/or facts which could consume resources for nothing. When checking the rgw/rbdmirror services status, we're only using the servicmap structure in the ceph status output. To optimize this, we could use the ceph service dump command which contains the same needed information. This command returns less information and is slightly faster than the ceph status command. $ ceph status -f json \| wc -c 2001 $ ceph service dump -f json \| wc -c 1105 $ time ceph status -f json > /dev/null real 0m0.557s user 0m0.516s sys 0m0.040s $ time ceph service dump -f json > /dev/null real 0m0.454s user 0m0.434s sys 0m0.020s Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-11-03 09:05:33 +01:00
Guillaume Abrioux	1cc9666c09	common: drop `fetch_directory` feature This commit drops the `fetch_directory` feature. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-21 13:22:16 +02:00
Guillaume Abrioux	c101cb3931	defaults: change defaults value this commit changes defaults value in default pool definitions. there's no need to define `pg_num`, `pgp_num`, `size` and `min_size`, `ceph_pool` module will use the current default if needed. This also drops the 3 following `set_fact` in `ceph-facts`: - osd_pool_default_pg_num, - osd_pool_default_pgp_num, - osd_pool_default_size_num Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-02 07:42:40 +02:00
Seena Fallah	ff9f4d138f	ceph-facts: add get default crush rule from running monitor In case of deploying new monitor node to an existing cluster, osd_pool_default_crush_rule should be taken from running monitor because ceph-osd role won't be run and the new monitor will have different osd_pool_default_crush_role from other monitors. Signed-off-by: Seena Fallah <seenafallah@gmail.com>	2020-09-29 09:27:58 -04:00
Seena Fallah	69f7e35382	ceph-facts: check for mon socket in its own host delegate to its own host after checking mon socket to findout if mon socket is in-use or not. Signed-off-by: Seena Fallah <seenafallah@gmail.com>	2020-09-29 00:21:12 +02:00
Dimitri Savineau	50104650e7	add missing boolean filter Otherwise this will generate an ansible warning about the missing filter. [DEPRECATION WARNING]: evaluating xxx as a bare variable, this behaviour will go away and you might need to add \|bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This feature will be removed in version 2.12. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-28 20:45:01 +02:00
Tyler Bishop	ee4b8804ae	facts: support device aliases for (dedicated\|bluestore_wal)_devices Just likve `devices`, this commit adds the support for linux device aliases for `dedicated_devices` and `bluestore_wal_devices`. Signed-off-by: Tyler Bishop <tbishop@liquidweb.com>	2020-09-25 19:59:45 +02:00
Dimitri Savineau	f63022dfec	ceph-facts: only get fsid when monitor are present When running the rolling_update playbook with an inventory without monitor nodes defined (like external scenario) then we can't retrieve the cluster fsid from the running monitor. In this scenario we have to pass this information manually (group_vars or host_vars). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1877426 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-10 13:19:44 -04:00
Guillaume Abrioux	f0fe193d8e	facts: refact and optimize memory consumption there's no need to run this task on all nodes. This uses too much memory for nothing. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1856981 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-20 11:16:26 -04:00
Dimitri Savineau	4e84b4beed	ceph-facts: remove mds_name fact The mds_name fact always gets the ansible_hostname value so we don't need to have a dedicated fact for this and use the ansible_hostname fact instead. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-23 17:02:43 +02:00
Guillaume Abrioux	f8a951f50c	facts: fix broken facts when using --limit This commit fixes these tasks when --limit is used. It makes sure the fact is set on right nodes even when the playbook is run with `--limit` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-20 10:56:10 -04:00
Guillaume Abrioux	bcc673f66c	facts: refact `ceph_uid` fact There's no need to set this fact with a `set_fact` We can achieve this in `ceph-defaults` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-09 13:37:29 +02:00
Guillaume Abrioux	e5e81843e9	facts: always set ceph_run_cmd and ceph_admin_command always set these facts on monitor nodes whatever we run with `--limit`. Otherwise, playbook will fail when using `--limit` on nodes where these facts are used on a delegated task to monitor. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-05-15 10:53:15 +02:00
Guillaume Abrioux	378405e328	mds: fix --limit run against mds nodes This commit fixes --limit runs against mds nodes. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-04-14 10:42:43 -04:00
Guillaume Abrioux	4bcc52cb2a	osd: fix monitor_name error when scaling out OSDs This commit fixes a bug when trying to scale out osd nodes with `crush_rule_config` is enabled. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822599 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-04-09 13:46:40 -04:00
Guillaume Abrioux	1bb9860dfd	osd: use default crush rule name when needed When `rule_name` isn't set in `crush_rules` the osd pool creation will fail. This commit adds a new fact `ceph_osd_pool_default_crush_rule_name` with the default crush rule name. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1817586 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-03-31 14:49:38 -04:00
Dimitri Savineau	9d3b49293d	purge: stop rgw instances by iteration It looks like that the service module doesn't support wildcard anymore for stopping/disabling multiple services. fatal: [rgw0]: FAILED! => changed=false msg: 'This module does not currently support using glob patterns, found '''' in service name: ceph-radosgw@' ...ignoring Instead we should iterate over the rgw_instances list. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-02 16:32:06 +01:00
Dimitri Savineau	298ba0bf03	ceph-facts: set devices osd_auto_discovery on OSDs We only need to set the devices fact with osd_auto_discovery on OSD nodes. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-03 16:23:38 +01:00
Dimitri Savineau	ed461544a7	ceph-facts: remove is_podman fact This was used before the CentOS 8 requirement when using CentOS 7 atomic which has both docker and podman installed. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-02-03 10:11:03 -05:00
Dimitri Savineau	1fcafffdad	ceph-facts: fix _container_exec_cmd fact value When using different name between the inventory_hostname and the ansible_hostname then the _container_exec_cmd fact will get a wrong value based on the inventory_hostname instead of the ansible_hostname. This happens when the ceph cluster is already running (update/upgrade). Later the container exec commands will fail because the container name is wrong. We should always set the _container_exec_cmd based on the ansible_hostname fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1795792 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-29 08:44:59 +01:00
Vytenis Sabaliauskas	ed1eaa1f38	ceph-facts: Fix for 'running_mon is undefined' error, so that fact 'running_mon' is set once 'grep' successfully exits with 'rc == 0' Signed-off-by: Vytenis Sabaliauskas <vytenis.sabaliauskas@protonmail.com>	2020-01-23 16:27:11 +01:00
Dimitri Savineau	7f997e623a	ceph-facts: move facts to defaults value There's no need to define a variable via a fact if we can do it via a default value. Using a fact could be interesseting to override the default value on some condition. - ceph_uid could be set to 167 by default because it's only different on non containerized deployment on Debian/Ubuntu. - rbd_client_directory_{owner,group,mode} could be set to ceph,ceph,0770 by default install of null as we are doing in the facts. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-16 13:57:11 -05:00
Dimitri Savineau	4e7fb5d45a	drop use_fqdn variables This has been deprecated in the previous releases. Let's drop it. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-15 11:32:39 +01:00
Guillaume Abrioux	2592a1e1e8	facts: fix osp/ceph external use case `d6da508a9b` broke the osp/ceph external use case. We must skip these tasks when no monitor is present in the inventory. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790508 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-13 12:06:06 -05:00
Dimitri Savineau	f940e695ab	ceph-facts: move grafana fact to dedicated file We don't need to executed the grafana fact everytime but only during the dashboard deployment. Especially for ceph-grafana, ceph-prometheus and ceph-dashboard roles. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790303 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-01-13 12:05:57 -05:00
Guillaume Abrioux	86f3eeb717	mon: support replacing a mon We must pick up a mon which actually exists in ceph-facts in order to detect if a cluster is running. Otherwise, it will state no cluster is already running which will end up deploying a new monitor isolated in a new quorum. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-09 12:59:12 -05:00
Guillaume Abrioux	5adb735c78	facts: use correct python interpreter that task is delegated on the first mon so we should always use the `discovered_interpreter_python` from that node. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-01-08 10:06:43 -05:00
Dimitri Savineau	68c6f39349	ceph-facts: set use_new_ceph_iscsi on iscsi nodes We don't need to set the use_new_ceph_iscsi fact on other nodes than those present in the iscsigws group. Also remove the duplicate iscsi_gw_group_name condition already present on the include_task. Finally validate the ansible distribution as the first task. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-12-10 23:57:03 +01:00
Dimitri Savineau	12aa8f4025	ceph-facts: move ntp/chrony facts to ceph-infra The ntp/chrony facts are only used in the ceph-infra role so we don't really need to set them in the ceph-facts roles. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-12-05 19:46:59 +01:00
Guillaume Abrioux	fe5ffe589e	facts: isolate container_binary facts in order to be able to call container_binary without having to run the whole ceph-facts role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-03 13:29:52 +01:00
Guillaume Abrioux	23b1f43897	facts: avoid duplicated element in devices list When using `osd_auto_discovery`, `devices` is built multiple times due to multiple runs of `ceph-facts` role. It end up with duplicate instances of a same device in the list. Using `unique` filter when building the list fixes this issue. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-27 16:35:41 +01:00
Guillaume Abrioux	fa9b42e98e	switch_to_containers: do not re-set `ceph_uid` This commit refacts the way we set `ceph_uid` fact in `ceph-facts` and removes all `set_fact` tasks for `ceph_uid` in switch-to-containers playbook to avoid duplicated code. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-10-07 14:15:56 +02:00
Dimitri Savineau	ec3b687dc4	ceph-facts: use --admin-daemon to get fsid During the rolling_update scenario, the fsid value is retrieve from the current ceph cluster configuration via the ceph daemon config command. This command tries first to resolve the admin socket path via the ceph-conf command. Unfortunately this command won't work if you have a duplicate key in the ceph configuration even if it only produces a warning. As a result the task will fail. Can't get admin socket path: unable to get conf option admin_socket for mon.xxx: warning: line 13: 'osd_memory_target' in section 'osd' redefined Instead of using ceph daemon we can use the --admin-daemon option because we already know what the socket admin path value based on the ceph cluster and mon hostname values. Closes: #4492 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-02 10:07:13 +02:00
Dimitri Savineau	20b1a464ec	ceph-facts: update external grafana fact filter `e695efc` hasn't been updated with the changes introduced in `9bb11c7` so the ips_in_ranges filter isn't used for an external grafana instance. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-10-01 10:47:14 +02:00
Harald Jensås	e695efcaf7	Replace ipaddr() with ips_in_ranges() This change implements a filter_plugin that is used in the ceph-facts, ceph-validate roles and infrastucture-playbooks. The new filter plugin will return a list of all IP address that reside in any one of the given IP ranges. The new filter replaces the use of the ipaddr filter. ceph.conf already support a comma separated list of CIDRs for the public_network and cluster_network options. Changes: [1] and [2] introduced a regression in ceph-ansible where public_network can no longer be a comma separated list of cidrs. With this change a comma separated list of subnet CIDRs can also be used for monitor_address_block and radosgw_address_block. [1] commit: `d67230b2a2` [2] commit: `20e4852888` Related-To: https://bugs.launchpad.net/tripleo/+bug/1840030 Related-To: https://bugzilla.redhat.com/show_bug.cgi?id=1740283 Closes: #4333 Please backport to stable-4.0 Signed-off-by: Harald Jensås <hjensas@redhat.com>	2019-09-27 10:11:53 +02:00
fmount	9bb11c7b2a	Inject ceph grafana dashboard layouts This change just adds the task to inject from the ceph dashboard mgr module the required layouts to show all the cluster metrics on the grafana instance. Since we're now able to push grafana layouts through the ceph mgr module command, the dashboards configuration template is no longer needed on containerized environments. This commit also fixes the Vagrantfile IP static assigment in the grafana section because it generates an issue (it's the same of the mgr instance). Finally, considering some deployments that use an external grafana server instance, we reworked the 'grafana_server_addr' assignment to address these requirements. Signed-off-by: fmount <fpantano@redhat.com>	2019-09-26 11:12:20 -04:00
fmount	81eb091533	Fix discovered_interpreter_python variable This change fixes the discovered_interpreter_python variable name that was "discovered_python_interpreter" and caused a failure in OSP deployments. Signed-off-by: fmount <fpantano@redhat.com>	2019-09-04 09:55:30 -04:00
Johannes Kastl	bd507fa147	set discovered_python_interpreter if ansible_python_interpreter is defined If the user has set the `ansible_python_interpreter`, ansible will not try to discover python, so `discovered_python_interpreter` will not be set. Solution: Set `discovered_python_interpreter` to `ansible_python_interpreter` if `ansible_python_interpreter` is defined Signed-off-by: Johannes Kastl <kastl@b1-systems.de>	2019-08-27 20:54:59 +02:00
Johannes Kastl	e1b9312084	facts: fix a typo This commit fixes a typo in roles/ceph-facts/tasks/facts.yml Signed-off-by: Johannes Kastl <kastl@b1-systems.de>	2019-08-22 18:08:28 +02:00
Guillaume Abrioux	4df92152c0	common: replace shell module there is no need to use `shell` in these tasks. Let's use `command`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-14 16:42:02 +02:00
Guillaume Abrioux	13815ad3ca	common: use discovered_interpreter_python fact in order to use the right binary name when using python cli in command or shell module. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-08-14 16:42:02 +02:00
Guillaume Abrioux	d67230b2a2	dashboard: use dedicated group only There's no need to add complexity and trying to fallback on other group. Let's deploy dashboard on all nodes present in grafana-server group. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-29 14:42:45 +02:00
Guillaume Abrioux	a781ce881c	iscsi: refact deprecated variables This commit moves some old variables into ceph-defaults so we can move the `use_new_ceph_iscsi` fact in ceph-facts role in order. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-07-03 22:13:19 +02:00
fmount	e655038743	Set grafana_server_addr fact for ipv6 scenarios. As the bz1721914 describes, the grafana_server_addr fact is not defined if ip_version used is ipv6. This commit adds the ip_version condition to set correctly this fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1721914 Signed-off-by: fmount <fpantano@redhat.com>	2019-06-26 15:47:22 +02:00
Guillaume Abrioux	366b309c12	facts: fix bug in grafana_server_addr fact setting If no grafana-server group is defined while an mgr group is, that task will fail because `hostvars[groups[grafana_server_group_name][0]` can't return anything since `groups['grafana-server']` will be a non existing key. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-26 10:49:30 +02:00
Guillaume Abrioux	46a2683944	facts: add a retry on get current fsid task sometimes it can happen the following task fails: ``` TASK [ceph-facts : get current fsid] ***************************************** task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-centos-container-update/roles/ceph-facts/tasks/facts.yml:78 Wednesday 19 June 2019 18:12:49 +0000 (0:00:00.203) 0:02:39.995 **** fatal: [mon2 -> mon1]: FAILED! => changed=true cmd: - timeout - --foreground - -s - KILL - 600s - docker - exec - ceph-mon-mon1 - ceph - --cluster - ceph - daemon - mon.mon1 - config - get - fsid delta: '0:00:00.239339' end: '2019-06-19 18:12:49.812099' msg: non-zero return code rc: 22 start: '2019-06-19 18:12:49.572760' stderr: 'admin_socket: exception getting command descriptions: [Errno 2] No such file or directory' stderr_lines: <omitted> stdout: '' stdout_lines: <omitted> ``` not sure exactly why since just before this task, mon1 seems to be well UP otherwise it wouldn't have passed the task `waiting for the containerized monitor to join the quorum`. As a quick fix/workaround, let's add a retry which allows us to get around this situation: ``` TASK [ceph-facts : get current fsid] *************************************** task path: /home/jenkins-build/build/workspace/ceph-ansible-scenario/roles/ceph-facts/tasks/facts.yml:78 Thursday 20 June 2019 15:35:07 +0000 (0:00:00.201) 0:03:47.288 ******* FAILED - RETRYING: get current fsid (3 retries left). changed: [mon2 -> mon1] => changed=true attempts: 2 cmd: - timeout - --foreground - -s - KILL - 600s - docker - exec - ceph-mon-mon1 - ceph - --cluster - ceph - daemon - mon.mon1 - config - get - fsid delta: '0:00:00.290252' end: '2019-06-20 15:35:13.960188' rc: 0 start: '2019-06-20 15:35:13.669936' stderr: '' stderr_lines: <omitted> stdout: \|- { "fsid": "153e159d-7ade-42a7-842c-4d04348b901e" } stdout_lines: <omitted> ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-06-20 13:13:04 -04:00
Rishabh Dave	9d88d3199f	ceph-infra: make chronyd default NTP daemon Since timesyncd is not available on RHEL-based OSs, change the default to chronyd for RHEL-based OSs. Also, chronyd is chrony on Ubuntu, so set the Ansible fact accordingly. Fixes: https://github.com/ceph/ceph-ansible/issues/3628 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-06-13 14:53:22 -04:00
fmount	069076bbfd	Fix units and add ability to have a dedicated instance Few fixes on systemd unit templates for node_exporter and alertmanager container parameters. Added the ability to use a dedicated instance to deploy the dashboard components (prometheus and grafana). This commit also introduces the grafana_group_name variable to refer grafana group and keep consistency with the other groups. During the integration with TripleO some grafana/prometheus template variables resulted undefined. This commit adds the ability to check if the group exist and create, accordingly, different job groups in prometheus template. Signed-off-by: fmount <fpantano@redhat.com>	2019-06-10 18:18:46 +02:00

1 2

75 Commits (11b4bf5083639abea66a874ba86ac38a1b706ca6)