ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Dimitri Savineau	68c6f39349	ceph-facts: set use_new_ceph_iscsi on iscsi nodes We don't need to set the use_new_ceph_iscsi fact on other nodes than those present in the iscsigws group. Also remove the duplicate iscsi_gw_group_name condition already present on the include_task. Finally validate the ansible distribution as the first task. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-12-10 23:57:03 +01:00
Guillaume Abrioux	8d0dc34ebe	defaults: fix a typo s/above/below Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-10 09:32:02 -05:00
Guillaume Abrioux	d682412e2a	ansible.cfg: do not enforce PreferredAuthentications There's no need to enforce PreferredAuthentications by default. Users can still choose to override the ansible.cfg with any additional parameter like this one to fit their infrastructure. Fixes: #4826 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-09 17:15:09 -05:00
Guillaume Abrioux	a234338eff	defaults: add a comment This commit isolates and adds an explicit comment about variables not intended to be modified by the user. Fixes: #4828 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-09 13:50:43 -05:00
Guillaume Abrioux	6d9ca6b05b	shrink-osd: support fqdn in inventory When using fqdn in inventory, that playbook fails because of some tasks using the result of ceph osd tree (which returns shortname) to get some datas in hostvars[]. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1779021 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-09 10:52:38 -05:00
Guillaume Abrioux	332c39376b	switch_to_containers: exclude clients nodes from facts gathering just like site.yml and rolling_update, let's exclude clients node from the fact gathering. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-09 10:49:13 -05:00
Guillaume Abrioux	d245eb7e7d	dashboard: run node_export as privileged container Typical error: ``` type=AVC msg=audit(1575367499.582:3210): avc: denied { search } for pid=26680 comm="node_exporter" name="1" dev="proc" ino=11528 scontext=system_u:system_r:container_t:s0:c100,c1014 tcontext=system_u:system_r:init_t:s0 tclass=dir permissive=0 ``` node_exporter needs to be run as privileged to avoid avc denied error since it gathers lot of information on the host. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762168 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-09 09:40:13 -05:00
Dimitri Savineau	1a77dd7e91	ceph-validate: start with ansible version test It doesn't make sense to start validating configuration if the ansible version isn't the good one. This commit moves the check_system task as the first task in the ceph-validate role. The ansible version test tasks are moved at the top of this file. Also moving the iscsi kernel tests from check_system to check_iscsi file. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-12-09 09:35:03 +01:00
Dimitri Savineau	12aa8f4025	ceph-facts: move ntp/chrony facts to ceph-infra The ntp/chrony facts are only used in the ceph-infra role so we don't really need to set them in the ceph-facts roles. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-12-05 19:46:59 +01:00
Guillaume Abrioux	0756fa467d	defaults: change default value for dashboard_admin_password A recent change in ceph/ceph prevent from having username in the password: `Error EINVAL: Password cannot contain username.` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-05 13:02:06 -05:00
Guillaume Abrioux	c7708eb458	update: restart iscsigws daemons after upgrade In containerized context, containers aren't stopped early in the sequence. It means they aren't restarted after the upgrade because the task is just checking the daemon status is started (eg: `state: started`). This commit also removes the task which ensure services are started because it's already done in the role ceph-iscsigw. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-05 13:02:06 -05:00
Guillaume Abrioux	451c5ca934	upgrade: add dashboard deployment when upgrading from RHCS 3, dashboard has obviously never been deployed and it forces us to deploy it later manually. This commit adds the dashboard deployment as part of the upgrade to RHCS 4. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1779092 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-05 13:02:06 -05:00
Dimitri Savineau	014f51c2a4	ceph-defaults: exclude md devices from discovery The md devices (RAID software) aren't excluded from the devices list in the auto discovery scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1764601 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-12-05 10:14:25 +01:00
Dimitri Savineau	89f6cc54a2	purge-cluster: add podman support The podman support was added to the purge-container-cluster playbook but containers are always used for the dashboard even on non containerized deployment. This commits adds the podman support on purging the dashboard resources in the purge-cluster playbook. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-12-04 14:15:12 -05:00
Dimitri Savineau	4a6d19dae2	tests: reduce max_mds from 3 to 2 Having max_mds value equals to the number of mds nodes generates a warning in the ceph cluster status: cluster: id: 6d3e49a4-ab4d-4e03-a7d6-58913b8ec00a' health: HEALTH_WARN' insufficient standby MDS daemons available' (...) services: mds: cephfs:3 {0=mds1=up:active,1=mds0=up:active,2=mds2=up:active}' Let's use 2 active and 1 standby mds. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-12-04 14:07:29 -05:00
Guillaume Abrioux	f5a81b1790	purge: fix symlink to purge-container-cluster ceph/ceph-ansible#4805 introduced a symlink to purge-container-cluster.yml playbook which is broken. This commit fixes it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-04 09:38:34 +01:00
Guillaume Abrioux	7bc7e3669d	purge: rename playbook (container) Since we now support podman, let's rename the playbook so it's more generic. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-03 11:10:21 -05:00
Guillaume Abrioux	a8d76d72d7	dashboard: use fqdn url for active alert When using the shortname, the URL for active alert launches with short hostname and fails to connect to the server. This commit changes the template in order to use the fqdn. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1765485 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-03 14:30:32 +01:00
Guillaume Abrioux	b18476a1a6	purge: do not try to stop docker when binary is podman If the container binary is podman, we shouldn't try to stop docker here. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-03 13:29:52 +01:00
Guillaume Abrioux	fe5ffe589e	facts: isolate container_binary facts in order to be able to call container_binary without having to run the whole ceph-facts role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-03 13:29:52 +01:00
Guillaume Abrioux	d23383a820	purge: remove docker_* task All containers are removed when systemd stops them. There is no need to call this module in purge container playbook. This commit also removes all docker_image task and remove all container images in the final cleanup play. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1776736 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-03 13:29:52 +01:00
Stanley Lam	ad7a5dad3f	Add option for HAproxy to act a SSL frontend termination point for loadbalanced RGW instances. Signed-off-by: Stanley Lam <stanleylam_604@hotmail.com>	2019-12-02 16:54:33 -05:00
Guillaume Abrioux	a43a872105	docker2podman: import ceph-handler role This is needed to avoid following error: ``` ERROR! The requested handler 'restart ceph mons' was not found in either the main handlers list nor in the listening handlers list ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-02 09:11:12 -05:00
Guillaume Abrioux	7fe0d55eff	docker2podman: do not hardcode group name let's use `client_group_name` instead of hardcoding the name. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-02 09:11:12 -05:00
Guillaume Abrioux	6526a25ab5	docker2podman: import ceph-defaults in first play We must import this role in the first play otherwise the first call to `client_group_name`fails. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-12-02 09:11:12 -05:00
Dimitri Savineau	39cfe0aa65	switch_to_containers: fix umount ceph partitions When a container is already running on a non containerized node then the umount ceph partition task is skipped. This is due to the container ps command which always returns 0 even if the filter matches nothing. We should run the umount task when: 1/ the container command is failing (not installed) : rc != 0 2/ the container command reports running ceph-osd containers : rc == 0 Also we should not fail on the ceph directory listing. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1616159 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-12-02 09:19:50 +01:00
Dimitri Savineau	5bd1cf40eb	ceph-osd: wait for all osds once `cf8c6a3` moves the 'wait for all osds' task from openstack_config to the main tasks list. But the openstack_config code was executed only on the last OSD node. We don't need to do this check on all OSD node so we need to add set run_once to true on that task. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-27 13:05:42 -05:00
Guillaume Abrioux	23b1f43897	facts: avoid duplicated element in devices list When using `osd_auto_discovery`, `devices` is built multiple times due to multiple runs of `ceph-facts` role. It end up with duplicate instances of a same device in the list. Using `unique` filter when building the list fixes this issue. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-27 16:35:41 +01:00
Guillaume Abrioux	cc0c1ce301	dashboard: only print dashboard url of the grafana-server node This commit makes the ceph-dashboard role only printing ceph-dashboard URL of the nodes present in grafana-server group Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762163 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-27 10:28:23 -05:00
Guillaume Abrioux	0441812959	purge/update: remove backward compatibility legacy This was introduced in 3.1 and marked as deprecation We can definitely drop it in stable-4.0 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-27 10:27:43 -05:00
Dimitri Savineau	3f29b243ea	tests: fix cluster health status The current ceph cluster health is in warning state: health: HEALTH_WARN 13 pool(s) have no replicas configured 2 pool(s) have non-power-of-two pg_num Because we're using only 1 replica then we need to disable the redundancy check. The pool pg num should be a power of two number (like 16). Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-27 16:20:17 +01:00
Guillaume Abrioux	f19a2aef1a	Revert "tox-podman: use centos 8 vagrant image" This reverts commit `19e9a06ab1`.	2019-11-27 16:19:58 +01:00
Dimitri Savineau	cf8c6a3849	ceph-osd: wait for all osd before crush rules When creating crush rules with device class parameter we need to be sure that all OSDs are up and running because the device class list is is populated with this information. This is now enable for all scenario not openstack_config only. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-27 07:43:07 +01:00
Dimitri Savineau	55adc10be3	ceph-grafana: remove ipv6 brakets on wait_for The wait_for ansible module doesn't support the backets on IPv6 address so need to remove them. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769710 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-26 10:08:17 +01:00
Guillaume Abrioux	5353ab8a23	tests: revert vagrant_variable file name detection This commit reverts the following change: `fcf181342a (diff-23b6f443c01ea2efcb4f36eedfea9089R7-R14)` this is causing CI failures so this commit is intended to unlock the CI. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-25 11:22:04 +01:00
Dimitri Savineau	dd97353574	travis: add python 3.7 and 3.8 Add both python 3.7 and 3.8 in the travis matrix testing. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-25 09:17:19 +01:00
Guillaume Abrioux	33bfb10af9	nfs: remove legacy file this file is provided by the packaging (nfs-ganesha) so there's no need to maintain it in ceph-ansible Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-22 05:11:41 +01:00
Guillaume Abrioux	d06158e9d9	nfs: do not run privileged nfs container At the moment, we bindmount the dbus socket from the host, this requires to run the container with --privileged. Since we now run a dedicated dbus daemon inside the same container, we can stop running privileged nfs-ganesha containers Related ceph-container PR : ceph/ceph-container#1517 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1725254 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-22 05:11:41 +01:00
Guillaume Abrioux	c878e99589	update: only run post osd upgrade play on 1 mon There is no need to run these tasks n times from each monitor. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-20 09:22:19 -05:00
Guillaume Abrioux	548db78b95	update: use flags noout and nodeep-scrub only 1. set noout and nodeep-scrub flags, 2. upgrade each OSD node, one by one, wait for active+clean pgs 3. after all osd nodes are upgraded, unset flags Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by: Rachana Patel <racpatel@redhat.com>	2019-11-20 09:22:19 -05:00
Dimitri Savineau	19e9a06ab1	tox-podman: use centos 8 vagrant image Switch the podman scenario from atomic centos 7 to centos 8 (not atomic) Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-20 10:34:34 +01:00
VasishtaShastry	72c43cc5d9	Fixes failure of cephfs configuration using --limit Configuration of cephfs with an existing cluster using --limit used to fail at different tasks while running with site-docker.yml This commit addresses both of those tasks Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1773489 Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>	2019-11-18 16:44:47 +01:00
Dimitri Savineau	d7fd769b6d	container: add always tag on gather fact tasks If we execute the site-container.yml playbook with specific tags (like ceph_update_config) then we need to be sure to gather the facts otherwise we will see error like: The task includes an option with an undefined variable. The error was: 'ansible_hostname' is undefined This commit also adds missing 'gather_facts: false' to mons plays. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1754432 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-14 11:50:24 -05:00
Dimitri Savineau	ef2cb99f73	ceph-osd: add device class to crush rules This adds device class support to crush rules when using the class key in the rule dict via the create-replicated sub command. If the class key isn't specified then we use the create-simple sub command for backward compatibility. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-14 16:25:46 +01:00
Dimitri Savineau	ed36a11eab	move crush rule creation from mon to osd role If we want to create crush rules with the create-replicated sub command and device class then we need to have the OSD created before the crush rules otherwise the device classes won't exist. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-14 16:25:46 +01:00
Dimitri Savineau	3e29b8d5ff	ceph-defaults: pin prometheus container tags In addition to the grafana container tag change, we need to do the same for the prometheus container stack based on the release present in the OSE 4.1 container image. $ docker run --rm openshift4/ose-prometheus-node-exporter:v4.1 --version node_exporter, version 0.17.0 build user: root@67fee13ed48f build date: 20191023-14:38:12 go version: go1.11.13 $ docker run --rm openshift4/ose-prometheus-alertmanager:4.1 --version alertmanager, version 0.16.2 build user: root@70b79a3f29b6 build date: 20191023-14:57:30 go version: go1.11.13 $ docker run --rm openshift4/ose-prometheus:4.1 --version prometheus, version 2.7.2 build user: root@12da054778a3 build date: 20191023-14:39:36 go version: go1.11.13 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2019-11-14 16:11:14 +01:00
VasishtaShastry	9a1f1626c3	Evades validation of ceph_repository_type in containerized scenario This will prevent failure of site-docker.yml with configs in doc. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769760 Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com> Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-14 15:53:22 +01:00
Guillaume Abrioux	b717b5f736	ceph_key: restore file mode after a key is fetched when `import_key` is enabled, if the key already exists, it will only be fetched using ceph cli, if the mode specified in the `ceph_key` task is different from what is applied by the ceph cli, the mode isn't restored because we don't call `module.set_fs_attributes_if_different()` before `module.exit_json(**result)` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1734513 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-14 14:58:37 +01:00
Guillaume Abrioux	16bcef4f28	tests: add time command in vagrant_up.sh monitor how long it takes to get all VMs up and running Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-08 15:47:46 +01:00
Guillaume Abrioux	1a5d32dda5	tests: remove legacy in tox-update.ini This variable isn't used in tox-update.ini so this commit removes it. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-11-08 09:27:28 -05:00

... 2 3 4 5 6 ...

5137 Commits (eac207091b0574e16084c62e71372b510f25de6b) All Branches Search

5137 Commits (eac207091b0574e16084c62e71372b510f25de6b)

All Branches