ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	67163113ea	debug commit dnm Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-11-26 09:22:20 +01:00
Guillaume Abrioux	bb515496ca	containers: modify bindmount option This commit changes the bind mount option for the mount point `/var/lib/ceph` in the systemd template for mon and mgr containers. This is needed in case of collocating mon/mgr with osds using dmcrypt scenario. Once mon/mgr got converted to containers, the dmcrypt layer sub mount is still seen in `/var/lib/ceph`. For some reason it makes the corresponding devices busy so any other container can't open/close it. As a result, it prevents osds from starting properly. Since it only happens on the nodes converted before the OSD play, the idea is to bind mount `/var/lib/ceph` on mon and mgr with the `rshared` option so once the sub mount is unmounted, it is propagated inside the container so it doesn't see that mount point. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1896392 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `f5ba6d9b01`)	2020-11-23 10:09:28 -05:00
Dimitri Savineau	6c8df0523e	switch2container: chown symlink in mon/mgr plays `fa2bb3a` only fix the symlink owner/group issue in the OSD play. If the OSDs are collocated with other services like MONs and MGRs then the chown command will fail. $ find /var/lib/ceph/osd/ceph-0 -not -user 167 -execdir chown 167:167 {} + chown: cannot dereference './block': Permission denied Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1896448 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `35ed9977aa`)	2020-11-16 16:37:16 -05:00
Guillaume Abrioux	8b6aedc871	mon: fix force peer addition task when using `monitor_interface`, if nodes don't have same interface names this task will fail like following: ``` fatal: [argo010]: FAILED! => { "msg": "The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute u'ansible_enp1s0f0'\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-mon/tasks/docker/main.yml': line 19, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: ipv4 - force peer addition as potential bootstrap peer for cluster bringup - monitor_interface\n ^ here\n" } ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876551 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-13 12:02:27 -04:00
Guillaume Abrioux	f99a4a7305	osd: add missing param to the container cli calls This adds some missing param to the container cli calls in ceph-osd-run.sh.j2 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1885558 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-10-07 11:38:45 -04:00
Dimitri Savineau	be1d98f425	ceph-osd: add missing container_binary `90f3f61` introduced the docker-to-podman.yml playbook but the ceph-osd-run.sh.j2 template still has some docker hardcoded instead of using the container_binary variable. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-10-07 14:47:40 +02:00
Dimitri Savineau	045d4612d6	library/ceph_key: set no_log on secret We don't need to show this information during the module execution. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `a3f4e2b4d1`)	2020-09-29 10:49:35 -04:00
Kefu Chai	05725183b6	docs: update URLs to point to the RTD links Fixes #5798 Signed-off-by: Kefu Chai <tchaikov@gmail.com> (cherry picked from commit `f3a78371d9`)	2020-09-25 10:47:46 -04:00
Dimitri Savineau	de98f9ab8e	facts: refact `ceph_uid` fact There's no need to set this fact with a set_fact We can achieve this in ceph-defaults Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1875058 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-09-21 13:49:12 -04:00
Dimitri Savineau	17b1427084	switch2container: chown symlink for devices If the OSD directory is using symlinks for referencing devices (like block, db, wal for bluestore and journal for filestore) then the chown command could fail to change the owner:group on some system. $ ls -hl /var/lib/ceph/osd/ceph-0/ total 28K lrwxrwxrwx 1 ceph ceph 92 Sep 15 01:53 block -> /dev/ceph-45113532-95ca-471b-bd75-51de46f1339c/osd-data-570a1aee-60c0-44c9-8036-ffed7d67a4e6 -rw------- 1 ceph ceph 37 Sep 15 01:53 ceph_fsid -rw------- 1 ceph ceph 37 Sep 15 01:53 fsid -rw------- 1 ceph ceph 55 Sep 15 01:53 keyring -rw------- 1 ceph ceph 6 Sep 15 01:53 ready -rw------- 1 ceph ceph 3 Sep 15 02:00 require_osd_release -rw------- 1 ceph ceph 10 Sep 15 01:53 type -rw------- 1 ceph ceph 2 Sep 15 01:53 whoami $ find /var/lib/ceph/osd/ceph-0 -not -user 167 -execdir chown 167:167 {} + chown: cannot dereference './block': Permission denied $ find /var/lib/ceph/osd/ceph-0 -not -user 167 /var/lib/ceph/osd/ceph-0/block Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `da4280e243`)	2020-09-17 14:57:16 -04:00
Dimitri Savineau	042b9e81de	switch2container: remove deb systemd units When running the switch2container playbook on a Debian based system then the systemd unit path isn't the same than Red Hat based system. Because the systemd unit files aren't removed then the new container systemd unit isn't take in count. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c1af69a7e7`)	2020-09-17 14:57:16 -04:00
Guillaume Abrioux	470c1d821c	tests: migrate to quay.ceph.io registry in order to avoid docker.io rate limiting Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2001039c0e`)	2020-09-11 00:59:14 +02:00
RPietrzak	84edd510d7	Remove 'run_once: true' from wait 'for all osd to be up' task in ceph-osd/tasks/main.yml role. This together with condition 'ansible_play_hosts_all \| last' causes skipping that task on the first host. Signed-off-by: RPietrzak <rp.pietrzak@gmail.com>	2020-08-21 15:58:31 +02:00
Guillaume Abrioux	e08a5fe555	tests: followup on `bff2114` remove same node for containerized deployments Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-20 14:33:44 +02:00
Guillaume Abrioux	bff2114934	tests: remove 1 osd node for upgrade scenario This node was needed for the upgrade job in stable-4.0. Since we moved the code erasure pool testing in lvm_osds, we don't need to fire up that node anymore. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-20 10:10:25 +02:00
Guillaume Abrioux	c373bfa00c	osd: move systemd rendering task This commit moves the systemd rendering task into `systemd.yml` file. Otherwise, when running docker to podman playbook, the systemd unit file isn't updated as it should be. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1870141 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-19 11:22:07 -04:00
Guillaume Abrioux	8a154ae14a	osd: change lvm bindmount This commit makes the bindmount a bit more generic, otherwise it currently makes the OSDs failing to start in an OSP FFU upgrade (with RHEL7 > RHEL8 OS upgrade). docker2podman playbook is run from ceph-ansible stable-3.2 branch against RHEL7 nodes where `/var/run/lvmetad.socket` exists but once the system is upgraded to RHEL8, this socket doesn't exist anymore and prevent OSDs from starting after the reboot. As a workaround we can make this bindmount a bit more generic like what is done in `stable-4.0` branch by mounting `/run/lvm` instead. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1866252 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-08-05 09:23:39 -04:00
Dimitri Savineau	70469a69e4	docker2podman: set disk_list for non lvm scenario When using non lvm scenarios (collocated or non-collocated) then the disk_list variable isn't set because this is done during the ceph-osd role (start_osds.yml) which isn't executed in the docker2podman playbook. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1862046 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-30 12:02:06 -04:00
Dimitri Savineau	54b55c8bac	tests: pin pytest-forked to 1.2.0 The pytest-forked 1.3.0 release isn't compatible with the pytest release we are using in that branch. ----------------------- pytest-forked 1.3.0 requires pytest>=3.10, but you'll have pytest 3.6.1 which is incompatible. ----------------------- Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-29 10:35:22 -04:00
Dimitri Savineau	a16bb2b630	README-MULTISITE: fix old conflict The automatic backport [1] done by mergify has merged the backport PR even if a conflict was present in the documentation. [1] https://github.com/ceph/ceph-ansible/pull/3803 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-07 21:18:00 +02:00
Dimitri Savineau	da5c093ba4	facts: explicitly disable facter and ohai By default, ansible gathers facts from facter and ohai if installed on the remote nodes, given we don't need them, let's exclude these facts from our facts gathering Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `c95adc564b`)	2020-07-07 14:13:23 -04:00
Dimitri Savineau	4e3301d361	ceph-osd: exit gracefully when no data partition When using collocated or non-collocated osd_scenarios (ceph-disk) and trying to deterime the OSD_DEVICE from the OSD_ID passed to the systemd unit then we can be in a situation where the OSD hasn't been activated but the OSD ID exists. This means the data partition isn't in activate state and the ceph-disk list command won't show the OSD ID on the data partition. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1850377 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-07-07 18:18:14 +02:00
Guillaume Abrioux	90f3f61548	infra: introduce docker to podman playbook This isn't backported from master because there are too many changes between stable-3.2 and other newer branches. NOTE: This playbook doesn't add podman support in stable-3.2 at all. This is a tripleO dedicated playbook which is intended to be run early during FFU workflow in order to prepare the OS upgrade. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1853457 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-07-07 12:11:09 -04:00
Guillaume Abrioux	6daa2c9d22	doc: add a note about deprecated branches This commit adds a note about `stable-3.0` `stable-3.1` branches which are deprecated and not maintained anymore. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `bbe30bcc69`)	2020-07-03 14:45:05 +02:00
Guillaume Abrioux	15f03e66b2	doc: add a note about containerized deployments This commit updates the documentation to add a note about containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `e61488507b`)	2020-07-03 14:45:05 +02:00
Guillaume Abrioux	2b6528561d	doc: fix warning treated as an error Typical error: ``` Warning, treated as error: /home/jenkins-build/build/workspace/ceph-ansible-docs-pull-requests/docs/source/day-2/upgrade.rst:2:Title underline too short. ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5c254861bd`)	2020-07-03 09:45:39 +02:00
Guillaume Abrioux	8b8fa74db7	switch_to_containers: don't set noup flag We shouldn't set this flag when running switch_to_containers playbook. Otherwise the playbook fails waiting for pgs to be clean. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1843569 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b91d60d384`)	2020-06-29 15:25:01 +02:00
Guillaume Abrioux	b2e1dcc0f4	switch-to-containers: set and unset osd flags The workflow in this playbook should be the same than in rolling_update, we should first set noout and nodeep-scrub flags before migrating the first osd and unset osd flags after the last osd is migrated. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2cfaa056e0`)	2020-06-29 15:25:01 +02:00
Guillaume Abrioux	1cf3a57a6c	Revert "switch-to-containers: set and unset osd flags" This reverts commit `5a4134098a`. We need to provide a tag for RHCS 3.3z6 without this commit.	2020-06-25 17:08:10 +02:00
Guillaume Abrioux	693e534ee9	Revert "switch_to_containers: don't set noup flag" This reverts commit `b7ec4a995b`. We need to provide a tag for RHCS 3.3z6 without this commit.	2020-06-25 17:07:25 +02:00
Dimitri Savineau	a2556f084d	docker: Add Requires on docker service When using docker container engine then the systemd unit scripts only use a dependency on the docker daemon via the After parameter. But if docker is restarted on a live system then the ceph systemd units should wait for the docker daemon to be fully restarted. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1846830 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `bd22f1d1ec`)	2020-06-22 21:08:13 -04:00
Dimitri Savineau	9a9ef7bc97	docs: Add upgrade operation. This commit adds a chapter about the ceph upgrade process. Closes: #5393 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `e41487dbce`)	2020-06-18 18:02:09 +02:00
Guillaume Abrioux	b7ec4a995b	switch_to_containers: don't set noup flag We shouldn't set this flag when running switch_to_containers playbook. Otherwise the playbook fails waiting for pgs to be clean. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1843569 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `b91d60d384`)	2020-06-18 09:56:28 +02:00
Guillaume Abrioux	5a4134098a	switch-to-containers: set and unset osd flags The workflow in this playbook should be the same than in rolling_update, we should first set noout and nodeep-scrub flags before migrating the first osd and unset osd flags after the last osd is migrated. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `2cfaa056e0`)	2020-06-18 09:56:28 +02:00
Guillaume Abrioux	6f3d696742	clients: move dummy container creation This commit moves the dummy container creation task right before the cephx keys creation task so it can't be run out of time. Also, this commit makes the dummy container running for ever. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1828105 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-04-27 13:31:52 -04:00
ianwatsonrh	2666c54b3a	typo: updating type check on rc Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1827271 Signed-off-by: ianwatsonrh <ianwatson@redhat.com> (cherry picked from commit `ccf6a7f153`)	2020-04-23 11:36:59 -04:00
Guillaume Abrioux	8c9be9c179	doc: add day-2 operations documentation This commit is the first of a serie in order to describe all day-2 operations that are possible via ceph-ansible using a set of playbook provided in `infrastructure-playbooks` directory. Fixes: #5061 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `7e800303e9`)	2020-04-23 13:29:32 +02:00
Rishabh Dave	34cf0e5301	library/ceph_volume: look for error messages in stderr Error message were moved to from stdout in stderr here - `b8d6dcbe9f (diff-20f7c578a4e69ec61a5869d706567a24R137)`. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1793542 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit `4249d1e02d`)	2020-04-20 13:36:57 -04:00
Dimitri Savineau	65b0e9bb5d	ceph-validate: update RHEL requirement for RHCS We were not testing the right ansible_distribution fact value for RHEL distribution. This commit also updates the minial RHEL version supported by RHCS. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit `5de74fe512`)	2020-04-14 11:27:21 -04:00
Guillaume Abrioux	a51331beb9	add-osd: refact the playbook There's no need to have two plays anymore since we now set/unset osd flags in `ceph-osd` role. Also, this commit makes the role `ceph-facts` to be called after `ceph-defaults` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-04-07 11:19:53 -04:00
Guillaume Abrioux	724620ed3d	add-osd: fix fact gathering in add-osd This commit makes this playbook gathering facts from all other nodes but clients. When collocating OSDs on other nodes it can fail like following: ``` fatal: [vm252-11]: FAILED! => { "msg": "'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_hostname'" } ``` In that case, a fact from a RGW node is called when rendering the `ceph.conf.j2` but it fails because facts are gathered only from mon and osd nodes. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1806765 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-04-07 11:19:53 -04:00
Guillaume Abrioux	8ccf91c1f0	add-osd: unset noup flag after last osd is deployed this commit fixes a bug when using `add-osd.yml` playbook. `noup` flag is set early but it never got unset before the "wait for pgs clean" check, so the playbook always fails because OSDs aren't never seen UP. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1816023 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2020-04-07 11:19:53 -04:00
Guillaume Abrioux	a8f5e43624	ceph_key: fetch key when needed Fetch the key when it is present in the cluster but not on the node. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `ccfa249919`)	2020-04-03 16:19:03 -04:00
Guillaume Abrioux	323d4f8f0b	ceph_key: fix idempotency when no secret is passed `553584cbd0` introduced a regression when no secret is passed, it overwrites the secret each time the task is run. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `003defec03`)	2020-04-03 16:19:03 -04:00
Guillaume Abrioux	b107dcf80b	ceph_key: remove 'update' state With this change, the state `present` is enough to update a keyring. If the keyring already exist, it will be updated if caps or secret passed to the module are different. If the keyring doen't exist, it will be created. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1808367 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `553584cbd0`)	2020-04-03 16:19:03 -04:00
Dimitri Savineau	edfeb98593	tests: add mgr nodes to shrink_mon inventory Since `306ce82` we explicitly fail when there's no mgr node preent in the inventory. fatal: [mon0]: FAILED! => { "changed": false } MSG: Please add a mgr host to your inventory. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-04-02 22:02:35 +02:00
Guillaume Abrioux	d4ffe21225	osd: support changing default rule even when osd_crush_location isn't defined Creating crush rules even with no crush hierarchy configuration is a valid scenario so we shouldn't be bound to the first task result (which configure crush hierarchy) to be able to add new crush rules. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1816989 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `5b0476385c`)	2020-03-31 23:04:03 +02:00
Dimitri Savineau	586c6e8afe	Add site-container.yml symlink This adds a symlink to the site-docker.yml.sample playbook. Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>	2020-03-31 23:00:49 +02:00
Guillaume Abrioux	3b1794a0fd	switch_to_containers: exclude clients nodes from facts gathering just like site.yml and rolling_update, let's exclude clients node from the fact gathering. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `332c39376b`) (cherry picked from commit `5c3ba0787c`)	2020-03-30 11:10:29 -04:00
Guillaume Abrioux	cfe77bc51f	main: exclude client nodes from facts gathering when delegate_facts_host This commit excludes client nodes from facts gathering, they are not needed and can speed up this task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit `865d2eac9b`)	2020-03-30 11:10:29 -04:00

1 2 3 4 5 ...

4420 Commits (67163113ea80a2e6f3d90293b17d6e7212e7949a) All Branches Search

4420 Commits (67163113ea80a2e6f3d90293b17d6e7212e7949a)

All Branches