ceph-ansible

Commit Graph

Author	SHA1	Message	Date
Guillaume Abrioux	d4e31b90a6	Revert "osd: container remove --pid=host" This reverts commit `bb2bbeb941`. Looks like when not passing `--pid=host` we are facing some issues when deploying more than 2 OSDs in containerized environment. At the moment, we are still troubleshooting this issue but we prefer to revert this commit so it doesn't block any PR in the CI. As soon as we have a fix; we will push a new PR to remove `--pid=host` (a revert of revert...) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-14 10:34:37 +00:00
Guillaume Abrioux	187b2bc9d9	tests: avoid 'Cannot allocate memory' error in testinfra ``` ------------------------------ Captured log setup ------------------------------ display.py 174 INFO Wednesday 13 February 2019 15:54:15 +0000 (0:00:07.787) 0:02:11.607 **** ansible.py 61 INFO RUN Ansible('setup', None, {'check': True, 'become': False}): {'_ansible_no_log': False, '_ansible_parsed': False, '_ansible_verbose_override': True, 'changed': False, 'module_stderr': u'Connection to 192.168.121.87 closed.\r\n', 'module_stdout': u'bash: /bin/sh: Cannot allocate memory\r\n', 'msg': u'MODULE FAILURE\nSee stdout/stderr for the exact error', 'rc': 126} ``` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-14 10:34:37 +00:00
Guillaume Abrioux	4a1bafdc21	tests: use memory backend for cache fact force ansible to generate facts for each run. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-14 10:34:37 +00:00
Guillaume Abrioux	500256cdab	validate: fix ntp_daemon_type check in validate is_atomic is defined in ceph-facts or very early in main playbook. In non containerized deployment, is_atomic is only set in ceph-facts which is played after ceph-validate. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-14 10:34:37 +00:00
Guillaume Abrioux	9c10affb69	site.yml: run ceph-validate before facts/defaults roles ceph-validate must be run before any other role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-14 10:34:37 +00:00
Guillaume Abrioux	ac4aded4aa	tests: fix ubuntu-container-all_daemons the public_network subnet used for this scenario was wrong. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-14 10:34:37 +00:00
Guillaume Abrioux	76303b457c	container: create ceph-common.conf tmpfiles.d if it doesn't exist Otherwise the task will fail. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-14 10:34:37 +00:00
Guillaume Abrioux	b24202f6a4	facts: move two set_fact into ceph-facts those two set_fact tasks should be moved in ceph-facts. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-14 10:34:37 +00:00
Guillaume Abrioux	69310a5cd6	switch_to_containers: support multiple rgw instances per host add multiple rgw instances per host in switch_to_containers playbook. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-13 09:42:27 +01:00
Guillaume Abrioux	70f1eea9b2	switch_to_containers: remove non-containerized systemd unit files remove old systemd unit files (non-containerized) during the switch_to_containers transition. We have seen sometimes the unit started is the old one instead of the new systemd unit generated. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-13 09:42:27 +01:00
Guillaume Abrioux	4064035a54	switch_to_containers: use ceph binary from container use the ceph binary from the container instead of the host. If the ceph CLI version isn't compatible between host and container image, it can cause the CLI to hang. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-13 09:42:27 +01:00
Guillaume Abrioux	8c8ec63633	container: use tmpfiles.d to creates /run/ceph instead of using `RuntimeDirectory` parameter in systemd unit files, let's use a systemd `tmpfiles.d` to ensure `/run/ceph`. Explanation: `podman` doesn't create the `/var/run/ceph` if it doesn't exist the time where the container is run while `docker` used to create it. In case of `switch_to_containers` scenario, `/run/ceph` gets created by a tmpfiles.d systemd file; when switching to containers, the systemd unit file complains because `/run/ceph` already exists The better fix would be to ensure `/usr/lib/tmpfiles.d/ceph-common.conf` is removed and only rely on `RuntimeDirectory` from systemd unit file parameter but we come from a non-containerized environment which is already running, it means `/run/ceph` is already created and when starting the unit to start the container, systemd will still complain and we can't simply remove the directory if daemons are collocated. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-13 09:42:27 +01:00
Guillaume Abrioux	7e0a70f7a8	switch_to_containers: do not try to redeploy monitors `ceph-mon` tries to redeploy monitors because it assumes it was not yet deployed since `mon_socket_stat` and `ceph_mon_container_stat` are undefined (indeed, we stop the daemon before calling `ceph-mon` in the switch_to_containers playbook). Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-13 09:42:27 +01:00
Guillaume Abrioux	ad3a489847	tests: rename switch_to_containers rename switch_to_containers scenario Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-13 09:42:27 +01:00
Rishabh Dave	05ea783eff	fix mistake in task that aborts when ntpd is chosen on Atomic Since it's already confusing whether ntp_daemon_type should be "ntp" or "ntpd", fix the mistake in the title of the task that aborts if ntp_daemon_type is set to "ntpd" and OS being used is Atomic. Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-02-12 09:09:27 +01:00
Guillaume Abrioux	54f5dc3aab	doc: resync group_vars sample files resync group_vars sample files with their corresponding original files. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-11 17:19:27 +01:00
Rishabh Dave	bdff3e48fd	don't install NTPd on Atomic Since Atomic doesn't allow any installations and NTPd is not present on Atomic image we are using, abort when ntp_daemon_type is set to ntpd. https://github.com/ceph/ceph-ansible/issues/3572 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2019-02-11 12:02:30 +01:00
Sébastien Han	c69c8c9ac1	mon: do not hardcode ceph uid 167 is the ceph uid for Red Hat based system, thus trying to deploy a monitor on Debian fail since the ceph user id on that system is 64045. This commit uses the ceph_uid variable which contains the right uid based on system/container detection. Closes: https://github.com/ceph/ceph-ansible/issues/3589 Signed-off-by: Sébastien Han <seb@redhat.com>	2019-02-11 09:09:40 +00:00
Leah Neukirchen	4fe7f37849	Fix uses of default(omit) with string concatenation When {{omit}} is concatenated with another string, it expands to something like __omit_place_holder__63eea0d96dd6ed867b95405e11d87dddf61f448d. However, in these use-cases we need an empty string. Regression introduced in `d53f55e807`. Signed-off-by: Leah Neukirchen <leah.neukirchen@mayflower.de>	2019-02-08 16:18:15 +00:00
Patrick C. F. Ernzer	c605ff6a68	setup_ntp: call handler to disable ntpd if chronyd used The task setup chronyd called the handler disable chronyd, which of course defeats the purpose. Changing the task to disable ntpd instead fixes the issue of chronyd being disabled after it got enabled. Fixes: #3582 Signed-off-by: Patrick C. F. Ernzer pcfe@redhat.com	2019-02-08 12:04:44 +01:00
Guillaume Abrioux	d4b3c1d409	iscsi-gws: remove a leftover remove leftover introduced by `9d590f4` Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-08 01:11:42 +01:00
Guillaume Abrioux	9d590f4339	iscsi: fix permission denied error Typical error: ``` fatal: [iscsi-gw0]: FAILED! => msg: 'an error occurred while trying to read the file ''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key'': [Errno 13] Permission denied: b''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key''' ``` `become: True` is not needed on the following task: `copy crt file(s) to gateway nodes`. Since it's already set in the main playbook (site.yml/site-container.yml) The thing is that the files get generated in the 'fetch_directory' with root user because there is a 'delegate_to' + we run the playbook with `become: True` (from main playbook). The idea here is to create files under ansible user so we can open them later to copy them on the remote machine. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-07 17:57:22 +01:00
Sébastien Han	bb2bbeb941	osd: container remove --pid=host Let's try again with the Nautilus release. Closes: https://github.com/ceph/ceph-ansible/issues/1297 Signed-off-by: Sébastien Han <seb@redhat.com>	2019-02-07 12:13:51 +00:00
Guillaume Abrioux	708e13e7bb	repo: update gitignore file - do not ignore raw_install_python.yml Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-06 15:06:27 +00:00
Guillaume Abrioux	b37c4adb32	ansible: increase fact cache timeout 10m seems a bit low, indeed, a complete run can take more than 1h. Let's increase it to 2h Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-06 08:58:52 +00:00
John Fulton	cc0bf197e1	Fix CNI error when net=host is not used on OSD calls Follow up fix that `410abd7` missed. Related: ceph#3561 Signed-off-by: John Fulton <fulton@redhat.com>	2019-02-05 22:49:01 +00:00
John Fulton	719a25b571	Create Ceph Initial Dirs earlier Include tasks from create_ceph_initial_dirs earlier during ceph config role. Fixes: #3568 Signed-off-by: John Fulton <fulton@redhat.com>	2019-02-05 18:38:05 +00:00
John Fulton	dab3f6ee3f	Fix CNI error when net=host is not used in some podman calls With 'podman version 1.0.0' on RHEL8 beta the 'get ceph version' and 'ceph monitor mkfs' commands fail [1] with "error configuring network namespace for container Missing CNI default network". When net=host is added these errors are resolved. net=host is used in many other calls (grep -R net=host \| wc -l --> 38). Fixes: #3561 Signed-off-by: John Fulton <fulton@redhat.com> (cherry picked from commit `410abd7745`)	2019-02-05 18:14:28 +01:00
Guillaume Abrioux	a31058374b	site-container: import role ceph-facts when ceph-container-common notifies handlers because a new container image has been pulled, ceph-handler will throw an error because of undefined variables since they are set in ceph-facts role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Guillaume Abrioux	914d94cae8	set RuntimeDirectory in all systemd unit templates /var/run/ceph resides in a non persistent filesystem (tmpfs) After a reboot, all daemons won't start because this directory will be missing. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Guillaume Abrioux	ac7f4b3a01	tests: increase amount of memory for all vms double the amount of memory from 512m to 1024m. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Guillaume Abrioux	7ade032807	osd: bind mount /var/run/udev/ without this, the command `ceph-volume lvm list --format json` hangs and takes a very long time to complete. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Guillaume Abrioux	ff5509295a	tests: remove useless test `test_mon_host_line_has_correct_value()` will cover this test in anycase. It doesn't worth to have a dedicated test for this. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Guillaume Abrioux	efc051d17c	tests: update test_mon_host_line_has_correct_value() since msgr2 introduction, this test must be updated. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Guillaume Abrioux	0d72fe9b30	tests: add a rhel8 scenario testing test upstream with rhel8 vagrant image Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Guillaume Abrioux	fdca29f2a7	facts: set timeout_command fact in ceph-defaults - also add `--foreground` which seems to fix some issue we are facing when using timeout with `podman`. - use this fact in the `is ceph running already?` task. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Guillaume Abrioux	16efdbc59b	podman: support podman installation on rhel8 Add required changes to support podman on rhel8 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1667101 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-05 18:14:28 +01:00
Patrick C. F. Ernzer	fd222a8bbf	ansible.cfg: change log_path to directory used by fact_caching_connection Since fact_caching_connection uses a directory in $HOME already, write the ansible.log to the same directory. Fixes: #3509 Signed-off-by: Patrick C. F. Ernzer <pcfe@redhat.com>	2019-02-05 13:53:28 +00:00
John Fulton	37b5d1084a	Make python print statements python3 compatible The restart_osd_daemon.sh generated from the j2 template contains a python call which uses 'print x' instead of 'print(x)'. Add the missing parentheses to make this call compatible with both 2 and 3. Also add parentheses to other python print calls found in roles/ceph-client/defaults/main.yml and infrastructure-playbooks/cluster-os-migration.yml. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1671721 Signed-off-by: John Fulton <fulton@redhat.com>	2019-02-01 15:23:27 +00:00
Andrew Schoen	7b411b93d5	tests: do not run lvm_setup.yml on lvm_auto_discovery tests Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2019-02-01 12:28:12 +01:00
Andrew Schoen	70a4368bc5	ceph-config: do not always assume containers when calculating num_osds CEPH_CONTAINER_IMAGE should be None if containerized_deployment is False. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2019-02-01 12:28:12 +01:00
Andrew Schoen	e0dcd9f2c7	tests: fix Vagrantfile symlink for lvm-auto-discovery tests Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2019-02-01 12:28:12 +01:00
Andrew Schoen	03d61a8842	docs: using osd_auto_discovery and osd_scenario lvm Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2019-02-01 12:28:12 +01:00
Andrew Schoen	fc9502039d	tests: adds the lvm_auto_discovery container testing scenario This tests osd_auto_discovery: True, containerized_deployment: True and the lvm osd scenario Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2019-02-01 12:28:12 +01:00
Andrew Schoen	3d4beaf952	tests: create as many drives for virtualbox as libvirt This just ensures that virtualbox and libvirt are making the same amount of devices for tests. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2019-02-01 12:28:12 +01:00
Andrew Schoen	c53ccab0b2	tests: adds the lvm_auto_discovery scenario Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2019-02-01 12:28:12 +01:00
Andrew Schoen	88eda479a9	ceph-facts: generate devices when osd_auto_discovery is true This task used to live in ceph-osd, but we need it defined here to that ceph-config can use it when trying to determine the number of osds. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2019-02-01 12:28:12 +01:00
Andrew Schoen	c5b082848f	validate: do not validate lvm config if osd_auto_discovery is true If osd_auto_discovery is set with the lvm scenario it's expected for lvm_volumes and devices to be empty. Signed-off-by: Andrew Schoen <aschoen@redhat.com>	2019-02-01 12:28:12 +01:00
Guillaume Abrioux	46db5a1e38	tests: ensure iptables rule is inserted for rgw_multisite job wip Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-01 10:20:28 +01:00
Guillaume Abrioux	773a7608d1	tests: run dev_setup and lvm_setup on secondary cluster for rgw_multisite Otherwise, the deployment of the second cluster fails. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>	2019-02-01 10:20:28 +01:00

... 30 31 32 33 34 ...

5920 Commits (5cd25ea8c1df3d193410ac2f5234c712e4496742) All Branches Search

5920 Commits (5cd25ea8c1df3d193410ac2f5234c712e4496742)

All Branches