Commit Graph

4449 Commits (8add55451c25d09009cbd0618ed27e9937ea62df)
 

Author SHA1 Message Date
Guillaume Abrioux 9d590f4339 iscsi: fix permission denied error
Typical error:
```
fatal: [iscsi-gw0]: FAILED! =>
  msg: 'an error occurred while trying to read the file ''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key'': [Errno 13] Permission denied: b''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key'''
```

`become: True` is not needed on the following task:

`copy crt file(s) to gateway nodes`.

Since it's already set in the main playbook (site.yml/site-container.yml)

The thing is that the files get generated in the 'fetch_directory' with
root user because there is a 'delegate_to' + we run the playbook with
`become: True` (from main playbook).

The idea here is to create files under ansible user so we can open them
later to copy them on the remote machine.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-07 17:57:22 +01:00
Sébastien Han bb2bbeb941 osd: container remove --pid=host
Let's try again with the Nautilus release.

Closes: https://github.com/ceph/ceph-ansible/issues/1297
Signed-off-by: Sébastien Han <seb@redhat.com>
2019-02-07 12:13:51 +00:00
Guillaume Abrioux 708e13e7bb repo: update gitignore file
- do not ignore raw_install_python.yml

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-06 15:06:27 +00:00
Guillaume Abrioux b37c4adb32 ansible: increase fact cache timeout
10m seems a bit low, indeed, a complete run can take more than 1h.
Let's increase it to 2h

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-06 08:58:52 +00:00
John Fulton cc0bf197e1 Fix CNI error when net=host is not used on OSD calls
Follow up fix that 410abd7 missed.

Related: ceph#3561

Signed-off-by: John Fulton <fulton@redhat.com>
2019-02-05 22:49:01 +00:00
John Fulton 719a25b571 Create Ceph Initial Dirs earlier
Include tasks from create_ceph_initial_dirs earlier during
ceph config role.

Fixes: #3568
Signed-off-by: John Fulton <fulton@redhat.com>
2019-02-05 18:38:05 +00:00
John Fulton dab3f6ee3f Fix CNI error when net=host is not used in some podman calls
With 'podman version 1.0.0' on RHEL8 beta the 'get ceph version' and
'ceph monitor mkfs' commands fail [1] with "error configuring network
namespace for container Missing CNI default network".

When net=host is added these errors are resolved. net=host is used in
many other calls (grep -R net=host | wc -l --> 38).

Fixes: #3561
Signed-off-by: John Fulton <fulton@redhat.com>
(cherry picked from commit 410abd7745)
2019-02-05 18:14:28 +01:00
Guillaume Abrioux a31058374b site-container: import role ceph-facts
when ceph-container-common notifies handlers because a new container
image has been pulled, ceph-handler will throw an error because of
undefined variables since they are set in ceph-facts role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux 914d94cae8 set RuntimeDirectory in all systemd unit templates
/var/run/ceph resides in a non persistent filesystem (tmpfs)
After a reboot, all daemons won't start because this directory will be
missing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux ac7f4b3a01 tests: increase amount of memory for all vms
double the amount of memory from 512m to 1024m.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux 7ade032807 osd: bind mount /var/run/udev/
without this, the command `ceph-volume lvm list --format json` hangs and
takes a very long time to complete.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux ff5509295a tests: remove useless test
`test_mon_host_line_has_correct_value()` will cover this test in
anycase. It doesn't worth to have a dedicated test for this.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux efc051d17c tests: update test_mon_host_line_has_correct_value()
since msgr2 introduction, this test must be updated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux 0d72fe9b30 tests: add a rhel8 scenario testing
test upstream with rhel8 vagrant image

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux fdca29f2a7 facts: set timeout_command fact in ceph-defaults
- also add `--foreground` which seems to fix some issue we are facing when
using timeout with `podman`.
- use this fact in the `is ceph running already?` task.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Guillaume Abrioux 16efdbc59b podman: support podman installation on rhel8
Add required changes to support podman on rhel8

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1667101

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-05 18:14:28 +01:00
Patrick C. F. Ernzer fd222a8bbf ansible.cfg: change log_path to directory used by fact_caching_connection
Since fact_caching_connection uses a directory in $HOME already, write the
ansible.log to the same directory.

Fixes: #3509

Signed-off-by: Patrick C. F. Ernzer <pcfe@redhat.com>
2019-02-05 13:53:28 +00:00
John Fulton 37b5d1084a Make python print statements python3 compatible
The restart_osd_daemon.sh generated from the j2 template
contains a python call which uses 'print x' instead of
'print(x)'. Add the missing parentheses to make this call
compatible with both 2 and 3.

Also add parentheses to other python print calls found
in roles/ceph-client/defaults/main.yml and
infrastructure-playbooks/cluster-os-migration.yml.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1671721
Signed-off-by: John Fulton <fulton@redhat.com>
2019-02-01 15:23:27 +00:00
Andrew Schoen 7b411b93d5 tests: do not run lvm_setup.yml on lvm_auto_discovery tests
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-02-01 12:28:12 +01:00
Andrew Schoen 70a4368bc5 ceph-config: do not always assume containers when calculating num_osds
CEPH_CONTAINER_IMAGE should be None if containerized_deployment
is False.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-02-01 12:28:12 +01:00
Andrew Schoen e0dcd9f2c7 tests: fix Vagrantfile symlink for lvm-auto-discovery tests
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-02-01 12:28:12 +01:00
Andrew Schoen 03d61a8842 docs: using osd_auto_discovery and osd_scenario lvm
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-02-01 12:28:12 +01:00
Andrew Schoen fc9502039d tests: adds the lvm_auto_discovery container testing scenario
This tests osd_auto_discovery: True, containerized_deployment: True
and the lvm osd scenario

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-02-01 12:28:12 +01:00
Andrew Schoen 3d4beaf952 tests: create as many drives for virtualbox as libvirt
This just ensures that virtualbox and libvirt are making
the same amount of devices for tests.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-02-01 12:28:12 +01:00
Andrew Schoen c53ccab0b2 tests: adds the lvm_auto_discovery scenario
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-02-01 12:28:12 +01:00
Andrew Schoen 88eda479a9 ceph-facts: generate devices when osd_auto_discovery is true
This task used to live in ceph-osd, but we need it defined here to that
ceph-config can use it when trying to determine the number of osds.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-02-01 12:28:12 +01:00
Andrew Schoen c5b082848f validate: do not validate lvm config if osd_auto_discovery is true
If osd_auto_discovery is set with the lvm scenario it's expected for
lvm_volumes and devices to be empty.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2019-02-01 12:28:12 +01:00
Guillaume Abrioux 46db5a1e38 tests: ensure iptables rule is inserted for rgw_multisite job
wip

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-01 10:20:28 +01:00
Guillaume Abrioux 773a7608d1 tests: run dev_setup and lvm_setup on secondary cluster for rgw_multisite
Otherwise, the deployment of the second cluster fails.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-02-01 10:20:28 +01:00
John Fulton cba9b23363 Do not timeout podman/docker pull if timeout value is '0'
If user sets "docker_pull_timeout: '0'" then do not use the
timeout command when running podman/docker pull. Also, use
"timeout -s KILL"; without KILL, podman on RHEL8 beta does
not timeout and deployment can hang.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1670625
Signed-off-by: John Fulton <fulton@redhat.com>
2019-01-31 15:35:09 +00:00
Guillaume Abrioux fe1528adb4 config: support num_osds fact setting in containerized deployment
This part of the code must be supported in containerized deployment

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1664112

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-31 14:30:23 +00:00
Guillaume Abrioux c0ad91957c tests: add missing hosts file for ubuntu testing
we need it for dev-ubuntu-container-all_daemons job

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-31 08:19:17 +01:00
Guillaume Abrioux fd8856b80a tests: do not deploy iscsigw node on ubuntu
iSCSI gateways can only be deployed on Red Hat OS family.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-31 08:19:17 +01:00
Guillaume Abrioux 02b18b15c0 tests: run lvm_setup.yml only when osd_scenario is lvm
especially for ooo_collocation scenario which is still using ceph-disk
testing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-31 08:19:17 +01:00
Ramana Raja dfff89ce67 Install nfs-ganesha stable v2.7
nfs-ganesha v2.5 and 2.6 have hit EOL. Install nfs-ganesha v2.7
stable that is currently being maintained.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2019-01-30 14:57:26 +01:00
Guillaume Abrioux 9f16501747 common: clean monitor initial keyring code
let's not be blocked by the fact we don't have the initial keyring in
`{{ fetch_directory }}`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-30 10:36:02 +01:00
Guillaume Abrioux f8aa8cdf60 facts: clean fsid generation code
clean some leftover and duplicate code.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-30 10:36:02 +01:00
Patrick Donnelly 080ce7dd72 do not set ceph_release to dummy
When we're deploying a dev branch.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-01-29 16:04:44 +00:00
Guillaume Abrioux 25d8198c5d tests: do not play dev_setup on containerized deployment
using `!` mark in tox.ini doesn't work on comma separated list.
The idea here is to skip all containerized scenario in dev_setup.yml and
use the `!` for the update scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-29 14:12:05 +00:00
Guillaume Abrioux 0745fefb88 tests: update envlist
- switch_to_containers, ooo_collocation, podman should be able to specify
which os they are running on.
- those scenarios are only container scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-28 14:45:32 +01:00
Patrick Donnelly 8cd0308f5f use shortname in keyring path
socket.gethostname may return a FQDN. Problem found in Linode.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-01-28 09:00:35 +00:00
Guillaume Abrioux 312867af56 tests: fix rgw testinfra failure
fix the wrong path used in various rgw testinfra tests.
set `1` as default value for `radosgw_num_instances`: if
`ansible_vars.get(radosgw_num_instances)` returns `None`, we can assume
there's only 1 instance since it's the default value in ceph-defaults.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-25 11:11:16 +00:00
Zack Cerza 82897c76fb Fix ceph.conf generation
877979c78 in #3486 broke ceph.conf generation entirely. Remove the stray
curly brace.

Signed-off-by: Zack Cerza <zack@redhat.com>
2019-01-25 09:19:24 +01:00
Noah Watkins 9a43674d2e shrink_osd: use cv zap by fsid to remove parts/lvs
Fixes:
  https://bugzilla.redhat.com/show_bug.cgi?id=1569413
  https://bugzilla.redhat.com/show_bug.cgi?id=1572933

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2019-01-24 16:34:13 +01:00
Noah Watkins 8a5530ee98 test: add missing test dependency
[nwatkins@smash ceph-ansible]$ virtualenv env
[nwatkins@smash ceph-ansible]$ env/bin/pip install -r tests/requirements.txt
[nwatkins@smash ceph-ansible]$ env/bin/python -c "import mock"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'mock'

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2019-01-24 16:34:13 +01:00
Noah Watkins fce9f6ef60 cv: support zap by osd fsid
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2019-01-24 16:34:13 +01:00
Guillaume Abrioux 973f316595 tests: fix symlink to Vagrantfile
fix the symplink for Vagrantfile in rgw-multisite/container/secondary

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-24 11:46:32 +01:00
Guillaume Abrioux 7c7b1ad931 tests: set CEPH_STABLE_RELEASE to mimic for dev-* jobs
CEPH_STABLE_RELEASE must be set to mimic to perform a mimic to nautilus
upgrade on dev-* jobs.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-24 11:46:32 +01:00
Guillaume Abrioux 8564e7e7c5 tests: remove an outdated comment
This comment is outdated, let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-24 11:46:32 +01:00
Guillaume Abrioux 5ccfa95b9a tests: update dev_setup for nfs_ganesha_stable value
since we now set this variable in inventory host, the regexp needs to be
updated, the assignment operator is `=`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2019-01-24 11:46:32 +01:00