If we don't bootstrap the mgr after the mon and the osds handler are
called, we will never be able to reach a clean state since the pgs
stats are handled by the mgr. This also happens when doing daemon
collocation.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1493920
Signed-off-by: Sébastien Han <seb@redhat.com>
This test doesn't work at the moment and need to be fixed.
Disabling it temporary to avoid errors in the CI.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
On a container env, machines don't have any ceph binaries so we need to
use a container to run the commands.
Signed-off-by: Sébastien Han <seb@redhat.com>
Delete these before creating them incase they are left around in a purge
cluster testing scenario. The purge-cluster.yml playbook does not
currently remove partitions used for journals.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
The partition only needs created and given a gpt label so that a
PARTUUID will exist on the partition.
This task also makes the purge_lvm_osds scenario fail on the second
deployment after purging.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Prior to this patch this activation sequence for autodetection was
always skipped because we were asking to activate on device without
partitions, which doesn't make sense.
We also fix the way we lookup for a device, since the data partition is
always numbered 1, we take the min element of the dict.
Closes: https://github.com/ceph/ceph-ansible/issues/1782
Signed-off-by: Sébastien Han <seb@redhat.com>
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
we need to force the value of `docker` variable which is initially set
to `false` since it's a migration from non-containerized to
containerized cluster.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The installation process is now described as follow:
* you still have to choose a 'ceph_origin' installation method. The
origin can be a 'repository' (add a new repository), distro (it will use
the packages provided by the native repo source of your distribution),
local (only available on redhat system, it installs locally built
packages). This option is not well tested, so use it carefully
* if ceph_origin == 'repository' you will have to decide what kind of
repository you want to enable:
- community: corresponds to the stable upstream/community version
- enterprise: corresponds to the stable enterprise/downstream version
(basically you are a red hat customer)
- dev: it will install ceph from packages built out of the github
development branches
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The lvm_volumes variable is now a list of dictionaries that represent
each OSD you'd like to deploy using ceph-volume. Each dictionary must
have the following keys: data, journal and data_vg. Each dictionary also
can optionaly provide a journal_vg key.
The 'data' key represents the lv name used for the OSD and the 'data_vg'
key is the vg name that the given lv resides on. The 'journal' key is
either an lv, device or partition. The 'journal_vg' key is optional and
must be the vg name for the journal lv if given. This key is mainly used
for purging of the journal lv if purge-cluster.yml is run.
For example:
lvm_volumes:
- data: data_lv1
journal: journal_lv1
data_vg: vg1
journal_vg: vg2
- data: data_lv2
journal: /dev/sdc
data_vg: vg1
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Resolves issue: Multiple RGW Ceph.conf Issue #1258
In multi-RGW setup, in ceph.conf the RGW sections
contain identical bind IP in civetweb line. So this
modification fixes that issue and puts the right IP
for each RGW.
Signed-off-by: SirishaGuduru SGuduru@walmartlabs.com
Modified ceph-defaults and ran generate_group_vars_sample.sh
group_vars/osds.yml.sample and group_vars/rhcs.yml.sample are
not part of the changes. But they got modified when
generate_group_vars_sample.sh is ran to generate group_vars/
all.yml.sample.
Uncommented added variables in ceph-defaults
Updated tests by adding value for radosgw_interface
Added radosgw_interface to centos cluster tests
Modified ceph-rgw role,rebased and ran generate_group_vars_sample.sh
In ceph-rgw role removed check_mandatory_vars.yml.
Rebased on master.
Ran generate_group_vars_sample.sh and then the below files got
modified.
When you udpate to the latest version of the centos/7 box it always puts
the OS on /dev/sda, so do not use it as an OSD.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
There is only two main scenarios now:
* collocated: everything remains on the same device:
- data, db, wal for bluestore
- data and journal for filestore
* non-collocated: dedicated device for some of the component
Signed-off-by: Sébastien Han <seb@redhat.com>
ceph-disk is responsable for enabling the unit file if needed. Actually
since https://github.com/ceph/ceph/pull/12241 it seems that it's not
even needed. On an event of a restart, udev rules will be trigger and
they will ceph-disk activate the device too so the 'enabled' is not
needed.
Closes: https://github.com/ceph/ceph-ansible/issues/1142
Signed-off-by: Sébastien Han <seb@redhat.com>
If you use the 'dev' factor, the testing scenario will
use repos from shaman.ceph.com. You can define CEPH_DEV_BRANCH
and CEPH_DEV_SHA1 to specify which repo you'd like to test.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
the `test_osds_listen_on_*` consider OSDs will always listen on tcp port
with consecutive tcp port number starting from `6800`.
Eg.
If you have 2 OSDs, tests will assume it should listen on 2 ports for each
network (`public_network` and `cluster_network`), therefore:
`6800, 6801, 6802, 6803`
but sometime it doesn't happen this way and you can get OSDs listening
on tcp port like this :
`6800, 6801, 6802, 6805`
Then the test are failing while it shouldn't.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since we are hitting this bug :
https://bugzilla.redhat.com/show_bug.cgi?id=1324587
eg:
`failed: internal error: Monitor path /var/lib/libvirt/qemu/domain-bs-docker-cl
uster-dmcrypt-journal-collocation_mon0_1499294943_ba9faf7bf296533177f6/monitor.
sock too big for destination`
and we can't upgrade libvirt in our CI for some reason
we need to get the directories name shorter in order to workaround this
issue
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
The scenario set in `group_vars/all` for
docker-cluster-dmcrypt-journal-collocation is not the correct one.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
We were setting journal_collocation and used raw_journal_devices which
is definitely wrong. We should just stick with devices.
Signed-off-by: Sébastien Han <seb@redhat.com>
remove `ceph_mon_docker_interface` and use `monitor_interface` instead
for both containerized and non-containerized deployment.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Since ceph.conf.j2 has been updated to add ipv6 support, the different
variables in many scenarios need to be updated.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
`ceph-docker-common`:
At the moment there is a lot of duplicated tasks in each
`./roles/ceph-<role>/tasks/docker/main.yml` that could be refactored in
`./roles/ceph-docker-common/tasks/main.yml`.
`*_containerized_deployment` variables:
All `*_containerized_deployment` have been refactored to a single
variable `containerized_deployment`
duplicate `cephx` variables in `group_vars/* have been removed.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Restore the check_socket that was removed by `5bec62b`.
This commit also improves the logging in `restart_*_daemon.sh` scripts
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Without this, we don't test the mgr role so we need to add it.
Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
This is just nice to see in the test output so we know exactly what
configuration is going to be used.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Instead of relying on environment variables and --extra-vars simply
modify the group_vars/all that ships with the specific testing scenario
to enable ceph_rchs testing.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
The Ceph Manager daemon (ceph-mgr) runs alongside monitor daemons, to
provide additional monitoring and interfaces to external monitoring and
management systems.
Only works as of the Kraken release.
Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
In the environment we were testing on, MTU was set to 1500 which causes
download failures of our yum repos. There might be a better way to set
this instead of doing it here in ansible.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Decorating a test method directly with a pytest mark seems to break if
the test function does not explicitly define all pytest fixtures it
expects to recieve.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
journal_collocation was enabled so the test suite was testing this
scenario and obviously failed since there is no second partition to
verify.
Signed-off-by: Sébastien Han <seb@redhat.com>
This fixes the error: Call to virDomainCreateWithFlags failed: internal
error: Monitor path
/var/lib/libvirt/qemu/domain-docker-cluster-dedicated-journal_osd0_1487692576_dbfc21d851071d3e2cd2/monitor.sock
too big for destination
Signed-off-by: Sébastien Han <seb@redhat.com>
Since distro will not allow /usr/share to be writable (e.g: atomic) so
we let the operator decide where to put that script.
Signed-off-by: Sébastien Han <seb@redhat.com>
Prior to this patch we had several ways to runs containers, we could use
ansible's docker module on some distro and on containers distros we were
using systemd. We strongly believe threating containers as services with
systemd is the right approach so this patch generalizes to all the
distros. These days most of the distros are running systemd so it's fair
assumption.
Signed-off-by: Sébastien Han <seb@redhat.com>
Just for clarity and because we can we now show the name of the
ceph configuration file that is generated.
Signed-off-by: Sébastien Han <seb@redhat.com>
We need to test the cluster name support in this CI as well. This
commit might be prone to debate because it tests 2 things in a single
scenario. We first test our ability to deploy a cluster AND the cluster
name support. However it's easier to do it this way and will reduce the
amount of time for testingg. If we don't do this we will have a
duplicate those 2 existing tests into new ones 'only' to test the
cluster name support.
Signed-off-by: Sébastien Han <seb@redhat.com>
The osds are named differently for systemd in containerized deployments
so this new parameter is used to make that change transparent in the
tests.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This playbook could be used in the future to install anything else we
need on these nodes for testing purposes.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This also makes the conf_tests take the subnet as input
so multiple scenarios on differing subnets can use these tests.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
We want to test journal collocation here because we're gonna switch
xenial-cluster and centos7-cluster to use a dedicated journal.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
We really only need to test the raw-multi-journal OSD scenario on one
OS and it needed a better name to use with the CI.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This scenario duplicates what we are currently doing with our
ceph-ansible testing using OVH and a single node, except now
we are using 4 separate nodes.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Resolves: vagrant#boxes
This box supports both libvirt and virtualbox. Eventually we want to
be building our own boxes but this should work in the short term.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
This box supports both virtualbox and libvirt. Eventually
we want to be building our own vagrant boxes, but this might
work for now.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Resolves: vagrant#boxes
This can be used to test if mon hosts and
mon initial members are being set properly with
multiple hosts.
Also, to verify that monitor_address and monitor_interface
options both work as described.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Resolves: testing#updates
This was just a placeholder until we could get more valid scenarios in
place.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Resolves: testing#updates
This will allow for no changes needed in the ansible playbook command
when adding new scenarios. Each scenario will just need a hosts file and
a group_vars directory to define how the cluster should be setup.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>