Commit Graph

363 Commits (d9cfe5f6df131e7a3bdf8807187d123b0b7e9315)

Author SHA1 Message Date
Guillaume Abrioux efe06be10f osd: ensure a gpt label is set on device
ceph-disk prepare will fail on jewel if a GPT label is not present on
device.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-11-17 17:32:23 +01:00
Sébastien Han 932345ab2a osd: remove leftover from osd partition
We used to support osds that are a partition. This is long gone so
removing this task.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-11-16 14:58:40 +01:00
Sébastien Han b1c1322357 osd: remove failed_when on activation
There is no need to continue if the activation fails.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-11-16 14:57:49 +01:00
Sébastien Han 80d3a242d0 osd: fix bad activation for dmcrypt
We were activating dmcrypt devices with the wrong command. Basically the
first task execute the wrong activate command. The task fails but
continues because of the 'failed_when: false'. Then the right activation
sequence is being done by the next task.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-11-16 14:55:08 +01:00
Sébastien Han cc264d6ba6
Merge pull request #2151 from hwoarang/add-opensuse
Add openSUSE Leap 42.3 support
2017-11-16 14:35:28 +01:00
Andrew Schoen 3c604f1115 lvm: support --data as a raw device or partition in ceph-volume
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-11-15 09:36:17 -06:00
Andrew Schoen 04f02910a9 lvm: ensure the data_vg exists before using it
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2017-11-15 09:36:17 -06:00
Guillaume Abrioux aa0b1ed118 tests: remove OSD_FORCE_ZAP variable from tests
according to ceph/ceph-container#840, this variable is no longer needed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-11-14 17:55:01 +01:00
Markos Chandras fb46950373 ceph-osd: Add support for openSUSE Leap distributions
Add support for openSUSE Leap distributions

Signed-off-by: Markos Chandras <mchandras@suse.de>
2017-11-14 10:51:23 +00:00
Guillaume Abrioux 0369bd59e2
Merge pull request #2146 from mslovy/wip-fix-crush-location
osd: fix crush location for non-containerized deployment
2017-11-13 12:23:44 +01:00
Guillaume Abrioux c06faf2deb
Merge pull request #2154 from ceph/fix_auto_discover
osd: avoid using non desired loop device in autodiscovery
2017-11-10 01:19:20 +01:00
Guillaume Abrioux 591d77220e osd: always run disk_list test
there is no need to have a condition on this task, this test should be
always run since the result will be interpreted later.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-11-09 11:51:16 +01:00
Guillaume Abrioux 43975a7332 osd: avoid using non desired loop device in autodiscovery
This will prevent ceph-ansible from using a loop device while it
shouldn't in auto_discovery mode.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-11-09 10:26:24 +01:00
Guillaume Abrioux d5dfc63c89 osd: fix automatic prepare when auto_discover
Use `devices` variable instead of `ansible_devices`, otherwise it means
we are not using the devices which have been 'auto discovered'

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-11-08 10:20:44 +01:00
yaoning d82a09dddd fix crush location for non-containerized deployment
crush location only set for containerized deployment

Signed-off-by: yaoning <yaoning@unitedstack.com>
2017-11-08 12:05:10 +11:00
Sébastien Han 0930f14915 osd: do not use dm when osd_auto_discovery
The current code will also return lvm devices such as /dev/dm-2, this
kind of device type is not supported by ceph-disk at the moment. Now we
just ignore them.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-11-08 11:33:10 +11:00
Guillaume Abrioux 39b584e540 osd: fix a typo in roles/ceph-osd/defaults/main.yml
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-11-07 10:06:16 +01:00
Sébastien Han d4ed9a2064 osd: enhance backward compatibility
During the initial implementation of this 'old' thing we were falling
into this issue without noticing
https://github.com/moby/moby/issues/30341 and where blindly using --rm,
now this is fixed the prepare container disappears and thus activation
fail.
I'm fixing this for old jewel images.

Also this fixes the machine reboot case where the docker logs are
purgend. In the old scenario, we now store the log locally in the same
directory as the ceph-osd-run.sh script.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-11-03 11:15:23 +01:00
Sébastien Han faccd0acf0 Merge pull request #2100 from ceph/lvm-bluestore
ceph-volume lvm bluestore support
2017-10-27 17:36:16 +02:00
Alfredo Deza 517a2b3feb ceph-osd skip lvm creation if they are already in use
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-10-27 11:33:54 -04:00
Sébastien Han 5a10b048b0 Merge pull request #2105 from major/really-fix-always-run
Really fix always run
2017-10-27 09:33:47 +02:00
Sébastien Han 5f9e50dabe Merge pull request #2103 from andymcc/tcmalloc_settings
Option to set TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES
2017-10-25 17:36:04 +02:00
Sébastien Han 07e2a783f8 Merge pull request #2084 from ceph/backward-osd-2.4
osd: bring backward compatibility with old Jewel images
2017-10-25 17:33:49 +02:00
Major Hayden f73232caa4
Use check_mode instead of always_run
This patch changes the `always_run: yes` task option to
`check_mode: no` to avoid Ansible warnings.
2017-10-25 09:53:34 -05:00
Major Hayden c2b5118c1b
Revert "Avoid deprecated always_run"
This reverts commit 620fb37dd4.
2017-10-25 09:48:09 -05:00
Andy McCrae 7f6c39102d Option to set TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES
Use "ceph_tcmalloc_max_total_thread_cache" to set the
TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES value inside /etc/default/ceph for
Debian installs, or /etc/sysconfig/ceph for Red Hat/CentOS installs.

By default this is set to 0, so the default package value will be used,
if specified this value will be changed to match the variable, and ceph
osd services will be restarted.
2017-10-25 14:38:36 +01:00
Alfredo Deza d3b427e169 ceph-osd lvm scnearios are no longer limited to filestore
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-10-25 08:23:45 -04:00
Alfredo Deza df05e63c10 ceph-osd use --cluster in ceph-volume calls
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-10-25 08:23:45 -04:00
Alfredo Deza 628d98a92c ceph-osd add the CEPH_VOLUME_DEBUG env var to all ceph-volume commands
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-10-25 06:50:22 -04:00
Alfredo Deza b89309e2a3 ceph-osd update the examples in defaults for lvm bluestore
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-10-25 06:46:39 -04:00
Alfredo Deza bbc3672253 ceph-osd: lvm support for bluestore
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2017-10-25 06:46:39 -04:00
Guillaume Abrioux f21859656b Merge pull request #2102 from yanyixing/fix_miss_word
add the miss word
2017-10-25 10:49:38 +02:00
Yixing Yan b6296c13ac update sample file 2017-10-25 16:39:08 +08:00
John Fulton 7a7ddab6c2 Require osd_scenario parameter to be provided in containerized deploy
Fixes: #2095
2017-10-23 15:16:03 +00:00
Sébastien Han 968ef04324 osd: bring backward compatibility with old Jewel images
There was a huge resync from luminous to jewel in ceph-docker:
https://github.com/ceph/ceph-docker/pull/797

This change brought a new handy function to discover partitions tight to
an OSD. This function doesn't exist in the old image so the
ceph-osd-run.sh script breaks when trying to deploy Jewel OSD with that
old Jewel image version.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-20 16:26:41 +02:00
Sébastien Han 4413511b66 all: backward compatibility between stable-2.2 and 3.0
stable-3.0 brought numerous changes in ceph-ansible variables, this PR
aims to maintain backward compatibility for someone running stable-2.2
upgrading to stable-3.0 but keeps its groups_vars untouched.
We will then determine the right options to make sure the upgrade works
but we are expecting that new variables should be used.

We will drop this in a near future, maybe 3.1 or 3.2.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-20 11:54:10 +02:00
Sébastien Han c527515502 Merge pull request #2000 from ceph/merge-osd-scenarios
[skip ci] ci: new osd scenarios
2017-10-19 09:18:02 +02:00
Sébastien Han a53aa9e8b4 ci: new osd scenarios
This commit add new osd scenarios, it aims to simplify the CI setup and
brings a better coverage on the OSD scenarios.
We decided to differentiate between filestore and bluestore, thinking
ahead when filestore won't be supported anymore.
So we now have two classes of tests:

* Filestore
* Bluestore

In each of those classes we have container and non-container.
Then for each we test the following:

* collocated
* collocated dmcrypt
* non-collocated
* non-collocated dmcrypt
* auto discovery collocated
* auto discovery collocated dmcrypt

This gives us a nice coverage and also reduces the footprint on the CI.
We are now up to 4 scenarios, each containing 6 OSD VMs.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-18 09:26:06 +02:00
Christian Berendt 4c380c9ef8 Cleanup readme files in roles directories
The contents of the README files are no longer up to date.
Documentation for all roles is located below the docs directory.
2017-10-17 11:22:06 +02:00
Christian Berendt cf901f0171 In docker start scripts replace \u00a0 with \u0020
This will solve the following issue when starting docker containers on ubuntu:

invalid argument "1\u00a0" for --cpus=1 : failed to parse 1  as a rational number

Closes-bug: #2056
2017-10-16 15:16:48 +02:00
Major Hayden c01851325e
Remove jinja2 delimiters from `when` keys
This patch changes the `when:` keys so that they have no jinja2
delimiters. This avoids Ansible warnings which could turn into
errors in a future Ansible release.
2017-10-12 11:27:42 -05:00
Major Hayden 620fb37dd4
Avoid deprecated always_run
The `always_run` key is deprecated and being removed in Ansible 2.4.
Using it causes a warning to be displayed:

    [DEPRECATION WARNING]: always_run is deprecated.

This patch changes all instances of `always_run` to use the `always`
tag, which causes the task to run each time the playbook runs.
2017-10-12 08:29:44 -05:00
Sébastien Han d0a9e57bfc osd: rollback bindmount of /run/udev
This is causing unknown issues when trying to start a dmcrypt container.
Basically the container is stuck at mount opening the LUKS device. This
is still unknown why this is causing trouble but we need to move
forward. Also, this doesn't seem to help in any ways to fix the race
condition we've seen.

Here is the log for dmcrypt:

cryptsetup 1.7.4 processing "cryptsetup --debug --verbose --key-file
key luksClose fbf8887d-8694-46ca-b9ff-be79a668e2a9"
Running command close.
Locking memory.
Installing SIGINT/SIGTERM handler.
Unblocking interruption on signal.
Allocating crypt device context by device
fbf8887d-8694-46ca-b9ff-be79a668e2a9.
Initialising device-mapper backend library.
dm version   [ opencount flush ]   [16384] (*1)
dm versions   [ opencount flush ]   [16384] (*1)
Detected dm-crypt version 1.14.1, dm-ioctl version 4.35.0.
Device-mapper backend running with UDEV support enabled.
dm status fbf8887d-8694-46ca-b9ff-be79a668e2a9  [ opencount flush ]
[16384] (*1)
Releasing device-mapper backend.
Trying to open and read device /dev/sdc1 with direct-io.
Allocating crypt device /dev/sdc1 context.
Trying to open and read device /dev/sdc1 with direct-io.
Initialising device-mapper backend library.
dm table fbf8887d-8694-46ca-b9ff-be79a668e2a9  [ opencount flush
securedata ]   [16384] (*1)
Trying to open and read device /dev/sdc1 with direct-io.
Crypto backend (gcrypt 1.5.3) initialized in cryptsetup library
version 1.7.4.
Detected kernel Linux 3.10.0-693.el7.x86_64 x86_64.
Reading LUKS header of size 1024 from device /dev/sdc1
Key length 32, device size 1943016847 sectors, header size 2050
sectors.
Deactivating volume fbf8887d-8694-46ca-b9ff-be79a668e2a9.
dm status fbf8887d-8694-46ca-b9ff-be79a668e2a9  [ opencount flush ]
[16384] (*1)
Udev cookie 0xd4d14e4 (semid 32769) created
Udev cookie 0xd4d14e4 (semid 32769) incremented to 1
Udev cookie 0xd4d14e4 (semid 32769) incremented to 2
Udev cookie 0xd4d14e4 (semid 32769) assigned to REMOVE task(2) with
flags         (0x0)
dm remove fbf8887d-8694-46ca-b9ff-be79a668e2a9  [ opencount flush
retryremove ]   [16384] (*1)
fbf8887d-8694-46ca-b9ff-be79a668e2a9: Stacking NODE_DEL [verify_udev]
Udev cookie 0xd4d14e4 (semid 32769) decremented to 1
Udev cookie 0xd4d14e4 (semid 32769) waiting for zero

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-11 13:21:37 +02:00
Sébastien Han bf99751ce1 osd: bindmount /run/udev
Ensures that "udevadm" is able to check the status of udev's event queue.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-09 17:25:45 +02:00
Sébastien Han c693e95cbf purge-docker: rework device detection
we don't need "devices" and other device variable anymore, the playbook
detects that for us.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-07 03:39:04 +02:00
Guillaume Abrioux 6b027557e6 osd: fix `set_fact build dedicated_devices`
Use an intermediate variable to build the final `dedicated_devices` list
to avoid duplicate entry in that array. (We need a 1:1 relation between
`dedicated_devices` and `devices` since we are using a `with_together`
later.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-06 15:00:32 +02:00
Sébastien Han 29888649e5 osd: do not do unique on dedicated_devices
This is needed later, if we do unique, only the first OSD will get a
journal.

Signed-off-by: Sébastien Han <seb@redhat.com>
2017-10-05 18:20:18 +02:00
Michel Rode b462b68e65 Fixing path to osd_fragment.yml 2017-10-05 14:42:10 +02:00
Guillaume Abrioux 70e2787fe2 docker: fix keyrings copied on all nodes
All keyring are getting copied to all nodes.
This commit fixes a leftover from a previous code refactor.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498583

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-05 09:23:22 +02:00
Guillaume Abrioux 784cc73da0 set docker_exec_cmd fact early in each role
This is to ensure `docker_exec_cmd` fact is set with the correct value
in case of daemons collocation

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
2017-10-04 11:31:09 +02:00