Commit Graph

179 Commits (5364160d6a6091b7c9945e84896a0100d03bf8ba)

Author SHA1 Message Date
RongZhang 388b627f72
Enable OOM killing for etcd-events
Enable OOM killing like docker run etcd
2018-03-05 20:46:39 -06:00
Antoine Legrand 5cc77eb6fd
Merge pull request #2294 from Nowaker/patch-1
Enable OOM killing
2018-03-01 14:56:26 +01:00
RongZhang 67ffd8e923 Add etcd-events cluster for kube-apiserver (#2385)
Add etcd-events cluster for kube-apiserver
2018-03-01 11:39:14 +03:00
Maxim Krasilnikov ba91304636 Fixed generate front proxy client certs with vault (#2359)
* Fixed generate front proxy client certs with vault

* fix vault cert management

* Distrebute etcd node certs to vault hosts
2018-02-22 15:08:50 +03:00
RongZhang c0aad0a6d5 Fix install etcd by host service (#2297)
Fix bug issues #2289
2018-02-12 17:34:01 +01:00
Damian Nowak f8a59446e8 Enable OOM killing
When etcd exceeds its memory limit, it becomes useless but keeps running.
We should let OOM killer kill etcd process in the container, so systemd can spot
the problem and restart etcd according to "Restart" setting in etcd.service unit file.
If OOME problem keep repeating, i.e. it happens every single restart,
systemd will eventually back off and stop restarting it anyway.

--restart=on-failure:5 in this file has no effect because memory allocation error
doesn't by itself cause the process to die

Related: https://github.com/kubernetes-incubator/kubespray/blob/master/roles/etcd/templates/etcd-docker.service.j2

This kind of reverts a change introduced in #1860.
2018-02-09 11:00:13 -06:00
Antoine Legrand fe57c13b51
Merge pull request #2172 from leseb/etcd-auth
etcd: ability to enable/disable ETCD_PEER_CLIENT_CERT_AUTH
2018-02-07 11:25:56 +01:00
Dmitri Rubinstein 331f141f63 Fix DNS entries in etcd's openssl.conf by adding a newline. (#2208)
DNS entries generated from 'etcd_cert_alt_names' variable in etcd's
openssl.conf are not terminated by a newline.

This fixes issue #2207.
2018-01-30 16:26:58 +03:00
Sébastien Han fa8a128e49 etcd: ability to enable/disable ETCD_PEER_CLIENT_CERT_AUTH
Some installation are failing to authenticate with peers due to
etcd picking up/resoling the wrong node.

By setting 'etcd_peer_client_auth' to "False" you can disable peer client cert
authentication.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-01-30 11:19:12 +01:00
Matthew Mosesohn dc6a17e092
Use include/import tasks (#2192)
import_tasks will consume far less memory, so it should be
used whenever it is compatible.
2018-01-29 14:37:48 +03:00
Chad Swenson c6e0fcea31
Merge pull request #1948 from sgmitchell/secured-etcd
Enable etcd secure client to prevent etcdctl access without cert and key
2018-01-25 09:35:51 -06:00
Matthew Mosesohn 1401286910
Add support for cert alt names for etcd (#2139)
* Add support for cert alt names for etcd

* Update gen_certs_vault.yml
2018-01-09 14:37:34 +03:00
Steve Mitchell e45b30d033 Add etcd key and cert environment variables for use with client auth 2018-01-02 13:52:17 -05:00
Bogdan Dobrelya 8aafe64397
Defaults for apiserver_loadbalancer_domain_name (#1993)
* Defaults for apiserver_loadbalancer_domain_name

When loadbalancer_apiserver is defined, use the
apiserver_loadbalancer_domain_name with a given default value.

Fix unconsistencies for checking if apiserver_loadbalancer_domain_name
is defined AND using it with a default value provided at once.

Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>

* Define defaults for LB modes in common defaults

Adjust the defaults for apiserver_loadbalancer_domain_name and
loadbalancer_apiserver_localhost to come from a single source, which is
kubespray-defaults. Removes some confusion and simplefies the code.

Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-11-23 16:15:48 +00:00
chenhonggc c7910b51a1 --peers DEPRECATED - --endpoints should be used instead (#1943) 2017-11-14 11:28:35 +00:00
Spencer Smith 0126168472 provide environment for rkt trust and run with etcd 2017-11-08 12:57:22 -05:00
Matthew Mosesohn 86fb669fd3 Idempotency fixes (#1838) 2017-10-25 21:19:40 +01:00
Matthew Mosesohn acb63a57fa Only limit etcd memory on small hosts (#1860)
Also disable oom killer on etcd
2017-10-25 10:25:15 +01:00
Matthew Mosesohn 0b4fcc83bd Fix up warnings and deprecations (#1848) 2017-10-20 08:25:57 +01:00
Matthew Mosesohn 514359e556 Improve etcd scale up (#1846)
Now adding unjoined members to existing etcd cluster
occurs one at a time so that the cluster does not
lose quorum.
2017-10-20 08:02:31 +01:00
Matthew Mosesohn fc9a65be2b Refactor downloads to use download role directly (#1824)
* Refactor downloads to use download role directly

Also disable fact delegation so download delegate works acros OSes.

* clean up bools and ansible_os_family conditionals
2017-10-19 09:17:11 +01:00
Matthew Mosesohn 10dd049912 Revert "Security fixes for etcd (#1778)" (#1786)
This reverts commit 4209f1cbfd.
2017-10-12 14:02:51 +01:00
Matthew Mosesohn 4209f1cbfd Security fixes for etcd (#1778)
* Security fixes for etcd

* Use certs when querying etcd
2017-10-12 13:32:54 +01:00
Matthew Mosesohn 83be0735cd Fix setting etcd client cert serial (#1775) 2017-10-11 19:47:11 +01:00
Spencer Smith f2db15873d Merge pull request #1754 from ArchiFleKs/rkt-kubelet-fix
add hosts to rkt kubelet
2017-10-10 10:37:36 -04:00
ArchiFleKs 7c663de6c9 add /etc/hosts volume to rkt templates 2017-10-09 16:41:51 +02:00
Aivars Sterns 9c86da1403 Normalize tags in all places to prepare for tag fixing in future (#1739) 2017-10-05 08:43:04 +01:00
Matthew Mosesohn a56738324a Move set_facts to kubespray-defaults defaults
These facts can be generated in defaults with a performance
boost.

Also cleaned up duplicate etcd var names.
2017-10-04 14:02:47 +01:00
Brad Beam 14c232e3c4 Merge pull request #1663 from foxyriver/fix-shell
use command module instead of shell module
2017-09-25 13:24:45 -05:00
Bogdan Dobrelya bcddfb786d Merge pull request #1692 from mattymo/old-etcd-logic
drop unused etcd logic
2017-09-25 17:44:33 +02:00
Hassan Zamani b23d81f825 Add etcd_blkio_weight var (#1690) 2017-09-25 12:20:24 +01:00
Matthew Mosesohn 126f42de06 drop unused etcd logic
Fixes #1660
2017-09-25 07:52:55 +01:00
foxyriver 30b5493fd6 use command module instead of shell module 2017-09-22 15:47:03 +08:00
Brad Beam ac281476c8 Prune unnecessary certs from vault setup (#1652)
* Cleaning up cert checks for vault

* Removing all unnecessary etcd certs from each node

* Removing all unnecessary kube certs from each node
2017-09-14 12:28:11 +01:00
Matthew Mosesohn 6744726089 kubeadm support (#1631)
* kubeadm support

* move k8s master to a subtask
* disable k8s secrets when using kubeadm
* fix etcd cert serial var
* move simple auth users to master role
* make a kubeadm-specific env file for kubelet
* add non-ha CI job

* change ci boolean vars to json format

* fixup

* Update create-gce.yml

* Update create-gce.yml

* Update create-gce.yml
2017-09-13 19:00:51 +01:00
Matthew Mosesohn 5d99fa0940 Purge old upgrade hooks and unused tasks (#1641) 2017-09-09 23:41:20 +03:00
mkrasilnikov bf0af1cd3d Vault role updates:
* using separated vault roles for generate certs with different `O` (Organization) subject field;
  * configure vault roles for issuing certificates with different `CN` (Common name) subject field;
  * set `CN` and `O` to `kubernetes` and `etcd` certificates;
  * vault/defaults vars definition was simplified;
  * vault dirs variables defined in kubernetes-defaults foles for using
  shared tasks in etcd and kubernetes/secrets roles;
  * upgrade vault to 0.8.1;
  * generate random vault user password for each role by default;
  * fix `serial` file name for vault certs;
  * move vault auth request to issue_cert tasks;
  * enable `RBAC` in vault CI;
2017-09-05 09:07:35 +03:00
Brad Beam 8ae77e955e Adding in certificate serial numbers to manifests (#1392) 2017-09-01 09:02:23 +03:00
sgmitchell 783924e671 Change backup handler to only run v2 data backup if snap directory exists (#1594) 2017-08-31 18:23:24 +03:00
Maxim Krasilnikov 6eb22c5db2 Change single Vault pki mount to multi pki mounts paths for etcd and kube CA`s (#1552)
* Added update CA trust step for etcd and kube/secrets roles

* Added load_balancer_domain_name to certificate alt names if defined. Reset CA's in RedHat os.

* Rename kube-cluster-ca.crt to vault-ca.crt, we need separated CA`s for vault, etcd and kube.

* Vault role refactoring, remove optional cert vault auth because not not used and worked. Create separate CA`s fro vault and etcd.

* Fixed different certificates set for vault cert_managment

* Update doc/vault.md

* Fixed condition create vault CA, wrong group

* Fixed missing etcd_cert_path mount for rkt deployment type. Distribute vault roles for all vault hosts

* Removed wrong when condition in create etcd role vault tasks.
2017-08-30 16:03:22 +03:00
Brad Beam 8b151d12b9 Adding yamllinter to ci steps (#1556)
* Adding yaml linter to ci check

* Minor linting fixes from yamllint

* Changing CI to install python pkgs from requirements.txt

- adding in a secondary requirements.txt for tests
- moving yamllint to tests requirements
2017-08-24 12:09:52 +03:00
Anton 1e07ee6cc4 etcd_compaction_retention every 8 hour (#1527) 2017-08-20 13:55:48 +03:00
Maxim Krasilnikov 2ba285a544 Fixed deploy cluster with vault cert manager (#1548)
* Added custom ips to etcd vault distributed certificates

* Added custom ips to kube-master vault distributed certificates

* Added comment about issue_cert_copy_ca var in vault/issue_cert role file

* Generate kube-proxy, controller-manager and scheduler certificates by vault

* Revert "Disable vault from CI (#1546)"

This reverts commit 781f31d2b8.

* Fixed upgrade cluster with vault cert manager

* Remove vault dir in reset playbook
2017-08-20 13:53:58 +03:00
Matthew Mosesohn 2645e88b0c Fix vault setup partially (#1531)
This does not address per-node certs and scheduler/proxy/controller-manager
component certs which are now required. This should be handled in a
follow-up patch.
2017-08-18 15:09:45 +03:00
Spencer Smith e55f8a61cd Merge pull request #1482 from bradbeam/fix1393
Removing run_once in these tasks so that etcd ca certs get propogated…
2017-07-31 13:47:18 -04:00
Spencer Smith cb6892d2ed Merge pull request #1469 from hzamani/etcd_metrics
Add etcd metrics flag
2017-07-31 09:04:07 -04:00
Brad Beam d09222c900 Removing run_once in these tasks so that etcd ca certs get propogated properly to worker nodes
without this etcd ca certs dont exist on worker nodes causing calico to fail
2017-07-28 14:34:47 -05:00
Anton e0960f6288 FIX: Unneded (extra) cycles in some tasks (#1393) 2017-07-27 20:46:21 +03:00
Hassan Zamani 3fb0383df4 Add etcd metrics flag 2017-07-25 20:00:30 +04:30
Spencer Smith 955c5549ae Merge pull request #1402 from Lendico/fix_failed_when
"failed_when: false" and "|succeeded" checks for registered vars
2017-07-25 09:33:43 -04:00