Commit Graph

320 Commits (adb8ff14b9481843a3874c6e007ad183445e2f8c)

Author SHA1 Message Date
Florian Ruynat 05c9169c70
Fix example value for etcd_quota_backend_bytes (#6724) 2020-09-21 05:42:31 -07:00
Maxime Guyot 648fcf3a2e
Fix E306 in roles/etcd (#6515) 2020-08-31 03:20:20 -07:00
Barry Melbourne 058438a25d
Remove support for CoreOS Container Linux (#6576) 2020-08-28 02:28:53 -07:00
Florian Ruynat 706c7cb4f1
etcd should not fail when adding an already existing member (#6587) 2020-08-27 02:33:01 -07:00
Maxime Guyot e70f27dd79
Add noqa and disable .ansible-lint global exclusions (#6410) 2020-07-27 06:24:17 -07:00
chenguoquan1024 9c48f666ec
change /etc/ssl/etcd to etcd_config_dir param (#6408)
* change /etc/ssl/etcd to etcd_config_dir param

* add use etcd_events_data_dir param
2020-07-21 23:58:05 -07:00
Florent Monbillard bf8c8976dd
Upgrade etcd to 3.4.3 (#5998) 2020-07-20 07:26:51 -07:00
Maxime Guyot 00fe3d5094
Explicitly set ETCDCTL_API and use ETCDCTL_ENDPOINTS (#6327) 2020-07-01 04:56:16 -07:00
Joel Seguillon 4c1e0b188d
Add .editorconfig file (#6307) 2020-06-29 12:39:59 -07:00
Andrew DeMaria af1c93cdfc
Add option to expose metrics on separate port (#6092) 2020-05-10 12:21:51 -07:00
Etienne Champetier a35b6dc1af
Fix scaling (#5889)
* etcd: etcd-events doesn't depend on etcd_cluster_setup

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* etcd: remove condition already present on include_tasks

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* etcd: fix scaling up

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* etcd: use *access_addresses, do not delegate to etcd[0]

We want to wait for the full cluster to be healthy,
so use all the cluster addresses
Also we should be able to run the playbook when etcd[0] is down
(not tested), so do not delegate to etcd[0]

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* etcd: use failed_when for health check

unhealthy cluster is expected on first run, so use failed_when
instead of ignore_errors to remove scary red messages

Also use run_once

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* kubernetes/preinstall: ensure ansible_fqdn is up to date after changing /etc/hosts

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* kubernetes/master: regenerate apiserver cert if needed

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2020-04-08 01:27:43 -07:00
Xiaodu 63fa406c3c
Move host_architecture to kubespray-defaults (#5811)
The variable is defined in `kubernetes/preinstall` role and used in several roles. Since `kubernetes/preinstall` is not always included when `ansible-playbook` is run with tag selectors (see #5734 for reason), they will fail, or individual roles must copy the same fact definitions (as in #3846). Moving the definition to the always-included `kubespray-defaults` role will resolve the dependency problem.
2020-03-25 12:58:25 -07:00
Etienne Champetier 6ad6609872
Fix certificates checking when adding etcd node to existing k8s node (#5807)
Co-authored-by: alexkomrakov <alexkomrakov@gmail.com>
2020-03-25 12:46:25 -07:00
Stephen Schmidt 0379a52f03
Fix etcd install with docker and etcd_kubeadm_enabled (#5777)
- This solves issue #5721 & #5713 (dupes)
  - Provide a cleaner default usage pattern for the download role
    around etcd that supports 'host' and 'docker' properly
  - Extract the 'etcdctl' as a separate task install piece and reuse it where
    appropriate
  - Update the kubeadm-etcd task to reflect the above change
2020-03-24 08:12:47 -07:00
spaced 8ce5a9dd19
remove atomic support because reached end of live (#5783) 2020-03-17 14:31:27 -07:00
spaced 876d4de6be
Fedora CoreOS support (#5657)
* fedora coreos support
- bootstrap and new fact for

* fedora coreos support
- fix bootstrap condition

* fedora coreos support
- allow customize packages for fedora coreos bootstrap

* fedora coreos support
- prevent install ptyhon3 and epel via dnf for fedora coreos

* fedora coreos support
- handle all ostree like os in same way

* fedora coreos support
- handle all ostree like os in same way for crio

* fedora coreos support
- add fcos documentations
2020-03-17 03:12:21 -07:00
Jakub Husák 2beffe688a
Make etcdctl connect to localhost out of the box (#5643)
* Make etcdctl connect to localhost out of the box

* etcdctl envs: use admin-.pem instead of member-.pem
2020-03-06 02:05:23 -08:00
Sylvain Chateau 0ca7aa126b
added "Flatcar", "Flatcar Container Linux by Kinvolk" for all coreOS role (#5607) 2020-02-18 00:15:29 -08:00
Erwan Miran 339e36fbe6
Files to archive can be passed directly (#5571) 2020-02-12 07:50:51 -08:00
qvicksilver ac2135e450
Fix recover-control-plane to work with etcd 3.3.x and add CI (#5500)
* Fix recover-control-plane to work with etcd 3.3.x and add CI

* Set default values for testcase

* Add actual test jobs

* Attempt to satisty gitlab ci linter

* Fix ansible targets

* Set etcd_member_name as stated in the docs...

* Recovering from 0 masters is not supported yet

* Add other master to broken_kube-master group as well

* Increase number of retries to see if etcd needs more time to heal

* Make number of retries for ETCD loops configurable, increase it for recovery CI and document it
2020-02-11 01:38:01 -08:00
Erwan Miran 303c3654a1 Set pipefail in case tar fails (#5506) 2020-01-06 02:25:34 -08:00
Maxime Guyot b15d41a96a Add support to Ansible 2.9 (#5361) 2019-12-05 07:24:32 -08:00
Sergey 932935ecc7 fix wrong path in include install_host.yml in etcd role (#5256) 2019-10-13 18:16:34 -07:00
Matthew Mosesohn 7abf6a6958 Allow etcd member join by checking cluster health only on first etcd (#5032)
Change-Id: I9cc01cef3a437893225e2d9f58495826bbce7be9
2019-08-07 04:44:50 -07:00
Matthew Mosesohn 4348e78b24 Enable kubeadm etcd mode (#4818)
* Enable kubeadm etcd mode

Uses cert commands from kubeadm experimental control plane to
enable non-master nodes to obtain etcd certs.

Related story: PROD-29434

Change-Id: Idafa1d223e5c6ceadf819b6f9c06adf4c4f74178

* Add validation checks and exclude calico kdd mode

Change-Id: Ic234f5e71261d33191376e70d438f9f6d35f358c

* Move etcd mode test to ubuntu flannel HA job

Change-Id: I9af6fd80a1bbb1692ab10d6da095eb368f6bc732

* rename etcd_mode to etcd_kubeadm_enabled

Change-Id: Ib196d6c8a52f48cae370b026f7687ff9ca69c172
2019-06-20 11:12:51 -07:00
MarkusTeufelberger 73c2ff17dd Fix Ansible-lint error [E502] (#4743) 2019-05-16 00:27:43 -07:00
MarkusTeufelberger e67f848abc ansible-lint: add spaces around variables [E206] (#4699) 2019-05-02 14:24:21 -07:00
Stas 50bdaa573c Apply etcd_extra_vars to etcd-events.env as well. (#4219)
This change ensures that etcd_extra_vars variable applies
to events etcd as well.
2019-05-02 12:24:27 -07:00
Christoffer Anselm dcd9c9509b Add etcd role dependency on kube user to avoid etcd role failure when running scale.yml with a fresh node. (#3240) (#4479) 2019-04-30 04:01:36 -07:00
Andreas Krüger 38af93b60c Remove rkt support (#4671) 2019-04-29 01:14:20 -07:00
gitareest 6ca2019002 Fix issue with etcd arm host installation case (#4589)
Use host_architecture variable.
2019-04-25 04:58:47 -07:00
Attilio Greco 6243467856 remove duble check for run this task just one time (#4613) 2019-04-24 05:38:01 -07:00
Matthew Mosesohn fc072300ea Purge legacy cleanup tasks from older than 1 year (#4450)
We don't need to support upgrades from 2 year old installs,
just from the last major version.

Also changed most retried tasks to 1s delay instead of longer.
2019-04-24 00:08:05 -07:00
MarkusTeufelberger a65605b17a ansible-lint: Don't use bare variables (#4608)
Circumvented one false positive from ansible-lint
Moved a block of jinja magic into its own variable
2019-04-23 22:20:00 -07:00
MarkusTeufelberger 424e59805f ansible-lint: Fix commands that are also available as module (#4619) 2019-04-23 22:18:00 -07:00
Qasim Sarfraz 0a3cf1a087 Fix CA cert environment variable for ectd v3 (#4381) 2019-03-28 00:18:43 -07:00
Matthew Mosesohn acbf3db233 Remove hard dependence on facts for all nodes (#4304)
* Remove hard dependence on facts for all nodes

* Update main.yaml

* Update main.yaml
2019-03-05 03:04:39 -08:00
Ganesh Maharaj Mahalingam 73aee004ac Enable ClearLinux as a distro in kubespray (#3855)
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
2018-12-18 01:39:25 -08:00
Andrey Zhelnin 1712314fab Setting host_architecture var (#3846)
Setting host_architecture to allow etcd upgrade working through: ansible-playbook -b -i inventory/sample/hosts.ini cluster.yml --tags=etcd (on other case host_architecture is missing)
2018-12-07 05:41:30 -08:00
Chad Swenson 145687a48e Reduce log spam of verbose tasks (#3806)
Added a loop_control label to a few tasks that flood our logs.
2018-12-04 10:35:44 -08:00
Zohar Mamedov af5e05d08d etcd_log_package_levels for /etc/etcd.env (#3700) 2018-11-16 23:59:40 -08:00
Antoine Legrand 3dcb914607 Remove Vault (#3684)
* Remove Vault

* Remove reference to 'kargo' in the doc

* change check order
2018-11-10 08:51:24 -08:00
Bily Zhang b2b421840c Fix some typos (#3690)
Signed-off-by: mooncake <xcoder@tenxcloud.com>
2018-11-10 15:53:58 +01:00
ankitcharolia 9c83551a0e add certificate authority file (#3433) 2018-11-02 08:27:53 -07:00
Matthew Mosesohn bc74a37696 Calculate etcd client cert serial for appropriate groups (#3605)
Standalone etcd nodes do not generate node-$hostname certs and do
not need this serial calculated.
2018-11-01 05:50:26 -07:00
Bart Laarhoven 0acb823d96 Distribute node etcd certificates like it's done in kubernetes/secrets (#3486)
* do it like in kubernetes/secrets

* fix indentation

* processed comments

* missed one, sorry

* trailing space fix
2018-10-29 11:45:32 +01:00
Erwan Miran b4e2b85745 Replace shell with command in order to allow the task to fail when openssl x509 does return zero (#3516) 2018-10-15 23:48:12 -07:00
Erwan Miran fcd8d850dc Fix ansible syntax to avoid ansible warnings (again) (#3509)
* Fix ansible syntax to avoid ansible warnings (again)

* warn: false on tar -cfz

* wrong placement of warn:false
2018-10-15 23:47:04 -07:00
Erwan Miran 2ab2f3a0a3 Ability to define SSL certificates duration and SSL key size (#3482)
* Ability to specify ssl certificate duration and ssl key size - etcd/secrets

* Ability to specify ssl certificate duration and ssl key size - helm/contiv + fix contiv missing copy certs generation script
2018-10-09 04:43:30 -07:00
刘旭 145e5c8943 use copy and slurp module (#3313) 2018-09-27 02:12:02 -07:00
rongzhang 84c4c7dc82 Use synchronize module 2018-09-16 20:36:44 +08:00
Matthew Mosesohn aaa9a4efac Ensure vault file permissions are correct 2018-09-10 12:04:04 +03:00
k8s-ci-robot db11394711
Merge pull request #3200 from pablodav/feature/k8s_win_v1.11
Required support to start working on windows node support
2018-09-03 04:51:23 -07:00
Pablo Estigarribia 7cbe3c2171 ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version
ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version

remove empty when line

ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version

force kubeadm upgrade due to failure without --force flag

ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version

added nodeSelector to have compatibility with hybrid cluster with win nodes, also fix for download with missing container type

fixes in syntax and LF for newline in files

fix on yamllint check

ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version

some cleanup for innecesary lines

remove conditions for nodeselector
2018-09-02 12:47:06 -03:00
k8s-ci-robot 6e7100f283
Merge pull request #3208 from mirwan/etcd_ha_doc_n_cleaning
Add documentation about having HA for etcd
2018-08-31 08:06:05 -07:00
Erwan Miran 82a28d6bb3 Add documentation about having HA for etcd 2018-08-31 14:40:25 +02:00
Antoine Legrand da06c8e5a9 etcd UNSUPPORTED for all arch 2018-08-31 13:45:08 +02:00
Antoine Legrand 19268ded23 Fix some arm64 errors 2018-08-31 13:45:08 +02:00
Antoine Legrand f67933d2ac add ETCD_UNSUPPORTED_ARCH=arm64 flag 2018-08-31 13:45:08 +02:00
Takashi Okamoto 359009bb05 Download etcd and hyperkube binary. 2018-08-28 01:24:26 +00:00
Vasilis Remmas b61eb7d7f3 Add ETCD_QUOTA_BACKEND_BYTES environment variable 2018-08-24 12:17:34 +02:00
Aivars Sterns 1567a977c3
Revert "gen_certs_script: refactor using stdin (Ansible 2.4+)" 2018-08-24 12:35:31 +03:00
Tatsuyuki Ishi 69786b2d16 gen_certs_script: refactor using stdin (Ansible 2.4+) 2018-08-23 11:19:17 +09:00
Antoine Legrand e51c5dc0a6
Merge pull request #3123 from mathieuherbert/until-restart-etcd
add until option for etcd backup commands
2018-08-17 22:09:08 +02:00
Sergey Bondarev ce6854e726 add version to environment file
Trigger reboot handler when version upgrade during update script
2018-08-17 17:25:35 +03:00
Mathieu Herbert 59d89a37cc add until option for etcd backup commands 2018-08-17 11:05:57 +02:00
Matthew Mosesohn 97e0de7e29
Fix vault file owner issues and k8s apiserver cert creation (#2985)
apiserver cert should be created only once
2018-07-11 14:58:02 +03:00
Matthew Mosesohn 5c617c5a8b
Add tags to deploy components by --tags option (#2960)
* Add tags for cert serial tasks

This will help facilitate tag-based deployment of specific components.

* fixup kubernetes node
2018-07-06 09:12:13 +03:00
elementyang d6f2dbc723 fix the time of ca files are changed in make-ssl-etcd 2018-06-24 13:05:43 +08:00
Matthew Mosesohn 61e97251a5 Improve variable handling for disabling etcd events cluster 2018-06-18 16:58:29 +03:00
Brad Beam 63a458063b Adding missing rkt template for etcd-events 2018-06-06 10:43:30 -05:00
Matthew Mosesohn 59be578842
Revert "wip pr for improved cert sync" (#2849) 2018-06-06 17:22:25 +03:00
Matthew Mosesohn 7433348aae wip pr for improved cert sync 2018-05-30 12:15:11 +03:00
Andreas Krüger e60a63ea51
Merge pull request #2577 from woopstar/etcd-fix-4
Makeover of etcd- and etcd-cluster setup.
2018-05-16 20:49:54 +02:00
Matthew Mosesohn 07cc981971
refactor vault role (#2733)
* Move front-proxy-client certs back to kube mount

We want the same CA for all k8s certs

* Refactor vault to use a third party module

The module adds idempotency and reduces some of the repetitive
logic in the vault role

Requires ansible-modules-hashivault on ansible node and hvac
on the vault hosts themselves

Add upgrade test scenario
Remove bootstrap-os tags from tasks

* fix upgrade issues

* improve unseal logic

* specify ca and fix etcd check

* Fix initialization check

bump machine size
2018-05-11 19:11:38 +03:00
woopstar 4c81cd2a71 Merge branch 'master' of https://github.com/kubernetes-incubator/kubespray into etcd-fix-4 2018-05-02 14:45:58 +02:00
Andreas Kruger 32a8ea8094 Fix wrong var used 2018-05-02 12:44:05 +02:00
ashon fb465f8b4b Use 'items()' for python compatibility 2018-05-01 16:55:50 +09:00
Markos Chandras 9168c71359 Revert "Revert "Add openSUSE support" (#2697)" (#2699)
This reverts commit 51f4e6585a.
2018-04-26 12:52:06 +03:00
Matthew Mosesohn 51f4e6585a
Revert "Add openSUSE support" (#2697) 2018-04-23 14:28:24 +03:00
Spencer Smith 49c6bf8fa6 support custom env vars for etcd 2018-04-18 14:03:24 -04:00
Markos Chandras 2d34781259 roles: etcd: Add support for SUSE distributions
Add path for certificate location for SUSE distributions. Also make sure
the 'update-ca-certificates' command is executed on SUSE hosts as well.
2018-04-11 20:53:43 +01:00
woopstar 86e3506ae6 Etcd cluster setup makeover
The current way to setup the etc cluster is messy and buggy.

- It checks for cluster is healthy before the cluster is even created.
- The unit files are started on handlers, not in the task, so you mess with "flush handlers".
- The join_member.yml is not used.
- etcd events cluster is not configured for kubeadm
- remove duplicate runs between running the role on etcd nodes and k8s nodes
2018-04-01 21:38:33 +02:00
Andreas Krüger b9b028a735 Update etcd deployment to use correct cert and key (#2572)
* Update etcd deployment to use correct cert and key

* Update to use admin cert for etcdctl commands

* Update handler to use admin cert too
2018-03-31 14:06:09 -04:00
Wong Hoi Sing Edison 195d6d791a Integrate jetstack/cert-manager 0.2.3 to Kubespray 2018-03-31 19:29:11 +08:00
woopstar 859a7f32fb Fix import task. Has to be include task to evalutate etcd_cluster_setup variable at run time 2018-03-31 00:06:34 +02:00
Andreas Krüger 76cb37d6b5
Merge pull request #2544 from woopstar/cert-fix-2
Update openssl.conf to count better and work with Jinja 2.9
2018-03-30 21:57:17 +02:00
Matthew Mosesohn 03bcfa7ff5
Stop templating kube-system namespace and creating it (#2545)
Kubernetes makes this namespace automatically, so there is
no need for kubespray to manage it.
2018-03-30 14:29:13 +03:00
woopstar 0df32b03ca Update openssl.conf to count better and work with Jinja 2.9 2018-03-28 17:48:56 +02:00
Sergey Bondarev 4f7479d94d add etc tunning options
https://coreos.com/etcd/docs/latest/tuning.html

etcd_snapshot_count
and
ionice priority
2018-03-26 17:25:51 +03:00
Sergey Bondarev f8fed0f308 change expirations period for generated certificate from 10 years to 100 years 2018-03-14 13:33:36 +03:00
RongZhang 388b627f72
Enable OOM killing for etcd-events
Enable OOM killing like docker run etcd
2018-03-05 20:46:39 -06:00
Antoine Legrand 5cc77eb6fd
Merge pull request #2294 from Nowaker/patch-1
Enable OOM killing
2018-03-01 14:56:26 +01:00
RongZhang 67ffd8e923 Add etcd-events cluster for kube-apiserver (#2385)
Add etcd-events cluster for kube-apiserver
2018-03-01 11:39:14 +03:00
Maxim Krasilnikov ba91304636 Fixed generate front proxy client certs with vault (#2359)
* Fixed generate front proxy client certs with vault

* fix vault cert management

* Distrebute etcd node certs to vault hosts
2018-02-22 15:08:50 +03:00
RongZhang c0aad0a6d5 Fix install etcd by host service (#2297)
Fix bug issues #2289
2018-02-12 17:34:01 +01:00
Damian Nowak f8a59446e8 Enable OOM killing
When etcd exceeds its memory limit, it becomes useless but keeps running.
We should let OOM killer kill etcd process in the container, so systemd can spot
the problem and restart etcd according to "Restart" setting in etcd.service unit file.
If OOME problem keep repeating, i.e. it happens every single restart,
systemd will eventually back off and stop restarting it anyway.

--restart=on-failure:5 in this file has no effect because memory allocation error
doesn't by itself cause the process to die

Related: https://github.com/kubernetes-incubator/kubespray/blob/master/roles/etcd/templates/etcd-docker.service.j2

This kind of reverts a change introduced in #1860.
2018-02-09 11:00:13 -06:00
Antoine Legrand fe57c13b51
Merge pull request #2172 from leseb/etcd-auth
etcd: ability to enable/disable ETCD_PEER_CLIENT_CERT_AUTH
2018-02-07 11:25:56 +01:00
Dmitri Rubinstein 331f141f63 Fix DNS entries in etcd's openssl.conf by adding a newline. (#2208)
DNS entries generated from 'etcd_cert_alt_names' variable in etcd's
openssl.conf are not terminated by a newline.

This fixes issue #2207.
2018-01-30 16:26:58 +03:00
Sébastien Han fa8a128e49 etcd: ability to enable/disable ETCD_PEER_CLIENT_CERT_AUTH
Some installation are failing to authenticate with peers due to
etcd picking up/resoling the wrong node.

By setting 'etcd_peer_client_auth' to "False" you can disable peer client cert
authentication.

Signed-off-by: Sébastien Han <seb@redhat.com>
2018-01-30 11:19:12 +01:00