kubespray

Commit Graph

Author	SHA1	Message	Date
Damian Nowak	f8a59446e8	Enable OOM killing When etcd exceeds its memory limit, it becomes useless but keeps running. We should let OOM killer kill etcd process in the container, so systemd can spot the problem and restart etcd according to "Restart" setting in etcd.service unit file. If OOME problem keep repeating, i.e. it happens every single restart, systemd will eventually back off and stop restarting it anyway. --restart=on-failure:5 in this file has no effect because memory allocation error doesn't by itself cause the process to die Related: https://github.com/kubernetes-incubator/kubespray/blob/master/roles/etcd/templates/etcd-docker.service.j2 This kind of reverts a change introduced in #1860.	2018-02-09 11:00:13 -06:00
Antoine Legrand	fe57c13b51	Merge pull request #2172 from leseb/etcd-auth etcd: ability to enable/disable ETCD_PEER_CLIENT_CERT_AUTH	2018-02-07 11:25:56 +01:00
Dmitri Rubinstein	331f141f63	Fix DNS entries in etcd's openssl.conf by adding a newline. (#2208 ) DNS entries generated from 'etcd_cert_alt_names' variable in etcd's openssl.conf are not terminated by a newline. This fixes issue #2207.	2018-01-30 16:26:58 +03:00
Sébastien Han	fa8a128e49	etcd: ability to enable/disable ETCD_PEER_CLIENT_CERT_AUTH Some installation are failing to authenticate with peers due to etcd picking up/resoling the wrong node. By setting 'etcd_peer_client_auth' to "False" you can disable peer client cert authentication. Signed-off-by: Sébastien Han <seb@redhat.com>	2018-01-30 11:19:12 +01:00
Matthew Mosesohn	dc6a17e092	Use include/import tasks (#2192 ) import_tasks will consume far less memory, so it should be used whenever it is compatible.	2018-01-29 14:37:48 +03:00
Chad Swenson	c6e0fcea31	Merge pull request #1948 from sgmitchell/secured-etcd Enable etcd secure client to prevent etcdctl access without cert and key	2018-01-25 09:35:51 -06:00
Matthew Mosesohn	1401286910	Add support for cert alt names for etcd (#2139 ) * Add support for cert alt names for etcd * Update gen_certs_vault.yml	2018-01-09 14:37:34 +03:00
Steve Mitchell	e45b30d033	Add etcd key and cert environment variables for use with client auth	2018-01-02 13:52:17 -05:00
Bogdan Dobrelya	8aafe64397	Defaults for apiserver_loadbalancer_domain_name (#1993 ) * Defaults for apiserver_loadbalancer_domain_name When loadbalancer_apiserver is defined, use the apiserver_loadbalancer_domain_name with a given default value. Fix unconsistencies for checking if apiserver_loadbalancer_domain_name is defined AND using it with a default value provided at once. Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru> * Define defaults for LB modes in common defaults Adjust the defaults for apiserver_loadbalancer_domain_name and loadbalancer_apiserver_localhost to come from a single source, which is kubespray-defaults. Removes some confusion and simplefies the code. Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>	2017-11-23 16:15:48 +00:00
chenhonggc	c7910b51a1	--peers DEPRECATED - --endpoints should be used instead (#1943 )	2017-11-14 11:28:35 +00:00
Spencer Smith	0126168472	provide environment for rkt trust and run with etcd	2017-11-08 12:57:22 -05:00
Matthew Mosesohn	86fb669fd3	Idempotency fixes (#1838 )	2017-10-25 21:19:40 +01:00
Matthew Mosesohn	acb63a57fa	Only limit etcd memory on small hosts (#1860 ) Also disable oom killer on etcd	2017-10-25 10:25:15 +01:00
Matthew Mosesohn	0b4fcc83bd	Fix up warnings and deprecations (#1848 )	2017-10-20 08:25:57 +01:00
Matthew Mosesohn	514359e556	Improve etcd scale up (#1846 ) Now adding unjoined members to existing etcd cluster occurs one at a time so that the cluster does not lose quorum.	2017-10-20 08:02:31 +01:00
Matthew Mosesohn	fc9a65be2b	Refactor downloads to use download role directly (#1824 ) * Refactor downloads to use download role directly Also disable fact delegation so download delegate works acros OSes. * clean up bools and ansible_os_family conditionals	2017-10-19 09:17:11 +01:00
Matthew Mosesohn	10dd049912	Revert "Security fixes for etcd (#1778 )" (#1786 ) This reverts commit `4209f1cbfd`.	2017-10-12 14:02:51 +01:00
Matthew Mosesohn	4209f1cbfd	Security fixes for etcd (#1778 ) * Security fixes for etcd * Use certs when querying etcd	2017-10-12 13:32:54 +01:00
Matthew Mosesohn	83be0735cd	Fix setting etcd client cert serial (#1775 )	2017-10-11 19:47:11 +01:00
Spencer Smith	f2db15873d	Merge pull request #1754 from ArchiFleKs/rkt-kubelet-fix add hosts to rkt kubelet	2017-10-10 10:37:36 -04:00
ArchiFleKs	7c663de6c9	add /etc/hosts volume to rkt templates	2017-10-09 16:41:51 +02:00
Aivars Sterns	9c86da1403	Normalize tags in all places to prepare for tag fixing in future (#1739 )	2017-10-05 08:43:04 +01:00
Matthew Mosesohn	a56738324a	Move set_facts to kubespray-defaults defaults These facts can be generated in defaults with a performance boost. Also cleaned up duplicate etcd var names.	2017-10-04 14:02:47 +01:00
Brad Beam	14c232e3c4	Merge pull request #1663 from foxyriver/fix-shell use command module instead of shell module	2017-09-25 13:24:45 -05:00
Bogdan Dobrelya	bcddfb786d	Merge pull request #1692 from mattymo/old-etcd-logic drop unused etcd logic	2017-09-25 17:44:33 +02:00
Hassan Zamani	b23d81f825	Add etcd_blkio_weight var (#1690 )	2017-09-25 12:20:24 +01:00
Matthew Mosesohn	126f42de06	drop unused etcd logic Fixes #1660	2017-09-25 07:52:55 +01:00
foxyriver	30b5493fd6	use command module instead of shell module	2017-09-22 15:47:03 +08:00
Brad Beam	ac281476c8	Prune unnecessary certs from vault setup (#1652 ) * Cleaning up cert checks for vault * Removing all unnecessary etcd certs from each node * Removing all unnecessary kube certs from each node	2017-09-14 12:28:11 +01:00
Matthew Mosesohn	6744726089	kubeadm support (#1631 ) * kubeadm support * move k8s master to a subtask * disable k8s secrets when using kubeadm * fix etcd cert serial var * move simple auth users to master role * make a kubeadm-specific env file for kubelet * add non-ha CI job * change ci boolean vars to json format * fixup * Update create-gce.yml * Update create-gce.yml * Update create-gce.yml	2017-09-13 19:00:51 +01:00
Matthew Mosesohn	5d99fa0940	Purge old upgrade hooks and unused tasks (#1641 )	2017-09-09 23:41:20 +03:00
mkrasilnikov	bf0af1cd3d	Vault role updates: * using separated vault roles for generate certs with different `O` (Organization) subject field; * configure vault roles for issuing certificates with different `CN` (Common name) subject field; * set `CN` and `O` to `kubernetes` and `etcd` certificates; * vault/defaults vars definition was simplified; * vault dirs variables defined in kubernetes-defaults foles for using shared tasks in etcd and kubernetes/secrets roles; * upgrade vault to 0.8.1; * generate random vault user password for each role by default; * fix `serial` file name for vault certs; * move vault auth request to issue_cert tasks; * enable `RBAC` in vault CI;	2017-09-05 09:07:35 +03:00
Brad Beam	8ae77e955e	Adding in certificate serial numbers to manifests (#1392 )	2017-09-01 09:02:23 +03:00
sgmitchell	783924e671	Change backup handler to only run v2 data backup if snap directory exists (#1594 )	2017-08-31 18:23:24 +03:00
Maxim Krasilnikov	6eb22c5db2	Change single Vault pki mount to multi pki mounts paths for etcd and kube CA`s (#1552 ) * Added update CA trust step for etcd and kube/secrets roles * Added load_balancer_domain_name to certificate alt names if defined. Reset CA's in RedHat os. * Rename kube-cluster-ca.crt to vault-ca.crt, we need separated CA`s for vault, etcd and kube. * Vault role refactoring, remove optional cert vault auth because not not used and worked. Create separate CA`s fro vault and etcd. * Fixed different certificates set for vault cert_managment * Update doc/vault.md * Fixed condition create vault CA, wrong group * Fixed missing etcd_cert_path mount for rkt deployment type. Distribute vault roles for all vault hosts * Removed wrong when condition in create etcd role vault tasks.	2017-08-30 16:03:22 +03:00
Brad Beam	8b151d12b9	Adding yamllinter to ci steps (#1556 ) * Adding yaml linter to ci check * Minor linting fixes from yamllint * Changing CI to install python pkgs from requirements.txt - adding in a secondary requirements.txt for tests - moving yamllint to tests requirements	2017-08-24 12:09:52 +03:00
Anton	1e07ee6cc4	etcd_compaction_retention every 8 hour (#1527 )	2017-08-20 13:55:48 +03:00
Maxim Krasilnikov	2ba285a544	Fixed deploy cluster with vault cert manager (#1548 ) * Added custom ips to etcd vault distributed certificates * Added custom ips to kube-master vault distributed certificates * Added comment about issue_cert_copy_ca var in vault/issue_cert role file * Generate kube-proxy, controller-manager and scheduler certificates by vault * Revert "Disable vault from CI (#1546)" This reverts commit `781f31d2b8`. * Fixed upgrade cluster with vault cert manager * Remove vault dir in reset playbook	2017-08-20 13:53:58 +03:00
Matthew Mosesohn	2645e88b0c	Fix vault setup partially (#1531 ) This does not address per-node certs and scheduler/proxy/controller-manager component certs which are now required. This should be handled in a follow-up patch.	2017-08-18 15:09:45 +03:00
Spencer Smith	e55f8a61cd	Merge pull request #1482 from bradbeam/fix1393 Removing run_once in these tasks so that etcd ca certs get propogated…	2017-07-31 13:47:18 -04:00
Spencer Smith	cb6892d2ed	Merge pull request #1469 from hzamani/etcd_metrics Add etcd metrics flag	2017-07-31 09:04:07 -04:00
Brad Beam	d09222c900	Removing run_once in these tasks so that etcd ca certs get propogated properly to worker nodes without this etcd ca certs dont exist on worker nodes causing calico to fail	2017-07-28 14:34:47 -05:00
Anton	e0960f6288	FIX: Unneded (extra) cycles in some tasks (#1393 )	2017-07-27 20:46:21 +03:00
Hassan Zamani	3fb0383df4	Add etcd metrics flag	2017-07-25 20:00:30 +04:30
Spencer Smith	955c5549ae	Merge pull request #1402 from Lendico/fix_failed_when "failed_when: false" and "\|succeeded" checks for registered vars	2017-07-25 09:33:43 -04:00
Brad Beam	20f29327e9	Merge pull request #1379 from gdmello/etcd_data_dir_fix Custom `etcd_data_dir` saves etcd data to host, not container	2017-07-20 09:30:18 -05:00
Anton Nerozya	1fedbded62	ignore_errors instead of failed_when: false	2017-06-29 20:15:14 +02:00
gdmelloatpoints	649654207f	mount the etcd data directory in the container with the same path as on the host.	2017-06-27 09:29:47 -04:00
gdmelloatpoints	3123502f4c	move `etcd_backup_prefix` to new home.	2017-06-27 09:12:34 -04:00
gdmelloatpoints	4ba237c5d8	Make etcd_backup_prefix configurable. Ensures that backups can be stored on a different location other than ${HOST}/var/backups, say an EBS volume on AWS.	2017-06-26 09:42:30 -04:00
gdmelloatpoints	5c1891ec9f	In the etcd container, the etcd data directory is always /var/lib/etcd. Reverting to this value, since `etcd_data_dir` on the host maps to `/var/lib/etcd` in the container.	2017-06-23 13:49:31 -04:00
Gregory Storme	fff0aec720	add configurable parameter for etcd_auto_compaction_retention	2017-06-14 10:39:38 +02:00
Brad Beam	db3e8edacd	Fixing up vault variables	2017-06-08 16:15:33 -05:00
Matthew Mosesohn	ae7f59e249	Skip vault cert task evaluation completely when using script cert generation	2017-04-13 19:29:07 +03:00
Matthew Mosesohn	798f90c4d5	Merge pull request #1153 from mattymo/graceful_drain Move graceful upgrade test to Ubuntu canal HA, adjust drain	2017-04-04 17:33:53 +03:00
Aleksandr Didenko	58acbe7caf	Fix multiline when condition in sync_certs task Folded style in multiline 'when' condition causes error with unexpected ident. Changing it to literal style should fix the issue. Closes #1190	2017-03-30 22:21:04 +02:00
Matthew Mosesohn	fb467df47c	fix etcd restart	2017-03-29 23:22:49 +04:00
Sergii Golovatiuk	f144fd1ed3	Refactor etcd role - Run docker run from script rather than directly from systemd target - Refactoring styling/templates Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>	2017-03-24 12:34:15 +01:00
Sergii Golovatiuk	c04a6254b9	Backup etcd data before restarting etcd etcd is crucial part of kubernetes cluster. Ansible restarts etcd on reconfiguration. Backup helps operator to restore cluster manually in case of any issues. Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>	2017-03-20 14:50:52 +01:00
Matthew Mosesohn	8195957461	Merge branch 'master' into idempotency2	2017-03-16 09:29:43 +03:00
Matthew Mosesohn	a422ad0d50	More idempotency fixes Fixed sync_tokens fact Fixed sync_certs for k8s tokens fact Disabled register docker images changability Fixed CNI dir permission Fix idempotency for etcd pre upgrade checks	2017-03-15 19:06:39 +03:00
Matthew Mosesohn	4c6829513c	Fix etcd idempotency	2017-03-14 17:23:29 +03:00
Matthew Mosesohn	02a8e78902	Remove standalone etcd specific play, cleanup host mode Now etcd role can optionally disable etcd cluster setup for faster deployment when it is combined with etcd role.	2017-03-04 00:34:26 +04:00
Matthew Mosesohn	8f3d9e93ce	Merge pull request #1111 from mattymo/use_find_for_certs Use find module for checking for certificates	2017-03-03 20:08:33 +03:00
Matthew Mosesohn	d176818c44	Use find module for checking for certificates Also generate certs only when absent on master (rather than when absent on target node)	2017-03-03 16:21:01 +03:00
Vijay Katam	a0b1eda1d0	Add support for atomic host Updates based on feedback Simplify checks for file exists remove invalid char Review feedback. Use regular systemd file. Add template for docker systemd atomic	2017-03-01 09:38:19 -08:00
Antoine Legrand	77e5171679	Merge pull request #1076 from VincentS/etcd_openssl_count_fix Fixed counter in ETCD Openssl.conf	2017-03-01 14:17:27 +01:00
Sergii Golovatiuk	f9ff93c606	Make etcd data dir configurable. Closes: #1073 Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>	2017-02-27 21:35:51 +01:00
Vincent Schwarzer	0cbc3d8df6	Fixed counter in ETCD Openssl.conf When a apiserver_loadbalancer_domain_name is added to the Openssl.conf the counter gets not increased correctly. This didnt seem to have an effect at the current kargo version.	2017-02-27 12:01:09 +01:00
Sergii Golovatiuk	00cfead9bb	Increase SSL TTL to 3650 days In real scenarios 365 days is short period of time. 3650 days is good enough for long running k8s environments	2017-02-24 15:38:13 +01:00
Matthew Mosesohn	d821448e2f	Merge branch 'master' into synthscale	2017-02-21 22:17:43 +03:00
Matthew Mosesohn	0afadb9149	Merge pull request #1046 from skyscooby/pedantic-syntax-cleanup Cleanup legacy syntax, spacing, files all to yml	2017-02-21 17:03:16 +03:00
Matthew Mosesohn	d19e6dec7a	speed up etcd preupgrade check	2017-02-20 20:18:10 +03:00
Matthew Mosesohn	a21eb036ee	Add no_log to cert tar tasks This works around 4MB limit for gitlab CI runner.	2017-02-18 14:09:57 +04:00
Matthew Mosesohn	9c1701f2aa	Add synthetic scale deployment mode New deploy modes: scale, ha-scale, separate-scale Creates 200 fake hosts for deployment with fake hostvars. Useful for testing certificate generation and propagation to other master nodes. Updated test cases descriptions.	2017-02-18 14:09:55 +04:00
Andrew Greenwood	ca9ea097df	Cleanup legacy syntax, spacing, files all to yml Migrate older inline= syntax to pure yml syntax for module args as to be consistant with most of the rest of the tasks Cleanup some spacing in various files Rename some files named yaml to yml for consistancy	2017-02-17 16:22:34 -05:00
Antoine Legrand	e16ebcad6e	Merge pull request #1042 from holser/fix_facts Fix fact tags	2017-02-17 17:56:29 +01:00
Sergii Golovatiuk	e91e58aec9	Fix fact tags Ansible playbook fails when tags are limited to "facts,etcd" or to "facts". This patch allows to run ansible-playbook to gather facts only that don't require calico/flannel/weave components to be verified. This allows to run ansible with 'facts,bootstrap-os' or just 'facts' to gether facts that don't require specific components. Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>	2017-02-17 12:32:33 +01:00
Matthew Mosesohn	80c0e747a7	Fix references to CoreOS and Container Linux by CoreOS Fixes #967	2017-02-16 19:25:17 +03:00
Vladimir Rutsky	09847567ae	set "check_mode: no" for read-only "shell" steps that registers result "shell" step doesn't support check mode, which currently leads to failures, when Ansible is being run in check mode (because Ansible doesn't run command, assuming that command might have effect, and no "rc" or "output" is registered). Setting "check_mode: no" allows to run those "shell" commands in check mode (which is safe, because those shell commands doesn't have side effects).	2017-02-13 18:53:41 +03:00
Josh Conant	245e05ce61	Vault security hardening and role isolation	2017-02-08 21:41:36 +00:00
Josh Conant	f4ec2d18e5	Adding the Vault role	2017-02-08 21:31:28 +00:00
Matthew Mosesohn	0180ad7f38	Merge pull request #990 from mattymo/fix_cert_upgrade Fix check for node-NODEID certs existence	2017-02-08 14:44:09 +03:00
Matthew Mosesohn	e5779ab786	Fix check for node-NODEID certs existence Fixes upgrade from pre-individual node cert envs.	2017-02-07 21:06:48 +03:00
Matthew Mosesohn	71e14a13b4	Re-tune ETCD performance params Reduce election timeout to 5000ms (was 10000ms) Raise heartbeat interval to 250ms (was 100ms) Remove etcd cpu share (was 300) Make etcd_cpu_limit and etcd_memory_limit optional.	2017-02-07 20:15:14 +03:00
Matthew Mosesohn	fd30131dc2	Revert "Drop linux capabilities and rework users/groups"	2017-02-06 15:58:54 +03:00
Bogdan Dobrelya	cb2e5ac776	Drop linux capabilities and rework users/groups * Drop linux capabilities for unprivileged containerized worlkoads Kargo configures for deployments. * Configure required securityContext/user/group/groups for kube components' static manifests, etcd, calico-rr and k8s apps, like dnsmasq daemonset. * Rework cloud-init (etcd) users creation for CoreOS. * Fix nologin paths, adjust defaults for addusers role and ensure supplementary groups membership added for users. * Add netplug user for network plugins (yet unused by privileged networking containers though). * Grant the kube and netplug users read access for etcd certs via the etcd certs group. * Grant group read access to kube certs via the kube cert group. * Remove priveleged mode for calico-rr and run it under its uid/gid and supplementary etcd_cert group. * Adjust docs. * Align cpu/memory limits and dropped caps with added rkt support for control plane. Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>	2017-01-20 08:50:42 +01:00
Matthew Mosesohn	8ce32eb3e1	Merge pull request #905 from galthaus/async-runs Add tasks to ensure that the first nodes have their directories for cert gen	2017-01-19 18:32:27 +03:00
Greg Althaus	0d44599a63	Add explicit name printing in task names for deletgated task during cert creation	2017-01-18 14:06:50 -06:00
Sergii Golovatiuk	43fa72b7b7	Flush handlers before etcd restart systemctl daemon-reload should be run before when task modifies/creates union for etcd. Otherwise etcd won't be able to start Closes #892 Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>	2017-01-17 15:04:25 +01:00
Greg Althaus	6c69da1573	This PR adds/or modifies a few tasks to allow for the playbook to be run by limit on each node without regard for order. The changes make sure that all of the directories needed to do certificate management are on the master[0] or etcd[0] node regardless of when the playbook gets run on each node. This allows for separate ansible playbook runs in parallel that don't have to be synchronized.	2017-01-14 23:24:34 -06:00
Greg Althaus	95bf380d07	If the inventory name of the host exceeds 63 characters, the openssl tools will fail to create signing requests because the CN is too long. This is mainly a problem when FQDNs are used in the inventory file. THis will truncate the hostname for the CN field only at the first dot. This should handle the issue for most cases.	2017-01-13 10:02:23 -06:00
Aleksandr Didenko	d9539e0f27	Fix etcd cert generation for calico-rr role "etcd_node_cert_data" variable is undefinded for "calico-rr" role. This patch adds "calico-rr" nodes to task where "etcd_node_cert_data" variable is registered.	2017-01-09 12:06:25 +01:00
Bogdan Dobrelya	5af2c42bde	Better fix for different CoreOS os family facts Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>	2017-01-05 16:32:08 +01:00
Bogdan Dobrelya	f7447837c5	Rename CoreOS fact Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>	2017-01-05 14:02:29 +01:00
Brad Beam	4b6f29d5e1	Adding kubelet in rkt	2017-01-03 14:49:48 -06:00
Brad Beam	8dc19374cc	Allowing etcd to run via rkt	2017-01-03 10:10:38 -06:00
Bogdan Dobrelya	58062be2a3	Drop non systemd OS types support Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>	2017-01-02 12:14:03 +01:00
Matthew Mosesohn	1f9f885379	Fix etcd cert generation to support large deployments Due to bash max args limits, we should pass all node filenames and base64-encoded tar data through stdin/stdout instead. Fixes #832	2016-12-30 12:55:26 +03:00
Bogdan Dobrelya	a56d9de502	Systemd units, limits, and bin path fixes * Add restart for weave service unit * Reuse docker_bin_dir everythere * Limit systemd managed docker containers by CPU/RAM. Do not configure native systemd limits due to the lack of consensus in the kernel community requires out-of-tree kernel patches. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-12-28 15:49:42 +01:00
Matthew Mosesohn	f0c0390646	Fix creation and sync of etcd certs Admin certs only go to etcd nodes Only generate cert-data for nodes that need sync	2016-12-28 14:21:17 +04:00
Matthew Mosesohn	e7a1949d85	Merge pull request #818 from mattymo/calico-rr-certs Fix calico-rr to use etcd certs instead of kube certs	2016-12-28 08:47:16 +03:00
Matthew Mosesohn	6d9cd2d720	Fix calico-rr to use etcd certs instead of kube certs	2016-12-27 17:04:50 +03:00
Bogdan Dobrelya	79996b557b	Rework ignore_errors to report no reds Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>	2016-12-27 13:00:50 +01:00
Matthew Mosesohn	385f7f6e75	Update etcd.j2	2016-12-22 22:29:24 +03:00
Matthew Mosesohn	9f1e3db906	Adjust etcd server certificates ETCD doesn't need cert/key options set. It only requires peer cert options.	2016-12-22 23:05:17 +04:00
Spencer Smith	b63d900625	Workaround etcdctl not yet being installed (#797 ) workaround case for etcdctl not yet being installed, only allow for return code of 0 (no error)	2016-12-22 12:41:38 -05:00
Matthew Mosesohn	ad796d188d	Individual etcd ssl certs Includes hooks for triggering calico, kubelet, and kube-apiserver restarts if etcd certs changed.	2016-12-22 13:31:11 +03:00
Matthew Mosesohn	348fc5b109	Fix etcd to-SSL upgrade and task register vars	2016-12-19 15:05:49 +03:00
Matthew Mosesohn	9cc73bdf08	Fix etcd member list when upgrading ETCD from an old version	2016-12-15 12:00:45 +04:00
Bogdan Dobrelya	a15d626771	Preconfigure DNS stack and docker early In order to enable offline/intranet installation cases: * Move DNS/resolvconf configuration to preinstall role. Remove skip_dnsmasq_k8s var as not needed anymore. * Preconfigure DNS stack early, which may be the case when downloading artifacts from intranet repositories. Do not configure K8s DNS resolvers for hosts /etc/resolv.conf yet early (as they may be not existing). * Reconfigure K8s DNS resolvers for hosts only after kubedns/dnsmasq was set up and before K8s apps to be created. * Move docker install task to early stage as well and unbind it from the etcd role's specific install path. Fix external flannel dependency on docker role handlers. Also fix the docker restart handlers' steps ordering to match the expected sequence (the socket then the service). * Add default resolver fact, which is the cloud provider specific and remove hardcoded GCE resolver. * Reduce default ndots for hosts /etc/resolv.conf to 2. Multiple search domains combined with high ndots values lead to poor performance of DNS stack and make ansible workers to fail very often with the "Timeout (12s) waiting for privilege escalation prompt:" error. * Update docs. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-12-09 17:30:55 +01:00
Bogdan Dobrelya	8cc84e132a	Add tags Add tags to allow more granular tasks filtering. Add generator script for MD formatted tags found. Add docs for tags how-to. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-12-09 12:14:28 +01:00
Spencer Smith	8870178a2d	Merge pull request #627 from kubernetes-incubator/issue-626 add restart flag for docker run kubelet	2016-12-06 08:47:18 -08:00
Dan Bode	ff675d40f9	Ensure that etcd health checks always pass in the etcd handler, the reload etcd action was called after ansible waits for etcd to be up, this means that the health checks which are called immediately after fail (resulting in the etcd role always failing and never finishing) This patch changes the order to move the 'wait for etcd up' resource after the 'reload etcd resource', ensuring that the service is up before the health check is called.	2016-11-18 14:15:00 -08:00
Spencer Smith	0eebe43c08	updated all instances of restart always to restart on-failure with a max of 5 times	2016-11-18 14:33:22 -05:00
Matthew Mosesohn	46ee9faca9	Fix ca certificate loading on CoreOS	2016-11-14 08:47:09 +04:00
Matthew Mosesohn	a32cd85eb7	Add etcd TLS support	2016-11-09 18:38:28 +03:00
Matthew Mosesohn	95b460ae94	Remove etcd-proxy from all nodes and use etcd multiaccess	2016-11-09 13:31:12 +03:00
Smaine Kahlouch	4c0bf6225a	Merge pull request #562 from kubespray/enable_standalone_node Enable standalone node deployment	2016-10-24 13:10:53 +02:00
Matthew Mosesohn	65d2a3b0e5	Use only native cachable hostvars for etcd set_facts	2016-10-21 14:39:58 +03:00
Chad Swenson	e6902d8ecc	Use absolute path for etcdctl Small fix. The shell module won't automatically resolve the path to the etcdctl binary, so i prefixed with {{ bin_dir }}/	2016-10-20 14:56:52 -05:00
Bogdan Dobrelya	390764c2b4	Add retry_stagger var for failed download/pushes. * Add the retry_stagger var to tweak push and retry time strategies. * Add large deployments related docs. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-09-15 16:43:58 +02:00
Bogdan Dobrelya	422428908a	Download containers and save all Move version/repo vars to download role. Add container to download params, which overrides url/source_url, if enabled. Fix networking plugins download depending on kube_network_plugin. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-09-15 16:43:56 +02:00
Bogdan Dobrelya	6fdcaa1a63	Add retries for copying binaries from containers Closes issue: https://github.com/kubespray/kargo/issues/479 Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-09-13 15:09:34 +02:00
Smaine Kahlouch	311baeed5d	Merge pull request #448 from kubespray/etcdnosync Add --no-sync to etcdctl member list	2016-08-29 18:58:14 +02:00
Matthew Mosesohn	256a4e1f29	Rebase etcd to v3.0.6 Fixes #450	2016-08-29 15:31:05 +03:00
Matthew Mosesohn	1345dd07f7	Add --no-sync to etcdctl member list Fixes #447	2016-08-29 12:51:43 +03:00
Bogdan Dobrelya	8168689caa	Refactor roles and hosts Shorten deployment time with: - Remove redundand roles if duplicated by a dependency and vice versa - When a member of k8s-cluster, always install docker as a dependency of the etcd role and drop the docker role from cluster.yaml. - Drop etcd and node role dependencies from master role as they are covered by the node role in k8s-cluster group as well. Copy defaults for master from node role. - Decouple master, node, secrets roles handlers and vars to be used w/o cross references. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-08-25 13:27:57 +02:00
Matthew Mosesohn	0c953101ff	Fix init scripts for etcd. Fixes #383 Fixes Ubuntu 14.04 deployment of etcd.	2016-08-15 14:09:42 +03:00
Matthew Mosesohn	5668e5f767	Fix etcd restart and handler systemd tasks Changed Wants=docker.service to docker.socket Renamed handlers for reloading systemd to contain role in task name.	2016-07-29 16:32:35 +03:00
Matthew Mosesohn	90fc407420	Fix etcd user for etcd-proxy service Only affects sys V OSes (Ubuntu 14.04) Fixes ##383	2016-07-27 11:54:47 +03:00
Matthew Mosesohn	1b1f5f22d4	Fix etcd standalone deployment etcd facts are generated in kubernetes/preinstall, so etcd nodes need to be evaluated first before the rest of the deployment. Moved several directory facts from kubernetes/node to kubernetes/preinstall because they are not backward dependent.	2016-07-26 18:15:06 +03:00
mattymo	8141b72d5e	Merge branch 'master' into etcddockerdefault	2016-07-20 19:16:47 +03:00
Matthew Mosesohn	7a86b6c73e	Set default etcd deployment to docker Improved docker reload command to wait for etcd to be up before proceeding. Switched reload to run restart because it can't reload if it is not guaranteed to be in running state.	2016-07-20 18:26:16 +03:00
Bogdan Dobrelya	a76e5dbb11	Fix set_facts visibility Move set_facts to the preinstall scope, so every role may see it. For example, network plugins to see the etcd_endpoint. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-07-20 11:41:09 +02:00
Bogdan Dobrelya	32cd6e99b2	Add etcd proxy support * Enforce a etcd-proxy role to a k8s-cluster group members. This provides an HA layout for all of the k8s cluster internal clients. * Proxies to be run on each node in the group as a separate etcd instances with a readwrite proxy mode and listen the given endpoint, which is either the access_ip:2379 or the localhost:2379. * A notion for the 'kube_etcd_multiaccess' is: ignore endpoints and loadbalancers and use the etcd members IPs as a comma-separated list. Otherwise, clients shall use the local endpoint provided by a etcd-proxy instances on each etcd node. A Netwroking plugins always use that access mode. * Fix apiserver's etcd servers args to use the etcd_access_endpoint. * Fix networking plugins flannel/calico to use the etcd_endpoint. * Fix name env var for non masters to be set as well. * Fix etcd_client_url was not used anywhere and other etcd_* facts evaluation was duplicated in a few places. * Define proxy modes only in the env file, if not a master. Del an automatic proxy mode decisions for etcd nodes in init/unit scripts. * Use Wants= instead of Requires= as "This is the recommended way to hook start-up of one unit to the start-up of another unit" * Make apiserver/calico Wants= etcd-proxy to keep it always up Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com> Co-authored-by: Matthew Mosesohn <mmosesohn@mirantis.com>	2016-07-19 14:09:40 +02:00
Bogdan Dobrelya	0b874e8db2	Fix systemd service unit for etcd See https://github.com/coreos/etcd/issues/4308 Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>	2016-07-15 16:22:17 +02:00
Smana	ab8fdba484	deployment idempotent	2016-07-14 21:33:24 +02:00
Matthew Mosesohn	b3282cd0bb	Add optional deployment mode for Docker etcd_deployment_type Running etcd in Docker reduces the number of individual file downloads and services running on the host. Note: etcd container v3.0.1 moves bindir to /usr/local/bin Fixes: #298	2016-07-07 19:31:28 +03:00
Smana	40fbb3691d	uprade to etcd v3.0.1	2016-07-02 14:14:32 +02:00
Smana	536454b079	upgrade etcd version to 2.3.7	2016-06-28 12:31:57 +02:00
Evgeny L	0500f27db8	Scale-up functionality for etcd cluster * Set ETCD_INITIAL_CLUSTER_STATE from `new` to `existing`, because parameter `new` makes sense only on cluster assembly stage. * If cluster exists and current node is not a part of the cluster, add it with command `etcdctl add member name url`. Closes kubespray/kargo/#270	2016-05-31 18:23:46 +03:00
Paul Czarkowski	7de87d958e	turn adduser/download roles into meta roles This should make things a little more composable, by making these roles meta roles that perform no actions by default we allow each role to own its own resources.	2016-05-22 17:25:52 -05:00
David Reuss	180f2d1fde	Pull correct variable for etcd initial variable This shouldn't use the `inventory_hostname` variable, as that will just yield the same variable, but rather use the `host` which we're looping over.	2016-04-29 14:37:01 +02:00
Christopher M Luciano	47982ea21c	Use ansible array format instead of dot-notation. This fixes the ansible error ```'dict object' has no attribute 'ansible_default_ipv4'"}```. Closes #215	2016-04-25 08:45:58 -04:00
Smana	fca384e24c	first version of CoreOS on GCE Please enter the commit message for your changes. Lines starting	2016-02-21 00:06:36 +01:00
Smana	b013b125bc	Upgrade Calico and etcd	2016-02-15 12:41:27 +01:00
Smana	a649aa8b7e	use ansible_service_mgr to detect init system	2016-02-13 11:46:53 +01:00
Smaine Kahlouch	6358cf788f	etcd initd startup command fix	2016-01-30 22:31:41 +01:00
Greg Althaus	bedcca922c	Add variables and defaults for multiple types of ip addresses. Each node can have 3 IPs. 1. ansible_default_ip4 - whatever ansible things is the first IPv4 address usually with the default gw. 2. ip - An address to use on the local node to bind listeners and do local communication. For example, Vagrant boxes have a first address that is the NAT bridge and is common for all nodes. The second address/interface should be used. 3. access_ip - An address to use for node-to-node access. This is assumed to be used by other nodes to access the node and may not be actually assigned on the node. For example, AWS public ip that is not assigned to node. This updates the places addresses are used to use either ip or access_ip and walk up the list to find an address.	2016-01-27 16:05:39 -06:00
Antoine Legrand	b9781fa7c2	Symlink dnsmasq conf	2016-01-26 00:30:29 +01:00
Smaine Kahlouch	baaa6efc2b	workaround_ha_apiserver	2016-01-25 12:07:32 +01:00
ant31	56b92812fa	Fix systemd reload and calico unit	2016-01-25 10:54:07 +01:00
Smaine Kahlouch	4984b57aa2	use rsync instead of command	2016-01-23 18:26:07 +01:00
Smaine Kahlouch	283c4169ac	run apiserver as a service reorder master handlers typo for sysvinit	2016-01-23 14:21:04 +01:00
Smaine Kahlouch	cb59559835	use command instead of synchronize	2016-01-22 16:37:07 +01:00
Antoine Legrand	078b67c50f	Remove downloader host	2016-01-22 09:59:39 +01:00
Greg Althaus	28e530e005	Fix etcd synchronize to other nodes from the downloader	2016-01-21 11:21:25 -06:00
Smaine Kahlouch	9715962356	etcd directly in host fix etcd configuration for nodes fix wrong calico checksums using a var name etcd_bin_dir fix etcd handlers for sysvinit using a var name etcd_bin_dir sysvinit script review etcd configuration	2016-01-21 11:36:11 +01:00
Smaine Kahlouch	8127e8f8e8	Flannel running as pod	2016-01-15 13:03:27 +01:00
Smaine Kahlouch	2bd6b83656	increase etcd timeout value again	2015-12-30 14:02:22 +01:00
Smaine Kahlouch	3f3b03bc99	increase timeout value for etcd wait_for	2015-12-29 21:37:17 +01:00
Antoine Legrand	5c15d14f12	Run etcd as pod	2015-12-28 22:04:39 +01:00
Smaine Kahlouch	7315d33e3c	use ip for etcd proxies even when hostnames are used in the inventory	2015-12-21 14:24:10 +01:00
Smaine Kahlouch	ab694ee291	Install python-httplib2 required packaged	2015-12-21 12:00:42 +01:00
Smaine Kahlouch	b155e8cc7b	Fix error in ETCD_INITIAL_CLUSTER loop	2015-12-18 11:22:56 +01:00
Antoine Legrand	3c450191ea	User etcd node ip in initial cluster	2015-12-17 22:47:19 +01:00
Antoine Legrand	184bb8c94d	Use 0755 mode for binaries	2015-12-17 22:46:50 +01:00
Smaine Kahlouch	3a349b8519	Using var file for etcd service	2015-12-16 21:43:29 +01:00
Smaine Kahlouch	e2984b4fdb	ha etcd with calico	2015-12-15 11:49:11 +01:00
Smaine Kahlouch	ef8a46b8c5	Doesn't manage firewall, note: has to be disabled before running the playbook	2015-12-12 19:37:08 +01:00
Smaine Kahlouch	3014dfef24	Clustering etcd for ha masters	2015-12-12 19:37:08 +01:00
ant31	c352df6fc8	Add Backup	2015-11-20 11:18:37 +01:00
Smaine Kahlouch	00c562828f	Initial commit	2015-10-03 22:19:50 +02:00

... 2 3 4 5 6 ...

324 Commits (e250bb65bb08fb142b4911e00f0873379bd51b06)