* rename ansible groups to use _ instead of -
k8s-cluster -> k8s_cluster
k8s-node -> k8s_node
calico-rr -> calico_rr
no-floating -> no_floating
Note: kube-node,k8s-cluster groups in upgrade CI
need clean-up after v2.16 is tagged
* ensure old groups are mapped to the new ones
* Remove contrib/vault
This is marked as broken since 2018 / 3dcb914607
This still reference apiserver.pem, not used since ddffdb63bf
Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
* Finish nuking vault from the codebase
Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
By default Ansible stat module compute checksum, list extended attributes and find mime type
To find all stat invocations that really use one of those:
git grep -F stat. | grep -vE 'stat.(islnk|exists|lnk_source|writeable)'
Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
* Added force_etcd_cert_refresh var to maintain existing functionality. Broke out etcd node cert syncing from member and admin cert sync logic. Now first etcd will sync node certs to other etcd members on every run to keep all etcds up to date after adding additional worker nodes to the cluster
* Updated etcd cert check tasks to better detect when new certificates need to be generated
* Move usage of force_etcd_cert_refresh var to gen_certs fact set
* Force etcd cert generation per server if force_etcd_cert_refresh is set to true
* Include gathering of node certs even if k8s-cluster member and in etcd group.
* Removed run_once due to when statement
* etcd: etcd-events doesn't depend on etcd_cluster_setup
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* etcd: remove condition already present on include_tasks
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* etcd: fix scaling up
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* etcd: use *access_addresses, do not delegate to etcd[0]
We want to wait for the full cluster to be healthy,
so use all the cluster addresses
Also we should be able to run the playbook when etcd[0] is down
(not tested), so do not delegate to etcd[0]
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* etcd: use failed_when for health check
unhealthy cluster is expected on first run, so use failed_when
instead of ignore_errors to remove scary red messages
Also use run_once
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* kubernetes/preinstall: ensure ansible_fqdn is up to date after changing /etc/hosts
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* kubernetes/master: regenerate apiserver cert if needed
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
The variable is defined in `kubernetes/preinstall` role and used in several roles. Since `kubernetes/preinstall` is not always included when `ansible-playbook` is run with tag selectors (see #5734 for reason), they will fail, or individual roles must copy the same fact definitions (as in #3846). Moving the definition to the always-included `kubespray-defaults` role will resolve the dependency problem.
- This solves issue #5721 & #5713 (dupes)
- Provide a cleaner default usage pattern for the download role
around etcd that supports 'host' and 'docker' properly
- Extract the 'etcdctl' as a separate task install piece and reuse it where
appropriate
- Update the kubeadm-etcd task to reflect the above change
* fedora coreos support
- bootstrap and new fact for
* fedora coreos support
- fix bootstrap condition
* fedora coreos support
- allow customize packages for fedora coreos bootstrap
* fedora coreos support
- prevent install ptyhon3 and epel via dnf for fedora coreos
* fedora coreos support
- handle all ostree like os in same way
* fedora coreos support
- handle all ostree like os in same way for crio
* fedora coreos support
- add fcos documentations
* Fix recover-control-plane to work with etcd 3.3.x and add CI
* Set default values for testcase
* Add actual test jobs
* Attempt to satisty gitlab ci linter
* Fix ansible targets
* Set etcd_member_name as stated in the docs...
* Recovering from 0 masters is not supported yet
* Add other master to broken_kube-master group as well
* Increase number of retries to see if etcd needs more time to heal
* Make number of retries for ETCD loops configurable, increase it for recovery CI and document it
We don't need to support upgrades from 2 year old installs,
just from the last major version.
Also changed most retried tasks to 1s delay instead of longer.
Setting host_architecture to allow etcd upgrade working through: ansible-playbook -b -i inventory/sample/hosts.ini cluster.yml --tags=etcd (on other case host_architecture is missing)
ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version
remove empty when line
ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version
force kubeadm upgrade due to failure without --force flag
ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version
added nodeSelector to have compatibility with hybrid cluster with win nodes, also fix for download with missing container type
fixes in syntax and LF for newline in files
fix on yamllint check
ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version
some cleanup for innecesary lines
remove conditions for nodeselector
* Move front-proxy-client certs back to kube mount
We want the same CA for all k8s certs
* Refactor vault to use a third party module
The module adds idempotency and reduces some of the repetitive
logic in the vault role
Requires ansible-modules-hashivault on ansible node and hvac
on the vault hosts themselves
Add upgrade test scenario
Remove bootstrap-os tags from tasks
* fix upgrade issues
* improve unseal logic
* specify ca and fix etcd check
* Fix initialization check
bump machine size
The current way to setup the etc cluster is messy and buggy.
- It checks for cluster is healthy before the cluster is even created.
- The unit files are started on handlers, not in the task, so you mess with "flush handlers".
- The join_member.yml is not used.
- etcd events cluster is not configured for kubeadm
- remove duplicate runs between running the role on etcd nodes and k8s nodes