Deploy a Production Ready Kubernetes Cluster

ansible aws bare-metal gce hacktoberfest high-availability k8s-sig-cluster-lifecycle kubernetes kubernetes-cluster kubespray

Go to file

Kenichi Omichi c005c90746 Remove unnecessary failed_when (#7120 ) TASK [Generate a list of information about the images on a node] registers list of container images to docker_images. Then the next TASK [Set pull_required if the desired image is not yet loaded] does based on expecting images are registered. However sometimes the first TASK was failed as [1] but the failure is ignored due to failed_when:false and it makes another issue. This removes this unnecessary failed_when to detect the failure at the point. In addition, this removes no_log:true also because the output doesn't contain any sensitive data and now it just makes debugging difficult. [1]: https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/jobs/934714534#L2953		2021-01-11 08:49:10 -08:00
.github	Update triage/support label references to kind/support (#6792 )	2020-10-05 14:38:20 -07:00
.gitlab-ci	Ignore all .git* for mardownlint (#7109 )	2021-01-07 10:16:53 -08:00
contrib	Fix mardownlint failures of offline (#7108 )	2021-01-06 23:45:45 -08:00
docs	Update containerd documentation with etcd change (#7126 )	2021-01-11 06:39:08 -08:00
extra_playbooks	Add noqa and disable .ansible-lint global exclusions (#6410 )	2020-07-27 06:24:17 -07:00
inventory	Add ping_access_ip; allows to disable ping test (#7020 )	2021-01-11 06:15:08 -08:00
library	Add snapshot-controller for CSI drivers and snapshot CRDs, add a default volumesnapshotclass when running cinder CSI (#6537 )	2020-09-03 04:01:43 -07:00
logo	Add logo folders (#4515 )	2019-04-12 11:00:47 -07:00
roles	Remove unnecessary failed_when (#7120 )	2021-01-11 08:49:10 -08:00
scripts	Fix order of OS CI cleanup (#6714 )	2020-09-18 16:20:28 -07:00
test-infra	Update kubevirt Centos7 from 1809 to 2003 (#6823 )	2020-11-14 12:25:04 -08:00
tests	Allow containerd root and state path to be configured (#7098 )	2021-01-05 07:13:58 -08:00
.ansible-lint	Add noqa and disable .ansible-lint global exclusions (#6410 )	2020-07-27 06:24:17 -07:00
.editorconfig	Add .editorconfig file (#6307 )	2020-06-29 12:39:59 -07:00
.gitignore	Add `plugins/mitogen` to `.gitignore` (#6774 )	2020-10-01 16:03:21 -07:00
.gitlab-ci.yml	containerd docker hub registry mirror support (#6962 )	2020-11-30 00:22:49 -08:00
.gitmodules	Remove submodules	2016-03-04 16:14:01 +01:00
.markdownlint.yaml	Add markdown CI (#5380 )	2019-12-04 07:22:57 -08:00
.nojekyll	Publish docs with docsify (#4193 )	2019-02-07 04:52:08 -08:00
.yamllint	yamllint: ignore .git dir (#6667 )	2020-09-11 02:06:14 -07:00
CNAME	Update CNAME	2019-02-07 16:30:25 +03:00
CONTRIBUTING.md	Fix to use ansible-lint instead of ansible-lint.sh (#7047 )	2020-12-17 07:21:09 -08:00
Dockerfile	Remove last 1.19.5 references (#7107 )	2021-01-06 08:43:51 -08:00
LICENSE	Create LICENSE	2016-03-01 15:37:01 +01:00
Makefile	rename mitogen playbook inside makefile (#6025 )	2020-04-27 01:13:29 -07:00
OWNERS	Move some approvers to emeritus status (#6966 )	2020-12-10 01:40:54 -08:00
OWNERS_ALIASES	Move some approvers to emeritus status (#6966 )	2020-12-10 01:40:54 -08:00
README.md	Require 2.9.0 <= Ansible version < 2.10.0 (#7130 )	2021-01-11 07:49:11 -08:00
RELEASE.md	Add Dockerfile for vagrant image (#5977 )	2020-04-18 13:53:36 -07:00
SECURITY_CONTACTS	Update security contacts (#5719 )	2020-03-06 10:47:24 -08:00
Vagrantfile	Vagrantfile: Fix incorrect references to 'rhel' variable as 'redhat' (#6967 )	2020-12-01 01:22:50 -08:00
_config.yml	Add .editorconfig file (#6307 )	2020-06-29 12:39:59 -07:00
ansible.cfg	Do not display skipped hosts/tasks. (#5620 )	2020-02-19 02:38:25 -08:00
ansible_version.yml	Require 2.9.0 <= Ansible version < 2.10.0 (#7130 )	2021-01-11 07:49:11 -08:00
cluster.yml	Only setup *_PROXY env variables where needed (#7095 )	2021-01-11 07:21:08 -08:00
code-of-conduct.md	Update code-of-conduct.md	2017-12-20 14:12:38 -05:00
facts.yml	Gather just the necessary facts (#5955 )	2020-04-17 16:23:36 -07:00
index.html	Add logo folders (#4515 )	2019-04-12 11:00:47 -07:00
mitogen.yml	add strategy mitogen_linear when installed mitogen (#5985 )	2020-04-24 05:20:07 -07:00
recover-control-plane.yml	bump minimal ansible version to 2.8.0 (#5984 )	2020-04-22 13:33:44 -07:00
remove-node.yml	Only setup *_PROXY env variables where needed (#7095 )	2021-01-11 07:21:08 -08:00
requirements.txt	update ansible dependecy (#7128 )	2021-01-11 01:39:06 -08:00
reset.yml	Only setup *_PROXY env variables where needed (#7095 )	2021-01-11 07:21:08 -08:00
scale.yml	Only setup *_PROXY env variables where needed (#7095 )	2021-01-11 07:21:08 -08:00
setup.cfg	library files added to setup.cfg (#5274 )	2019-11-11 03:59:41 -08:00
setup.py	Add pbr build configuration	2017-08-18 12:56:01 +02:00
upgrade-cluster.yml	Only setup *_PROXY env variables where needed (#7095 )	2021-01-11 07:21:08 -08:00

README.md

Deploy a Production Ready Kubernetes Cluster

If you have questions, check the documentation at kubespray.io and join us on the kubernetes slack, channel #kubespray. You can get your invite here

Can be deployed on AWS, GCE, Azure, OpenStack, vSphere, Packet (bare metal), Oracle Cloud Infrastructure (Experimental), or Baremetal
Highly available cluster
Composable (Choice of the network plugin for instance)
Supports most popular Linux distributions
Continuous integration tests

Quick Start

To deploy the cluster you can use :

Ansible

Usage

# Install dependencies from ``requirements.txt``
sudo pip3 install -r requirements.txt

# Copy ``inventory/sample`` as ``inventory/mycluster``
cp -rfp inventory/sample inventory/mycluster

# Update Ansible inventory file with inventory builder
declare -a IPS=(10.10.1.3 10.10.1.4 10.10.1.5)
CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}

# Review and change parameters under ``inventory/mycluster/group_vars``
cat inventory/mycluster/group_vars/all/all.yml
cat inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml

# Deploy Kubespray with Ansible Playbook - run the playbook as root
# The option `--become` is required, as for example writing SSL keys in /etc/,
# installing packages and interacting with various systemd daemons.
# Without --become the playbook will fail to run!
ansible-playbook -i inventory/mycluster/hosts.yaml  --become --become-user=root cluster.yml

Note: When Ansible is already installed via system packages on the control machine, other python packages installed via sudo pip install -r requirements.txt will go to a different directory tree (e.g. /usr/local/lib/python2.7/dist-packages on Ubuntu) from Ansible's (e.g. /usr/lib/python2.7/dist-packages/ansible still on Ubuntu). As a consequence, ansible-playbook command will fail with:

ERROR! no action detected in task. This often indicates a misspelled module name, or incorrect module path.

probably pointing on a task depending on a module present in requirements.txt (i.e. "unseal vault").

One way of solving this would be to uninstall the Ansible package and then, to install it via pip but it is not always possible. A workaround consists of setting ANSIBLE_LIBRARY and ANSIBLE_MODULE_UTILS environment variables respectively to the ansible/modules and ansible/module_utils subdirectories of pip packages installation location, which can be found in the Location field of the output of pip show [package] before executing ansible-playbook.

Vagrant

For Vagrant we need to install python dependencies for provisioning tasks. Check if Python and pip are installed:

python -V && pip -V

If this returns the version of the software, you're good to go. If not, download and install Python from here https://www.python.org/downloads/source/ Install the necessary requirements

sudo pip install -r requirements.txt
vagrant up

Documents

Supported Linux Distributions

Flatcar Container Linux by Kinvolk
Debian Buster, Jessie, Stretch, Wheezy
Ubuntu 16.04, 18.04, 20.04
CentOS/RHEL 7, 8 (experimental: see centos 8 notes)
Fedora 31, 32
Fedora CoreOS (experimental: see fcos Note)
openSUSE Leap 42.3/Tumbleweed
Oracle Linux 7, 8 (experimental: centos 8 notes apply)

Note: Upstart/SysV init based OS types are not supported.

Supported Components

Core
- kubernetes v1.19.6
- etcd v3.4.13
- docker v19.03 (see note)
- containerd v1.3.9
- cri-o v1.19 (experimental: see CRI-O Note. Only on fedora, ubuntu and centos based OS)
Network Plugin
- cni-plugins v0.9.0
- calico v3.16.5
- canal (given calico/flannel versions)
- cilium v1.8.6
- flanneld v0.13.0
- kube-ovn v1.5.2
- kube-router v1.1.1
- multus v3.6.0
- ovn4nfv v1.1.0
- weave v2.7.0
Application
- ambassador: v1.5
- cephfs-provisioner v2.1.0-k8s1.11
- rbd-provisioner v2.1.1-k8s1.11
- cert-manager v0.16.1
- coredns v1.7.0
- ingress-nginx v0.41.2

Note: The list of validated docker versions is 1.13.1, 17.03, 17.06, 17.09, 18.06, 18.09 and 19.03. The recommended docker version is 19.03. The kubelet might break on docker's non-standard version numbering (it no longer uses semantic versioning). To ensure auto-updates don't break your cluster look into e.g. yum versionlock plugin or apt pin).

Requirements

Minimum required version of Kubernetes is v1.17
Ansible v2.9.x, Jinja 2.11+ and python-netaddr is installed on the machine that will run Ansible commands, Ansible 2.10.x is not supported for now
The target servers must have access to the Internet in order to pull docker images. Otherwise, additional configuration is required (See Offline Environment)
The target servers are configured to allow IPv4 forwarding.
The firewalls are not managed, you'll need to implement your own rules the way you used to. in order to avoid any issue during deployment you should disable your firewall.
If kubespray is ran from non-root user account, correct privilege escalation method should be configured in the target servers. Then the ansible_become flag or command parameters --become or -b should be specified.

Hardware: These limits are safe guarded by Kubespray. Actual requirements for your workload can differ. For a sizing guide go to the Building Large Clusters guide.

Master
- Memory: 1500 MB
Node
- Memory: 1024 MB

Network Plugins

You can choose between 10 network plugins. (default: calico, except Vagrant uses flannel)

flannel: gre/vxlan (layer 2) networking.
Calico is a networking and network policy provider. Calico supports a flexible set of networking options designed to give you the most efficient networking across a range of situations, including non-overlay and overlay networks, with or without BGP. Calico uses the same engine to enforce network policy for hosts, pods, and (if using Istio and Envoy) applications at the service mesh layer.
canal: a composition of calico and flannel plugins.
cilium: layer 3/4 networking (as well as layer 7 to protect and secure application protocols), supports dynamic insertion of BPF bytecode into the Linux kernel to implement security services, networking and visibility logic.
ovn4nfv: ovn4nfv-k8s-plugins is the network controller, OVS agent and CNI server to offer basic SFC and OVN overlay networking.
weave: Weave is a lightweight container overlay network that doesn't require an external K/V database cluster. (Please refer to weave troubleshooting documentation).
kube-ovn: Kube-OVN integrates the OVN-based Network Virtualization with Kubernetes. It offers an advanced Container Network Fabric for Enterprises.
kube-router: Kube-router is a L3 CNI for Kubernetes networking aiming to provide operational simplicity and high performance: it uses IPVS to provide Kube Services Proxy (if setup to replace kube-proxy), iptables for network policies, and BGP for ods L3 networking (with optionally BGP peering with out-of-cluster BGP peers). It can also optionally advertise routes to Kubernetes cluster Pods CIDRs, ClusterIPs, ExternalIPs and LoadBalancerIPs.
macvlan: Macvlan is a Linux network driver. Pods have their own unique Mac and Ip address, connected directly the physical (layer 2) network.
multus: Multus is a meta CNI plugin that provides multiple network interface support to pods. For each interface Multus delegates CNI calls to secondary CNI plugins such as Calico, macvlan, etc.

The choice is defined with the variable kube_network_plugin. There is also an option to leverage built-in cloud provider networking instead. See also Network checker.

Ingress Plugins

ambassador: the Ambassador Ingress Controller and API gateway.
nginx: the NGINX Ingress Controller.

Community docs and resources

Tools and projects on top of Kubespray

CI Tests

CI/end-to-end tests sponsored by: CNCF, Packet, OVHcloud, ELASTX.

See the test matrix for details.