Deploy a Production Ready Kubernetes Cluster
 
 
 
 
 
Go to file
Ilya Margolin cc6cbfbe71
Allow disabling calico CNI logs with calico_cni_log_file_path (#8921)
* Allow disabling calico CNI logs with calico_cni_log_file_path

Calico CNI logs up to 1G if it log a lot with current default settings:
log_file_max_size	100	Max file size in MB log files can reach before they are rotated.
log_file_max_age	30	Max age in days that old log files will be kept on the host before they are removed.
log_file_max_count	10	Max number of rotated log files allowed on the host before they are cleaned up.

See https://projectcalico.docs.tigera.io/reference/cni-plugin/configuration#logging

To save disk space, make the path configurable and allow disabling this log by setting
`calico_cni_log_file_path: false`

* Fix markdown

* Update roles/network_plugin/canal/templates/cni-canal.conflist.j2

Co-authored-by: Kenichi Omichi <ken1ohmichi@gmail.com>

Co-authored-by: Kenichi Omichi <ken1ohmichi@gmail.com>
2022-06-07 09:22:56 -07:00
.github Update triage/support label references to kind/support (#6792) 2020-10-05 14:38:20 -07:00
.gitlab-ci [CI] add remove node job 2022-05-04 06:35:51 -07:00
contrib Update docs for using venv (#8842) 2022-05-19 23:39:12 -07:00
docs Allow disabling calico CNI logs with calico_cni_log_file_path (#8921) 2022-06-07 09:22:56 -07:00
extra_playbooks Added playbook to wait for cloud-init to finish (#8799) 2022-05-09 10:49:19 -07:00
inventory support reserve ephemeral-storage (#8895) 2022-06-06 07:34:26 -07:00
library Add snapshot-controller for CSI drivers and snapshot CRDs, add a default volumesnapshotclass when running cinder CSI (#6537) 2020-09-03 04:01:43 -07:00
logo Add logo folders (#4515) 2019-04-12 11:00:47 -07:00
roles Allow disabling calico CNI logs with calico_cni_log_file_path (#8921) 2022-06-07 09:22:56 -07:00
scripts Added ppc64le support (#8505) 2022-02-04 00:14:00 -08:00
test-infra Add VAGRANT_ANSIBLE_TAGS for normal deployment (#8697) 2022-04-08 23:58:04 -07:00
tests Fix the invalid kube vip manifest (#8831) 2022-05-17 23:48:55 -07:00
.ansible-lint Move to Ansible 3.4.0 (#7672) 2021-07-12 00:00:47 -07:00
.editorconfig Add .editorconfig file (#6307) 2020-06-29 12:39:59 -07:00
.gitignore support reserve ephemeral-storage (#8895) 2022-06-06 07:34:26 -07:00
.gitlab-ci.yml Update KUBESPRAY_VERSION (#8922) 2022-06-05 22:08:20 +03:00
.gitmodules Remove submodules 2016-03-04 16:14:01 +01:00
.markdownlint.yaml Add markdown CI (#5380) 2019-12-04 07:22:57 -08:00
.nojekyll Publish docs with docsify (#4193) 2019-02-07 04:52:08 -08:00
.yamllint yamllint: ignore .git dir (#6667) 2020-09-11 02:06:14 -07:00
CNAME Update CNAME 2019-02-07 16:30:25 +03:00
CONTRIBUTING.md Various documentation updates (#8243) 2021-11-29 15:05:21 -08:00
Dockerfile add arch var in dockerfile (#8875) 2022-05-29 12:32:51 -07:00
LICENSE Create LICENSE 2016-03-01 15:37:01 +01:00
Makefile Mitogen: deprecate the use of mitogen and remove coverage from CI (#8147) 2021-11-05 00:57:52 -07:00
OWNERS Move some approvers to emeritus status (#6966) 2020-12-10 01:40:54 -08:00
OWNERS_ALIASES add liupeng0518 to reviewers (#8853) 2022-05-23 21:42:14 +03:00
README.md update kubespray image tag in readme to v2.19.0 (#8934) 2022-06-06 10:24:21 -07:00
RELEASE.md Update RELEASE.md (#8937) 2022-06-06 23:55:49 -07:00
SECURITY_CONTACTS Update security contacts (#5719) 2020-03-06 10:47:24 -08:00
Vagrantfile Add VAGRANT_ANSIBLE_TAGS for normal deployment (#8697) 2022-04-08 23:58:04 -07:00
_config.yml Add .editorconfig file (#6307) 2020-06-29 12:39:59 -07:00
ansible.cfg [CI] add ara to collect CI job logs (#8545) 2022-02-23 07:36:19 -08:00
ansible_version.yml [ansible] add support for ansible 5 (ansible-core 2.12) (#8512) 2022-03-28 08:49:22 -07:00
cluster.yml Fix host DNS config 1) being edited too soon and 2) not working with NM (#8575) 2022-02-26 10:29:23 -08:00
code-of-conduct.md Update code-of-conduct.md 2017-12-20 14:12:38 -05:00
facts.yml add tags: always to all included sevice playbook (#7906) 2021-08-25 12:01:54 -07:00
index.html Add logo folders (#4515) 2019-04-12 11:00:47 -07:00
legacy_groups.yml add tags: always to all included sevice playbook (#7906) 2021-08-25 12:01:54 -07:00
recover-control-plane.yml Fix etcd certificates reference to support etcd_kubeadm_enabled:true (#7766) 2022-01-10 15:24:25 -08:00
remove-node.yml Skip gathering facts when reset_nodes is false (#8843) 2022-05-19 01:04:07 -07:00
requirements-2.9.txt [ansible] update ansible and cryptography requirements to work on ubuntu 22.04 (#8826) 2022-05-16 11:14:17 -07:00
requirements-2.9.yml Exercise multiple ansible versions in CI (#8172) 2021-11-10 16:11:50 -08:00
requirements-2.10.txt Avoid yanked ruamel.yaml.clib version (#8372) 2022-01-05 08:06:41 -08:00
requirements-2.11.txt [ansible] update ansible and cryptography requirements to work on ubuntu 22.04 (#8826) 2022-05-16 11:14:17 -07:00
requirements-2.12.txt [ansible] update ansible and cryptography requirements to work on ubuntu 22.04 (#8826) 2022-05-16 11:14:17 -07:00
requirements.txt [ansible] make ansible 5.x the new default version (#8660) 2022-03-29 15:36:11 -07:00
reset.yml [systemd-resolved] Fix DNS configuration according to docs/dns-stack.md and during reset of cluster (#8560) (#8561) 2022-03-14 02:08:22 -07:00
scale.yml Fix host DNS config 1) being edited too soon and 2) not working with NM (#8575) 2022-02-26 10:29:23 -08:00
setup.cfg library files added to setup.cfg (#5274) 2019-11-11 03:59:41 -08:00
setup.py Add pbr build configuration 2017-08-18 12:56:01 +02:00
upgrade-cluster.yml Have ingress_controller and external_provisioner in upgrade-cluster.yml (#8640) 2022-03-22 05:43:43 -07:00

README.md

Deploy a Production Ready Kubernetes Cluster

Kubernetes Logo

If you have questions, check the documentation at kubespray.io and join us on the kubernetes slack, channel #kubespray. You can get your invite here

  • Can be deployed on AWS, GCE, Azure, OpenStack, vSphere, Equinix Metal (bare metal), Oracle Cloud Infrastructure (Experimental), or Baremetal
  • Highly available cluster
  • Composable (Choice of the network plugin for instance)
  • Supports most popular Linux distributions
  • Continuous integration tests

Quick Start

To deploy the cluster you can use :

Ansible

Usage

Install Ansible according to Ansible installation guide then run the following steps:

# Copy ``inventory/sample`` as ``inventory/mycluster``
cp -rfp inventory/sample inventory/mycluster

# Update Ansible inventory file with inventory builder
declare -a IPS=(10.10.1.3 10.10.1.4 10.10.1.5)
CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}

# Review and change parameters under ``inventory/mycluster/group_vars``
cat inventory/mycluster/group_vars/all/all.yml
cat inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml

# Deploy Kubespray with Ansible Playbook - run the playbook as root
# The option `--become` is required, as for example writing SSL keys in /etc/,
# installing packages and interacting with various systemd daemons.
# Without --become the playbook will fail to run!
ansible-playbook -i inventory/mycluster/hosts.yaml  --become --become-user=root cluster.yml

Note: When Ansible is already installed via system packages on the control machine, other python packages installed via sudo pip install -r requirements.txt will go to a different directory tree (e.g. /usr/local/lib/python2.7/dist-packages on Ubuntu) from Ansible's (e.g. /usr/lib/python2.7/dist-packages/ansible still on Ubuntu). As a consequence, ansible-playbook command will fail with:

ERROR! no action detected in task. This often indicates a misspelled module name, or incorrect module path.

probably pointing on a task depending on a module present in requirements.txt.

One way of solving this would be to uninstall the Ansible package and then, to install it via pip but it is not always possible. A workaround consists of setting ANSIBLE_LIBRARY and ANSIBLE_MODULE_UTILS environment variables respectively to the ansible/modules and ansible/module_utils subdirectories of pip packages installation location, which can be found in the Location field of the output of pip show [package] before executing ansible-playbook.

A simple way to ensure you get all the correct version of Ansible is to use the pre-built docker image from Quay. You will then need to use bind mounts to get the inventory and ssh key into the container, like this:

docker pull quay.io/kubespray/kubespray:v2.19.0
docker run --rm -it --mount type=bind,source="$(pwd)"/inventory/sample,dst=/inventory \
  --mount type=bind,source="${HOME}"/.ssh/id_rsa,dst=/root/.ssh/id_rsa \
  quay.io/kubespray/kubespray:v2.19.0 bash
# Inside the container you may now run the kubespray playbooks:
ansible-playbook -i /inventory/inventory.ini --private-key /root/.ssh/id_rsa cluster.yml

Vagrant

For Vagrant we need to install python dependencies for provisioning tasks. Check if Python and pip are installed:

python -V && pip -V

If this returns the version of the software, you're good to go. If not, download and install Python from here https://www.python.org/downloads/source/

Install Ansible according to Ansible installation guide then run the following step:

vagrant up

Documents

Supported Linux Distributions

  • Flatcar Container Linux by Kinvolk
  • Debian Bullseye, Buster, Jessie, Stretch
  • Ubuntu 16.04, 18.04, 20.04
  • CentOS/RHEL 7, 8
  • Fedora 34, 35
  • Fedora CoreOS (see fcos Note)
  • openSUSE Leap 15.x/Tumbleweed
  • Oracle Linux 7, 8
  • Alma Linux 8
  • Rocky Linux 8
  • Amazon Linux 2 (experimental: see amazon linux notes)

Note: Upstart/SysV init based OS types are not supported.

Supported Components

Container Runtime Notes

  • The list of available docker version is 18.09, 19.03 and 20.10. The recommended docker version is 20.10. The kubelet might break on docker's non-standard version numbering (it no longer uses semantic versioning). To ensure auto-updates don't break your cluster look into e.g. yum versionlock plugin or apt pin).
  • The cri-o version should be aligned with the respective kubernetes version (i.e. kube_version=1.20.x, crio_version=1.20)

Requirements

  • Minimum required version of Kubernetes is v1.21
  • Ansible v2.9.x, Jinja 2.11+ and python-netaddr is installed on the machine that will run Ansible commands
  • The target servers must have access to the Internet in order to pull docker images. Otherwise, additional configuration is required (See Offline Environment)
  • The target servers are configured to allow IPv4 forwarding.
  • If using IPv6 for pods and services, the target servers are configured to allow IPv6 forwarding.
  • The firewalls are not managed, you'll need to implement your own rules the way you used to. in order to avoid any issue during deployment you should disable your firewall.
  • If kubespray is ran from non-root user account, correct privilege escalation method should be configured in the target servers. Then the ansible_become flag or command parameters --become or -b should be specified.

Hardware: These limits are safe guarded by Kubespray. Actual requirements for your workload can differ. For a sizing guide go to the Building Large Clusters guide.

  • Master
    • Memory: 1500 MB
  • Node
    • Memory: 1024 MB

Network Plugins

You can choose between 10 network plugins. (default: calico, except Vagrant uses flannel)

  • flannel: gre/vxlan (layer 2) networking.

  • Calico is a networking and network policy provider. Calico supports a flexible set of networking options designed to give you the most efficient networking across a range of situations, including non-overlay and overlay networks, with or without BGP. Calico uses the same engine to enforce network policy for hosts, pods, and (if using Istio and Envoy) applications at the service mesh layer.

  • canal: a composition of calico and flannel plugins.

  • cilium: layer 3/4 networking (as well as layer 7 to protect and secure application protocols), supports dynamic insertion of BPF bytecode into the Linux kernel to implement security services, networking and visibility logic.

  • weave: Weave is a lightweight container overlay network that doesn't require an external K/V database cluster. (Please refer to weave troubleshooting documentation).

  • kube-ovn: Kube-OVN integrates the OVN-based Network Virtualization with Kubernetes. It offers an advanced Container Network Fabric for Enterprises.

  • kube-router: Kube-router is a L3 CNI for Kubernetes networking aiming to provide operational simplicity and high performance: it uses IPVS to provide Kube Services Proxy (if setup to replace kube-proxy), iptables for network policies, and BGP for ods L3 networking (with optionally BGP peering with out-of-cluster BGP peers). It can also optionally advertise routes to Kubernetes cluster Pods CIDRs, ClusterIPs, ExternalIPs and LoadBalancerIPs.

  • macvlan: Macvlan is a Linux network driver. Pods have their own unique Mac and Ip address, connected directly the physical (layer 2) network.

  • multus: Multus is a meta CNI plugin that provides multiple network interface support to pods. For each interface Multus delegates CNI calls to secondary CNI plugins such as Calico, macvlan, etc.

The choice is defined with the variable kube_network_plugin. There is also an option to leverage built-in cloud provider networking instead. See also Network checker.

Ingress Plugins

  • nginx: the NGINX Ingress Controller.

  • metallb: the MetalLB bare-metal service LoadBalancer provider.

Community docs and resources

Tools and projects on top of Kubespray

CI Tests

Build graphs

CI/end-to-end tests sponsored by: CNCF, Equinix Metal, OVHcloud, ELASTX.

See the test matrix for details.