#!/bin/bash # Create & manage k8s clusters set -o nounset set -o errexit #set -o xtrace function usage() { echo -e "\033[33mUsage:\033[0m ezctl COMMAND [args]" cat < to switch default kubeconfig of the cluster new to start a new k8s deploy with name 'cluster' setup to setup a cluster, also supporting a step-by-step way start to start all of the k8s services stopped by 'ezctl stop' stop to stop all of the k8s services temporarily upgrade to upgrade the k8s cluster destroy to destroy the k8s cluster backup to backup the cluster state (etcd snapshot) restore to restore the cluster state from backups start-aio to quickly setup an all-in-one cluster with 'default' settings Cluster ops: add-etcd to add a etcd-node to the etcd cluster add-master to add a master node to the k8s cluster add-node to add a work node to the k8s cluster del-etcd to delete a etcd-node from the etcd cluster del-master to delete a master node from the k8s cluster del-node to delete a work node from the k8s cluster Extra operation: kcfg-adm to manage client kubeconfig of the k8s cluster Use "ezctl help " for more information about a given command. EOF } function logger() { TIMESTAMP=$(date +'%Y-%m-%d %H:%M:%S') case "$1" in debug) echo -e "$TIMESTAMP \033[36mDEBUG\033[0m $2" ;; info) echo -e "$TIMESTAMP \033[32mINFO\033[0m $2" ;; warn) echo -e "$TIMESTAMP \033[33mWARN\033[0m $2" ;; error) echo -e "$TIMESTAMP \033[31mERROR\033[0m $2" ;; *) ;; esac } function help-info() { case "$1" in (setup) usage-setup ;; (add-etcd) echo -e "read more > 'https://github.com/easzlab/kubeasz/blob/master/docs/op/op-etcd.md'" ;; (add-master) echo -e "read more > 'https://github.com/easzlab/kubeasz/blob/master/docs/op/op-master.md'" ;; (add-node) echo -e "read more > 'https://github.com/easzlab/kubeasz/blob/master/docs/op/op-node.md'" ;; (del-etcd) echo -e "read more > 'https://github.com/easzlab/kubeasz/blob/master/docs/op/op-etcd.md'" ;; (del-master) echo -e "read more > 'https://github.com/easzlab/kubeasz/blob/master/docs/op/op-master.md'" ;; (del-node) echo -e "read more > 'https://github.com/easzlab/kubeasz/blob/master/docs/op/op-node.md'" ;; (kcfg-adm) usage-kcfg-adm ;; (*) echo -e "todo: help info $1" ;; esac } function usage-kcfg-adm(){ echo -e "\033[33mUsage:\033[0m ezctl kcfg-adm " cat <: -A to add a client kubeconfig with a newly created user -D to delete a client kubeconfig with the existed user -L to list all of the users -e to set expiry of the user certs in hours (ex. 24h, 8h, 240h) -t to set a user-type (admin or view) -u to set a user-name prefix examples: ./ezctl kcfg-adm test-k8s -L ./ezctl kcfg-adm default -A -e 240h -t admin -u jack ./ezctl kcfg-adm default -D -u jim-202101162141 EOF } function usage-setup(){ echo -e "\033[33mUsage:\033[0m ezctl setup " cat < /dev/null 2>&1 || { logger debug "disable registry mirrors"; registryMirror=false; } sed -i -e "s/__flannel__/$flannelVer/g" \ -e "s/__calico__/$calicoVer/g" \ -e "s/__cilium__/$ciliumVer/g" \ -e "s/__kube_ovn__/$kubeOvnVer/g" \ -e "s/__kube_router__/$kubeRouterVer/g" \ -e "s/__coredns__/$corednsVer/g" \ -e "s/__dns_node_cache__/$dnsNodeCacheVer/g" \ -e "s/__dashboard__/$dashboardVer/g" \ -e "s/__dash_metrics__/$dashboardMetricsScraperVer/g" \ -e "s/__prom_chart__/$promChartVer/g" \ -e "s/__traefik_chart__/$traefikChartVer/g" \ -e "s/^ENABLE_MIRROR_REGISTRY.*$/ENABLE_MIRROR_REGISTRY: $registryMirror/g" \ -e "s/__metrics__/$metricsVer/g" "clusters/$1/config.yml" logger debug "cluster $1: files successfully created." logger info "next steps 1: to config '$BASE/clusters/$1/hosts'" logger info "next steps 2: to config '$BASE/clusters/$1/config.yml'" } function setup() { [[ -d "clusters/$1" ]] || { logger error "invalid config, run 'ezctl new $1' first"; return 1; } [[ -f "bin/kube-apiserver" ]] || { logger error "no binaries founded, run 'ezdown -D' fist"; return 1; } PLAY_BOOK="dummy.yml" case "$2" in (01) PLAY_BOOK="01.prepare.yml" ;; (02) PLAY_BOOK="02.etcd.yml" ;; (03) PLAY_BOOK="03.runtime.yml" ;; (04) PLAY_BOOK="04.kube-master.yml" ;; (05) PLAY_BOOK="05.kube-node.yml" ;; (06) PLAY_BOOK="06.network.yml" ;; (07) PLAY_BOOK="07.cluster-addon.yml" ;; (90) PLAY_BOOK="90.setup.yml" ;; (all) PLAY_BOOK="90.setup.yml" ;; (*) usage-setup exit 1 ;; esac logger info "cluster:$1 setup step:$2 begins in 5s, press any key to abort:\n" ! (read -r -t5 -n1) || { logger warn "setup abort"; return 1; } ansible-playbook -i "clusters/$1/hosts" -e "@clusters/$1/config.yml" "playbooks/$PLAY_BOOK" || return 1 } function cmd() { [[ -d "clusters/$1" ]] || { logger error "invalid config, run 'ezctl new $1' first"; return 1; } PLAY_BOOK="dummy.yml" case "$2" in (start) PLAY_BOOK="91.start.yml" ;; (stop) PLAY_BOOK="92.stop.yml" ;; (upgrade) PLAY_BOOK="93.upgrade.yml" ;; (backup) PLAY_BOOK="94.backup.yml" ;; (restore) PLAY_BOOK="95.restore.yml" ;; (destroy) PLAY_BOOK="99.clean.yml" ;; (*) usage exit 1 ;; esac logger info "cluster:$1 $2 begins in 5s, press any key to abort:\n" ! (read -r -t5 -n1) || { logger warn "$2 abort"; return 1; } ansible-playbook -i "clusters/$1/hosts" -e "@clusters/$1/config.yml" "playbooks/$PLAY_BOOK" || return 1 } function list() { [[ -d ./clusters ]] || { logger error "cluster not found, run 'ezctl new' first"; return 1; } [[ -f ~/.kube/config ]] || { logger error "kubeconfig not found, run 'ezctl setup' first"; return 1; } which md5sum > /dev/null 2>&1 || { logger error "md5sum not found"; return 1; } CLUSTERS=$(cd clusters && echo -- *) CFG_MD5=$(md5sum -t ~/.kube/config |cut -d' ' -f1) cd "$BASE" logger info "list of managed clusters:" i=1; for c in $CLUSTERS; do if [[ -f "clusters/$c/kubectl.kubeconfig" ]];then c_md5=$(md5sum -t "clusters/$c/kubectl.kubeconfig" |cut -d' ' -f1) if [[ "$c_md5" = "$CFG_MD5" ]];then echo -e "==> cluster $i:\t$c (\033[32mcurrent\033[0m)" else echo -e "==> cluster $i:\t$c" fi let "i++" fi done } function checkout() { [[ -d "clusters/$1" ]] || { logger error "invalid config, run 'ezctl new $1' first"; return 1; } [[ -f "clusters/$1/kubectl.kubeconfig" ]] || { logger error "invalid kubeconfig, run 'ezctl setup $1' first"; return 1; } logger info "set default kubeconfig: cluster $1 (\033[32mcurrent\033[0m)" /bin/cp -f "clusters/$1/kubectl.kubeconfig" ~/.kube/config } ### in-cluster operation functions ############################## function add-node() { # check new node's address regexp [[ $2 =~ ^(2(5[0-5]{1}|[0-4][0-9]{1})|[0-1]?[0-9]{1,2})(\.(2(5[0-5]{1}|[0-4][0-9]{1})|[0-1]?[0-9]{1,2})){3}$ ]] || { logger error "Invalid ip add:$2"; return 1; } # check if the new node already exsited sed -n '/^\[kube_master/,/^\[harbor/p' "$BASE/clusters/$1/hosts"|grep "^$2[^0-9]" && { logger error "node $2 already existed in $BASE/clusters/$1/hosts"; return 2; } logger info "add $2 into 'kube_node' group" NODE_INFO="${@:2}" sed -i "/\[kube_node/a $NODE_INFO" "$BASE/clusters/$1/hosts" logger info "start to add a work node:$2 into cluster:$1" ansible-playbook -i "$BASE/clusters/$1/hosts" "$BASE/playbooks/22.addnode.yml" -e "NODE_TO_ADD=$2" -e "@clusters/$1/config.yml" } function add-master() { # check new master's address regexp [[ $2 =~ ^(2(5[0-5]{1}|[0-4][0-9]{1})|[0-1]?[0-9]{1,2})(\.(2(5[0-5]{1}|[0-4][0-9]{1})|[0-1]?[0-9]{1,2})){3}$ ]] || { logger error "Invalid ip add:$2"; return 1; } # check if the new master already exsited sed -n '/^\[kube_master/,/^\[kube_node/p' "$BASE/clusters/$1/hosts"|grep "^$2[^0-9]" && { logger error "master $2 already existed!"; return 2; } logger info "add $2 into 'kube_master' group" MASTER_INFO="${@:2}" sed -i "/\[kube_master/a $MASTER_INFO" "$BASE/clusters/$1/hosts" logger info "start to add a master node:$2 into cluster:$1" ansible-playbook -i "$BASE/clusters/$1/hosts" "$BASE/playbooks/23.addmaster.yml" -e "NODE_TO_ADD=$2" -e "@clusters/$1/config.yml" logger info "reconfigure and restart the haproxy service on 'kube_node' nodes" ansible-playbook -i "$BASE/clusters/$1/hosts" "$BASE/playbooks/05.kube-node.yml" -t restart_lb -e MASTER_CHG=yes -e "@clusters/$1/config.yml" } function add-etcd() { # check new node's address regexp [[ $2 =~ ^(2(5[0-5]{1}|[0-4][0-9]{1})|[0-1]?[0-9]{1,2})(\.(2(5[0-5]{1}|[0-4][0-9]{1})|[0-1]?[0-9]{1,2})){3}$ ]] || { logger error "Invalid ip add:$2"; return 1; } # check if the new node already exsited sed -n '/^\[etcd/,/^\[kube_master/p' "$BASE/clusters/$1/hosts"|grep "^$2[^0-9]" && { logger error "etcd $2 already existed!"; return 2; } logger info "add $2 into 'etcd' group" ETCD_INFO="${@:2}" sed -i "/\[etcd/a $ETCD_INFO" "$BASE/clusters/$1/hosts" logger info "start to add a etcd node:$2 into cluster:$1" ansible-playbook -i "$BASE/clusters/$1/hosts" "$BASE/playbooks/21.addetcd.yml" -e "NODE_TO_ADD=$2" -e "@clusters/$1/config.yml" logger info "reconfig &restart the etcd cluster" ansible-playbook -i "$BASE/clusters/$1/hosts" "$BASE/playbooks/02.etcd.yml" -t restart_etcd -e "@clusters/$1/config.yml" logger info "restart apiservers to use the new etcd cluster" ansible-playbook -i "$BASE/clusters/$1/hosts" "$BASE/playbooks/04.kube-master.yml" -t restart_master -e "@clusters/$1/config.yml" } function del-etcd() { # check node's address regexp [[ $2 =~ ^(2(5[0-5]{1}|[0-4][0-9]{1})|[0-1]?[0-9]{1,2})(\.(2(5[0-5]{1}|[0-4][0-9]{1})|[0-1]?[0-9]{1,2})){3}$ ]] || { logger error "Invalid ip add:$2"; return 1; } logger warn "start to delete the etcd node:$2 from cluster:$1" ansible-playbook -i "$BASE/clusters/$1/hosts" "$BASE/playbooks/31.deletcd.yml" -e "ETCD_TO_DEL=$2" -e "CLUSTER=$1" -e "@clusters/$1/config.yml" logger info "reconfig &restart the etcd cluster" ansible-playbook -i "$BASE/clusters/$1/hosts" "$BASE/playbooks/02.etcd.yml" -t restart_etcd -e "@clusters/$1/config.yml" logger info "restart apiservers to use the new etcd cluster" ansible-playbook -i "$BASE/clusters/$1/hosts" "$BASE/playbooks/04.kube-master.yml" -t restart_master -e "@clusters/$1/config.yml" } function del-node() { # check node's address regexp [[ $2 =~ ^(2(5[0-5]{1}|[0-4][0-9]{1})|[0-1]?[0-9]{1,2})(\.(2(5[0-5]{1}|[0-4][0-9]{1})|[0-1]?[0-9]{1,2})){3}$ ]] || { logger "Invalid ip add:$2"; return 2; } logger warn "start to delete the node:$2 from cluster:$1" ansible-playbook -i "$BASE/clusters/$1/hosts" "$BASE/playbooks/32.delnode.yml" -e "NODE_TO_DEL=$2" -e "CLUSTER=$1" -e "@clusters/$1/config.yml" } function del-master() { # check node's address regexp [[ $2 =~ ^(2(5[0-5]{1}|[0-4][0-9]{1})|[0-1]?[0-9]{1,2})(\.(2(5[0-5]{1}|[0-4][0-9]{1})|[0-1]?[0-9]{1,2})){3}$ ]] || { logger error "Invalid ip add:$2"; return 2; } logger warn "start to delete the master:$2 from cluster:$1" ansible-playbook -i "$BASE/clusters/$1/hosts" "$BASE/playbooks/33.delmaster.yml" -e "NODE_TO_DEL=$2" -e "CLUSTER=$1" -e "@clusters/$1/config.yml" logger info "reconfig kubeconfig in ansible manage node" ansible-playbook -i "$BASE/clusters/$1/hosts" "$BASE/roles/deploy/deploy.yml" -t create_kctl_cfg -e "@clusters/$1/config.yml" logger info "reconfigure and restart the haproxy service on 'kube_node' nodes" ansible-playbook -i "$BASE/clusters/$1/hosts" "$BASE/playbooks/05.kube-node.yml" -t restart_lb -e MASTER_CHG=yes -e "@clusters/$1/config.yml" } function start-aio(){ set +u # Check ENV 'HOST_IP', exists if the CMD 'ezctl' running in a docker container if [[ -z $HOST_IP ]];then # ezctl runs in a host machine, get host's ip HOST_IF=$(ip route|grep default|cut -d' ' -f5) HOST_IP=$(ip a|grep "$HOST_IF$"|head -n1|awk '{print $2}'|cut -d'/' -f1) fi set -u logger info "get local host ipadd: $HOST_IP" # allow ssh login using key locally if [[ ! -e /root/.ssh/id_rsa ]]; then logger debug "generate ssh key pair" ssh-keygen -t rsa -b 2048 -N '' -f /root/.ssh/id_rsa > /dev/null cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys ssh-keyscan -t ecdsa -H "$HOST_IP" >> /root/.ssh/known_hosts fi new default /bin/cp -f example/hosts.allinone "clusters/default/hosts" sed -i "s/_cluster_name_/default/g" "clusters/default/hosts" sed -i "s/192.168.1.1/$HOST_IP/g" "clusters/default/hosts" setup default all } ### Extra functions ############################################# EXPIRY=4800h # default cert will expire in 200 days USER_TYPE=admin # admin/view, admin=clusterrole:cluster-admin view=clusterrole:view USER_NAME=user function kcfg-adm() { OPTIND=2 ACTION="" while getopts "ADLe:t:u:" OPTION; do case $OPTION in A) ACTION="add-kcfg $1" ;; D) ACTION="del-kcfg $1" ;; L) ACTION="list-kcfg $1" ;; e) EXPIRY="$OPTARG" [[ $OPTARG =~ ^[1-9][0-9]*h$ ]] || { logger error "'-e' must be set like '2h, 5h, 50000h, ...'"; exit 1; } ;; t) USER_TYPE="$OPTARG" [[ $OPTARG =~ ^(admin|view)$ ]] || { logger error "'-t' can only be set as 'admin' or 'view'"; exit 1; } ;; u) USER_NAME="$OPTARG" ;; ?) help-info kcfg-adm return 1 ;; esac done [[ "$ACTION" == "" ]] && { logger error "illegal option"; help-info kcfg-adm; exit 1; } logger info "$ACTION" ${ACTION} || { logger error "$ACTION fail"; return 1; } logger info "$ACTION success" } function add-kcfg(){ USER_NAME="$USER_NAME"-$(date +'%Y%m%d%H%M') logger info "add-kcfg in cluster:$1 with user:$USER_NAME" ansible-playbook -i "clusters/$1/hosts" -e "@clusters/$1/config.yml" -e "CUSTOM_EXPIRY=$EXPIRY" \ -e "USER_TYPE=$USER_TYPE" -e "USER_NAME=$USER_NAME" -e "ADD_KCFG=true" \ -t add-kcfg "roles/deploy/deploy.yml" } function del-kcfg(){ logger info "del-kcfg in cluster:$1 with user:$USER_NAME" CRB=$(bin/kubectl --kubeconfig="clusters/$1/kubectl.kubeconfig" get clusterrolebindings -ojsonpath="{.items[?(@.subjects[0].name == '$USER_NAME')].metadata.name}") && \ bin/kubectl --kubeconfig="clusters/$1/kubectl.kubeconfig" delete clusterrolebindings "$CRB" && \ /bin/rm -f "clusters/$1/ssl/users/$USER_NAME"* } function list-kcfg(){ logger info "list-kcfg in cluster:$1" ADMINS=$(bin/kubectl --kubeconfig="clusters/$1/kubectl.kubeconfig" get clusterrolebindings -ojsonpath='{.items[?(@.roleRef.name == "cluster-admin")].subjects[*].name}') VIEWS=$(bin/kubectl --kubeconfig="clusters/$1/kubectl.kubeconfig" get clusterrolebindings -ojsonpath='{.items[?(@.roleRef.name == "view")].subjects[*].name}') ALL=$(bin/kubectl --kubeconfig="clusters/$1/kubectl.kubeconfig" get clusterrolebindings -ojsonpath='{.items[*].subjects[*].name}') printf "\n%-30s %-15s %-20s\n" USER TYPE "EXPIRY(+8h if in Asia/Shanghai)" echo "---------------------------------------------------------------------------------" for u in $ADMINS; do if [[ $u =~ ^.*-[0-9]{12}$ ]];then t=$(bin/cfssl-certinfo -cert "clusters/$1/ssl/users/$u.pem"|grep not_after|awk '{print $2}'|sed 's/"//g'|sed 's/,//g') printf "%-30s %-15s %-20s\n" "$u" cluster-admin "$t" fi done; for u in $VIEWS; do if [[ $u =~ ^.*-[0-9]{12}$ ]];then t=$(bin/cfssl-certinfo -cert "clusters/$1/ssl/users/$u.pem"|grep not_after|awk '{print $2}'|sed 's/"//g'|sed 's/,//g') printf "%-30s %-15s %-20s\n" "$u" view "$t" fi done; for u in $ALL; do if [[ $u =~ ^.*-[0-9]{12}$ ]];then [[ $ADMINS == *$u* ]] || [[ $VIEWS == *$u* ]] || { t=$(bin/cfssl-certinfo -cert "clusters/$1/ssl/users/$u.pem"|grep not_after|awk '{print $2}'|sed 's/"//g'|sed 's/,//g') printf "%-30s %-15s %-20s\n" "$u" unknown "$t" } fi done; echo "" } ### Main Lines ################################################## function main() { BASE="/etc/kubeasz" [[ -d "$BASE" ]] || { logger error "invalid dir:$BASE, try: 'ezdown -D'"; exit 1; } cd "$BASE" # check bash shell readlink /proc/$$/exe|grep -q "dash" && { logger error "you should use bash shell only"; exit 1; } # check 'ansible' executable which ansible > /dev/null 2>&1 || { logger error "need 'ansible', try: 'pip install ansible'"; usage; exit 1; } [ "$#" -gt 0 ] || { usage >&2; exit 2; } case "$1" in ### in-cluster operations ##################### (add-etcd) [ "$#" -gt 2 ] || { usage >&2; exit 2; } add-etcd "${@:2}" ;; (add-master) [ "$#" -gt 2 ] || { usage >&2; exit 2; } add-master "${@:2}" ;; (add-node) [ "$#" -gt 2 ] || { usage >&2; exit 2; } add-node "${@:2}" ;; (del-etcd) [ "$#" -eq 3 ] || { usage >&2; exit 2; } del-etcd "$2" "$3" ;; (del-master) [ "$#" -eq 3 ] || { usage >&2; exit 2; } del-master "$2" "$3" ;; (del-node) [ "$#" -eq 3 ] || { usage >&2; exit 2; } del-node "$2" "$3" ;; ### cluster-wide operations ####################### (checkout) [ "$#" -eq 2 ] || { usage >&2; exit 2; } checkout "$2" ;; (list) [ "$#" -eq 1 ] || { usage >&2; exit 2; } list ;; (new) [ "$#" -eq 2 ] || { usage >&2; exit 2; } new "$2" ;; (setup) [ "$#" -eq 3 ] || { usage-setup >&2; exit 2; } setup "${@:2}" ;; (start) [ "$#" -eq 2 ] || { usage >&2; exit 2; } cmd "$2" start ;; (stop) [ "$#" -eq 2 ] || { usage >&2; exit 2; } cmd "$2" stop ;; (upgrade) [ "$#" -eq 2 ] || { usage >&2; exit 2; } cmd "$2" upgrade ;; (backup) [ "$#" -eq 2 ] || { usage >&2; exit 2; } cmd "$2" backup ;; (restore) [ "$#" -eq 2 ] || { usage >&2; exit 2; } cmd "$2" restore ;; (destroy) [ "$#" -eq 2 ] || { usage >&2; exit 2; } cmd "$2" destroy ;; (start-aio) [ "$#" -eq 1 ] || { usage >&2; exit 2; } start-aio ;; ### extra operations ############################## (kcfg-adm) [ "$#" -gt 2 ] || { usage-kcfg-adm >&2; exit 2; } kcfg-adm "${@:2}" ;; (help) [ "$#" -gt 1 ] || { usage >&2; exit 2; } help-info "$2" exit 0 ;; (*) usage exit 0 ;; esac } main "$@"