331 lines
9.1 KiB
Markdown
331 lines
9.1 KiB
Markdown
|
# StatefulSet
|
|||
|
|
|||
|
StatefulSet是为了解决有状态服务的问题(对应Deployments和ReplicaSets是为无状态服务而设计),其应用场景包括:
|
|||
|
|
|||
|
- 稳定的持久化存储,即Pod重新调度后还是能访问到相同的持久化数据,基于PVC来实现
|
|||
|
- 稳定的网络标志,即Pod重新调度后其PodName和HostName不变,基于Headless Service(即没有Cluster IP的Service)来实现
|
|||
|
- 有序部署,有序扩展,即Pod是有顺序的,在部署或者扩展的时候要依据定义的顺序依次依次进行(即从0到N-1,在下一个Pod运行之前所有之前的Pod必须都是Running和Ready状态),基于init containers来实现
|
|||
|
- 有序收缩,有序删除(即从N-1到0)
|
|||
|
|
|||
|
从上面的应用场景可以发现,StatefulSet由以下几个部分组成:
|
|||
|
|
|||
|
- 用于定义网络标志(DNS domain)的Headless Service
|
|||
|
- 用于创建PersistentVolumes的volumeClaimTemplates
|
|||
|
- 定义具体应用的StatefulSet
|
|||
|
|
|||
|
StatefulSet中每个Pod的DNS格式为`statefulSetName-{0..N-1}.serviceName.namespace.svc.cluster.local`,其中
|
|||
|
|
|||
|
- `serviceName`为Headless Service的名字
|
|||
|
- `0..N-1`为Pod所在的序号,从0开始到N-1
|
|||
|
- `statefulSetName`为StatefulSet的名字
|
|||
|
- `namespace`为服务所在的namespace,Headless Servic和StatefulSet必须在相同的namespace
|
|||
|
- `.cluster.local`为Cluster Domain,
|
|||
|
|
|||
|
## 简单示例
|
|||
|
|
|||
|
以一个简单的nginx服务[web.yaml](../manifests/test/web.yaml)为例:
|
|||
|
|
|||
|
```yaml
|
|||
|
---
|
|||
|
apiVersion: v1
|
|||
|
kind: Service
|
|||
|
metadata:
|
|||
|
name: nginx
|
|||
|
labels:
|
|||
|
app: nginx
|
|||
|
spec:
|
|||
|
ports:
|
|||
|
- port: 80
|
|||
|
name: web
|
|||
|
clusterIP: None
|
|||
|
selector:
|
|||
|
app: nginx
|
|||
|
---
|
|||
|
apiVersion: apps/v1beta1
|
|||
|
kind: StatefulSet
|
|||
|
metadata:
|
|||
|
name: web
|
|||
|
spec:
|
|||
|
serviceName: "nginx"
|
|||
|
replicas: 2
|
|||
|
template:
|
|||
|
metadata:
|
|||
|
labels:
|
|||
|
app: nginx
|
|||
|
spec:
|
|||
|
containers:
|
|||
|
- name: nginx
|
|||
|
image: gcr.io/google_containers/nginx-slim:0.8
|
|||
|
ports:
|
|||
|
- containerPort: 80
|
|||
|
name: web
|
|||
|
volumeMounts:
|
|||
|
- name: www
|
|||
|
mountPath: /usr/share/nginx/html
|
|||
|
volumeClaimTemplates:
|
|||
|
- metadata:
|
|||
|
name: www
|
|||
|
annotations:
|
|||
|
volume.alpha.kubernetes.io/storage-class: anything
|
|||
|
spec:
|
|||
|
accessModes: [ "ReadWriteOnce" ]
|
|||
|
resources:
|
|||
|
requests:
|
|||
|
storage: 1Gi
|
|||
|
```
|
|||
|
|
|||
|
```sh
|
|||
|
$ kubectl create -f web.yaml
|
|||
|
service "nginx" created
|
|||
|
statefulset "web" created
|
|||
|
|
|||
|
# 查看创建的headless service和statefulset
|
|||
|
$ kubectl get service nginx
|
|||
|
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|||
|
nginx None <none> 80/TCP 1m
|
|||
|
$ kubectl get statefulset web
|
|||
|
NAME DESIRED CURRENT AGE
|
|||
|
web 2 2 2m
|
|||
|
|
|||
|
# 根据volumeClaimTemplates自动创建PVC(在GCE中会自动创建kubernetes.io/gce-pd类型的volume)
|
|||
|
$ kubectl get pvc
|
|||
|
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
|
|||
|
www-web-0 Bound pvc-d064a004-d8d4-11e6-b521-42010a800002 1Gi RWO 16s
|
|||
|
www-web-1 Bound pvc-d06a3946-d8d4-11e6-b521-42010a800002 1Gi RWO 16s
|
|||
|
|
|||
|
# 查看创建的Pod,他们都是有序的
|
|||
|
$ kubectl get pods -l app=nginx
|
|||
|
NAME READY STATUS RESTARTS AGE
|
|||
|
web-0 1/1 Running 0 5m
|
|||
|
web-1 1/1 Running 0 4m
|
|||
|
|
|||
|
# 使用nslookup查看这些Pod的DNS
|
|||
|
$ kubectl run -i --tty --image busybox dns-test --restart=Never --rm /bin/sh
|
|||
|
/ # nslookup web-0.nginx
|
|||
|
Server: 10.0.0.10
|
|||
|
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
|
|||
|
|
|||
|
Name: web-0.nginx
|
|||
|
Address 1: 10.244.2.10
|
|||
|
/ # nslookup web-1.nginx
|
|||
|
Server: 10.0.0.10
|
|||
|
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
|
|||
|
|
|||
|
Name: web-1.nginx
|
|||
|
Address 1: 10.244.3.12
|
|||
|
/ # nslookup web-0.nginx.default.svc.cluster.local
|
|||
|
Server: 10.0.0.10
|
|||
|
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
|
|||
|
|
|||
|
Name: web-0.nginx.default.svc.cluster.local
|
|||
|
Address 1: 10.244.2.10
|
|||
|
```
|
|||
|
|
|||
|
还可以进行其他的操作
|
|||
|
|
|||
|
```sh
|
|||
|
# 扩容
|
|||
|
$ kubectl scale statefulset web --replicas=5
|
|||
|
|
|||
|
# 缩容
|
|||
|
$ kubectl patch statefulset web -p '{"spec":{"replicas":3}}'
|
|||
|
|
|||
|
# 镜像更新(目前还不支持直接更新image,需要patch来间接实现)
|
|||
|
$ kubectl patch statefulset web --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"gcr.io/google_containers/nginx-slim:0.7"}]'
|
|||
|
|
|||
|
# 删除StatefulSet和Headless Service
|
|||
|
$ kubectl delete statefulset web
|
|||
|
$ kubectl delete service nginx
|
|||
|
|
|||
|
# StatefulSet删除后PVC还会保留着,数据不再使用的话也需要删除
|
|||
|
$ kubectl delete pvc www-web-0 www-web-1
|
|||
|
```
|
|||
|
|
|||
|
## zookeeper
|
|||
|
|
|||
|
另外一个更能说明StatefulSet强大功能的示例为[zookeeper.yaml](../manifests/test/zookeeper.yaml)。
|
|||
|
|
|||
|
```yaml
|
|||
|
---
|
|||
|
apiVersion: v1
|
|||
|
kind: Service
|
|||
|
metadata:
|
|||
|
name: zk-headless
|
|||
|
labels:
|
|||
|
app: zk-headless
|
|||
|
spec:
|
|||
|
ports:
|
|||
|
- port: 2888
|
|||
|
name: server
|
|||
|
- port: 3888
|
|||
|
name: leader-election
|
|||
|
clusterIP: None
|
|||
|
selector:
|
|||
|
app: zk
|
|||
|
---
|
|||
|
apiVersion: v1
|
|||
|
kind: ConfigMap
|
|||
|
metadata:
|
|||
|
name: zk-config
|
|||
|
data:
|
|||
|
ensemble: "zk-0;zk-1;zk-2"
|
|||
|
jvm.heap: "2G"
|
|||
|
tick: "2000"
|
|||
|
init: "10"
|
|||
|
sync: "5"
|
|||
|
client.cnxns: "60"
|
|||
|
snap.retain: "3"
|
|||
|
purge.interval: "1"
|
|||
|
---
|
|||
|
apiVersion: policy/v1beta1
|
|||
|
kind: PodDisruptionBudget
|
|||
|
metadata:
|
|||
|
name: zk-budget
|
|||
|
spec:
|
|||
|
selector:
|
|||
|
matchLabels:
|
|||
|
app: zk
|
|||
|
minAvailable: 2
|
|||
|
---
|
|||
|
apiVersion: apps/v1beta1
|
|||
|
kind: StatefulSet
|
|||
|
metadata:
|
|||
|
name: zk
|
|||
|
spec:
|
|||
|
serviceName: zk-headless
|
|||
|
replicas: 3
|
|||
|
template:
|
|||
|
metadata:
|
|||
|
labels:
|
|||
|
app: zk
|
|||
|
annotations:
|
|||
|
pod.alpha.kubernetes.io/initialized: "true"
|
|||
|
scheduler.alpha.kubernetes.io/affinity: >
|
|||
|
{
|
|||
|
"podAntiAffinity": {
|
|||
|
"requiredDuringSchedulingRequiredDuringExecution": [{
|
|||
|
"labelSelector": {
|
|||
|
"matchExpressions": [{
|
|||
|
"key": "app",
|
|||
|
"operator": "In",
|
|||
|
"values": ["zk-headless"]
|
|||
|
}]
|
|||
|
},
|
|||
|
"topologyKey": "kubernetes.io/hostname"
|
|||
|
}]
|
|||
|
}
|
|||
|
}
|
|||
|
spec:
|
|||
|
containers:
|
|||
|
- name: k8szk
|
|||
|
imagePullPolicy: Always
|
|||
|
image: gcr.io/google_samples/k8szk:v1
|
|||
|
resources:
|
|||
|
requests:
|
|||
|
memory: "4Gi"
|
|||
|
cpu: "1"
|
|||
|
ports:
|
|||
|
- containerPort: 2181
|
|||
|
name: client
|
|||
|
- containerPort: 2888
|
|||
|
name: server
|
|||
|
- containerPort: 3888
|
|||
|
name: leader-election
|
|||
|
env:
|
|||
|
- name : ZK_ENSEMBLE
|
|||
|
valueFrom:
|
|||
|
configMapKeyRef:
|
|||
|
name: zk-config
|
|||
|
key: ensemble
|
|||
|
- name : ZK_HEAP_SIZE
|
|||
|
valueFrom:
|
|||
|
configMapKeyRef:
|
|||
|
name: zk-config
|
|||
|
key: jvm.heap
|
|||
|
- name : ZK_TICK_TIME
|
|||
|
valueFrom:
|
|||
|
configMapKeyRef:
|
|||
|
name: zk-config
|
|||
|
key: tick
|
|||
|
- name : ZK_INIT_LIMIT
|
|||
|
valueFrom:
|
|||
|
configMapKeyRef:
|
|||
|
name: zk-config
|
|||
|
key: init
|
|||
|
- name : ZK_SYNC_LIMIT
|
|||
|
valueFrom:
|
|||
|
configMapKeyRef:
|
|||
|
name: zk-config
|
|||
|
key: tick
|
|||
|
- name : ZK_MAX_CLIENT_CNXNS
|
|||
|
valueFrom:
|
|||
|
configMapKeyRef:
|
|||
|
name: zk-config
|
|||
|
key: client.cnxns
|
|||
|
- name: ZK_SNAP_RETAIN_COUNT
|
|||
|
valueFrom:
|
|||
|
configMapKeyRef:
|
|||
|
name: zk-config
|
|||
|
key: snap.retain
|
|||
|
- name: ZK_PURGE_INTERVAL
|
|||
|
valueFrom:
|
|||
|
configMapKeyRef:
|
|||
|
name: zk-config
|
|||
|
key: purge.interval
|
|||
|
- name: ZK_CLIENT_PORT
|
|||
|
value: "2181"
|
|||
|
- name: ZK_SERVER_PORT
|
|||
|
value: "2888"
|
|||
|
- name: ZK_ELECTION_PORT
|
|||
|
value: "3888"
|
|||
|
command:
|
|||
|
- sh
|
|||
|
- -c
|
|||
|
- zkGenConfig.sh && zkServer.sh start-foreground
|
|||
|
readinessProbe:
|
|||
|
exec:
|
|||
|
command:
|
|||
|
- "zkOk.sh"
|
|||
|
initialDelaySeconds: 15
|
|||
|
timeoutSeconds: 5
|
|||
|
livenessProbe:
|
|||
|
exec:
|
|||
|
command:
|
|||
|
- "zkOk.sh"
|
|||
|
initialDelaySeconds: 15
|
|||
|
timeoutSeconds: 5
|
|||
|
volumeMounts:
|
|||
|
- name: datadir
|
|||
|
mountPath: /var/lib/zookeeper
|
|||
|
securityContext:
|
|||
|
runAsUser: 1000
|
|||
|
fsGroup: 1000
|
|||
|
volumeClaimTemplates:
|
|||
|
- metadata:
|
|||
|
name: datadir
|
|||
|
annotations:
|
|||
|
volume.alpha.kubernetes.io/storage-class: anything
|
|||
|
spec:
|
|||
|
accessModes: [ "ReadWriteOnce" ]
|
|||
|
resources:
|
|||
|
requests:
|
|||
|
storage: 20Gi
|
|||
|
```
|
|||
|
|
|||
|
```sh
|
|||
|
kubectl create -f zookeeper.yaml
|
|||
|
```
|
|||
|
|
|||
|
详细的使用说明见[zookeeper stateful application](https://kubernetes.io/docs/tutorials/stateful-application/zookeeper/)。
|
|||
|
|
|||
|
## StatefulSet注意事项
|
|||
|
|
|||
|
1. 还在beta状态,需要kubernetes v1.5版本以上才支持
|
|||
|
2. 所有Pod的Volume必须使用PersistentVolume或者是管理员事先创建好
|
|||
|
3. 为了保证数据安全,删除StatefulSet时不会删除Volume
|
|||
|
4. StatefulSet需要一个Headless Service来定义DNS domain,需要在StatefulSet之前创建好
|
|||
|
5. 目前StatefulSet还没有feature complete,比如更新操作还需要手动patch。
|
|||
|
|
|||
|
|
|||
|
更多可以参考[Kubernetes文档](https://kubernetes.io/docs/concepts/abstractions/controllers/statefulsets/)。
|