使用Heapster获取集群对象的metric数据

pull/57/head
Jimmy Song 2017-10-16 17:34:53 +08:00
parent 267bd09101
commit af8b75f354
7 changed files with 308 additions and 6 deletions

View File

@ -77,6 +77,7 @@
- [4.3.6 数据持久化问题](practice/data-persistence-problem.md)
- [4.3.7 管理容器的计算资源](practice/manage-compute-resources-container.md)
- [4.3.8 使用Prometheus监控kubernetes集群](practice/using-prometheus-to-monitor-kuberentes-cluster.md)
- [4.3.9 使用Heapster获取集群和对象的metric数据](practice/using-heapster-to-get-object-metrics.md)
- [4.4 存储管理](practice/storage.md)
- [4.4.1 GlusterFS](practice/glusterfs.md)
- [4.4.1.1 使用GlusterFS做持久化存储](practice/using-glusterfs-for-persistent-storage.md)

View File

@ -29,7 +29,7 @@ Horizontal Pod Autoscaling由API server和controller共同实现。
Horizontal Pod Autoscaling作为API resource也可以像Pod、Deployment一样使用kubeclt命令管理使用方法跟它们一样资源名称为`hpa`。
```
```bash
kubectl create hpa
kubebectl get hpa
kubectl describe hpa
@ -40,14 +40,14 @@ kubectl delete hpa
用法如下:
```b
```bash
kubectl autoscale (-f FILENAME | TYPE NAME | TYPE/NAME) [--min=MINPODS] --max=MAXPODS
[--cpu-percent=CPU] [flags] [options]
```
举个例子:
```
```bash
kubectl autoscale deployment foo --min=2 --max=5 --cpu-percent=80
```

View File

@ -25,13 +25,13 @@ Node包括如下状态信息
禁止pod调度到该节点上
```
```bash
kubectl cordon <node>
```
驱逐该节点上的所有pod
```
```bash
kubectl drain <node>
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 126 KiB

View File

@ -0,0 +1,151 @@
---
apiVersion: v1
kind: Service
metadata:
name: zk-svc
labels:
app: zk-svc
spec:
ports:
- port: 2888
name: server
- port: 3888
name: leader-election
clusterIP: None
selector:
app: zk
---
apiVersion: v1
kind: ConfigMap
metadata:
name: zk-cm
data:
jvm.heap: "1G"
tick: "2000"
init: "10"
sync: "5"
client.cnxns: "60"
snap.retain: "3"
purge.interval: "0"
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: zk-pdb
spec:
selector:
matchLabels:
app: zk
minAvailable: 2
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: zk
spec:
serviceName: zk-svc
replicas: 3
template:
metadata:
labels:
app: zk
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- zk
topologyKey: "kubernetes.io/hostname"
containers:
- name: k8szk
imagePullPolicy: Always
image: gcr.io/google_samples/k8szk:v2
resources:
requests:
memory: "2Gi"
cpu: "500m"
ports:
- containerPort: 2181
name: client
- containerPort: 2888
name: server
- containerPort: 3888
name: leader-election
env:
- name : ZK_REPLICAS
value: "3"
- name : ZK_HEAP_SIZE
valueFrom:
configMapKeyRef:
name: zk-cm
key: jvm.heap
- name : ZK_TICK_TIME
valueFrom:
configMapKeyRef:
name: zk-cm
key: tick
- name : ZK_INIT_LIMIT
valueFrom:
configMapKeyRef:
name: zk-cm
key: init
- name : ZK_SYNC_LIMIT
valueFrom:
configMapKeyRef:
name: zk-cm
key: tick
- name : ZK_MAX_CLIENT_CNXNS
valueFrom:
configMapKeyRef:
name: zk-cm
key: client.cnxns
- name: ZK_SNAP_RETAIN_COUNT
valueFrom:
configMapKeyRef:
name: zk-cm
key: snap.retain
- name: ZK_PURGE_INTERVAL
valueFrom:
configMapKeyRef:
name: zk-cm
key: purge.interval
- name: ZK_CLIENT_PORT
value: "2181"
- name: ZK_SERVER_PORT
value: "2888"
- name: ZK_ELECTION_PORT
value: "3888"
command:
- sh
- -c
- zkGenConfig.sh && zkServer.sh start-foreground
readinessProbe:
exec:
command:
- "zkOk.sh"
initialDelaySeconds: 10
timeoutSeconds: 5
livenessProbe:
exec:
command:
- "zkOk.sh"
initialDelaySeconds: 10
timeoutSeconds: 5
# volumeMounts:
# - name: datadir
# mountPath: /var/lib/zookeeper
# securityContext:
# runAsUser: 1000
# fsGroup: 1000
# volumeClaimTemplates:
# - metadata:
# name: datadir
# spec:
# accessModes: [ "ReadWriteOnce" ]
# resources:
# requests:
# storage: 10Gi

View File

@ -201,4 +201,8 @@ monitoring-influxdb 10.254.22.46 <nodes> 8086:32299/TCP,8083:30269/T
![修改grafana模板](../images/grafana-dashboard-setting.jpg)
将 Templating 中的 namespace 的 Data source 设置为 influxdb-datasourceRefresh 设置为 on Dashboard Load 保存设置,刷新浏览器,即可看到其他 namespace 选项。
将 Templating 中的 namespace 的 Data source 设置为 influxdb-datasourceRefresh 设置为 on Dashboard Load 保存设置,刷新浏览器,即可看到其他 namespace 选项。
## 参考
[使用Heapster获取集群对象的metric数据](../practice/using-heapster-to-get-object-metrics.md)

View File

@ -0,0 +1,146 @@
# 使用Heapster获取集群对象的metric数据
Heapster作为kubernetes安装过程中默认安装的一个插件见[安装heapster插件](practice/heapster-addon-installation.md)。这对于集群监控十分有用,同时在[Horizontal Pod Autoscaling](../concepts/horizontal-pod-autoscaling.md)中也用到了HPA将Heapster作为`Resource Metrics API`向其获取metric做法是在`kube-controller-manager` 中配置`--api-server`指向[kube-aggregator](https://github.com/kubernetes/kube-aggregator)也可以使用heapster来实现通过在启动heapster的时候指定`--api-server=true`。
Heapster可以收集Node节点上的cAdvisor数据还可以按照kubernetes的资源类型来集合资源比如Pod、Namespace域可以分别获取它们的CPU、内存、网络和磁盘的metric。默认的metric数据聚合时间间隔是1分钟。
## 架构
下面是Heapster架构图
![Heapster架构图](../images/heapster-architecture.png)
[Heapser](https://github.com/kubernetes/heapster)是用Go语言开发Kubernetes集群计算资源使用情况的数据采集工具编译后可以直接以一个二进制文件运行通过向heapster传递的参数来指定数据采集行为这些数据可以选择多种sink方式例如Graphite、influxDB、OpenTSDB、ElasticSearch、Kafka等。
## 使用案例
Heapster使用起来很简单本身就是二进制文件直接使用命令行启动也可以放在容器里运行在作为kubernetes插件运行时我们是直接放在容器中的见[安装heapster插件](practice/heapster-addon-installation.md)。
### 运行
下面是heapster的启动参数
| **Flag** | **Description** |
| ---------------------------------------- | ---------------------------------------- |
| --allowed-users string | comma-separated list of allowed users |
| --alsologtostderr | log to standard error as well as files |
| --api-server | Enable API server for the Metrics API. If set, the Metrics API will be served on --insecure-port (internally) and --secure-port (externally). |
| --authentication-kubeconfig string | kubeconfig file pointing at the 'core' kubernetes server with enough rights to create [tokenaccessreviews.authentication.k8s.io](http://tokenaccessreviews.authentication.k8s.io). |
| --authentication-token-webhook-cache-ttl duration | The duration to cache responses from the webhook token authenticator. (default 10s) |
| --authorization-kubeconfig string | kubeconfig file pointing at the 'core' kubernetes server with enough rights to create [subjectaccessreviews.authorization.k8s.io](http://subjectaccessreviews.authorization.k8s.io). |
| --authorization-webhook-cache-authorized-ttl duration | The duration to cache 'authorized' responses from the webhook authorizer. (default 10s) |
| --authorization-webhook-cache-unauthorized-ttl duration | The duration to cache 'unauthorized' responses from the webhook authorizer. (default 10s) |
| --bind-address ip | The IP address on which to listen for the --secure-port port. The associated interface(s) must be reachable by the rest of the cluster, and by CLI/web clients. If blank, all interfaces will be used (0.0.0.0). (default 0.0.0.0) |
| --cert-dir string | The directory where the TLS certs are located (by default /var/run/kubernetes). If --tls-cert-file and --tls-private-key-file are provided, this flag will be ignored. (default "/var/run/kubernetes") |
| --client-ca-file string | If set, any request presenting a client certificate signed by one of the authorities in the client-ca-file is authenticated with an identity corresponding to the CommonName of the client certificate. |
| --contention-profiling | Enable contention profiling. Requires --profiling to be set to work. |
| --disable-export | Disable exporting metrics in api/v1/metric-export |
| --enable-swagger-ui | Enables swagger ui on the apiserver at /swagger-ui |
| --heapster-port int | port used by the Heapster-specific APIs (default 8082) |
| --historical-source string | which source type to use for the historical API (should be exactly the same as one of the sink URIs), or empty to disable the historical API |
| --label-seperator string | seperator used for joining labels (default ",") |
| --listen-ip string | IP to listen on, defaults to all IPs |
| --log-backtrace-at traceLocation | when logging hits line file:N, emit a stack trace (default :0) |
| --log-dir string | If non-empty, write log files in this directory |
| --log-flush-frequency duration | Maximum number of seconds between log flushes (default 5s) |
| --logtostderr | log to standard error instead of files (default true) |
| --max-procs int | max number of CPUs that can be used simultaneously. Less than 1 for default (number of cores) |
| --metric-resolution duration | The resolution at which heapster will retain metrics. (default 1m0s) |
| --profiling | Enable profiling via web interface host:port/debug/pprof/ (default true) |
| --requestheader-allowed-names stringSlice | List of client certificate common names to allow to provide usernames in headers specified by --requestheader-username-headers. If empty, any client certificate validated by the authorities in --requestheader-client-ca-file is allowed. |
| --requestheader-client-ca-file string | Root certificate bundle to use to verify client certificates on incoming requests before trusting usernames in headers specified by --requestheader-username-headers |
| --requestheader-extra-headers-prefix stringSlice | List of request header prefixes to inspect. X-Remote-Extra- is suggested. (default [x-remote-extra-]) |
| --requestheader-group-headers stringSlice | List of request headers to inspect for groups. X-Remote-Group is suggested. (default [x-remote-group]) |
| --requestheader-username-headers stringSlice | List of request headers to inspect for usernames. X-Remote-User is common. (default [x-remote-user]) |
| --secure-port int | The port on which to serve HTTPS with authentication and authorization. If 0, don't serve HTTPS at all. (default 6443) |
| --sink *flags.Uris | external sink(s) that receive data (default []) |
| --source *flags.Uris | source(s) to watch (default []) |
| --stderrthreshold severity | logs at or above this threshold go to stderr (default 2) |
| --tls-ca-file string | If set, this certificate authority will used for secure access from Admission Controllers. This must be a valid PEM-encoded CA bundle. Altneratively, the certificate authority can be appended to the certificate provided by --tls-cert-file. |
| --tls-cert string | file containing TLS certificate |
| --tls-cert-file string | File containing the default x509 Certificate for HTTPS. (CA cert, if any, concatenated after server cert). If HTTPS serving is enabled, and --tls-cert-file and --tls-private-key-file are not provided, a self-signed certificate and key are generated for the public address and saved to /var/run/kubernetes. |
| --tls-client-ca string | file containing TLS client CA for client cert validation |
| --tls-key string | file containing TLS key |
| --tls-private-key-file string | File containing the default x509 private key matching --tls-cert-file. |
| --tls-sni-cert-key namedCertKey | A pair of x509 certificate and private key file paths, optionally suffixed with a list of domain patterns which are fully qualified domain names, possibly with prefixed wildcard segments. If no domain patterns are provided, the names of the certificate are extracted. Non-wildcard matches trump over wildcard matches, explicit domain patterns trump over extracted names. For multiple key/certificate pairs, use the --tls-sni-cert-key multiple times. Examples: "example.key,example.crt" or "*.foo.com,foo.com:foo.key,foo.crt". (default []) |
| --v Level | log level for V logs |
| --version | print version info and exit |
| --vmodule moduleSpec | comma-separated list of pattern=N settings for file-filtered logging |
**Version**
version: v1.4.0
commit: 546ab66f
### API使用
Heapster提供RESTful API接口下面以获取`spark-cluster` namespace的memory usage为例讲解Heapster API的使用。
**构造URL地址**
https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/heapster/api/v1/model/namespaces/spark-cluster/metrics/memory/usage?start=2017-10-16T09:14:00Z&end=2017-10-16T09:16:00Z
**结果**
访问该地址获取的结果是这样的:
```json
{
"metrics": [
{
"timestamp": "2017-10-16T09:14:00Z",
"value": 322592768
},
{
"timestamp": "2017-10-16T09:15:00Z",
"value": 322592768
},
{
"timestamp": "2017-10-16T09:16:00Z",
"value": 322592768
}
],
"latestTimestamp": "2017-10-16T09:16:00Z"
}
```
注意Heapster中查询的所有值都是以最小单位为单位比如CPU为1milicore内存为B。
1. **第一部分Heapster API地址**
https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/heapster/
可以使用下面的命令获取:
```bash
$ kubectl cluster-info
Heapster is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/heapster
...
```
2. **第二部分Heapster API参数**
`/api/v1/model/namespaces/spark-cluster/metrics/memory/usage`
表示查询的是`spark-cluster` namespace中的`memory/usage`的metrics。
3. **第三部分:时间片**
查询参数为时间片包括start和end。
`?start=2017-10-16T09:14:00Z&end=2017-10-16T09:16:00Z`
使用`RFC-3339`时间格式在Linux系统中可以这样获取
```bash
$ date --rfc-3339="seconds"
2017-10-16 17:23:20+08:00
```
该时间中的空格替换成T最后的`+08:00`替换成Z代表时区。可以只指定start时间end时间自动设置为当前时间。
## 参考
- [kubernetes metrics](https://github.com/kubernetes/metrics)
- [Heapster metric model](https://github.com/kubernetes/heapster/blob/master/docs/model.md)
- [Heapster storage schema](https://github.com/kubernetes/heapster/blob/master/docs/storage-schema.md)