# 使用 operator 部署 VictoriaMetrics ## VictoriaMetrics 架构概览 以下是 VictoriaMetrics 的核心组件架构图: ![](https://image-host-1251893006.cos.ap-chengdu.myqcloud.com/20220904161934.png) * `vmstorage` 负责存储数据,是有状态组件。 * `vmselect` 负责查询数据,Grafana 添加 Prometheus 数据源时使用 `vmselect` 地址,查询数据时,`vmselect` 会调用各个 `vmstorage` 的接口完成数据的查询。 * `vminsert` 负责写入数据,采集器将采集到的数据 "吐到" `vminsert`,然后 `vminsert` 会调用各个 `vmstorage` 的接口完成数据的写入。 * 各个组件都可以水平伸缩,但不支持自动伸缩,因为伸缩需要修改启动参数。 ## 安装 operator 使用 helm 安装: ```bash helm repo add vm https://victoriametrics.github.io/helm-charts helm repo update helm install victoria-operator vm/victoria-metrics-operator ``` 检查 operator 是否成功启动: ```bash $ kubectl -n monitoring get pod NAME READY STATUS RESTARTS AGE victoria-operator-victoria-metrics-operator-7b886f85bb-jf6ng 1/1 Running 0 20s ``` ## 安装 VMSorage, VMSelect 与 VMInsert 准备 `vmcluster.yaml`: ```yaml apiVersion: operator.victoriametrics.com/v1beta1 kind: VMCluster metadata: name: vmcluster namespace: monitoring spec: retentionPeriod: "1" # 默认单位是月,参考 https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#retention vmstorage: replicaCount: 2 storage: volumeClaimTemplate: metadata: name: data spec: accessModes: [ "ReadWriteOnce" ] storageClassName: cbs resources: requests: storage: 100Gi vmselect: replicaCount: 2 vminsert: replicaCount: 2 ``` 安装: ```bash $ kubectl apply -f vmcluster.yaml vmcluster.operator.victoriametrics.com/vmcluster created ``` 检查组件是否启动成功: ```bash $ kubectl -n monitoring get pod | grep vmcluster vminsert-vmcluster-77886b8dcb-jqpfw 1/1 Running 0 20s vminsert-vmcluster-77886b8dcb-l5wrg 1/1 Running 0 20s vmselect-vmcluster-0 1/1 Running 0 20s vmselect-vmcluster-1 1/1 Running 0 20s vmstorage-vmcluster-0 1/1 Running 0 20s vmstorage-vmcluster-1 1/1 Running 0 20s ``` ## 安装 VMAlertmanager 与 VMAlert 准备 `vmalertmanager.yaml`: ```yaml apiVersion: operator.victoriametrics.com/v1beta1 kind: VMAlertmanager metadata: name: vmalertmanager namespace: monitoring spec: replicaCount: 1 selectAllByDefault: true ``` 安装 `VMAlertmanager`: ```bash $ kubectl apply -f vmalertmanager.yaml vmalertmanager.operator.victoriametrics.com/vmalertmanager created ``` 准备 `vmalert.yaml`: ```yaml apiVersion: operator.victoriametrics.com/v1beta1 kind: VMAlert metadata: name: vmalert namespace: monitoring spec: replicaCount: 1 selectAllByDefault: true notifier: url: http://vmalertmanager-vmalertmanager:9093 resources: requests: cpu: 10m memory: 10Mi remoteWrite: url: http://vminsert-vmcluster:8480/insert/0/prometheus/ remoteRead: url: http://vmselect-vmcluster:8481/select/0/prometheus/ datasource: url: http://vmselect-vmcluster:8481/select/0/prometheus/ ``` 安装 `VMAlert`: ```bash $ kubectl apply -f vmalert.yaml vmalert.operator.victoriametrics.com/vmalert created ``` 检查组件是否启动成功: ```bash $ kubectl -n monitoring get pod | grep vmalert vmalert-vmalert-5987fb9d5f-9wt6l 2/2 Running 0 20s vmalertmanager-vmalertmanager-0 2/2 Running 0 40s ``` ## 安装 VMAgent vmagent 用于采集监控数据并发送给 VictoriaMetrics 进行存储,对于腾讯云容器服务上的容器监控数据采集,需要用自定义的 `additionalScrapeConfigs` 配置,准备自定义采集规则配置文件 `scrape-config.yaml`: ```yaml apiVersion: v1 kind: Secret type: Opaque metadata: name: additional-scrape-configs namespace: monitoring stringData: additional-scrape-configs.yaml: |- - job_name: "tke-cadvisor" scheme: https metrics_path: /metrics/cadvisor tls_config: insecure_skip_verify: true authorization: credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_sd_configs: - role: node relabel_configs: - source_labels: [__meta_kubernetes_node_label_node_kubernetes_io_instance_type] regex: eklet action: drop - action: labelmap regex: __meta_kubernetes_node_label_(.+) - job_name: "tke-kubelet" scheme: https metrics_path: /metrics tls_config: insecure_skip_verify: true authorization: credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_sd_configs: - role: node relabel_configs: - source_labels: [__meta_kubernetes_node_label_node_kubernetes_io_instance_type] regex: eklet action: drop - action: labelmap regex: __meta_kubernetes_node_label_(.+) - job_name: "tke-probes" scheme: https metrics_path: /metrics/probes tls_config: insecure_skip_verify: true authorization: credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_sd_configs: - role: node relabel_configs: - source_labels: [__meta_kubernetes_node_label_node_kubernetes_io_instance_type] regex: eklet action: drop - action: labelmap regex: __meta_kubernetes_node_label_(.+) - job_name: eks honor_timestamps: true metrics_path: '/metrics' params: collect[]: ['ipvs'] # - 'cpu' # - 'meminfo' # - 'diskstats' # - 'filesystem' # - 'load0vg' # - 'netdev' # - 'filefd' # - 'pressure' # - 'vmstat' scheme: http kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: [__meta_kubernetes_pod_annotation_tke_cloud_tencent_com_pod_type] regex: eklet action: keep - source_labels: [__meta_kubernetes_pod_phase] regex: Running action: keep - source_labels: [__meta_kubernetes_pod_ip] separator: ; regex: (.*) target_label: __address__ replacement: ${1}:9100 action: replace - source_labels: [__meta_kubernetes_pod_name] separator: ; regex: (.*) target_label: pod replacement: ${1} action: replace - source_labels: [__meta_kubernetes_namespace] separator: ; regex: (.*) target_label: namespace replacement: ${1} action: replace metric_relabel_configs: - source_labels: [__name__] separator: ; regex: (container_.*|pod_.*|kubelet_.*) replacement: $1 action: keep ``` 再准备 `vmagent.yaml`: ```yaml apiVersion: operator.victoriametrics.com/v1beta1 kind: VMAgent metadata: name: vmagent namespace: monitoring spec: selectAllByDefault: true additionalScrapeConfigs: key: additional-scrape-configs.yaml name: additional-scrape-configs resources: requests: cpu: 10m memory: 10Mi replicaCount: 1 remoteWrite: - url: "http://vminsert-vmcluster:8480/insert/0/prometheus/api/v1/write" ``` 安装: ```bash $ kubectl apply -f scrape-config.yaml secret/additional-scrape-configs created $ kubectl apply -f vmagent.yaml vmagent.operator.victoriametrics.com/vmagent created ``` 检查组件是否启动成功: ```bash $ kubectl -n monitoring get pod | grep vmagent vmagent-vmagent-cf9bbdbb4-tm4w9 2/2 Running 0 20s vmagent-vmagent-cf9bbdbb4-ija8r 2/2 Running 0 20s ``` ## 配置 Grafana ### 添加数据源 VictoriaMetrics 兼容 Prometheus,在 Grafana 添加数据源时,使用 Prometheus 类型,如果 Grafana 跟 VictoriaMetrics 安装在同一集群中,可以使用 service 地址,如: ```txt http://vmselect-vmcluster:8481/select/0/prometheus/ ``` ![](https://image-host-1251893006.cos.ap-chengdu.myqcloud.com/20220904160422.png) ### 添加 Dashboard VictoriaMetrics 官方提供了几个 Grafana Dashboard,id 分别是: 1. 11176 2. 12683 3. 14205 可以将其导入 Grafana: ![](https://image-host-1251893006.cos.ap-chengdu.myqcloud.com/20220904160727.png) ![](https://image-host-1251893006.cos.ap-chengdu.myqcloud.com/20220904161558.png) ![](https://image-host-1251893006.cos.ap-chengdu.myqcloud.com/20220904161641.png)