我对亚马逊 AWS 上的 Kubernetes 和 Prometheus 的自定义指标有疑问。默认情况下,CPU 和内存指标运行良好。Prometheus http_requests 不是,这是错误:
$ kubectl describe hpa hpa-deploy
Name: hpa-deploy
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"autoscaling/v2beta2","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"hpa-deploy","namespace":"default...
CreationTimestamp: Thu, 06 Jun 2019 11:06:48 +0000
Reference: Deployment/django
Metrics: ( current / target )
"http_requests" on pods: <unknown> / 2k
Min replicas: 1
Max replicas: 10
Deployment pods: 1 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetPodsMetric the HPA was unable to compute the replica count: unable to get metric http_requests: unable to fetch metrics from custom metrics API: the server could not find the metric http_requests for pods
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetPodsMetric 8m53s (x414 over 114m) horizontal-pod-autoscaler unable to get metric http_requests: unable to fetch metrics from custom metrics API: the server is currently unable to handle the request (get pods.custom.metrics.k8s.io *)
Warning FailedGetPodsMetric 3m48s (x12 over 6m36s) horizontal-pod-autoscaler unable to get metric http_requests: unable to fetch metrics from custom metrics API: the server could not find the metric http_requests for pods
我按照github项目的建议使用helm安装了Prometheus并检查了api:
$ kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "custom.metrics.k8s.io/v1beta1",
"resources": []
}
然后添加以下规则:
$ kubectl edit cm my-release-prometheus-adapter
rules:
- seriesQuery: 'http_requests_total{kubernetes_namespace!="",kubernetes_pod_name!=""}'
resources:
overrides:
kubernetes_namespace: {resource: "namespace"}
kubernetes_pod_name: {resource: "pod"}
name:
matches: "^(.*)_total"
as: "${1}_per_second"
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
演练说添加新规则后,api检查的返回值应该在“资源”:[]内有值,但没有,我不知道为什么。
这是我的 hpa 代码:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: hpa-deploy
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: django
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: http_requests
target:
type: Value
averageValue: 2k
另外,我使用的是基于 Nginx 的入口控制器,但入口和服务的 hpa kubectl describe 表明:
$ kubectl describe hpa hpa-ingress
Name: hpa-ingress
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"autoscaling/v2beta2","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"hpa-ingress","namespace":"defaul...
CreationTimestamp: Thu, 06 Jun 2019 11:06:48 +0000
Reference: Ingress/test-ingress
Metrics: ( current / target )
"http_requests" on Ingress/test-ingress (target value): <unknown> / 2k
Min replicas: 1
Max replicas: 10
Ingress pods: 0 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale False FailedGetScale the HPA controller was unable to get the target's current scale: the server could not find the requested resource
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetScale 2m40s (x473 over 122m) horizontal-pod-autoscaler the server could not find the requested resource
我不确定是否必须手动导出 Pod 的 http_requests 指标,如果是这种情况,我该怎么做?文档都是“复制和粘贴,一切都会正常工作”,但事实并非如此。请,如果可能的话,越详细越好,我对这个主题真的很陌生。非常感谢。