Skip to content

Autoscaling Kubernetes workloads with Prometheus Adapter and custom metrics

Published: at 10:00 AM

How to autoscale Kubernetes workloads using Prometheus Adapter and custom metrics

Introduction

Kubernetes’ Horizontal Pod Autoscaler (HPA) supports CPU and memory metrics by default, but many workloads need to scale based on application-level metrics such as queue depth or backlog size.

This guide walks through how to use Prometheus Adapter to expose custom Prometheus metrics to Kubernetes and use them for autoscaling.

By the end, you’ll have:


Prerequisites

Before starting, ensure you have:


Step 1: Verify your metric exists in Prometheus

Before configuring Prometheus Adapter, confirm your metric is being scraped.

Example metric:

pending_jobs_total{ namespace="example-app",service="worker" }

Query Prometheus directly:

curl -G https://prometheus.example.com/api/v1/query \
 --data-urlencode 'query=pending_jobs_total'

You should see at least one active time series.


Step 2: Install Prometheus Adapter

Prometheus Adapter is typically installed using the Helm chart from prometheus-community.

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

helm install prometheus-adapter prometheus-community/prometheus-adapter \
 --namespace prometheus-adapter \
 --create-namespace

Verify the adapter is running:

kubectl -n prometheus-adapter get pods

Step 3: Configure custom metric rules

Prometheus Adapter uses rules to map Prometheus metrics to Kubernetes objects.

Create a values file with custom rules:

rules:
  default: false
  custom:
    - seriesQuery: 'pending_jobs_total{namespace!="",service!=""}'
      resources:
        overrides:
          namespace:
            resource: namespace
          service:
            resource: service
      name:
        matches: "^pending_jobs_total$"
        as: "pending_jobs_total"
      metricsQuery: >
        sum(
          max_over_time(
            <<.Series>>{<<.LabelMatchers>>}[5m]
          )
        ) by (<<.GroupBy>>)

Why max_over_time?

Prometheus Adapter uses instant queries. If a metric hasn’t been scraped recently, the adapter returns 404.

Using a short range query (e.g. [5m]) avoids issues caused by brief scrape gaps.


Step 4: Apply the updated configuration

Upgrade Prometheus Adapter with your custom rules:

helm upgrade prometheus-adapter prometheus-community/prometheus-adapter \
 --namespace prometheus-adapter \
 -f values.yaml

Restart the adapter to ensure it reloads configuration:

kubectl -n prometheus-adapter rollout restart deploy/prometheus-adapter

Step 5: Verify the custom metrics API

Check that the custom metrics API is available:

kubectl get apiservice v1beta1.custom.metrics.k8s.io

List available custom metrics:

kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq '.resources'

You should see entries like:

services/pending_jobs_total

Query the metric for a specific Service:

kubectl get --raw \
"/apis/custom.metrics.k8s.io/v1beta1/namespaces/example-app/services/worker/pending_jobs_total"

If this returns a value, Prometheus Adapter is working correctly.


Step 6: Create an HPA using the custom metric

Define an HPA that references the custom metric:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
  metadata:
  name: worker-hpa
namespace: example-app
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: worker
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Object
    object:
      describedObject:
        apiVersion: v1
        kind: Service
        name: worker
      metric:
        name: pending_jobs_total
      target:
        type: Value
        value: "2"

Apply it:

kubectl apply -f worker-hpa.yaml

Step 7: Confirm autoscaling behavior

Describe the HPA to confirm it can read the metric:

kubectl -n example-app describe hpa worker-hpa

Look for:


Common troubleshooting tips

Metric returns 404 in custom metrics API

HPA shows old errors

Adapter runs but metrics are missing


Automating with GitOps

Prometheus Adapter can be fully automated using GitOps tools such as Argo CD:

This removes the need for manual Helm commands and ensures consistency across environments.


Conclusion

Prometheus Adapter enables Kubernetes autoscaling based on real application signals, making scaling decisions more meaningful than CPU or memory alone.

While the setup requires careful attention to PromQL, labels, and staleness, the resulting autoscaling behavior is far more aligned with how applications actually behave.

For queue-based or event-driven workloads, custom metrics autoscaling is a powerful and flexible approach.


Next Post
Creating a GitHub Action to add to your CI/CD pipeline