Autoscaling Kubernetes workloads with Prometheus Adapter and custom metrics

How to autoscale Kubernetes workloads using Prometheus Adapter and custom metrics

Introduction

Kubernetes’ Horizontal Pod Autoscaler (HPA) supports CPU and memory metrics by default, but many workloads need to scale based on application-level metrics such as queue depth or backlog size.

This guide walks through how to use Prometheus Adapter to expose custom Prometheus metrics to Kubernetes and use them for autoscaling.

By the end, you’ll have:

Prometheus Adapter installed
A custom metric exposed via custom.metrics.k8s.io
An HPA scaling a workload based on that metric

Prerequisites

Before starting, ensure you have:

A Kubernetes cluster
A Prometheus-compatible metrics backend
Metrics that include labels for:
- namespace
- service (or another Kubernetes object label)
Helm installed
Access to create:
- Deployments
- HPAs
- APIService resources

Step 1: Verify your metric exists in Prometheus

Before configuring Prometheus Adapter, confirm your metric is being scraped.

Example metric:

pending_jobs_total{ namespace="example-app",service="worker" }

Query Prometheus directly:

curl -G https://prometheus.example.com/api/v1/query \
 --data-urlencode 'query=pending_jobs_total'

You should see at least one active time series.

Step 2: Install Prometheus Adapter

Prometheus Adapter is typically installed using the Helm chart from prometheus-community.

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

helm install prometheus-adapter prometheus-community/prometheus-adapter \
 --namespace prometheus-adapter \
 --create-namespace

Verify the adapter is running:

kubectl -n prometheus-adapter get pods

Step 3: Configure custom metric rules

Prometheus Adapter uses rules to map Prometheus metrics to Kubernetes objects.

Create a values file with custom rules:

rules:
  default: false
  custom:
    - seriesQuery: 'pending_jobs_total{namespace!="",service!=""}'
      resources:
        overrides:
          namespace:
            resource: namespace
          service:
            resource: service
      name:
        matches: "^pending_jobs_total$"
        as: "pending_jobs_total"
      metricsQuery: >
        sum(
          max_over_time(
            <<.Series>>{<<.LabelMatchers>>}[5m]
          )
        ) by (<<.GroupBy>>)

Why max_over_time?

Prometheus Adapter uses instant queries. If a metric hasn’t been scraped recently, the adapter returns 404.

Using a short range query (e.g. [5m]) avoids issues caused by brief scrape gaps.

Step 4: Apply the updated configuration

Upgrade Prometheus Adapter with your custom rules:

helm upgrade prometheus-adapter prometheus-community/prometheus-adapter \
 --namespace prometheus-adapter \
 -f values.yaml

Restart the adapter to ensure it reloads configuration:

kubectl -n prometheus-adapter rollout restart deploy/prometheus-adapter

Step 5: Verify the custom metrics API

Check that the custom metrics API is available:

kubectl get apiservice v1beta1.custom.metrics.k8s.io

List available custom metrics:

kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq '.resources'

You should see entries like:

services/pending_jobs_total

Query the metric for a specific Service:

kubectl get --raw \
"/apis/custom.metrics.k8s.io/v1beta1/namespaces/example-app/services/worker/pending_jobs_total"

If this returns a value, Prometheus Adapter is working correctly.

Step 6: Create an HPA using the custom metric

Define an HPA that references the custom metric:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
  metadata:
  name: worker-hpa
namespace: example-app
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: worker
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Object
    object:
      describedObject:
        apiVersion: v1
        kind: Service
        name: worker
      metric:
        name: pending_jobs_total
      target:
        type: Value
        value: "2"

Apply it:

kubectl apply -f worker-hpa.yaml

Step 7: Confirm autoscaling behavior

Describe the HPA to confirm it can read the metric:

kubectl -n example-app describe hpa worker-hpa

Look for:

ValidMetricFound
A current metric value
No FailedGetObjectMetric errors

Common troubleshooting tips

Metric returns 404 in custom metrics API

The metric exists historically but has no recent samples
Use max_over_time in the adapter rule
Ensure the workload is still emitting metrics

HPA shows old errors

HPA events are historical
Focus on the current metric value and condition statuses

Adapter runs but metrics are missing

Check the ConfigMap loaded by the adapter
Verify label names match exactly (namespace, service)
Confirm Prometheus queries return results at “now”

Automating with GitOps

Prometheus Adapter can be fully automated using GitOps tools such as Argo CD:

Install the adapter via a Helm source
Store rules and values in Git
Manage credentials with Sealed Secrets
Let Argo continuously reconcile the state

This removes the need for manual Helm commands and ensures consistency across environments.

Conclusion

Prometheus Adapter enables Kubernetes autoscaling based on real application signals, making scaling decisions more meaningful than CPU or memory alone.

While the setup requires careful attention to PromQL, labels, and staleness, the resulting autoscaling behavior is far more aligned with how applications actually behave.

For queue-based or event-driven workloads, custom metrics autoscaling is a powerful and flexible approach.