How to autoscale Kubernetes workloads using Prometheus Adapter and custom metrics
Introduction
Kubernetes’ Horizontal Pod Autoscaler (HPA) supports CPU and memory metrics by default, but many workloads need to scale based on application-level metrics such as queue depth or backlog size.
This guide walks through how to use Prometheus Adapter to expose custom Prometheus metrics to Kubernetes and use them for autoscaling.
By the end, you’ll have:
-
Prometheus Adapter installed
-
A custom metric exposed via
custom.metrics.k8s.io -
An HPA scaling a workload based on that metric
Prerequisites
Before starting, ensure you have:
-
A Kubernetes cluster
-
A Prometheus-compatible metrics backend
-
Metrics that include labels for:
-
namespace -
service(or another Kubernetes object label)
-
-
Helm installed
-
Access to create:
-
Deployments
-
HPAs
-
APIService resources
-
Step 1: Verify your metric exists in Prometheus
Before configuring Prometheus Adapter, confirm your metric is being scraped.
Example metric:
pending_jobs_total{ namespace="example-app",service="worker" }
Query Prometheus directly:
curl -G https://prometheus.example.com/api/v1/query \
--data-urlencode 'query=pending_jobs_total'
You should see at least one active time series.
Step 2: Install Prometheus Adapter
Prometheus Adapter is typically installed using the Helm chart from prometheus-community.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus-adapter prometheus-community/prometheus-adapter \
--namespace prometheus-adapter \
--create-namespace
Verify the adapter is running:
kubectl -n prometheus-adapter get pods
Step 3: Configure custom metric rules
Prometheus Adapter uses rules to map Prometheus metrics to Kubernetes objects.
Create a values file with custom rules:
rules:
default: false
custom:
- seriesQuery: 'pending_jobs_total{namespace!="",service!=""}'
resources:
overrides:
namespace:
resource: namespace
service:
resource: service
name:
matches: "^pending_jobs_total$"
as: "pending_jobs_total"
metricsQuery: >
sum(
max_over_time(
<<.Series>>{<<.LabelMatchers>>}[5m]
)
) by (<<.GroupBy>>)
Why max_over_time?
Prometheus Adapter uses instant queries.
If a metric hasn’t been scraped recently, the adapter returns 404.
Using a short range query (e.g. [5m]) avoids issues caused by brief scrape gaps.
Step 4: Apply the updated configuration
Upgrade Prometheus Adapter with your custom rules:
helm upgrade prometheus-adapter prometheus-community/prometheus-adapter \
--namespace prometheus-adapter \
-f values.yaml
Restart the adapter to ensure it reloads configuration:
kubectl -n prometheus-adapter rollout restart deploy/prometheus-adapter
Step 5: Verify the custom metrics API
Check that the custom metrics API is available:
kubectl get apiservice v1beta1.custom.metrics.k8s.io
List available custom metrics:
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq '.resources'
You should see entries like:
services/pending_jobs_total
Query the metric for a specific Service:
kubectl get --raw \
"/apis/custom.metrics.k8s.io/v1beta1/namespaces/example-app/services/worker/pending_jobs_total"
If this returns a value, Prometheus Adapter is working correctly.
Step 6: Create an HPA using the custom metric
Define an HPA that references the custom metric:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: worker-hpa
namespace: example-app
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: worker
minReplicas: 1
maxReplicas: 10
metrics:
- type: Object
object:
describedObject:
apiVersion: v1
kind: Service
name: worker
metric:
name: pending_jobs_total
target:
type: Value
value: "2"
Apply it:
kubectl apply -f worker-hpa.yaml
Step 7: Confirm autoscaling behavior
Describe the HPA to confirm it can read the metric:
kubectl -n example-app describe hpa worker-hpa
Look for:
-
ValidMetricFound -
A current metric value
-
No
FailedGetObjectMetricerrors
Common troubleshooting tips
Metric returns 404 in custom metrics API
-
The metric exists historically but has no recent samples
-
Use
max_over_timein the adapter rule -
Ensure the workload is still emitting metrics
HPA shows old errors
-
HPA events are historical
-
Focus on the current metric value and condition statuses
Adapter runs but metrics are missing
-
Check the ConfigMap loaded by the adapter
-
Verify label names match exactly (
namespace,service) -
Confirm Prometheus queries return results at “now”
Automating with GitOps
Prometheus Adapter can be fully automated using GitOps tools such as Argo CD:
-
Install the adapter via a Helm source
-
Store rules and values in Git
-
Manage credentials with Sealed Secrets
-
Let Argo continuously reconcile the state
This removes the need for manual Helm commands and ensures consistency across environments.
Conclusion
Prometheus Adapter enables Kubernetes autoscaling based on real application signals, making scaling decisions more meaningful than CPU or memory alone.
While the setup requires careful attention to PromQL, labels, and staleness, the resulting autoscaling behavior is far more aligned with how applications actually behave.
For queue-based or event-driven workloads, custom metrics autoscaling is a powerful and flexible approach.