Kubernetes Autoscaling

Horizontal Pod Autoscaler

Contents

  1. โŒ› History of Scaling
  2. ๐Ÿคท Reasons for Scaling
  3. ๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Maturity Model
  4. ๐Ÿ“ˆ Scaling in Kubernetes
2

Timelines

โŒ› History of Scaling


| Class               | Lead Time               | Level of Automation |
| ------------------- | ----------------------- | ------------------- |
| Self-hosted Servers | Weeks to Months         | Low                 |
| Virtualisation      | Days to Weeks           | Low                 |
| VPS                 | Hours to Days           | Moderate            |
| Instances           | Minutes to Hours        | Moderate to High    |
| Pods                | Seconds to Minutes      | High                |
| Functions           | Milliseconds to Seconds | High                |
| #nocode             | Speed of thought **     | Infinite โˆž          |

** ( อกยฐ อœส– อกยฐ)

3

Drivers

๐Ÿคท Reasons for Scaling

Scale Up

  1. Latency.
  2. Availability.
  3. Throughput.

Scale Down

  1. Costs.
  2. Density.
  3. Sharing.
4

๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Autoscaling Maturity Model


| Level            | Monitoring       | Scaling                   | Benchmarking                |
| ---------------- | ---------------- | ------------------------- | --------------------------- |
|  0 - Static      | No observability | Best guess provisioning   | No performance/load testing |
|  1 - Coarse      | CPU/Memory       | Based on CPU/Memory       | Manual load tests           |
|  2 - Qualitative | Calls/Latency    | Based on calls/latency    | Automatic but periodic      |
|  3 - Optimising  | Tracing          | Adaptive                  | Automatic per commit        |
5

Level 0 - Static

๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Autoscaling Maturity Model

6

Level 1 - Coarse

๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Autoscaling Maturity Model

7

Level 2 - Qualitative

๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Autoscaling Maturity Model

8

Level 3 - Optimising

๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Autoscaling Maturity Model

9

๐Ÿ“ˆ Scaling in Kubernetes

10

Scaling Maths 101

๐Ÿ“ˆ Scaling in Kubernetes


desiredReplicas = ceil(currentReplicas * (currentMetric/desiredMetric))

    Given a target utilization of 60%
    And a replica count of 4 pods
    And an average utilization of 80%
    When the HPA evaluates the metrics
    Then it should scale to 6 pods.

 ceil(4 * (100/60)).

11

Level 0 - Static

๐Ÿ“ˆ Scaling in Kubernetes

kubectl scale --replicas=2 -n instana-dev deployment/fizzbuzz 
12

Level 1 - Coarse

๐Ÿ“ˆ Scaling in Kubernetes

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: fizzbuzz
  namespace: instana-dev
spec:
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - resource:
      name: cpu
      target:
        averageUtilization: 60
        type: Utilization
    type: Resource
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: fizzbuzz
13

Level 2 - Qualitative - Landscape

๐Ÿ“ˆ Scaling in Kubernetes

14

Thank you

Horizontal Pod Autoscaler

09 Jan 2022