The Docker Hubpocalypse

Nov 6, 2020

As of Nov 2nd, 2020 Docker Hub has begun rate-limiting the number of anonymous image pull requests from their registry. This may have a broad impact particularly for large Kubernetes clusters that depend on public images. Docker Hub will gradually reduce the limits to:

  • 100 container image requests per six hours for anonymous usage.
  • 200 container image requests per six hours for free Docker accounts.
  • Unlimited container pull requests for Pro and Team accounts.

These limits are applied per IP address. How much impact this will have on a cluster will depend on a variety of factors such as:

  • Does every node use public IP to pull images or are they NATed?
  • How often do the public images upstream change?
  • Do you use an auto-scaler with frequent resizing?
  • How often do pods move between hosts in your cluster?

By far the biggest impact will be Private IP clusters making requests via NAT. For a fixed size cluster the impact is expected to be minimal. A common configuration is imagePull: Always it is a common misconception that this always pulls the Docker image. In reality it checks the SHA and only pulls the image when it differs. Another consideration is that the most common elements consumed from public repositories are infrastructure components like service meshes, egresses, log forwarders, and APM agents such as Instana. Relative to customer applications that maybe deployed many times a day these often have longer release cycles meaning in practise your clusters might not be heavily impacted by this change in service.

What can you do?

There are a few different options you can consider to minimise your exposure:

  1. Use image pull secrets.
  2. Use Service Accounts.
  3. Use a caching proxy.
  4. Use a mirror.

Image Pull Secrets

Benefits

  • Unlimited pull requests with a Pro/Team account.
  • Simple and familiar approach for anyone using private repositories.

Tradeoffs

  • Requires pull secrets per namespace.
  • Requires updating workloads.
  • Potentially needs an Admission controller to auto-inject imagePullSecret.

How To

See the Kubernetes official documentation on creating image pull secrets and associating it with a workload.

Service Account

Benefits

  • Many infrastructure components will have Service Accounts (SA) attached.

Drawbacks

  • Lesser-known approach to associating pull secrets to a pod.
  • Potentially needs an Admission controller to auto-inject the SA.

How To

See the Kubernetes documentation on adding secrets to a Service Account.

Caching Proxy

Benefits

  • Reduces total external bandwidth/latency for commonly used images.

Tradeoffs

  • Requires hooking into the docker configuration.
  • If the accounts are unauthenticated still possible to hit rate-limit if consuming a high number of images and versions.
  • Non-trivial to configure with PaaS solutions such as GKE, AKS, and EKS.
  • Additional infrastructure to manage and monitor.

How To

See the Docker documentation on configuring a pull through cache. The key components to be aware of are configuring a proxy and docker.json:

  1. Create a proxy:
    1
    2
    3
    4
    
     docker run -d -p 6000:5000 \
     -e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io \
     --restart always \
     --name registry registry:2
    
  2. Update docker.json

Mirror

Benefits

  • Independence from upstream maintainers.
  • Potential to integrate Docker security scanners.

Tradeoffs

  • Additional overhead for keeping images up to date.
  • Requires overriding image name for any related workloads.

How

  1. Docker pull.
  2. Docker push.
  3. Update image references for workloads.

Conclusions

Of the above it’s hard to be perspective of any single solution. You may choose to mix and match the above as appropriate for your IT organisation.

tags: [ docker kubernetes ]