项目作者: jessemillar

项目描述 :
Researching how to best apply stress to Kubenetes pods/nodes using stress-ng
高级语言:
项目地址: git://github.com/jessemillar/k8s-stress-research.git
创建时间: 2020-10-21T15:43:50Z
项目社区:https://github.com/jessemillar/k8s-stress-research

开源协议:MIT License

下载


k8s-stress-research

Yes, I’m aware there are existing projects that were created to apply stress to Kubernetes clusters, but I’m working on some internal Microsoft projects that are needed for custom infrastructure. As such, the research presented by this repository will be largely based around raw stress-ng with no non-Kubernetes wrappers.

The way I’m looking at it currently, there’s three options for applying stress using a wrapper around stress-ng:

  1. Use a daemonset to run a container with stress-ng that tries to vampire suck all the resources of the node
  2. Use a sidecar of sorts to get stress-ng inside the application pods and apply stress directly
  3. Somehow install stress-ng on the actual VM (potentially via a daemonset) and stress the whole box

Setup

  1. Install the Kubernetes Metrics Server

Tests Performed

Local Docker Test

  1. Used Docker on my development machine to run a stress-ng load test at 80% CPU for a minute:
    1. docker run -it --rm alexeiled/stress-ng:latest-ubuntu --cpu 0 --cpu-load 80 --timeout 60s --metrics-brief

Saw host CPU usage hold at 80% across all cores as expected.

Local Kubernetes Test

  1. Created a local Kubernetes cluster using Kind
  2. Applied a pod spec to the cluster that contains stress-ng:
    1. kubectl apply -f pod.yaml
  3. Triggered the stress-ng pod to load test at 80% CPU for a minute:
    1. kubectl exec -it <stress-ng-pod> -- stress-ng --cpu 0 --cpu-load 80 --timeout 60s

Saw host CPU usage hold at 80% across all cores as expected.

Local Kubernetes Test with Multiple Stressor Pods

  1. Created a local Kubernetes cluster using Kind
  2. Applied multiple pod specs to the cluster that contains stress-ng:
    1. kubectl apply -f pod.yaml
    2. kubectl apply -f pod2.yaml
  3. Triggered the stress-ng pods to load test at 80% CPU for a minute:
    1. kubectl exec -it <stress-ng-pod1> -- stress-ng --cpu 0 --cpu-load 80 --timeout 60s
    2. kubectl exec -it <stress-ng-pod2> -- stress-ng --cpu 0 --cpu-load 80 --timeout 60s

Saw host CPU usage hold at 100% across all cores. The competing pods were not able to take CPU above 100% usage.

Create Namespace with CPU Limit

  1. kubectl create namespace limitrange-cpu-50-percent
  2. kubectl apply -f limitrange-cpu-50-percent.yaml --namespace=constraints-cpu-example

Questions to Answer

  • Are we trying to chaos test Kubernetes as a service, or chaos test a service that runs on Kubernetes?
  • Is it possible to guarantee that a daemonset has unlimited resource constraints?
  • Do we want a daemonset to have unlimited resource constraints, or to respect the cluster resource quote in the event of multitenant situations?
  • Do daemonsets get rescheduled to other nodes when resources are consumed?
  • Does Kubernetes react differently when the machine is stressed directly vs stressed from inside a container (e.g. Will it reschedule pods in one situation and not in another; we want to simulate real load as much as possible)?
  • How does networking inside a daemonset work?
  • How do we make sure the communication cert gets installed inside a daemonset container?
  • Would we save time/effort by instead wrapping an existing k8s stressing framework like Pumba?

Arguments for Containerized Daemonset

  • Would work and be able to more easily respect multitenant resource namespacing
  • Might be simpler to interact with Kubernetes networking from inside the Kubernetes network