项目作者: webdevops

项目描述 :
Automatic repair of K8s cluster nodes in Azure
高级语言: Go
项目地址: git://github.com/webdevops/azure-k8s-autorepair.git
创建时间: 2020-05-17T16:23:22Z
项目社区:https://github.com/webdevops/azure-k8s-autorepair

开源协议:MIT License

下载


Azurer Kubernetes AutoRepair

license
Docker
Docker Build Status


—-> replaced by azure-k8s-autopilot <—-


Services which checks node status and triggeres an automatic Azure VM redeployment to try to solve VM issues.

Supports Azure AKS and custom Azure Kubernetes clusters.

Supports shoutrrr notifications.

Configuration

  1. Usage:
  2. azure-k8s-autorepair [OPTIONS]
  3. Application Options:
  4. -v, --verbose Verbose mode [$VERBOSE]
  5. --dry-run Dry run (no redeploy triggered) [$DRY_RUN]
  6. --k8s.node.labelselector= Node Label selector which nodes should be checked [$K8S_NODE_LABELSELECTOR]
  7. --repair.interval= Duration of check run (default: 30s) [$REPAIR_INTERVAL]
  8. --repair.notready-threshold= Threshold (duration) when the automatic repair should be tried (eg. after 10 mins of NotReady state after last successfull
  9. heartbeat) (default: 10m) [$REPAIR_NOTREADY_THRESHOLD]
  10. --repair.concurrency= How many VMs should be redeployed concurrently (default: 1) [$REPAIR_CONCURRENCY]
  11. --repair.lock-duration= Duration how long should be waited for another redeploy (default: 30m) [$REPAIR_LOCK_DURATION]
  12. --repair.lock-duration-error= Duration how long should be waited for another redeploy in case an error occurred (default: 5m)
  13. [$REPAIR_LOCK_DURATION_ERROR]
  14. --repair.azure.vmss.action=[restart|redeploy|reimage] Defines the action which should be tried to repair the node (VMSS) (default: redeploy) [$REPAIR_AZURE_VMSS_ACTION]
  15. --repair.azure.vm.action=[restart|redeploy] Defines the action which should be tried to repair the node (VM) (default: redeploy) [$REPAIR_AZURE_VM_ACTION]
  16. --repair.azure.provisioningstate= Azure VM provisioning states where repair should be tried (eg. avoid repair in "upgrading" state; "*" to accept all
  17. states) (default: succeeded, failed) [$REPAIR_AZURE_PROVISIONINGSTATE]
  18. --notification= Shoutrrr url for notifications (https://containrrr.github.io/shoutrrr/) [$NOTIFCATION]
  19. --bind= Server address (default: :8080) [$SERVER_BIND]
  20. Help Options:
  21. -h, --help Show this help message

for Azure API authentication (using ENV vars) see https://github.com/Azure/azure-sdk-for-go#authentication

for Kubernetes ServiceAccont is discoverd automatically (or you can use env path KUBECONFIG to specify path to your kubeconfig file)

Metrics

standard metrics only (see :8080/metrics)