项目作者: bakito

项目描述 :
👷‍♂️ A batch job controller for k8s / OpenShift
高级语言: Go
项目地址: git://github.com/bakito/batch-job-controller.git
创建时间: 2020-08-19T17:58:46Z
项目社区:https://github.com/bakito/batch-job-controller

开源协议:Apache License 2.0

下载


Github Build
Go Report Card
Coveralls github
GitHub Release

Batch Job Controller

The batch job controller allows executing pods on nodes of a cluster, where the number of concurrent running pods can be
configured. Each pod can report it’s results back to the controller to have them exposed as metrics.

Deployment

The controller expects the following environment variables

Name Value
NAMESPACE The current namespace
CONFIG_MAP_NAME The name of the configmap to read the config from
POD_IP The IP of the controller Pod. If defined, this IP is used for the callback URL of the job pods.(should be injected via Downward API)

Configuration

The configuration has to be stored in a configmap with the following values

config.yaml

Controller configuration

  1. name: "" # name of the controller; will also be used as prefix for the job pods
  2. jobServiceAccount: "" # service account to be used for the job pods. If empty the default will be used
  3. jobImagePullSecrets: # pull secrets to be used for the job pods for pulling the image
  4. - name: secret_name
  5. jobNodeSelector: { } # node selector labels to define in which nodes to run the jobs
  6. runOnUnscheduledNodes: true # if true, jobs are also started on nodes that are unschedulable
  7. cronExpression: "42 3 * * *" # the cron expression to trigger the job execution
  8. reportDirectory: "/var/www" # directory to store and serve the reports
  9. reportHistory: 30 # number of execution reports to keep
  10. podPoolSize: 10 # number of concurrent job pods to run
  11. runOnStartup: true # if 'true' the jobs are triggered on startup of the controller
  12. startupDelay: 10s # the delay as duration that is used to start the jobs if runOnStartup is enabled. default is '10s'
  13. callbackServiceName: "" # name of the controller service
  14. callbackServicePort: 8090 # port of the controller callback api service
  15. custom: { } # additional properties that can be used in a custom implementation
  16. latestMetricsLabel: false # if 'true' each result metric is also created with executionID='latest'
  17. leaderElectionResourceLock: "" # type of leader election resource lock to be used. ('configmapsleases' (default), 'configmaps', 'endpoints', 'leases', 'endpointsleases')
  18. savePodLog: false # if enabled, pod logs are saved along other with other job files
  19. metrics:
  20. prefix: "foo_...." # prefix for the metrics exposed by the controller
  21. gauges: # metric gauges that will be exposed by the jobs. The key is uses as suffix for the metrics.
  22. test: # suffix of the metric
  23. help: "help ..." # help text for the metric
  24. labels: # list of labels to be used with the metric. node and executionID are automatically added
  25. - label_a
  26. - label_b

pod-template.yaml

The template of the pod to be started for each job. When a pod is created it gets enriched by the controller specific
configuration. pkg\job\job.go

Job Pod

The job pod has the following env variables provided by the controller:

Environment

Name Value
NAMESPACE The current namespace
NODE_NAME The name of the node it is running on
EXECUTION_ID The id of the current job execution
CALLBACK_SERVICE_NAME The name/host/ip of the callback service to send the report to
CALLBACK_SERVICE_PORT The port of the callback service to send the report to
CALLBACK_SERVICE_RESULT_URL The full qualified URL of the result callback service
CALLBACK_SERVICE_FILE_URL The full qualified URL of the file callback service, to send files to the controller
CALLBACK_SERVICE_EVENT_URL The full qualified URL of the event callback service, to create k8s event

Callback

The controller exposes by default an endpoint to receive job results. The report is stored locally and metrics of the
reports will be exposed.

URL

The report URL is by default: ${CALLBACK_SERVICE_RESULT_URL}

Body

The body of the report contains the metric suffixes that are also defined in the controller config. Each metric has a
decimal value and a map where the key is the label name and value is the value to be used for the metric label.

  1. {
  2. "test": [
  3. {
  4. "value": 1.0,
  5. "labels": {
  6. "label_a": "AAA",
  7. "label_b": "BBB"
  8. }
  9. },
  10. {
  11. "value": 2.554,
  12. "labels": {
  13. "label_a": "AAA2",
  14. "label_b": "BBB2"
  15. }
  16. }
  17. ]
  18. }

Example job script: helm\batch-job-controller\bin\run.sh

Upload additional files

Additional files can be uploaded.

Use default ‘Content-Disposition’ header or the name query parameter to define the name of the file. If the name
is not defined an uuid is generated. Each filename is prepended with the node name.

URL

The report URL is by default: ${CALLBACK_SERVICE_FILE_URL}

Create k8s Events from job pod

k8s Event can be created from each job pod by calling the event endpoint.

The ‘reason’ should be short and unique; it must be in UpperCamelCase format (starting with a capital letter).

Simple Message:

  1. {
  2. "warning": false,
  3. "reason": "TestReason",
  4. "message": "test message"
  5. }

Massage with parameters

  1. {
  2. "warning": true,
  3. "reason": "TestReason",
  4. "message": "test message: %s",
  5. "args": [
  6. "a1"
  7. ]
  8. }

URL

The event URL is by default: ${CALLBACK_SERVICE_EVENT_URL}

Examples

test-queries.http