项目作者: afritzler

项目描述 :
Install NVIDIA GPU Support on CoreOS based Kubernetes Clusters
高级语言:
项目地址: git://github.com/afritzler/kubernetes-gpu.git
创建时间: 2018-05-15T08:08:02Z
项目社区:https://github.com/afritzler/kubernetes-gpu

开源协议:Apache License 2.0

下载


kubernetes-gpu

Install NVIDIA GPU Support on CoreOS based Kubernetes Cluster

Prerequisits

  • CoreOS based Kubernetes cluster with GPU nodes (e.g. AWS P2 instances)

Installation

First install the nvidia driver via this daemonset

  1. kubectl apply -f https://raw.githubusercontent.com/afritzler/kubernetes-gpu/master/k8s-nvidia-driver.yaml

Wait until the init container finishes on each node and install the device plugin

  1. kubectl apply -f https://raw.githubusercontent.com/afritzler/kubernetes-gpu/master/k8s-nvidia-deviceplugin.yaml

Run

To run an example training on a GPU node, start first a base image with Tensorflow with GPU support & Keras

  1. kubectl apply -f https://raw.githubusercontent.com/afritzler/deeplearning-workbench/master/manifests/dl-workbench.yaml

Now exec into the container and start an example Keras traing

  1. kubectl exec -it deeplearning-workbench-8676458f5d-p4d2v -- /bin/bash
  2. cd /keras/example
  3. python imdb_cnn.py

Open Issues

  • Label GPU nodes and add NodeSelector to daemonset

Acknowledgments & References