项目作者: AdamPaternostro

项目描述 :
Demo of using Docker contatiners in Azure Batch (using Shipyard)
高级语言:
项目地址: git://github.com/AdamPaternostro/Azure-Docker-Shipyard.git
创建时间: 2017-07-06T13:54:01Z
项目社区:https://github.com/AdamPaternostro/Azure-Docker-Shipyard

开源协议:

下载


Azure-Docker-Shipyard

Demo of using Docker contatiners in Azure Batch (using Shipyard). This also shows how to mount a Docker Volume using Shipyard.

Create Azure Resource Group

  • Create a Resource Group (e.g. AdamShipyardDemo)

alt tag

Create Azure Batch Service (create in the resource group above)

  • Create a “Batch Service” account (e.g. adamshipyardbatchservice)
  • When creating the “Batch Service” create a storage account (e.g. adamshipyardstorage)

alt tag

Create a Linux VM (create in the resource group above)

  • Create Ubuntu Server 16.04 LTS (e.g. adamshipyardvm)
    1. Username: shipyarduser Password: <<REMOVED>>
    2. Hard disk: HDD
    3. Size: D1_V2 (does not need to be powerful)
    4. Use Managed Disk: Yes
    5. Monitoring: Disabled
    6. Use the defualts for Networking

alt tag

All Resources

alt tag

Install Docker and Shipyard

ssh to the Linux computer

alt tag

Install Docker

  1. sudo apt-get -y install apt-transport-https ca-certificates curl
  2. curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
  3. sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
  4. sudo apt-get update
  5. sudo apt-get -y install docker-ce

Test docker

  1. sudo docker run hello-world

For Shipyard image

  1. sudo docker pull alfpark/batch-shipyard:cli-latest

alt tag

Create a Docker program

  • Create a directory mkdir docker
  • Create a file named Dockerfile and place this in the file

    1. FROM alpine
    2. WORKDIR /app
    3. RUN apk add --update bash
    4. RUN apk add --update curl && rm -rf /var/cache/apk/*
    5. ADD download.sh /app
    6. CMD ["bash","./download.sh"]
  • Create a file named download.sh and place this in the file

    1. #!/bin/bash
    2. for i in {1..10}
    3. do
    4. filename=$(date +"%m-%d-%y-%T").pdf
    5. echo "Starting Download: $filename"
    6. curl -o /share/$filename http://ipv4.download.thinkbroadband.com/20MB.zip
    7. echo "Downloaded: $filename"
    8. done
  • Build the Docker image

    1. sudo docker build -t adamshipyarddockerimage .

    alt tag

  • List the images
    1. sudo docker images
    alt tag
  • Run the image locally

    1. sudo docker run -v ~/mnt/share:/share adamshipyarddockerimage
  • Create a respository on Dockerhub (e.g. adamshipyardrepository)
    alt tag

  • Upload image to repository

    1. sudo docker login
    2. sudo docker tag adamshipyarddockerimage adampaternostro/adamshipyarddockerimage:latest
    3. sudo docker push adampaternostro/adamshipyarddockerimage:latest

    alt tag

Create Shipyard files

  • Go back to your home directory (cd ..)
  • Make a directory (e.g. config)
  • cd config

  • Create a file named “config.yaml”

    • Description: This contains a reference to a storage account used by Shipyard for its own internal purposes. It also has a reference to our Docker image as well a Data Volume so we know where to download our files.
    • For full schema: https://github.com/Azure/batch-shipyard/blob/master/docs/12-batch-shipyard-configuration-global.md
      ```
      batch_shipyard:
      storage_account_settings: mystorageaccount
      global_resources:
      docker_images:
    • adampaternostro/adamshipyarddockerimage:latest
      docker_volumes:
      data_volumes:
      ephemeraldisk:
      1. host_path: "/mnt/docker-tmp"
      2. container_path: "/share"
      ```
  • Create a file named “credentials.yaml”

    • For full schema: https://github.com/Azure/batch-shipyard/blob/master/docs/11-batch-shipyard-configuration-credentials.md
    • Change the Batch: account_key and account_service_url
    • Change the mystorageaccount: account and account key (NOTE: keep the name mystorageaccount since it is referenced in the “config.json”)
      1. credentials:
      2. batch:
      3. account_key: "<<REMOVED>>"
      4. account_service_url: https://adamshipyardbatchservice.eastus2.batch.azure.com
      5. storage:
      6. mystorageaccount:
      7. account: adamshipyardstorage
      8. account_key: "<<REMOVED>>"
      9. endpoint: core.windows.net
  • Create a file named “jobs.ymal”

  • id: adamshipyardjob
    data_volumes:
    • ephemeraldisk
      tasks:
    • docker_image: adampaternostro/adamshipyarddockerimage:latest
      remove_container_after_exit: true
      command: bash /app/download.sh
  1. - Create a file named "pool.yaml"
  2. - For full schema: https://github.com/Azure/batch-shipyard/blob/master/docs/13-batch-shipyard-configuration-pool.md

pool_specification:
id: adampool
vm_size: STANDARD_D1_V2
vm_count:
dedicated: 2
low_priority: 0
vm_configuration:
platform_image:
publisher: Canonical
offer: UbuntuServer
sku: 16.04-LTS
reboot_on_start_task_failed: false
block_until_all_global_resources_loaded: true

  1. ## Run Shipyard
  2. #### Create the batch pool to run our Docker container on

sudo docker run —rm -it -v /home/shipyarduser/config:/configs -e SHIPYARD_CONFIGDIR=/configs alfpark/batch-shipyard:latest-cli pool add

  1. ![alt tag](https://raw.githubusercontent.com/AdamPaternostro/Azure-Docker-Shipyard/master/images/Create-Pools-1.png)
  2. ![alt tag](https://raw.githubusercontent.com/AdamPaternostro/Azure-Docker-Shipyard/master/images/Create-Pools-2.png)
  3. #### Run the Docker container (you can use stdout.txt or stderr.txt)

sudo docker run —rm -it -v /home/shipyarduser/config:/configs -e SHIPYARD_CONFIGDIR=/configs alfpark/batch-shipyard:latest-cli jobs add —tail stdout.txt

  1. ![alt tag](https://raw.githubusercontent.com/AdamPaternostro/Azure-Docker-Shipyard/master/images/Run-Job-1.png)
  2. #### Delete our pool
  3. (DO NOT DO THIS UNTIL YOU ARE DONE EXPLORING THE AZURE PORTAL) (you do not have to do this, you can set the auto scaling down to zero when the pool is not in use)

sudo docker run —rm -it -v /home/shipyarduser/config:/configs -e SHIPYARD_CONFIGDIR=/configs alfpark/batch-shipyard:latest-cli pool del
```


In the Azure Portal

  • You can see the pools, jobs, output files (stdout, stderr) and ssh into each node:

alt tag

alt tag

alt tag

alt tag

alt tag

What’s Next

  • Change the Docker image to do some processing

  • In the jobs.json you can pass parameters to your container. Typically you would stage your data to be processed in Blob or Azure Data Lake Storage (ADLS) and then download to the batch worker node, process and then upload the results. The parameter is typically the name of the blob data or ADLS data to process.

  • To use this for a real workload you would need a small program to generate a jobs.json and then submit the job to Azure Batch

  • You can also use Azure Functions and go serverless to submit jobs through Azure Batch: https://github.com/Azure/batch-shipyard/blob/master/docs/60-batch-shipyard-site-extension.md