项目作者: himewel

项目描述 :
Infrastructure of Airflow with Celery workers builded with Terraform resources based on GCP Compute Engine VM Instances
高级语言: HCL
项目地址: git://github.com/himewel/gcp-terraform-airflow.git
创建时间: 2020-12-30T03:00:35Z
项目社区:https://github.com/himewel/gcp-terraform-airflow

开源协议:

下载


Terraform build of Airflow in GCP Compute Engine


Docker
Apache Airflow
Celery
Google Cloud
Terraform

The infrastructure constructed with this project consists in a set of VM instances running the services belonging to Airflow. Three main instances are builded:

  1. a proxy running Nginx and redirecting the access to the Webserver UI and Flower UI;
  2. the docker containers of Postgres and Redis (as a celery broker) databases together with a instance of the official image from Apache Airflow running the Scheduler service;
  3. another docker container from Apache Airflow running the Webserver UI and Flower UI.

Besides these VM instances, each worker instantiated take a new VM instance running the Airflow docker image as a celery worker. To make the connection between each VM instance, a VPC and some firewall rules are setted to enable the communication in the ports 80 (http), 21 (ssh), 8080 (webserver), 5555 (flower), 5432 (postgres), 6379 (redis) and 8793 (worker logs).

How to use

A example of terraform.tfvars is presented in the following code block. The credentials in GCP need to be provided by gcloud auth login, you can use make gcloud. Besides these values, some data can be setted about login of the UIs, workers quantity, OS image of the VMs and machine types and sizes:

  1. number_of_workers = 4
  2. webserver = {
  3. username = "admin"
  4. password = "admin"
  5. firstname = "Welbert"
  6. lastname = "Castro"
  7. email = "welberthime@hotmail.com"
  8. role = "Admin"
  9. }
  10. flower = {
  11. username = "admin"
  12. password = "admin"
  13. }

To build this project, run with terraform the setup of providers and modules and then, apply the infrastructure configured:

  1. # build docker container
  2. make build
  3. # start terraform daemon
  4. make start
  5. # enter into the shell container
  6. make shell
  7. # terraform workflow
  8. terraform init apache-airflow
  9. terraform plan apache-airflow
  10. terraform apply apache-airflow

So, wait some moments and check the ip address outputed with the name proxy_external_ip in the browser. The Airflow Webserver UI can be found at the root of the addres outputed and Flower can be acessed at the location /flower. The terrform output should look like this:

  1. Outputs:
  2. proxy_external_ip = <EXTERNAL IP>