项目作者: gadiener

项目描述 :
This makes the Kubernetes pod logs visible in the Airflow UI interface during runtime in GKE
高级语言: Python
项目地址: git://github.com/gadiener/bigquery-airflow-logger.git
创建时间: 2020-04-03T11:23:37Z
项目社区:https://github.com/gadiener/bigquery-airflow-logger

开源协议:MIT License

下载


BigQuery logger handler for Airflow

Installation

pip install airflow-bigquerylogger

Configuration

  1. AIRFLOW__CORE__REMOTE_LOGGING='true'
  2. AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER='gs://bucket/path'
  3. AIRFLOW__CORE__REMOTE_LOG_CONN_ID='gcs_log'
  4. AIRFLOW__CORE__LOGGING_CONFIG_CLASS='bigquerylogger.config.LOGGING_CLASS'
  5. AIRFLOW__CORE__LOG_BIGQUERY_DATASET='dataset.table'
  6. AIRFLOW__CORE__LOG_BIGQUERY_LIMIT=50

Google Cloud BigQuery

Rows that were written to a table recently via streaming (using the tabledata.insertall method) cannot be modified using UPDATE, DELETE, or MERGE statements. I recommend setting up a table retention!

Credits

Thanks to Bluecore engineering team for this usefull article.