项目作者: rhythmv

项目描述 :
Use Embulk and Digdag to load CSV to PostgreSQL. Prepare the data and run SQL queries
高级语言:
项目地址: git://github.com/rhythmv/embulk-digdag.git
创建时间: 2019-07-05T17:01:14Z
项目社区:https://github.com/rhythmv/embulk-digdag

开源协议:

下载


embulk-digdag

Use Embulk and Digdag to load CSV to PostgreSQL. Prepare the data and run SQL queries

Prerequisite for this assignment on Linux Environment

  1. JAVA 8 (set java path)
  2. Install postgresql
  3. Install pgAdmin (keep the db user as “postgres” and password as “admin”)
  4. Create database “td_coding_challenge”

*Note: Run all commands as superuser

  1. Install Embulk (use the following command)
  1. $ curl --create-dirs -o ~/.embulk/bin/embulk -L "https://dl.embulk.org/embulk-latest.jar"
  2. $ chmod +x ~/.embulk/bin/embulk
  3. $ echo 'export PATH="$HOME/.embulk/bin:$PATH"' >> ~/.bashrc
  4. $ source ~/.bashrc
  1. Install JDBC input plugins for Embulk-postgresql
    1. $ embulk gem install embulk-input-postgresql
  2. Install JDBC output plugins for Embulk-postgresql

    1. $ embulk gem install embulk-output-postgresql
  3. Install Digdag (use the following command)

    1. $ curl -o ~/bin/digdag --create-dirs -L "https://dl.digdag.io/digdag-latest"
    2. $ chmod +x ~/bin/digdag
    3. $ echo 'export PATH="$HOME/bin:$PATH"' >> ~/.bashrc

*Note: Embulk and Digdag command can be tested using their respective examples

  1. for embulk: https://github.com/embulk/embulk#linux--mac--bsd
  2. for digdag: http://docs.digdag.io/getting_started.html#downloading-the-latest-version

*Note: Keep all the csv, embulk and digdag files in one folder or else provide the path.

Run the following commands to get the results:

  1. $ sudo -s (to get super user previliges)
  2. $ digdag secrets --local --set pg.password=admin (set the secret key)
  3. $ digdag run tdcc.dig --rerun -O log/task (run the digdag command to get results and generte event logs)