项目作者: evdubs

项目描述 :
ETL for the SPDR ETF holdings XLS documents
高级语言: Racket
项目地址: git://github.com/evdubs/spdr-etf-holdings.git
创建时间: 2017-12-06T08:06:47Z
项目社区:https://github.com/evdubs/spdr-etf-holdings

开源协议:Mozilla Public License 2.0

下载


spdr-etf-holdings

These Racket programs will download the SPDR ETF holdings XLS documents and insert the holding data into a PostgreSQL database.
The intended usage on Windows with Microsoft Excel is:

  1. $ racket extract.rkt
  2. $ racket transform-load-com.rkt

On other platforms, you will need to do something like the following (and will need some bit of software to do the XLS->CSV transformation):

  1. $ racket extract.rkt
  2. $ for f in `ls /var/local/spdr/etf-holdings/date/` ; do libreoffice --headless --convert-to csv --outdir /var/local/spdr/etf-holdings/date $f ; done
  3. $ racket transform-load-csv.rkt

If you have libreoffice installed, you can instead just do the following as XLS->CSV conversion using libreoffice is supported within the process:

  1. $ racket extract.rkt
  2. $ racket transform-load-csv.rkt -c

You will need to provide a database password for the transform-load-*.rkt programs. The available parameters are:

  1. $ racket transform-load-csv.2019-11-02.rkt -h
  2. racket transform-load-csv.2019-11-02.rkt [ <option> ... ]
  3. where <option> is one of
  4. -b <folder>, --base-folder <folder> : SPDR ETF Holdings base folder. Defaults to /var/local/spdr/etf-holdings
  5. -c, --convert-xls : Convert XLS documents to CSV for handling. This requires libreoffice to be installed
  6. -d <date>, --folder-date <date> : SPDR ETF Holdings folder date. Defaults to today
  7. -n <name>, --db-name <name> : Database name. Defaults to 'local'
  8. -p <password>, --db-pass <password> : Database password
  9. -u <user>, --db-user <user> : Database user name. Defaults to 'user'
  10. --help, -h : Show this help
  11. -- : Do not treat any remaining argument as a switch (at this level)
  12. Multiple single-letter switches can be combined after one `-`. For
  13. example: `-h-` is the same as `-h --`

The provided schema.sql file shows the expected schema within the target PostgreSQL instance.
This process assumes you can write to a /var/local/spdr folder. This process also assumes you have loaded your database with NASDAQ symbol
file information. This data is provided by the nasdaq-symbols project.

Dependencies

It is recommended that you start with the standard Racket distribution. With that, you will need to install the following packages:

  1. $ raco pkg install --skip-installed gregor http-easy tasks threading

Format and URL updates

On 2020-01-01, the URL for SPDR ETF documents changed; extract.2020-01-01.rkt uses this new location.

On 2019-11-02, columns were added to the SPDR ETF documents; transform-load.2019-11-02.rkt can process these new columns.