web crawler in python
web crawler in python. Its purpose is to obtain information from guatecompras.gt and generate csv’s with it, so users have data suitable for machine and human analysis
To install dependencies just run
pip install -r requirements.txt
Open a command terminal inside the folder where you cloned the project and
EntryPoint.py [-h] [-y YEAR] [-m MONTH] [-d DAY]
$ python EntryPoint.py -y 2016
Executing above command will obtain data for 2016
$ python EntryPoint.py -y 2016 -m 1
Executing above command will obtain data for January 2016
$ python EntryPoint.py -y 2016 -m 1 -d 1
executing above command will obtain data for January 1st 2016, but if you wish to obtain today’s data just pass no parameters to the script, just like the example below:
$ python EntryPoint.py
After execution program will output 3 csv files:
adjudicaciones.csv, compradores.csv, proveedores.csv, column delimiter is ‘|’ and encoding is utf-8.