Python library to import OCR data in various formats into the canonical JSON format defined by the Impresso project.