Multi-modal document, image, and text datasets and models for document understanding, OCR, VQA tasks.
GitHub repos:
chug
pixparse