Multi-modal document, image, and text datasets and models for document understanding, OCR, VQA tasks.

GitHub repos: