GROTOAP2

Dominika Tkaczyk, Pawe\U0142 Szostek & Bolikowski, \U0141ukasz
GROTOAP2 (GROund Truth for Open Access Publications) is a dataset useful for training and performance evaluation of document content analysis tasks, such as document zone classification. GROTOAP2 is a successor of GROTOAP dataset.\r\n\r\nGROTOAP2 was built automatically from PubMed Central Open Access Subset. It contains 13,210 ground truth files, that store geometrical and logical structure\r\nof the articles content. The corresponding PDF files can be downloaded from\r\n[PMC repository](http://europepmc.org/) using provided script.\r\n\r\nThis repository contains a sample of a...
This data repository is not currently reporting usage information. For information on how your repository can submit usage information, please see our documentation.