Fasta files for InpactorDB: A Plant classified lineage-level LTR retrotransposon reference library for free-alignment methods based on Machine Learning

Romain Guyot, Simon Orozco-Arias & Gustavo Isaza
Here, we present InpactorDB a semi-curated dataset composed of 130,511 elements from 195 plant genomes belonging to 108 plant species, classified down to the lineage level. This dataset has been used to train two deep neural networks (one fully connected and one convolutional) for fast classification of elements. Used in lineage-level classification approaches, we obtain a score above 98% of F1-score, precision and recall. In order to classify elements of the ‘LTR_STRUC’ and ‘EDTA’ datasets,...
This data repository is not currently reporting usage information. For information on how your repository can submit usage information, please see our documentation.