DogmatiX Tracks down Duplicates in XML

Melanie Weis & Felix Naumann
Duplicate detection is the problem of detecting different entries in a data source representing the same real-world entity. While research abounds in the realm of duplicate detection in relational data, there is yet little work for duplicates in other, more complex data models, such as XML. In this paper, we present a generalized framework for duplicate detection, dividing the problem into three components: candidate definition defining which objects are to be compared, duplicate definition defining...
This data repository is not currently reporting usage information. For information on how your repository can submit usage information, please see our documentation.