Jennifer Sleeman,
Online Unsupervised Coreference Resolution for Semi-Structured Heterogeneous Data,
in Proceedings of the 11th International Semantic Web Conference (ISWC 2012),
Boston, US,
November,
2012.
[PDF(local)],
[PDF(online)]
Abstract:
A pair of RDF instances are said to corefer when they are intended to denote the same thing in the world, for example, when two nodes of type foaf:Person describe the same individual. This problem is central to integrating and inter-linking semi-structured datasets. We are developing an online, unsupervised coreference resolution framework for heterogeneous, semi-structured data. The online aspect requires us to process new instances as they appear and not as a batch. The instances are heterogeneous in that they may contain terms from different ontologies whose alignments are not known in advance. Our framework encompasses a two-phased clustering algorithm that is both flexible and distributable, a probabilistic multidimensional attribute model that will support robust schema mappings, and a consolidation algorithm that will be used to perform instance consolidation in order to improve accuracy rates over time by addressing data sparseness.
BibTex:
@InProceedings { iswc2012paper-doctor-consortium-14,
author = { Jennifer Sleeman },
title = { Online Unsupervised Coreference Resolution for Semi-Structured Heterogeneous Data },
booktitle = { Proceedings of the 11th International Semantic Web Conference (ISWC 2012) },
address = {Boston, US},
month = { November },
year = { 2012 },
}