Jennifer Sleeman, Online Unsupervised Coreference Resolution for Semi-Structured Heterogeneous Data, in Proceedings of the 11th International Semantic Web Conference (ISWC 2012), Boston, US, November, 2012. [PDF(local)], [PDF(online)]


Abstract:
A pair of RDF instances are said to corefer when they are intended to denote the same thing in the world, for example, when two nodes of type foaf:Person describe the same individual. This problem is central to integrating and inter-linking semi-structured datasets. We are developing an online, unsupervised coreference resolution framework for heterogeneous, semi-structured data. The online aspect requires us to process new instances as they appear and not as a batch. The instances are heterogeneous in that they may contain terms from different ontologies whose alignments are not known in advance. Our framework encompasses a two-phased clustering algorithm that is both flexible and distributable, a probabilistic multidimensional attribute model that will support robust schema mappings, and a consolidation algorithm that will be used to perform instance consolidation in order to improve accuracy rates over time by addressing data sparseness.

BibTex:
@InProceedings { iswc2012paper-doctor-consortium-14,
  author = { Jennifer Sleeman },
  title = { Online Unsupervised Coreference Resolution for Semi-Structured Heterogeneous Data },
  booktitle = { Proceedings of the 11th International Semantic Web Conference (ISWC 2012) },
  address = {Boston, US},
  month = { November },
  year = { 2012 },
}
Back