Where's my semantics?


Possible supervisor: Heiner Stuckenschmidt

Project description

On the Semantic Web, content-related metadata encoded in RDF promises to improve classical information retrieval methods by explicating properties of and relations between different resources. With an increasing number of RDF files being available on the web, however, it seems that we are running into another, maybe even harder retrieval problem: Before we can use an RDF description to get more information about a resource, we first have to find this description. The problem is that RDF files refer to resources and not the other way around. This means that we can find a resource if we have the RDF file, but we cannot find the RDF data if we know the resource. While some work on retrieving XML documents exists - the HySpirit System developed at Queen Mary College in London is an example for such work - the retrieval of RDF has not been addressed so far.

The task in this project is now to apply and extend existing methods for XML retrieval in such a way that they retrieve RDF documents that describe a certain aspect of a given resource. As RDF models are also valid XML documents, a first approach can be to use the Hyspirit system as it is and evaluate its performance on RDF data. In a second step, ways to improve the retrieval performance should be explored. This may include:

  • Adapting the retrieval mechanism to the RDF data model
  • Use semantics encode in schema files in the retrieval process

Finally, based on the experiences gained in the experiments and the extension of the retrieval mechanisms, ways of extending the methods further to more complex semantic models encoded in the Web Ontology Language OWL should be discussed.

The candidate should have some knowledge of semantic web languages, i.e. XML, RDF and RDF schema. Basic knowledge about information retrieval methods is useful. As parts of the project will require the extension of existing software, programming skills are advisable.