David R Newman

PhD Work: Mini-Thesis Report

A copy of my Mini-Thesis Report can be downloaded here. Below is a the abstract for this report.

How Semantic Web Technologies Might Improve Natural Language Querying Systems

Natural Language (NL) querying has an extensive history, the earliest systems were developed back in the 1960s. This reports reviews a number of these system developed between then and the present to gain understanding of how they work and what are their strengths and weaknesses. Relatively, SemanticWeb (SW) technologies are a much more recent invention. These technologies provide mechanisms for storing and representing data in an innovative way, which may be of benefit to NL querying systems.

This report looks at the CombeChem project as an example of how SW technologies can be used and because it is a domain suitable for a NL querying system. CombeChem is however a complex domain, therefore an analogous domain around music albums has been developed. Using these two domains to exemplify, this report defines a conceptual framework for a NL querying system that uses SW technologies. It then describes the parts of this framework that have already been implemented, using screenshots to illustrate where applicable.

The report concludes by discussing the benefits of using SW technologies that have already been found. Such as the ability to more logically structure the data the NL querying system requires and to pull in data from disparate sources. It considers how SW technologies and NL querying share similar goals, i.e. both use machine-readable knowledge to produce a human readable output. However, a NL querying system goes further by mirroring this by taking a NL/human-understandable input and representing it in a machine-readable format. Finally this report sets out the work that is still to be done, using a rough timeline to illustrate the key milestones of this work.

Page written by David R Newman (drn[at]ecs.soton.ac.uk). Last updated October 13 2009 18:09:38.