Natural language query translation and expansion in information retrieval

Thumbnail Image
Ibrahim, Duraid M
Journal Title
Journal ISSN
Volume Title
Query formulation and expansion have long been explored for enhancing query effectiveness and solving the word mismatch problem in information retrieval systems. Most of the approaches are statistical in nature. They are based on the occurrence frequency of words this thesis, we present a new approach based on natural language processing. Given a natural language query, our approach will translate a natural language query into a Boolean query that is better, in terms of retrieval effectiveness, than the original query. The terms in the Boolean query are assigned weights based on their contribution to the semantic of the query, which is determined by its occurrence frequency and its syntactic dependency within the query. Furthermore, the resulting weighted Boolean query can be further improved by expanding the query terms with synonyms in a very restrictive fashion. This process is fully automated and does not require human intervention. Experiments run for TREC-4 queries showed consistent improvement over standard information retrieval ranking methods.