Information Retrieval Model

Our information retrieval is based on the Boolean Model and, in order to rank the documents retrieved, we have used the Jaccard similarity function applied to both, the query and every document of the corpus used. Previously, each document was preprocessed and its index terms were selected (the preprocessing phase is described in section 3.1). As we will see in Section 3.3, we have represented each text by using the selection formula given in the Equation (1). Additionally, after the reduction step, we have carried out an enrichment process (see Equation (2)) based on the identification of related terms to those resulted from the selection.

David Pinto 2007-05-08