next up previous
Next: Information Retrieval Model Up: Description of the search Previous: The Transition Point Technique

Term Enrichment

Certainly TP reduction may increase precision, but furthermore it decreases recall. Due to this fact, we enriched the selected terms by obtaining new terms, those with similar characteristics to the initial ones. Specifically, given a text $T$, with selected terms $TP_{SET}$, $y$ is a new term if it co-occurs in $T$ with some $x \in TP_{SET}$, i.e.,
TP_{SET}'=TP_{SET}\cup\{y\vert x\in TP_{SET}\wedge(fr(xy)>1\vee fr(yx)>1)\}.
\end{displaymath} (2)

Considering the text length, we only selected a window of size 1 around each term of $TP_{SET}$, and a minimum frequency of two for each bigram was required as condition to include new terms.

David Pinto 2007-05-08