Up: Term Selection and Weighting
Previous: Term Enrichment
This representation takes advantage of the benefit of both approaches, TP and entropy.
TP represents text independently, whereas entropy obtains better discriminant
terms, therefore, we have selected those terms that satisfy either of these two conditions.
The representation of a document is then given by:
In this approach two weighting criteria were adopted for the representation schema.
Terms provided by (Equation (5)) and (Equation (6))
are weighted by Equations (1) and (7) (a modified version of (1)),
respectively. The procedure for determining was to add all terms that satisfy
to the set . Thereafter, terms
are weighted by Equation (7).