Next: The CICLing-2002 corpus
Up: Clustering Narrow-Domain Short Texts
Previous: The Kullback-Leibler Distance
In the experiments we have carried out, three corpora with different characteristics with respect to their size and their balance were used. We consider that all these very narrow domain corpora are suitable for our experiments because of their average size per abstract and their narrow domain. In the following subsections we describe each corpus into detail.
Description of the corpora