Some Text Data
Related to the results in the paper 'Text classification using support vector machines with dimension reduction' by H. Kim, P. Howland, and H. Park, Proceedings of Text Mining Workshop of the 3rd SIAM International Conference on Data Mining, San Francisco, CA, May, 2003.
- MEDLINE dataset: 1250x22095 sparse matrices with 5 clusters in sparse representation of column index, row index, and value. The five clusters start from (1 251 501 751 1001) row training data, test data
- REUTERS dataset in sparse representation using column index, row index, and value. training data, test data. The matrix size and starting position of each cluster are summarized in training data category, test data category.