Unified Framework of Dimensionality Reduction and Text Categorisation

K. M.M Rajashekharaiah; Sunil S Chikkalli; Prateek K Kumbar; Dr. P. Suryanarayana Babu

doi:10.14419/ijet.v7i3.29.21397

Authors and Affiliations

K. M.M Rajashekharaiah
Sunil S Chikkalli
Prateek K Kumbar
Dr. P. Suryanarayana Babu

About this article

DOI:

https://doi.org/10.14419/ijet.v7i3.29.21397

Received:

09-10-2018

Revised:

09-10-2018

Accepted:

09-10-2018

Published:

20-04-2026

Views:

179

Downloads:

12

Download PDF

Keywords:

Classification accuracy, Classifier, Dimension Reduction, Framework, Supervised learning, Support Vector Machine(SVM), Text Classification/Categorisation (TC)

Abstract

Text classification (categorization) is a supervised learning task that assigns text documents to pre-defined classes of documents. It is used to organize and manage the collection of text documents available in digital form. To accomplish the task, support vector machine (SVM) is regarded as the suitable classifier for any kind of applications. Though SVM’s computational complexity is independent of number of dimensions, still high dimensionality poses the problem of ‘curse of dimensionality’ that can be solved effectively by the process of Dimension Reduction (DR). This work contemplates on developing a framework for dimensionality reduction and text classification. A comparative analysis of the classification accuracies using two approaches viz., text classification with dimensionality reduction and text classification without dimensionality reduction completes the scope of the paper. It also evaluates the efficiency of various dimensionality reduction techniques to include one of the most coherent methods in the framework.

References

Richard Ernest Bellman, “Adaptive control processes: a guided tour”, Princeton University Press, 1967.

Y. Yang and J. O. Pedersen, “A comparative study on feature selection in text categorization”, International Conference on Machine Learning, 1997.

Hyunsoo Kim, Peg Howland and Haesun Park, “Dimension Reduction in Text Classification with Support Vector Machines”.

E. Bingham and H. Mannila, “Random projection in dimensionality reduction: applications to image and text data”. In ACM Special Interest Group on Management of Data. ACM Press, 2001.

Underhill, D.G., McDowell, L., Marchette, D.J., & Solka, J.L. (2007). Enhancing Text Analysis via Dimensionality Reduction. 2007 IEEE International Conference on Information Reuse and Integration, 348-353.

View more references (10)

Aas, K. and Eikvil, L. 1999. Text Categorization: A survey. Tech. rep. 941. Norwegian Computing Center, Oslo, Norway.

Yang, Y., Slattery, S., and Ghani, R., 2002, “A study of approaches to hypertext categorization”, J. Intell. Inform. Syst. 18, 2/3 (March-May), 219–241.

Fabrizio Sebastiani, “Machine learning in automated text categorization”, ACM Computing Surveys, 34(1):1-47, 2002.

F. Wickelmaier. An introduction to MDS. Technical report,Aalborg University (Denmark), May 2003.

Manning, Christopher; Raghavan, Prabhakar; Schütze, Hinrich, "Vector space classification: Introduction to Information Retrieval”, Cambridge University Press, 2008

S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, and R. Harshman, “Indexing by latent semantic analysis”. Journal of the Society for Information Science, 41:391-407, 1990.

Chelsea Boling and Kumar Das, “Reducing Dimensionality of Text Documents using Latent Semantic Analysis”

P. Howland, M. Jeon, and H. Park, “Structure Preserving Dimension Reduction for Clustered Text Data based on the Generalized Singular Value Decomposition”, SIAM Journal of Matrix Analysis and Applications, 25(1):165–179, 2003.

Yogesh Jain, Amit kumar Nandanwar, A Theoretical Study of Text Document Clustering, “International Journal of Computer Science and Information Technologies”, Vol. 5 (2), 2014, 2246-2251

Pratiksha Y Pawar and S H Gawande, “A Comparitive Study on Different Types of Approaches to Text Categorization”, International Journal of Machine Learning and Computing, Vol 2, No 4, August 2012

How to Cite

M.M Rajashekharaiah, K., S Chikkalli, S., K Kumbar, P., & P. Suryanarayana Babu, D. (2026). Unified Framework of Dimensionality Reduction and Text Categorisation. International Journal of Engineering and Technology, 7(3.29), 648-654. https://doi.org/10.14419/ijet.v7i3.29.21397

Download Citation

Unified Framework of Dimensionality Reduction and Text Categorisation

Authors and Affiliations

About this article

DOI:

Received:

Revised:

Accepted:

Published:

Views:

Downloads:

Keywords:

Abstract

References

How to Cite

Related Articles

Downloads