Unified Framework of Dimensionality Reduction and Text Categorisation
-
https://doi.org/10.14419/ijet.v7i3.29.21397
Received date: October 9, 2018
Accepted date: October 9, 2018
Published date: April 20, 2026
-
Classification accuracy, Classifier, Dimension Reduction, Framework, Supervised learning, Support Vector Machine(SVM), Text Classification/Categorisation (TC) -
Abstract
Text classification (categorization) is a supervised learning task that assigns text documents to pre-defined classes of documents. It is used to organize and manage the collection of text documents available in digital form. To accomplish the task, support vector machine (SVM) is regarded as the suitable classifier for any kind of applications. Though SVM’s computational complexity is independent of number of dimensions, still high dimensionality poses the problem of ‘curse of dimensionality’ that can be solved effectively by the process of Dimension Reduction (DR). This work contemplates on developing a framework for dimensionality reduction and text classification. A comparative analysis of the classification accuracies using two approaches viz., text classification with dimensionality reduction and text classification without dimensionality reduction completes the scope of the paper. It also evaluates the efficiency of various dimensionality reduction techniques to include one of the most coherent methods in the framework.
-
References
- Richard Ernest Bellman, “Adaptive control processes: a guided tour”, Princeton University Press, 1967.
- Y. Yang and J. O. Pedersen, “A comparative study on feature selection in text categorization”, International Conference on Machine Learning, 1997.
- Hyunsoo Kim, Peg Howland and Haesun Park, “Dimension Reduction in Text Classification with Support Vector Machines”.
- E. Bingham and H. Mannila, “Random projection in dimensionality reduction: applications to image and text data”. In ACM Special Interest Group on Management of Data. ACM Press, 2001.
- Underhill, D.G., McDowell, L., Marchette, D.J., & Solka, J.L. (2007). Enhancing Text Analysis via Dimensionality Reduction. 2007 IEEE International Conference on Information Reuse and Integration, 348-353.
- Aas, K. and Eikvil, L. 1999. Text Categorization: A survey. Tech. rep. 941. Norwegian Computing Center, Oslo, Norway.
- Yang, Y., Slattery, S., and Ghani, R., 2002, “A study of approaches to hypertext categorization”, J. Intell. Inform. Syst. 18, 2/3 (March-May), 219–241.
- Fabrizio Sebastiani, “Machine learning in automated text categorization”, ACM Computing Surveys, 34(1):1-47, 2002.
- F. Wickelmaier. An introduction to MDS. Technical report,Aalborg University (Denmark), May 2003.
- Manning, Christopher; Raghavan, Prabhakar; Schütze, Hinrich, "Vector space classification: Introduction to Information Retrieval”, Cambridge University Press, 2008
- S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, and R. Harshman, “Indexing by latent semantic analysis”. Journal of the Society for Information Science, 41:391-407, 1990.
- Chelsea Boling and Kumar Das, “Reducing Dimensionality of Text Documents using Latent Semantic Analysis”
- P. Howland, M. Jeon, and H. Park, “Structure Preserving Dimension Reduction for Clustered Text Data based on the Generalized Singular Value Decomposition”, SIAM Journal of Matrix Analysis and Applications, 25(1):165–179, 2003.
- Yogesh Jain, Amit kumar Nandanwar, A Theoretical Study of Text Document Clustering, “International Journal of Computer Science and Information Technologies”, Vol. 5 (2), 2014, 2246-2251
- Pratiksha Y Pawar and S H Gawande, “A Comparitive Study on Different Types of Approaches to Text Categorization”, International Journal of Machine Learning and Computing, Vol 2, No 4, August 2012
-
Downloads
-
How to Cite
M.M Rajashekharaiah, K., S Chikkalli, S., K Kumbar, P., & P. Suryanarayana Babu, D. (2026). Unified Framework of Dimensionality Reduction and Text Categorisation. International Journal of Engineering and Technology, 7(3.29), 648-654. https://doi.org/10.14419/ijet.v7i3.29.21397
