since its conception, natural language processing (NLP) has experienced substantial changes, moving from rule-based systems to complex machine learning techniques that allow for comprehensive text classification. The effectiveness of different machine learning and deep learning models in classifying text data is investigated in this study, with a particular emphasis on document classification across five different categories: business, politics, sport, technology, and entertainment. Using a publicly accessible dataset with 2,225 data points, we refined the input text using a number of preprocessing techniques. We also used deep learning architectures like Convolutional Neural Networks (CNN) and Long Short-Term Memory networks (LSTM), as well as models like Support Vector Machines (SVM), Multilayer Perceptron (MLP), and K-Nearest Neighbors (KNN). Our findings demonstrate the strong predictive power of ensemble approaches, including SVM and MLP, which produced high accuracy rates of 97% and precision and recall metrics above 0.96 in the majority of categories. In addition to offering insightful information for future research in automatic text classification and methods for overcoming data noise and complexity, this work emphasizes the need for model optimization based on particular classification tasks.
Elsinary, H. (2025). Decoding Text Complexity: Evaluating Machine and Deep Learning Models for Document Classification across different Categories. Sohag Journal of Sciences, 10(4), 574-587. doi: 10.21608/sjsci.2025.391898.1302
MLA
hameda Elsinary. "Decoding Text Complexity: Evaluating Machine and Deep Learning Models for Document Classification across different Categories", Sohag Journal of Sciences, 10, 4, 2025, 574-587. doi: 10.21608/sjsci.2025.391898.1302
HARVARD
Elsinary, H. (2025). 'Decoding Text Complexity: Evaluating Machine and Deep Learning Models for Document Classification across different Categories', Sohag Journal of Sciences, 10(4), pp. 574-587. doi: 10.21608/sjsci.2025.391898.1302
VANCOUVER
Elsinary, H. Decoding Text Complexity: Evaluating Machine and Deep Learning Models for Document Classification across different Categories. Sohag Journal of Sciences, 2025; 10(4): 574-587. doi: 10.21608/sjsci.2025.391898.1302