Analyzing Twitter Data for Depression Signs Based on Machine Learning Techniques

Authors

  • Eman Khattak School of Computer Science, Iqra National University Peshawar, Peshawar 25124, Pakistan.
  • Haleem Ullah School of Computer Science, Iqra National University Peshawar, Peshawar 25124, Pakistan.
  • Irfanullah Khan School of Computer Science, Iqra National University Peshawar, Peshawar 25124, Pakistan.
  • Malik Taimoor Ali School of Computer Science, Iqra National University Peshawar, Peshawar 25124, Pakistan.
  • Dr. latif Jan School of Computer Science, Iqra National University Peshawar, Peshawar 25124, Pakistan.

Keywords:

Natural Language processing, Machine Learning, K-Nearest Neighbors, Naïve Bayes, Decision Tree

Abstract

This research examines tweets with Natural Language Processing (NLP) and Machine Learning to detect trends associated with depression. The aim of this research is to create a dependable system for the early detection of depression. For this purpose, several classification algorithms are studied such as Decision Tree, K-Nearest Neighbors (KNN), Random Forest, Naïve Bayes, Support Vector Machine (SVM), Logistic Regression, Convolutional Neural Networks (CNN). All these models are tested to find out which is most dependable for depression detection. The effectiveness of these models is analyzed to determine which one is most precise in depression detection. Ongoing monitoring of users’ text data allows participants to visualize the progression of depression over time, as well as recognize shifts in emotional states through graphs. The approach is made up of three phases: text cleaning, model building, and intensive testing of a given dataset. The data analysis shows that the best performing algorithms in accuracy were Logistic Regression and SVM with 91.8% and 91.9% respectively. It was noted that Logistic Regression performed better in precision and recall metrics which highlights its effectiveness in symptom depression detection.

Downloads

Published

2026-01-04