Vol. 26 Issue 1 2023
  1. Nageswararao1 , C. Shoba Bindu2, P. Dileep Kumar Reddy3

1Department of CSE, JNTUA College of Engineering, Ananthapuramu, Andhra Pradesh, India,

2Department of CSE, JNTUA College of Engineering, Ananthapuramu, Andhra Pradesh, India,

3Department of CSE, Narsimha Reddy Engineering College (Autonomous), Secunderabad, Telangana, India,


Newspapers play a very crucial role in society because they inform people about current events and how they may impact their day-to-day lives. In cases of wellbeing emergencies, along with the recent COVID-19 outbreak, their significance becomes even more critical and indispensable. Since the beginning of the pandemic, newspapers have been a valuable source of information for the general public on a wide range of topics, including the identification of a novel coronavirus strain, restrictions and other lockdown, governmental regulations and details on the development of a coronavirus vaccine. In this case, analysing newly emerging and extensively reported topics, themes, and concerns, as well as the accompanying sentiments from different nations, can aid in our understanding of the COVID-19 pandemic. In our paper, we investigated greater than 100,000 COVID-19 news headlines and articles utilising BERTopic, Top2Vec (topic modelling), and XLNet (sentiment analysis and classification). Our topic modelling findings showed that the most prevalent and widely covered topics in the India, Japan, South Korea and UK were education, sports, Vaccination, and economy. Additionally, our sentiment classification model achieved 92% validation accuracy, 96% testing accuracy, 97% F1-Score and the study revealed that the UK, the nation with the worst affects in our dataset, also has the largest prevalence of negative sentiment.

INDEX TERMS: Natural Language Processing, Machine Learning, Topic Modeling, Topic Labeling, BERTopic, Top2Vec, XLNET, Sentiment Analysis, Newspaper, COVID-19.