The aim of the paper is to exploit Twitter to enhance the comprehension of COVID-19 diffusion and people reactions. To this purpose, the objectives are to identify the key terms and features used in the tweets, and the interest in the COVID-19 topics. To address those goals the paper proposes an approach that combines peak detection and clustering techniques. Space-time features are extracted from the tweets and modeled as time series. After that, peaks are detected and the textual features are clustered based on the co-occurrence in the tweets. Each cluster obtained is then associated to a topic. Results, performed over a real-world dataset of tweets related to COVID-19 in US, show that the approach is able to detect several relevant topics, of varying importance and character. A case study about the correlation of Twitter data with COVID-19 confirmed cases has been presented, also evaluating the feasibility of exploiting Twitter for the outbreak diffusion prediction. Results highlight a high correlation between tweets and real COVID-19 data, proving that Twitter can be considered a reliable indicator of the epidemic spreading and that data generated by user activity on social media is becoming an invaluable source for capturing and understanding epidemics outbreaks.

How COVID-19 information spread in US The Role of Twitter as Early Indicator of Epidemics

Comito C
2021

Abstract

The aim of the paper is to exploit Twitter to enhance the comprehension of COVID-19 diffusion and people reactions. To this purpose, the objectives are to identify the key terms and features used in the tweets, and the interest in the COVID-19 topics. To address those goals the paper proposes an approach that combines peak detection and clustering techniques. Space-time features are extracted from the tweets and modeled as time series. After that, peaks are detected and the textual features are clustered based on the co-occurrence in the tweets. Each cluster obtained is then associated to a topic. Results, performed over a real-world dataset of tweets related to COVID-19 in US, show that the approach is able to detect several relevant topics, of varying importance and character. A case study about the correlation of Twitter data with COVID-19 confirmed cases has been presented, also evaluating the feasibility of exploiting Twitter for the outbreak diffusion prediction. Results highlight a high correlation between tweets and real COVID-19 data, proving that Twitter can be considered a reliable indicator of the epidemic spreading and that data generated by user activity on social media is becoming an invaluable source for capturing and understanding epidemics outbreaks.
2021
COVID-19
Social Media Data
Topic Modeling
Peak Detection
Forecasting models
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/448511
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 29
  • ???jsp.display-item.citation.isi??? ND
social impact