The aim of the paper is to exploit Twitter to enhance the comprehension of COVID-19 diffusion and people reactions. To this purpose, the objectives are to identify the key terms and features used in the tweets, and the interest in the COVID-19 topics. To address those goals the paper proposes an approach that combines peak detection and clustering techniques. Space-time features are extracted from the tweets and modeled as time series. After that, peaks are detected and the textual features are clustered based on the co-occurrence in the tweets. Each cluster obtained is then associated to a topic. Results, performed over a real-world dataset of tweets related to COVID-19 in US, show that the approach is able to detect several relevant topics, of varying importance and character. A case study about the correlation of Twitter data with COVID-19 confirmed cases has been presented, also evaluating the feasibility of exploiting Twitter for the outbreak diffusion prediction. Results highlight a high correlation between tweets and real COVID-19 data, proving that Twitter can be considered a reliable indicator of the epidemic spreading and that data generated by user activity on social media is becoming an invaluable source for capturing and understanding epidemics outbreaks.

How COVID-19 information spread in US The Role of Twitter as Early Indicator of Epidemics

Comito C
Primo
Conceptualization
2022

Abstract

The aim of the paper is to exploit Twitter to enhance the comprehension of COVID-19 diffusion and people reactions. To this purpose, the objectives are to identify the key terms and features used in the tweets, and the interest in the COVID-19 topics. To address those goals the paper proposes an approach that combines peak detection and clustering techniques. Space-time features are extracted from the tweets and modeled as time series. After that, peaks are detected and the textual features are clustered based on the co-occurrence in the tweets. Each cluster obtained is then associated to a topic. Results, performed over a real-world dataset of tweets related to COVID-19 in US, show that the approach is able to detect several relevant topics, of varying importance and character. A case study about the correlation of Twitter data with COVID-19 confirmed cases has been presented, also evaluating the feasibility of exploiting Twitter for the outbreak diffusion prediction. Results highlight a high correlation between tweets and real COVID-19 data, proving that Twitter can be considered a reliable indicator of the epidemic spreading and that data generated by user activity on social media is becoming an invaluable source for capturing and understanding epidemics outbreaks.
2022
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
Social Media Data
Topic Modeling
Peak Detection
Forecasting models
COVID-19; Artificial intelligence; Machine learning; Deep learning; Forecasting; Diagnosing
File in questo prodotto:
File Dimensione Formato  
TSC2022.pdf

solo utenti autorizzati

Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.12 MB
Formato Adobe PDF
1.12 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/448511
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 36
  • ???jsp.display-item.citation.isi??? 39
social impact