The aim of the paper is to exploit Twitter to enhance the comprehension of COVID-19 diffusion and people reactions. To this purpose, the objectives are to identify the key terms and features used in the tweets, and the interest in the COVID-19 topics. To address those goals the paper proposes an approach that combines peak detection and clustering techniques. Space-time features are extracted from the tweets and modeled as time series. After that, peaks are detected and the textual features are clustered based on the co-occurrence in the tweets. Each cluster obtained is then associated to a topic. Results, performed over a real-world dataset of tweets related to COVID-19 in US, show that the approach is able to detect several relevant topics, of varying importance and character. A case study about the correlation of Twitter data with COVID-19 confirmed cases has been presented, also evaluating the feasibility of exploiting Twitter for the outbreak diffusion prediction. Results highlight a high correlation between tweets and real COVID-19 data, proving that Twitter can be considered a reliable indicator of the epidemic spreading and that data generated by user activity on social media is becoming an invaluable source for capturing and understanding epidemics outbreaks.
How COVID-19 information spread in US The Role of Twitter as Early Indicator of Epidemics
Comito C
Primo
Conceptualization
2022
Abstract
The aim of the paper is to exploit Twitter to enhance the comprehension of COVID-19 diffusion and people reactions. To this purpose, the objectives are to identify the key terms and features used in the tweets, and the interest in the COVID-19 topics. To address those goals the paper proposes an approach that combines peak detection and clustering techniques. Space-time features are extracted from the tweets and modeled as time series. After that, peaks are detected and the textual features are clustered based on the co-occurrence in the tweets. Each cluster obtained is then associated to a topic. Results, performed over a real-world dataset of tweets related to COVID-19 in US, show that the approach is able to detect several relevant topics, of varying importance and character. A case study about the correlation of Twitter data with COVID-19 confirmed cases has been presented, also evaluating the feasibility of exploiting Twitter for the outbreak diffusion prediction. Results highlight a high correlation between tweets and real COVID-19 data, proving that Twitter can be considered a reliable indicator of the epidemic spreading and that data generated by user activity on social media is becoming an invaluable source for capturing and understanding epidemics outbreaks.File | Dimensione | Formato | |
---|---|---|---|
TSC2022.pdf
solo utenti autorizzati
Tipologia:
Versione Editoriale (PDF)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1.12 MB
Formato
Adobe PDF
|
1.12 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.