Social media posts are often tagged with geographical coordinates or other information that allows identifying user positions, this way enabling mobility pattern analysis using trajectory mining techniques. This paper presents a methodology and discusses results of a study aimed at discovering behavior and mobility patterns of Instagram users who visited EXPO 2015, the Universal Exposition hosted in Milan, Italy, from May to October 2015. We collected and analyzed geotagged posts published by about 238,000 Instagram users who visited EXPO 2015, including more than 570,000 posts published during the visits, and 2.63 million posts published by them from one month before to one month after their visit to EXPO. To cope with this large amount of data, the whole process - from data collection to data mining - was implemented on a high-performance cloud platform that provided the necessary storage and compute resources. The analysis allowed us to discover how the number of visitors changed over time, which were the sets of most frequently visited pavilions, which countries the visitors came from, and the main flows of destination of visitors towards Italian cities and regions in the days after their visit to EXPO. A strong correlation (Pearson coefficient 0.7) was measured between official visitor numbers and the visit trends produced by our analysis, which assessed the effectiveness of the proposed methodology and confirmed the reliability of results.

Analyzing social media data to discover mobility patterns at EXPO 2015: Methodology and results

Eugenio Cesario;
2016

Abstract

Social media posts are often tagged with geographical coordinates or other information that allows identifying user positions, this way enabling mobility pattern analysis using trajectory mining techniques. This paper presents a methodology and discusses results of a study aimed at discovering behavior and mobility patterns of Instagram users who visited EXPO 2015, the Universal Exposition hosted in Milan, Italy, from May to October 2015. We collected and analyzed geotagged posts published by about 238,000 Instagram users who visited EXPO 2015, including more than 570,000 posts published during the visits, and 2.63 million posts published by them from one month before to one month after their visit to EXPO. To cope with this large amount of data, the whole process - from data collection to data mining - was implemented on a high-performance cloud platform that provided the necessary storage and compute resources. The analysis allowed us to discover how the number of visitors changed over time, which were the sets of most frequently visited pavilions, which countries the visitors came from, and the main flows of destination of visitors towards Italian cities and regions in the days after their visit to EXPO. A strong correlation (Pearson coefficient 0.7) was measured between official visitor numbers and the visit trends produced by our analysis, which assessed the effectiveness of the proposed methodology and confirmed the reliability of results.
2016
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
Trajectory Mining
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/322317
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 23
  • ???jsp.display-item.citation.isi??? ND
social impact