Data Storytelling (DS) is building data-driven stories to communicate the result of a data analysis process effectively. However, it may happen that data storytellers lack the competences to build compelling texts to include in the data-driven stories. In this paper, we propose a novel strategy to enhance DS by automatically generating context for data-driven stories, leveraging the capabilities of Generative AI (GenAI). This contextual information provides the background knowledge necessary for the audience to understand the story's message fully. Our approach uses Retrieval Augmented Generation (RAG), which adapts large language models (LLMs), the core concept behind GenAI, to the specific domain required by a data-driven story. We demonstrate the effectiveness of our method through a practical case study on salmon aquaculture, showcasing the ability of GenAI to enrich DS with relevant context. We also describe some possible strategies to evaluate the generated context and ethical issues may raise when using GenAI in DS.

Using Retrieval Augmented Generation to Build the Context for Data-Driven Stories

Angelica Lo Duca
Primo
Writing – Original Draft Preparation
2024

Abstract

Data Storytelling (DS) is building data-driven stories to communicate the result of a data analysis process effectively. However, it may happen that data storytellers lack the competences to build compelling texts to include in the data-driven stories. In this paper, we propose a novel strategy to enhance DS by automatically generating context for data-driven stories, leveraging the capabilities of Generative AI (GenAI). This contextual information provides the background knowledge necessary for the audience to understand the story's message fully. Our approach uses Retrieval Augmented Generation (RAG), which adapts large language models (LLMs), the core concept behind GenAI, to the specific domain required by a data-driven story. We demonstrate the effectiveness of our method through a practical case study on salmon aquaculture, showcasing the ability of GenAI to enrich DS with relevant context. We also describe some possible strategies to evaluate the generated context and ethical issues may raise when using GenAI in DS.
2024
Istituto di informatica e telematica - IIT
978-989-758-679-8
Data Storytelling
Retrieval Augmented Generation
Data Visualization
Generative AI
File in questo prodotto:
File Dimensione Formato  
RAG DS CR.pdf

solo utenti autorizzati

Tipologia: Documento in Pre-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 314.26 kB
Formato Adobe PDF
314.26 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
124197.pdf

accesso aperto

Licenza: Creative commons
Dimensione 565.51 kB
Formato Adobe PDF
565.51 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/515637
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact