Creativity is a fundamental skill of human cognition. We use textual forma mentis networks (TFMN) to extract network (semantic/syntactic associations) and emotional features from approximately one thousand human- and GPT3.5-generated stories. Using Explainable Artificial Intelligence (XAI), we test whether features relative to Mednick's associative theory of creativity can explain creativity ratings assigned by humans and GPT-3.5. Using XGBoost, we examine three scenarios: (i) human ratings of human stories, (ii) GPT-3.5 ratings of human stories, and (iii) GPT-3.5 ratings of GPT-generated stories. Our findings reveal that GPT-3.5 ratings differ significantly from human ratings not only in terms of correlations but also because of feature patterns identified with XAI methods. GPT-3.5 favours 'its own' stories and rates human stories differently from humans. Feature importance analysis with SHAP scores shows that: (i) network features are more predictive for human creativity ratings but also for GPT-3.5's ratings of human stories; (ii) emotional features played a greater role than semantic/syntactic network structure in GPT-3.5 rating its own stories. These quantitative results underscore key limitations in GPT-3.5's ability to align with human assessments of creativity. We emphasise the need for caution when using GPT-3.5 to assess and generate creative content, as it does not yet capture the nuanced complexity that characterises human creativity.

Forma mentis networks predict creativity ratings of short texts via interpretable artificial intelligence in human and GPT-simulated raters

Citraro S.;Rossetti G.;
2024

Abstract

Creativity is a fundamental skill of human cognition. We use textual forma mentis networks (TFMN) to extract network (semantic/syntactic associations) and emotional features from approximately one thousand human- and GPT3.5-generated stories. Using Explainable Artificial Intelligence (XAI), we test whether features relative to Mednick's associative theory of creativity can explain creativity ratings assigned by humans and GPT-3.5. Using XGBoost, we examine three scenarios: (i) human ratings of human stories, (ii) GPT-3.5 ratings of human stories, and (iii) GPT-3.5 ratings of GPT-generated stories. Our findings reveal that GPT-3.5 ratings differ significantly from human ratings not only in terms of correlations but also because of feature patterns identified with XAI methods. GPT-3.5 favours 'its own' stories and rates human stories differently from humans. Feature importance analysis with SHAP scores shows that: (i) network features are more predictive for human creativity ratings but also for GPT-3.5's ratings of human stories; (ii) emotional features played a greater role than semantic/syntactic network structure in GPT-3.5 rating its own stories. These quantitative results underscore key limitations in GPT-3.5's ability to align with human assessments of creativity. We emphasise the need for caution when using GPT-3.5 to assess and generate creative content, as it does not yet capture the nuanced complexity that characterises human creativity.
2024
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
LLM
Creativity
File in questo prodotto:
File Dimensione Formato  
2412.00530v1.pdf

accesso aperto

Descrizione: 1 Forma mentis networks predict creativity ratings of short texts via interpretable artificial intelligence in human and GPT-simulated raters
Tipologia: Documento in Pre-print
Licenza: Creative commons
Dimensione 1.65 MB
Formato Adobe PDF
1.65 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/563112
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact