LLM and Talmud: an evaluation of early results of assisted translation and question answering

Papini, Mafalda; Marchi, Simone; Giovannetti, Emiliano

In this presentation, we will discuss two experiments applying Large Language Models (LLMs) within the framework of the Translation Project of the Babylonian Talmud into Italian. Two distinct tasks have been identified where large language models can contribute: translation support through suggestions and question answering (QA). For the first task, we applied a methodology inspired by Retrieval-Augmented Generation (RAG) techniques and few-shot prompting, which, by leveraging the translation memory of the “Traduco” CAT system, enabled the LLMs to generate translation suggestions that are linguistically correct and stylistically consistent with what translators have produced so far. For the second task, on the other hand, two prototypes also based on the RAG technique were developed in which the contexts provided to the model came from either the already published text alone or a combination of the published text and its accompanying glossaries. In both prototypes, prompts included instructions for the system to cite the sources used in its responses. Early results are promising. In the first task, our methodology produced a better translation in 79% of cases compared to the text translated directly by the generative model. The remaining 21% highlighted the LLM's inability to produce essential explanatory expansions crucial for understanding the Talmud and characteristic of the edition. The QA task also provided promising feedback, particularly in the indication of sources to support its answers, although it did not show substantial improvements in quality compared to direct queries to the models. These results underscore the need for further investigations to overcome the limitations of LLMs in processing highly complex texts.

In questo intervento presenteremo due esperimenti di applicazione di Large Language Model (LLM) nell’ambito del Progetto di Traduzione del Talmud Babilonese in italiano. Sono stati individuati due diversi task nei quali i grandi modelli di lingua possono fornire un contributo: il supporto alla traduzione tramite suggerimenti e il question answering (QA). Per il primo task si è applicata una metodologia ispirata alla tecnica dei RAG (Retrieval Augmented Generation) e al few-shot prompting che, fattorizzando sulla memoria di traduzione del sistema CAT di “Traduco”, consentisse ai LLM di generare suggerimenti di traduzione linguisticamente corretti e stilisticamente coerenti con quanto finora prodotto dai traduttori. Per il secondo task, invece, si sono sviluppati due prototipi anch'essi basati sulla tecnica del RAG nei quali i contesti forniti al modello provengono rispettivamente dal solo testo già edito e dalla combinazione del testo edito e dei suoi glossari di accompagnamento. Per entrambi i prototipi si sono inserite nei prompt istruzioni affinché il sistema restituisse, nella risposta, le fonti utilizzate. I primi risultati sono promettenti. Nel primo task la nostra metodologia ha prodotto una traduzione migliore nel 79% dei casi rispetto al testo tradotto direttamente dal modello generativo. Il restante 21% dei casi ha evidenziato l’incapacità del LLM di produrre quelle espansioni esplicative fondamentali per la comprensibilità del Talmud e peculiari dell’edizione. Anche il task di QA ha fornito riscontri promettenti, in particolare nell’indicazione delle fonti a supporto delle sue risposte, senza però evidenziare incrementi sostanziali nella loro qualità rispetto all’interrogazione diretta dei modelli. Questi risultati sottolineano la necessità di ulteriori indagini per superare i limiti dei LLM nell'elaborazione di testi ad alta complessità.