We present Uncertainty-Gated Lexical Decoding (UGLD), a decoding-time framework for fine-grained lexical control in Large Language Models (LLMs) that explicitly addresses the trade-off between controllability and fluency. UGLD adaptively scales intervention through an entropy-based gating mechanism derived from the model’s predictive distribution, activating control when uncertainty is high and limiting interference when predictions are confident. The method supports both promotion toward and against predefined vocabularies. We evaluate UGLD in Italian on two open-weight LLMs (ANITA 8B and Qwen 3 4B) across paraphrasing and free-text generation settings, considering Simple Vocabulary Conditioning and Jargon Reduction scenarios. Automatic evaluation shows consistent improvements in lexical coverage over standard decoding strategies, while human evaluation confirms that fluency is preserved under controlled intervention.

Lexical Conditioning of Model's Distribution through Uncertainty-gated Soft-Mixing of Probabilities

Michele Papucci;Giulia Venturi;Felice Dell'Orletta
2026

Abstract

We present Uncertainty-Gated Lexical Decoding (UGLD), a decoding-time framework for fine-grained lexical control in Large Language Models (LLMs) that explicitly addresses the trade-off between controllability and fluency. UGLD adaptively scales intervention through an entropy-based gating mechanism derived from the model’s predictive distribution, activating control when uncertainty is high and limiting interference when predictions are confident. The method supports both promotion toward and against predefined vocabularies. We evaluate UGLD in Italian on two open-weight LLMs (ANITA 8B and Qwen 3 4B) across paraphrasing and free-text generation settings, considering Simple Vocabulary Conditioning and Jargon Reduction scenarios. Automatic evaluation shows consistent improvements in lexical coverage over standard decoding strategies, while human evaluation confirms that fluency is preserved under controlled intervention.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Michele Papucci en
dc.authority.people Giulia Venturi en
dc.authority.people Felice Dell'Orletta en
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.date.firstsubmission 2026/06/09 17:32:09 *
dc.date.issued 2026 -
dc.date.submission 2026/06/09 17:32:09 *
dc.description.abstracteng We present Uncertainty-Gated Lexical Decoding (UGLD), a decoding-time framework for fine-grained lexical control in Large Language Models (LLMs) that explicitly addresses the trade-off between controllability and fluency. UGLD adaptively scales intervention through an entropy-based gating mechanism derived from the model’s predictive distribution, activating control when uncertainty is high and limiting interference when predictions are confident. The method supports both promotion toward and against predefined vocabularies. We evaluate UGLD in Italian on two open-weight LLMs (ANITA 8B and Qwen 3 4B) across paraphrasing and free-text generation settings, considering Simple Vocabulary Conditioning and Jargon Reduction scenarios. Automatic evaluation shows consistent improvements in lexical coverage over standard decoding strategies, while human evaluation confirms that fluency is preserved under controlled intervention. -
dc.description.allpeople Papucci, Michele; Venturi, Giulia; Dell'Orletta, Felice -
dc.description.allpeopleoriginal Michele Papucci, Giulia Venturi, Felice Dell'Orletta en
dc.description.fulltext none en
dc.description.numberofauthors 3 -
dc.identifier.isbn 978-2-493814-91-3 en
dc.identifier.source manual *
dc.identifier.uri https://hdl.handle.net/20.500.14243/586463 -
dc.identifier.url http://www.italianlp.it/wp-content/uploads/2026/05/papuccietalreadixtsar2026.pdf en
dc.language.iso eng en
dc.relation.allauthors Michele Papucci, Giulia Venturi, Felice Dell’Orletta en
dc.relation.conferencedate 11 May 2026 en
dc.relation.conferencename Joint Workshop on Readability and Text Simplification (READIxTSAR) en
dc.relation.conferenceplace Palma de Maiorca en
dc.relation.firstpage 89 en
dc.relation.ispartofbook Proceedings of the Joint Workshop on Readability and Text Simplification (READIxTSAR) @ LREC 2026 en
dc.relation.lastpage 100 en
dc.relation.numberofpages 12 en
dc.subject.keywordseng Controlled Text Generation, Lexically Constrained Decoding, Entropy-Gated Decoding -
dc.subject.singlekeyword Controlled Text Generation *
dc.subject.singlekeyword Lexically Constrained Decoding *
dc.subject.singlekeyword Entropy-Gated Decoding *
dc.title Lexical Conditioning of Model's Distribution through Uncertainty-gated Soft-Mixing of Probabilities en
dc.type.circulation Internazionale en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
iris.orcid.lastModifiedDate 2026/06/09 17:32:09 *
iris.orcid.lastModifiedMillisecond 1781019129525 *
iris.sitodocente.maxattempts 1 -
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/586463
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ente

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact