We present Uncertainty-Gated Lexical Decoding (UGLD), a decoding-time framework for fine-grained lexical control in Large Language Models (LLMs) that explicitly addresses the trade-off between controllability and fluency. UGLD adaptively scales intervention through an entropy-based gating mechanism derived from the model’s predictive distribution, activating control when uncertainty is high and limiting interference when predictions are confident. The method supports both promotion toward and against predefined vocabularies. We evaluate UGLD in Italian on two open-weight LLMs (ANITA 8B and Qwen 3 4B) across paraphrasing and free-text generation settings, considering Simple Vocabulary Conditioning and Jargon Reduction scenarios. Automatic evaluation shows consistent improvements in lexical coverage over standard decoding strategies, while human evaluation confirms that fluency is preserved under controlled intervention.
Lexical Conditioning of Model's Distribution through Uncertainty-gated Soft-Mixing of Probabilities
Michele Papucci;Giulia Venturi;Felice Dell'Orletta
2026
Abstract
We present Uncertainty-Gated Lexical Decoding (UGLD), a decoding-time framework for fine-grained lexical control in Large Language Models (LLMs) that explicitly addresses the trade-off between controllability and fluency. UGLD adaptively scales intervention through an entropy-based gating mechanism derived from the model’s predictive distribution, activating control when uncertainty is high and limiting interference when predictions are confident. The method supports both promotion toward and against predefined vocabularies. We evaluate UGLD in Italian on two open-weight LLMs (ANITA 8B and Qwen 3 4B) across paraphrasing and free-text generation settings, considering Simple Vocabulary Conditioning and Jargon Reduction scenarios. Automatic evaluation shows consistent improvements in lexical coverage over standard decoding strategies, while human evaluation confirms that fluency is preserved under controlled intervention.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | en |
| dc.authority.people | Michele Papucci | en |
| dc.authority.people | Giulia Venturi | en |
| dc.authority.people | Felice Dell'Orletta | en |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.contributor.area | Non assegn | * |
| dc.contributor.area | Non assegn | * |
| dc.contributor.area | Non assegn | * |
| dc.date.firstsubmission | 2026/06/09 17:32:09 | * |
| dc.date.issued | 2026 | - |
| dc.date.submission | 2026/06/09 17:32:09 | * |
| dc.description.abstracteng | We present Uncertainty-Gated Lexical Decoding (UGLD), a decoding-time framework for fine-grained lexical control in Large Language Models (LLMs) that explicitly addresses the trade-off between controllability and fluency. UGLD adaptively scales intervention through an entropy-based gating mechanism derived from the model’s predictive distribution, activating control when uncertainty is high and limiting interference when predictions are confident. The method supports both promotion toward and against predefined vocabularies. We evaluate UGLD in Italian on two open-weight LLMs (ANITA 8B and Qwen 3 4B) across paraphrasing and free-text generation settings, considering Simple Vocabulary Conditioning and Jargon Reduction scenarios. Automatic evaluation shows consistent improvements in lexical coverage over standard decoding strategies, while human evaluation confirms that fluency is preserved under controlled intervention. | - |
| dc.description.allpeople | Papucci, Michele; Venturi, Giulia; Dell'Orletta, Felice | - |
| dc.description.allpeopleoriginal | Michele Papucci, Giulia Venturi, Felice Dell'Orletta | en |
| dc.description.fulltext | none | en |
| dc.description.numberofauthors | 3 | - |
| dc.identifier.isbn | 978-2-493814-91-3 | en |
| dc.identifier.source | manual | * |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/586463 | - |
| dc.identifier.url | http://www.italianlp.it/wp-content/uploads/2026/05/papuccietalreadixtsar2026.pdf | en |
| dc.language.iso | eng | en |
| dc.relation.allauthors | Michele Papucci, Giulia Venturi, Felice Dell’Orletta | en |
| dc.relation.conferencedate | 11 May 2026 | en |
| dc.relation.conferencename | Joint Workshop on Readability and Text Simplification (READIxTSAR) | en |
| dc.relation.conferenceplace | Palma de Maiorca | en |
| dc.relation.firstpage | 89 | en |
| dc.relation.ispartofbook | Proceedings of the Joint Workshop on Readability and Text Simplification (READIxTSAR) @ LREC 2026 | en |
| dc.relation.lastpage | 100 | en |
| dc.relation.numberofpages | 12 | en |
| dc.subject.keywordseng | Controlled Text Generation, Lexically Constrained Decoding, Entropy-Gated Decoding | - |
| dc.subject.singlekeyword | Controlled Text Generation | * |
| dc.subject.singlekeyword | Lexically Constrained Decoding | * |
| dc.subject.singlekeyword | Entropy-Gated Decoding | * |
| dc.title | Lexical Conditioning of Model's Distribution through Uncertainty-gated Soft-Mixing of Probabilities | en |
| dc.type.circulation | Internazionale | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| iris.orcid.lastModifiedDate | 2026/06/09 17:32:09 | * |
| iris.orcid.lastModifiedMillisecond | 1781019129525 | * |
| iris.sitodocente.maxattempts | 1 | - |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


