Understanding the nature of hand-drawn sketches is challenging due to the wide variation in their creation. Federico et al. [10] demonstrated that recognizing complex structural patterns enhances both sketch recognition and generation. Building on this foundation, we explore how the extracted features can also be leveraged for hand-drawn sketch retrieval. In this work, we extend ViSketch-GPT, a multi-scale context extraction model originally designed for classification and generation, to the task of retrieval. The model’s ability to capture intricate details at multiple scales allows it to learn highly discriminative representations, making it well-suited for retrieval applications. Through extensive experiments on the QuickDraw and TU-Berlin datasets, we show that ViSketch-GPT surpasses state-of-the-art methods in sketch retrieval, achieving substantial improvements across multiple evaluation metrics. Our results show that the extracted feature representations, originally designed for classification and generation, are also highly effective for retrieval tasks. This highlights ViSketch-GPT as a versatile and high-powerful framework for various applications in computer vision and sketch analysis.

ViSketch-GPT: collaborative multi-scale feature extraction for hand-drawn sketch retrieval

Federico G.;Carrara F.;Gennaro C.;Di Benedetto M.
2026

Abstract

Understanding the nature of hand-drawn sketches is challenging due to the wide variation in their creation. Federico et al. [10] demonstrated that recognizing complex structural patterns enhances both sketch recognition and generation. Building on this foundation, we explore how the extracted features can also be leveraged for hand-drawn sketch retrieval. In this work, we extend ViSketch-GPT, a multi-scale context extraction model originally designed for classification and generation, to the task of retrieval. The model’s ability to capture intricate details at multiple scales allows it to learn highly discriminative representations, making it well-suited for retrieval applications. Through extensive experiments on the QuickDraw and TU-Berlin datasets, we show that ViSketch-GPT surpasses state-of-the-art methods in sketch retrieval, achieving substantial improvements across multiple evaluation metrics. Our results show that the extracted feature representations, originally designed for classification and generation, are also highly effective for retrieval tasks. This highlights ViSketch-GPT as a versatile and high-powerful framework for various applications in computer vision and sketch analysis.
2026
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
9783032060686
9783032060693
AI
DDPM
Generative AI
LLM
Machine Learning
Retrieval
Signed Distance Function
Sketch
File in questo prodotto:
File Dimensione Formato  
SISAP_2025.pdf

accesso aperto

Descrizione: ViSketch-GPT Collaborative Multi-scale Feature Extraction For Hand-Drawn Sketch Retrieval
Tipologia: Documento in Pre-print
Licenza: Creative commons
Dimensione 2.77 MB
Formato Adobe PDF
2.77 MB Adobe PDF Visualizza/Apri
978-3-032-06069-3_1.pdf

non disponibili

Descrizione: ViSketch-GPT Collaborative Multi-scale Feature Extraction For Hand-Drawn Sketch Retrieval
Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.96 MB
Formato Adobe PDF
1.96 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/559864
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact