Understanding the nature of hand-drawn sketches is challenging due to the wide variation in their creation. Federico et al. [10] demonstrated that recognizing complex structural patterns enhances both sketch recognition and generation. Building on this foundation, we explore how the extracted features can also be leveraged for hand-drawn sketch retrieval. In this work, we extend ViSketch-GPT, a multi-scale context extraction model originally designed for classification and generation, to the task of retrieval. The model’s ability to capture intricate details at multiple scales allows it to learn highly discriminative representations, making it well-suited for retrieval applications. Through extensive experiments on the QuickDraw and TU-Berlin datasets, we show that ViSketch-GPT surpasses state-of-the-art methods in sketch retrieval, achieving substantial improvements across multiple evaluation metrics. Our results show that the extracted feature representations, originally designed for classification and generation, are also highly effective for retrieval tasks. This highlights ViSketch-GPT as a versatile and high-powerful framework for various applications in computer vision and sketch analysis.
ViSketch-GPT: collaborative multi-scale feature extraction for hand-drawn sketch retrieval
Federico G.;Carrara F.;Gennaro C.;Di Benedetto M.
2026
Abstract
Understanding the nature of hand-drawn sketches is challenging due to the wide variation in their creation. Federico et al. [10] demonstrated that recognizing complex structural patterns enhances both sketch recognition and generation. Building on this foundation, we explore how the extracted features can also be leveraged for hand-drawn sketch retrieval. In this work, we extend ViSketch-GPT, a multi-scale context extraction model originally designed for classification and generation, to the task of retrieval. The model’s ability to capture intricate details at multiple scales allows it to learn highly discriminative representations, making it well-suited for retrieval applications. Through extensive experiments on the QuickDraw and TU-Berlin datasets, we show that ViSketch-GPT surpasses state-of-the-art methods in sketch retrieval, achieving substantial improvements across multiple evaluation metrics. Our results show that the extracted feature representations, originally designed for classification and generation, are also highly effective for retrieval tasks. This highlights ViSketch-GPT as a versatile and high-powerful framework for various applications in computer vision and sketch analysis.| File | Dimensione | Formato | |
|---|---|---|---|
|
SISAP_2025.pdf
accesso aperto
Descrizione: ViSketch-GPT Collaborative Multi-scale Feature Extraction For Hand-Drawn Sketch Retrieval
Tipologia:
Documento in Pre-print
Licenza:
Creative commons
Dimensione
2.77 MB
Formato
Adobe PDF
|
2.77 MB | Adobe PDF | Visualizza/Apri |
|
978-3-032-06069-3_1.pdf
non disponibili
Descrizione: ViSketch-GPT Collaborative Multi-scale Feature Extraction For Hand-Drawn Sketch Retrieval
Tipologia:
Versione Editoriale (PDF)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1.96 MB
Formato
Adobe PDF
|
1.96 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


