CLIP-based text-to-image retrieval has proven to be very effective at the interactive video retrieval competition Video Browser Showdown 2022, where all three top-scoring teams had implemented a variant of a CLIP model in their system. Since the performance of these three systems was quite close, this post-evaluation was designed to get better insights on the differences of the systems and compare the CLIP-based text-query retrieval engines by introducing slight modifications to the original competition settings. An extended analysis of the overall results and the retrieval performance of all systems’ functionalities shows that a strong text retrieval model certainly helps, but has to be coupled with extensive browsing capabilities and other query-modalities to consistently solve known-item-search tasks in a large-scale video database.
Interactive multimodal video search: an extended post-evaluation for the VBS 2022 competition
Carrara F.;Vadicamo L.;Vairo C.
2024
Abstract
CLIP-based text-to-image retrieval has proven to be very effective at the interactive video retrieval competition Video Browser Showdown 2022, where all three top-scoring teams had implemented a variant of a CLIP model in their system. Since the performance of these three systems was quite close, this post-evaluation was designed to get better insights on the differences of the systems and compare the CLIP-based text-query retrieval engines by introducing slight modifications to the original competition settings. An extended analysis of the overall results and the retrieval performance of all systems’ functionalities shows that a strong text retrieval model certainly helps, but has to be coupled with extensive browsing capabilities and other query-modalities to consistently solve known-item-search tasks in a large-scale video database.File | Dimensione | Formato | |
---|---|---|---|
VBS22_Post_Evaluation.pdf
accesso aperto
Descrizione: Interactive Multimodal Video Search: An Extended Post-Evaluation for the VBS 2022 Competition
Tipologia:
Documento in Post-print
Licenza:
Creative commons
Dimensione
1.05 MB
Formato
Adobe PDF
|
1.05 MB | Adobe PDF | Visualizza/Apri |
2024-VBSSE-post evaluation_s13735-024-00325-9.pdf
accesso aperto
Descrizione: Interactive multimodal video search: an extended post-evaluation for the VBS 2022 competition
Tipologia:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
1.5 MB
Formato
Adobe PDF
|
1.5 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.