CNR Institutional Research Information System

This paper presents findings of the eleventh Video Browser Showdown competition, where sixteen teams competed in known-item and ad-hoc search tasks. Many of the teams utilized state-of-the-art video retrieval approaches that demonstrated high effectiveness in challenging search scenarios. In this paper, a broad survey of all utilized approaches is presented in connection with an analysis of the performance of participating teams. Specifically, both high-level performance indicators are presented with overall statistics as well as in-depth analysis of the performance of selected tools implementing result set logging. The analysis reveals evidence that the CLIP model represents a versatile tool for cross-modal video retrieval when combined with interactive search capabilities. Furthermore, the analysis investigates the effect of different users and text query properties on the performance in search tasks. Last but not least, lessons learned from search task preparation are presented, and a new direction for ad-hoc search based tasks at Video Browser Showdown is introduced.

Interactive video retrieval in the age of effective joint embedding deep models: lessons from the 11th VBS

Lokoc J;Andreadis S;Bailer W;Duane A;Gurrin C;Ma Z;Messina N;Nguyen T N;Peska L;Rossetto L;Sauter L;Schall K;Schoeffmann K;Khan OS;Spiess F;Vadicamo L;Vrochidis S

2023

Abstract

This paper presents findings of the eleventh Video Browser Showdown competition, where sixteen teams competed in known-item and ad-hoc search tasks. Many of the teams utilized state-of-the-art video retrieval approaches that demonstrated high effectiveness in challenging search scenarios. In this paper, a broad survey of all utilized approaches is presented in connection with an analysis of the performance of participating teams. Specifically, both high-level performance indicators are presented with overall statistics as well as in-depth analysis of the performance of selected tools implementing result set logging. The analysis reveals evidence that the CLIP model represents a versatile tool for cross-modal video retrieval when combined with interactive search capabilities. Furthermore, the analysis investigates the effect of different users and text query properties on the performance in search tasks. Last but not least, lessons learned from search task preparation are presented, and a new direction for ad-hoc search based tasks at Video Browser Showdown is introduced.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Parole chiave
	
				Content-based retrieval
Evaluations
Interactive video retrieval
Video browsing
Video content analysis
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
prod_486052-doc_201562.pdf accesso aperto Descrizione: Preprint - Interactive video retrieval in the age of effective joint embedding deep models: lessons from the 11th VBS Tipologia: Documento in Post-print Licenza: Creative commons Dimensione 785.56 kB Formato Adobe PDF Visualizza/Apri	785.56 kB	Adobe PDF	Visualizza/Apri
prod_486052-doc_201623.pdf solo utenti autorizzati Descrizione: Interactive video retrieval in the age of effective joint embedding deep models: lessons from the 11th VBS Tipologia: Versione Editoriale (PDF) Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 1.88 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.88 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/463462

Citazioni

ND

52

30

social impact