The scalability, as well as the effectiveness, of the different Content-based Image Retrieval (CBIR) approaches proposed in litera- ture, is today an important research issue. Given the wealth of images on theWeb, CBIR systems must in fact leap towardsWeb-scale datasets. In this paper, we report on our experience in building a test collection of 100 million images, with the corresponding descriptive features, to be used in experimenting new scalable techniques for similarity search- ing, and comparing their results. In the context of the SAPIR (Search on Audio-visual content using Peer-to-peer Information Retrieval) Euro- pean project, we had to experiment our distributed similarity searching technology on a realistic data set. Therefore, since no large-scale collec- tion was available for research purpose, we had to tackle the non-trivial process of image crawling and descriptive feature extraction (we used five MPEG-7 features) using the European EGEE computer GRID. The result of this effort is CoPhIR, the first CBIR test collection of such scale. CoPhIR is now open to the research community for experiments and comparisons, and access to the collection was already granted to more than 50 research groups worldwide

CoPhIR: a test collection for content-based image retrieval

Bolettieri P;Esuli A;Falchi F;Lucchese C;Perego R;
2009

Abstract

The scalability, as well as the effectiveness, of the different Content-based Image Retrieval (CBIR) approaches proposed in litera- ture, is today an important research issue. Given the wealth of images on theWeb, CBIR systems must in fact leap towardsWeb-scale datasets. In this paper, we report on our experience in building a test collection of 100 million images, with the corresponding descriptive features, to be used in experimenting new scalable techniques for similarity search- ing, and comparing their results. In the context of the SAPIR (Search on Audio-visual content using Peer-to-peer Information Retrieval) Euro- pean project, we had to experiment our distributed similarity searching technology on a realistic data set. Therefore, since no large-scale collec- tion was available for research purpose, we had to tackle the non-trivial process of image crawling and descriptive feature extraction (we used five MPEG-7 features) using the European EGEE computer GRID. The result of this effort is CoPhIR, the first CBIR test collection of such scale. CoPhIR is now open to the research community for experiments and comparisons, and access to the collection was already granted to more than 50 research groups worldwide
2009
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Information Search and Retrieval
CBIR
Collection
Image
Information Search and Retrieval
Digital Libraries
File in questo prodotto:
File Dimensione Formato  
prod_172793-doc_114188.pdf

accesso aperto

Descrizione: CoPhIR: a test collection for content-based image retrieval
Dimensione 2.16 MB
Formato Adobe PDF
2.16 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/151338
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact