In this work, we focus on the problem of annotation tagging over information spaces of objects stored in a full-text index. In such a scenario, data curators assign tags to objects with the purpose of classification, while generic end users will perceive tags as searchable and browsable object properties. To carry out their activities, data curators need annotation tagging tools that allow them to bulk tag or untag large sets of objects in temporary work sessions where they can virtually and in real time experiment with the effect of their actions before making the changes visible to end users. The implementation of these tools over full-text indexes is a challenge because bulk object updates in this context are far from being real-time and in critical cases may slow down index performance. We devised TagTick, a tool that offers to data curators a fully functional annotation tagging environment over the full-text index Apache Solr, regarded as a de facto standard in this area. TagTick consists of a TagTick Virtualizer module, which extends the API of Solr to support real-time, virtual, bulk-tagging operations, and a TagTick User Interface module, which offers end-user functionalities for annotation tagging. The tool scales optimally with the number and size of bulk tag operations without compromising the index performance.

High-performance annotation tagging over solr full-text indexes

Manghi P;Artini M;Bardi A;Atzori C;La Bruzzo S;Mikulicic M
2014

Abstract

In this work, we focus on the problem of annotation tagging over information spaces of objects stored in a full-text index. In such a scenario, data curators assign tags to objects with the purpose of classification, while generic end users will perceive tags as searchable and browsable object properties. To carry out their activities, data curators need annotation tagging tools that allow them to bulk tag or untag large sets of objects in temporary work sessions where they can virtually and in real time experiment with the effect of their actions before making the changes visible to end users. The implementation of these tools over full-text indexes is a challenge because bulk object updates in this context are far from being real-time and in critical cases may slow down index performance. We devised TagTick, a tool that offers to data curators a fully functional annotation tagging environment over the full-text index Apache Solr, regarded as a de facto standard in this area. TagTick consists of a TagTick Virtualizer module, which extends the API of Solr to support real-time, virtual, bulk-tagging operations, and a TagTick User Interface module, which offers end-user functionalities for annotation tagging. The tool scales optimally with the number and size of bulk tag operations without compromising the index performance.
2014
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Annotation Tagging
Virtual tagging
TagTick
File in questo prodotto:
File Dimensione Formato  
prod_285068-doc_81436.pdf

accesso aperto

Descrizione: High-performance annotation tagging over solr full-text indexes.
Tipologia: Versione Editoriale (PDF)
Dimensione 801.04 kB
Formato Adobe PDF
801.04 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/222884
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact