This document reports on a set of widely used evaluation measures for recommender systems. The measures described in this document are well known by recommender system scientific community. In particular, all the first four measures reported, Precision and Recall, and MAE and ROC, assume to know which are the "good" results and which are the "bad" ones. This is done without looking at the real content of the involved objects. These measures are useful (and are used) to evaluate recommender systems that try to mimic the user behavior by estimating or predicting his future choices. By using them, we can only give a binary evaluation of recommendations (good/not good), without knowing the real quality of both recommended and not recommended results. In order to overcome this limitation, a series of other approaches have been proposed. Some of them give a quality measure based on the content of the objects. They compare the content of recommended items against the desired content representative (e.g. a query, the keywords of a paper, etc.). Another approach use the context of objects by assuming that similar objects are related to other similar objects. It exploits the object-to-object relationships to evaluate their similarity. Finally, a direct study of user reactions can be done. In this case, recommendations are proposed directly to a set of selected users. Users give a direct evaluation of the proposed recommendations, e.g. by giving ratings over them, clicking the related links, opening the proposed paper, etc. In the following, all these type of metrics are briefly introduced, discussing their respective good and negative aspects.

Mendeley Recommender System: Evaluation Measures

Dazzi P;Mordacchini M
2010

Abstract

This document reports on a set of widely used evaluation measures for recommender systems. The measures described in this document are well known by recommender system scientific community. In particular, all the first four measures reported, Precision and Recall, and MAE and ROC, assume to know which are the "good" results and which are the "bad" ones. This is done without looking at the real content of the involved objects. These measures are useful (and are used) to evaluate recommender systems that try to mimic the user behavior by estimating or predicting his future choices. By using them, we can only give a binary evaluation of recommendations (good/not good), without knowing the real quality of both recommended and not recommended results. In order to overcome this limitation, a series of other approaches have been proposed. Some of them give a quality measure based on the content of the objects. They compare the content of recommended items against the desired content representative (e.g. a query, the keywords of a paper, etc.). Another approach use the context of objects by assuming that similar objects are related to other similar objects. It exploits the object-to-object relationships to evaluate their similarity. Finally, a direct study of user reactions can be done. In this case, recommendations are proposed directly to a set of selected users. Users give a direct evaluation of the proposed recommendations, e.g. by giving ratings over them, clicking the related links, opening the proposed paper, etc. In the following, all these type of metrics are briefly introduced, discussing their respective good and negative aspects.
2010
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Metrics
Recommender
Performance Metrics
File in questo prodotto:
File Dimensione Formato  
prod_161226-doc_132557.pdf

solo utenti autorizzati

Descrizione: Mendeley Recommender System: Evaluation Measures
Dimensione 196.09 kB
Formato Adobe PDF
196.09 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/155910
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact