Evaluating semantic similarity of concepts is a problem that has been extensively investigated in the literature in different areas, such as Artificial Intelligence, Cognitive Science, Databases, and Software Engineering. Currently, it is growing in importance in different settings, such as digital libraries, heterogeneous databases, and, in particular, the Semantic Web. In such contexts, very often concepts are organized according to a taxonomy (or a hierarchy) and, in addition, are associated with structures (also referred to as feature vectors). With this regard, in general, the concept similarity measures proposed in the literature have not been conceived to address both these levels of information (taxonomy and structure), i.e., there exist contributions focusing on the similarity of hierarchically related concepts, and other proposals conceived to compare concept feature vectors. In this article, a method for evaluating similarity of concepts is presented, where both concept taxonomy and concept structures are considered. In particular, such a method has been defined by combining and revisiting (i) the information content approach, with regard to the comparison of concepts within the taxonomy, and (ii) a method inspired by the maximum weighted matching problem in bipartite graph, with regard to the comparison of feature vectors. The proposed approach is then compared with two among the most representative similarity measures defined in the literature, and a small data set shows how the proposed measure allows us to reduce the gap existing between them.
Concept similarity by evaluating Information Contents and Feature Vectors: a combined approach
Formica A
2009
Abstract
Evaluating semantic similarity of concepts is a problem that has been extensively investigated in the literature in different areas, such as Artificial Intelligence, Cognitive Science, Databases, and Software Engineering. Currently, it is growing in importance in different settings, such as digital libraries, heterogeneous databases, and, in particular, the Semantic Web. In such contexts, very often concepts are organized according to a taxonomy (or a hierarchy) and, in addition, are associated with structures (also referred to as feature vectors). With this regard, in general, the concept similarity measures proposed in the literature have not been conceived to address both these levels of information (taxonomy and structure), i.e., there exist contributions focusing on the similarity of hierarchically related concepts, and other proposals conceived to compare concept feature vectors. In this article, a method for evaluating similarity of concepts is presented, where both concept taxonomy and concept structures are considered. In particular, such a method has been defined by combining and revisiting (i) the information content approach, with regard to the comparison of concepts within the taxonomy, and (ii) a method inspired by the maximum weighted matching problem in bipartite graph, with regard to the comparison of feature vectors. The proposed approach is then compared with two among the most representative similarity measures defined in the literature, and a small data set shows how the proposed measure allows us to reduce the gap existing between them.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


