Multiple Instance Learning (MIL) is a variant of traditional supervised learning that has received a considerable amount of attention due to its applicability to real-world problems such as drug activity prediction and image classification. In particular, we are interested in the binary classification case where the objective is to construct a classifier on the basis of positive and negative training examples. The main difference with the traditional supervised learning scenario is in the nature of the learning examples. In fact, each example is not represented by a fixed-length vector of features but by a bag of feature vectors that are referred to as instances. The classification labels are only provided for entire training bags whereas the labels of the instances inside them are unknown. The task is to learn a model that predicts the labels of the new incoming bags together the labels of the instances inside them. In this work we tackle the MIL problem by polyhedral approaches. The idea is to generate a polyhedral separation surface characterized by a finite number of hyperplanes such that, for each positive bag, at least one of its instances is inside the polyhedron and all the instances of each negative bag are outside. We come out with nonlinear nonconvex nonsmooth optimization problems of DC (Difference of Convex) type that we solve by adapting the DCA algorithm. The results of our implementation on a number of benchmark classification datasets are presented.
Multiple Instance Learning by Polyhedral Approaches
Annabella Astorino;
2021
Abstract
Multiple Instance Learning (MIL) is a variant of traditional supervised learning that has received a considerable amount of attention due to its applicability to real-world problems such as drug activity prediction and image classification. In particular, we are interested in the binary classification case where the objective is to construct a classifier on the basis of positive and negative training examples. The main difference with the traditional supervised learning scenario is in the nature of the learning examples. In fact, each example is not represented by a fixed-length vector of features but by a bag of feature vectors that are referred to as instances. The classification labels are only provided for entire training bags whereas the labels of the instances inside them are unknown. The task is to learn a model that predicts the labels of the new incoming bags together the labels of the instances inside them. In this work we tackle the MIL problem by polyhedral approaches. The idea is to generate a polyhedral separation surface characterized by a finite number of hyperplanes such that, for each positive bag, at least one of its instances is inside the polyhedron and all the instances of each negative bag are outside. We come out with nonlinear nonconvex nonsmooth optimization problems of DC (Difference of Convex) type that we solve by adapting the DCA algorithm. The results of our implementation on a number of benchmark classification datasets are presented.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.