Polyhedral separation in Multiple Instance Learning problems

Astorino, Annabella; Avolio, Matteo; Fuduli, Antonio

Multiple Instance Learning (MIL) is a variant of traditional supervised learning where the main difference is in the nature of the learning examples. In fact, each example is not represented by a single vector of features but by a set (called bag) of feature vectors (called instances). The classification labels of the training bags are known whereas the labels of the instances inside them are unknown. The task of MIL is to learn a model that predicts the labels of the new incoming bags together the labels of the instances inside them. In this work we tackle the MIL problem for the binary case by constructing a polyhedral classifier on the basis of positive and negative training examples. In particular, the idea is to generate a polyhedral separation surface characterized by a finite number of hyperplanes such that, for each positive bag, at least one of its instances is inside the polyhedron and all the instances of each negative bag are outside. We come out with nonlinear nonconvex nonsmooth optimization problems of DC (Difference of Convex) type that we solve by adapting the DCA algorithm. The results of our implementation on a number of benchmark classification datasets are presented.