Protection of one's intellectual property is a topic with important technological and legal facets. We provide mechanisms for establishing the ownership of a dataset consisting of multiple objects. The algorithms also preserve important properties of the dataset, which are important for mining operations, and so guarantee both right protection and utility preservation. We consider a right-protection scheme based on watermarking. Watermarking may distort the original distance graph. Our watermarking methodology preserves important distance relationships, such as: the Nearest Neighbors (NN) of each object and the Minimum Spanning Tree (MST) of the original dataset. This leads to preservation of any mining operation that depends on the ordering of distances between objects, such as NN-search and classification, as well as many visualization techniques. We prove fundamental lower and upper bounds on the distance between objects post-watermarking. In particular, we establish a restricted isometry property, i.e., tight bounds on the contraction/expansion of the original distances. We use this analysis to design fast algorithms for NN-preserving and MST-preserving watermarking that drastically prune the vast search space. We observe two orders of magnitude speedup over the exhaustive schemes, without any sacrifice in NN or MST preservation.

Right-protected data publishing with provable distance-based mining

Lucchese C;
2014

Abstract

Protection of one's intellectual property is a topic with important technological and legal facets. We provide mechanisms for establishing the ownership of a dataset consisting of multiple objects. The algorithms also preserve important properties of the dataset, which are important for mining operations, and so guarantee both right protection and utility preservation. We consider a right-protection scheme based on watermarking. Watermarking may distort the original distance graph. Our watermarking methodology preserves important distance relationships, such as: the Nearest Neighbors (NN) of each object and the Minimum Spanning Tree (MST) of the original dataset. This leads to preservation of any mining operation that depends on the ordering of distances between objects, such as NN-search and classification, as well as many visualization techniques. We prove fundamental lower and upper bounds on the distance between objects post-watermarking. In particular, we establish a restricted isometry property, i.e., tight bounds on the contraction/expansion of the original distances. We use this analysis to design fast algorithms for NN-preserving and MST-preserving watermarking that drastically prune the vast search space. We observe two orders of magnitude speedup over the exhaustive schemes, without any sacrifice in NN or MST preservation.
2014
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Watermarking
Nearest neighbors (NN)
Minimum spanning tree (MST)
Restricted isometry property (RIP)
File in questo prodotto:
File Dimensione Formato  
prod_295716-doc_84958.pdf

solo utenti autorizzati

Descrizione: Right-protected data publishing with provable distance-based mining
Tipologia: Versione Editoriale (PDF)
Dimensione 2.16 MB
Formato Adobe PDF
2.16 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/265646
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 6
social impact