The amount of data in our world has been exploding. Integrating, managing and analyzing large amounts of data - i.e. Big Data - will become a key issue for businesses for better operating and competing in today's markets. Data are only useful if used in a smart way. We introduce the concept of Smart Data that is web and enterprise structured and unstructured big data with explicit and implicit semantics that leverages context to understand intent for better driving business processes and for better and more informed decisions making. This paper proposes a language able to give a representation of Big Data based on ontologies and a system that implements an approach capable to satisfy the increasing need for efficiency and scalability in semantic data management. The proposed MANTRA Language allows for: (i) representing the semantics of data by knowledge representation constructs; (ii) acquiring data from disparate heterogeneous sources (e.g. data bases, documents); (iii) integrating and managing data; (iv) reasoning and querying with Big Data. The syntax of the proposed language is partially derived from logic programming, but the semantic is completely revised. The novelty of the language we propose is that a class can be thought of as a flexible collection of structurally heterogeneous individuals that have different properties (schema-less). The language also allows executing efficient querying and reasoning for revealing implicit knowledge. These have been achieved by using a triple-based data persistency model and a scalable No-SQL storage system.

Towards a language for representing and managing the semantics of big data

Oro Ermelinda;Ruffolo Massimo;
2014

Abstract

The amount of data in our world has been exploding. Integrating, managing and analyzing large amounts of data - i.e. Big Data - will become a key issue for businesses for better operating and competing in today's markets. Data are only useful if used in a smart way. We introduce the concept of Smart Data that is web and enterprise structured and unstructured big data with explicit and implicit semantics that leverages context to understand intent for better driving business processes and for better and more informed decisions making. This paper proposes a language able to give a representation of Big Data based on ontologies and a system that implements an approach capable to satisfy the increasing need for efficiency and scalability in semantic data management. The proposed MANTRA Language allows for: (i) representing the semantics of data by knowledge representation constructs; (ii) acquiring data from disparate heterogeneous sources (e.g. data bases, documents); (iii) integrating and managing data; (iv) reasoning and querying with Big Data. The syntax of the proposed language is partially derived from logic programming, but the semantic is completely revised. The novelty of the language we propose is that a class can be thought of as a flexible collection of structurally heterogeneous individuals that have different properties (schema-less). The language also allows executing efficient querying and reasoning for revealing implicit knowledge. These have been achieved by using a triple-based data persistency model and a scalable No-SQL storage system.
2014
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
9789897580154
Big Data
Data integration
Database
Extract Transform and Load (ETL)
Knowledge representation and reasoning
NoSQL
Smart Data
Smart ETL
Unstructured data
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/281433
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact