We describe the implications of (re)using the OpeNer and PANACEA Web Services into the CLARIN Research Infrastructure. The analyzed tools are of great interest for specific communities such as academic and small business focused on sentiment/opinion analysis and on Machine Translation along with related technologies, but their outcomes may be of great importance for the CLARIN audience as well. In fact, the Virtual Language Observatory shows a lot of lexical resources for sentiment but a few tool, while a lot of lexical resources and tools are available for Machine Translation. This means that the latter community is already in CLARIN, while the former should be poked. If community-related challenges are on the political side, issues related to interoperability are definitely on the technical one. The initiative is carried out at the ILC4CLARIN center in Pisa, the leading one of the CLARIN-IT national Consortium. The least common multiple between those two projects is neither limited to tools and Web Services nor to the creation of annotated corpora and lexicons; neither to the focus they have on specific communities. They also are based on (and strongly pursue and suggest) the concept of interoperability. This is clear from the use of the Kyoto Annotation Format in OpeNer, of Graph Annotation Format in PANACEA8 and of and the Lexical Markup Framework in both. Data and tools interoperability is also a key asset in both CLARIN (https://www.clarin.eu/event/2017/clarin-workshop-towards-interoperability-lexico-semantic-resources) and EUDAT (https://eudat.eu/communities/an-eudat-based-fair-data-approach-for-data-interoperability) . Within CLARIN, initiatives such as the Language Resource Switchboard and openly go towards methodologies and "systems" to address the interoperability issues. From a technical point of view the main issues are briefly reported below: 1. Many tools in OpeNer and PANACEA are command line ones; 2. OpeNer o_ers both POST and GET API; 3. PANACEA built its Web Services using Soaplab11 and o_ers SOAP Web Services; 4. KAF, LMF and GrAF guarantee the interoperability among data and services; 5. Simple pipelines are available in OpeNer, while a workow engine has been used in PANACEA. Tools are already wrapped, but to fully meet the requirements of both LRS and WebLicht we have to build a new shell around the command line tools so that REST APIs can accept both POST and GET requests and accept/produce different formats. Indeed if Language Resource Switchboard accepts tools with their output format but requires to read data from URL in plain text, WebLicht accepts tools which read and write the TCF format. While OpeNer requires that the core (the command line) be wrapped into a REST shell, Web Services in PANACEA need REST APIs around a SOAP core. In the final paper, we will finalize the technical aspects and describe how the User Involvement group can play an important role in poking the sentiment/opinion community in CLARIN.

(Re)Using OpeNER and PANACEA Web Services in the CLARIN Research Infrastructure

Del Gratta Riccardo
Primo
Writing – Original Draft Preparation
2017

Abstract

We describe the implications of (re)using the OpeNer and PANACEA Web Services into the CLARIN Research Infrastructure. The analyzed tools are of great interest for specific communities such as academic and small business focused on sentiment/opinion analysis and on Machine Translation along with related technologies, but their outcomes may be of great importance for the CLARIN audience as well. In fact, the Virtual Language Observatory shows a lot of lexical resources for sentiment but a few tool, while a lot of lexical resources and tools are available for Machine Translation. This means that the latter community is already in CLARIN, while the former should be poked. If community-related challenges are on the political side, issues related to interoperability are definitely on the technical one. The initiative is carried out at the ILC4CLARIN center in Pisa, the leading one of the CLARIN-IT national Consortium. The least common multiple between those two projects is neither limited to tools and Web Services nor to the creation of annotated corpora and lexicons; neither to the focus they have on specific communities. They also are based on (and strongly pursue and suggest) the concept of interoperability. This is clear from the use of the Kyoto Annotation Format in OpeNer, of Graph Annotation Format in PANACEA8 and of and the Lexical Markup Framework in both. Data and tools interoperability is also a key asset in both CLARIN (https://www.clarin.eu/event/2017/clarin-workshop-towards-interoperability-lexico-semantic-resources) and EUDAT (https://eudat.eu/communities/an-eudat-based-fair-data-approach-for-data-interoperability) . Within CLARIN, initiatives such as the Language Resource Switchboard and openly go towards methodologies and "systems" to address the interoperability issues. From a technical point of view the main issues are briefly reported below: 1. Many tools in OpeNer and PANACEA are command line ones; 2. OpeNer o_ers both POST and GET API; 3. PANACEA built its Web Services using Soaplab11 and o_ers SOAP Web Services; 4. KAF, LMF and GrAF guarantee the interoperability among data and services; 5. Simple pipelines are available in OpeNer, while a workow engine has been used in PANACEA. Tools are already wrapped, but to fully meet the requirements of both LRS and WebLicht we have to build a new shell around the command line tools so that REST APIs can accept both POST and GET requests and accept/produce different formats. Indeed if Language Resource Switchboard accepts tools with their output format but requires to read data from URL in plain text, WebLicht accepts tools which read and write the TCF format. While OpeNer requires that the core (the command line) be wrapped into a REST shell, Web Services in PANACEA need REST APIs around a SOAP core. In the final paper, we will finalize the technical aspects and describe how the User Involvement group can play an important role in poking the sentiment/opinion community in CLARIN.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Del Gratta Riccardo en
dc.collection.id.s 33fc2b58-b895-438b-9d2a-2c5bc86a83a6 *
dc.collection.name 04.04 Presentazione/Comunicazione non pubblicata (convegno, evento, webinar...) *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/20 23:44:06 -
dc.date.available 2024/02/20 23:44:06 -
dc.date.firstsubmission 2024/10/03 10:34:50 *
dc.date.issued 2017 -
dc.date.submission 2024/12/18 18:40:23 *
dc.description.abstract We describe the implications of (re)using the OpeNer and PANACEA Web Services into the CLARIN Research Infrastructure. The analyzed tools are of great interest for specific communities such as academic and small business focused on sentiment/opinion analysis and on Machine Translation along with related technologies, but their outcomes may be of great importance for the CLARIN audience as well. In fact, the Virtual Language Observatory shows a lot of lexical resources for sentiment but a few tool, while a lot of lexical resources and tools are available for Machine Translation. This means that the latter community is already in CLARIN, while the former should be poked. If community-related challenges are on the political side, issues related to interoperability are definitely on the technical one. The initiative is carried out at the ILC4CLARIN center in Pisa, the leading one of the CLARIN-IT national Consortium. The least common multiple between those two projects is neither limited to tools and Web Services nor to the creation of annotated corpora and lexicons; neither to the focus they have on specific communities. They also are based on (and strongly pursue and suggest) the concept of interoperability. This is clear from the use of the Kyoto Annotation Format in OpeNer, of Graph Annotation Format in PANACEA8 and of and the Lexical Markup Framework in both. Data and tools interoperability is also a key asset in both CLARIN (https://www.clarin.eu/event/2017/clarin-workshop-towards-interoperability-lexico-semantic-resources) and EUDAT (https://eudat.eu/communities/an-eudat-based-fair-data-approach-for-data-interoperability) . Within CLARIN, initiatives such as the Language Resource Switchboard and openly go towards methodologies and "systems" to address the interoperability issues. From a technical point of view the main issues are briefly reported below: 1. Many tools in OpeNer and PANACEA are command line ones; 2. OpeNer o_ers both POST and GET API; 3. PANACEA built its Web Services using Soaplab11 and o_ers SOAP Web Services; 4. KAF, LMF and GrAF guarantee the interoperability among data and services; 5. Simple pipelines are available in OpeNer, while a workow engine has been used in PANACEA. Tools are already wrapped, but to fully meet the requirements of both LRS and WebLicht we have to build a new shell around the command line tools so that REST APIs can accept both POST and GET requests and accept/produce different formats. Indeed if Language Resource Switchboard accepts tools with their output format but requires to read data from URL in plain text, WebLicht accepts tools which read and write the TCF format. While OpeNer requires that the core (the command line) be wrapped into a REST shell, Web Services in PANACEA need REST APIs around a SOAP core. In the final paper, we will finalize the technical aspects and describe how the User Involvement group can play an important role in poking the sentiment/opinion community in CLARIN. -
dc.description.affiliations CNR-ILC -
dc.description.allpeople DEL GRATTA, Riccardo -
dc.description.allpeopleoriginal Del Gratta Riccardo en
dc.description.fulltext open en
dc.description.numberofauthors 1 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/336808 -
dc.identifier.url https://office.clarin.eu/v/CE-2019-1512_CLARIN2019_ConferenceProceedings.pdf en
dc.language.iso eng en
dc.miur.last.status.update 2024-10-03T08:32:34Z *
dc.relation.conferencedate 30/11/2017, 1/12/2017 en
dc.relation.conferencename Digital Infrastructures for Research 2017 en
dc.relation.conferenceplace Brussels, The Square Meeting Centre en
dc.relation.ispartofbook Digital Infrastructures for Research 2017 en
dc.subject.keywords Web Services -
dc.subject.keywords Clarin -
dc.subject.keywords Research Infrastructures -
dc.subject.singlekeyword Web Services *
dc.subject.singlekeyword Clarin *
dc.subject.singlekeyword Research Infrastructures *
dc.title (Re)Using OpeNER and PANACEA Web Services in the CLARIN Research Infrastructure en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.04 Presentazione/Comunicazione non pubblicata (convegno, evento, webinar...) it
dc.type.miur -2 -
dc.type.referee Sì, ma tipo non specificato en
dc.ugov.descaux1 382031 -
iris.mediafilter.data 2025/04/15 04:38:01 *
iris.orcid.lastModifiedDate 2024/12/19 16:57:29 *
iris.orcid.lastModifiedMillisecond 1734623849284 *
iris.sitodocente.maxattempts 1 -
Appare nelle tipologie: 04.04 Presentazione/Comunicazione non pubblicata (convegno, evento, webinar...)
File in questo prodotto:
File Dimensione Formato  
prod_382031-doc_129684.pdf

accesso aperto

Descrizione: (Re)Using OpeNER and PANACEA Web Services in the CLARIN Research Infrastructure
Tipologia: Documento in Pre-print
Licenza: Creative commons
Dimensione 401.68 kB
Formato Adobe PDF
401.68 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/336808
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact