Identifying technologies is a key element for mapping a domain and its evolution. It allows managers and decision makers to anticipate trends for an accurate forecast and effective foresight. Researchers and practitioners are taking advantage of the rapid growth of the publicly accessible sources to map technological domains. Among these sources, patents are the widest technical open access database used in the literature and in practice. Nowadays, Natural Language Processing (NLP) techniques enable new methods for the analysis of patent texts. Among these techniques, in this paper we explore the use of Named Entity Recognition (NER) with the purpose to identify the technologies mentioned in patents' text. We compare three different NER methods, gazetteer-based, rule-based and deep learning-based (e.g. BERT), measuring their performances in terms of precision, recall and computational time. We test the approaches on 1600 patents from four assorted IPC classes as case studies. Our NER systems collected over 4500 fine-grained technologies, achieving the best results thanks to the combination of the three methodologies. The proposed method overcomes the literature thanks to the ability to filter generic technological terms. Our study delineates a valid technology identification tool that can be integrated in any text analysis pipeline to support academics and companies in investigating a technological domain.

Technology identification from patent texts: a novel named entity recognition method

Puccetti G.;
2022

Abstract

Identifying technologies is a key element for mapping a domain and its evolution. It allows managers and decision makers to anticipate trends for an accurate forecast and effective foresight. Researchers and practitioners are taking advantage of the rapid growth of the publicly accessible sources to map technological domains. Among these sources, patents are the widest technical open access database used in the literature and in practice. Nowadays, Natural Language Processing (NLP) techniques enable new methods for the analysis of patent texts. Among these techniques, in this paper we explore the use of Named Entity Recognition (NER) with the purpose to identify the technologies mentioned in patents' text. We compare three different NER methods, gazetteer-based, rule-based and deep learning-based (e.g. BERT), measuring their performances in terms of precision, recall and computational time. We test the approaches on 1600 patents from four assorted IPC classes as case studies. Our NER systems collected over 4500 fine-grained technologies, achieving the best results thanks to the combination of the three methodologies. The proposed method overcomes the literature thanks to the ability to filter generic technological terms. Our study delineates a valid technology identification tool that can be integrated in any text analysis pipeline to support academics and companies in investigating a technological domain.
2022
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Information retrieval
Named entity recognition
Natural language processing
Patents
Technology analysis
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0040162522006813-main.pdf

solo utenti autorizzati

Descrizione: Technology identification from patent texts: A novel named entity recognition method
Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 2.23 MB
Formato Adobe PDF
2.23 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
techNER__REV_1_.pdf

accesso aperto

Descrizione: Technology Identification from Patent Texts: A Novel Named Entity Recognition Method
Tipologia: Documento in Pre-print
Licenza: Altro tipo di licenza
Dimensione 668.04 kB
Formato Adobe PDF
668.04 kB Adobe PDF Visualizza/Apri
techNER__REV_1_-1.pdf

accesso aperto

Descrizione: This is the Author Accepted Manuscript (postprint) of the following paper: Puccetti G. et al. “Technology identification from patent texts: A novel named entity recognition method”, published in “Technological Forecasting and Social Change” Vol. 186, Part B, art. num. 122160, 2023. DOI: 10.1016/j.techfore.2022.122160.
Tipologia: Documento in Post-print
Licenza: Nessuna licenza dichiarata (non attribuibile a prodotti successivi al 2023)
Dimensione 675.69 kB
Formato Adobe PDF
675.69 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/521510
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 34
  • ???jsp.display-item.citation.isi??? 21
social impact