In-field sensing systems for automatic yield monitoring are gaining increasing importance as they promise to give a considerable boost in production. The development of artificial intelligence and sensing technologies to assist the human workforce also meets sustainability needs, which impact the ecological goals of current and future agricultural processes. In this context, image acquisition and processing systems are widely adopted to extract useful information for farmers. Although RGB-D cameras have been used in many applications for ground-based proximal sensing, relatively few works can be found that include depth information in image analysis. In this work, both semantic and depth information from RGB-D vineyard images is used in processing pipeline composed of a decision tree algorithm and a deep learning model. The goal is to reach coherent semantic segmentation of a set of natural images acquired at both long and short distances, using a low-cost RGB-D camera in an experimental vineyard. Depth information of each image is fed into a decision tree to predict the distance of the acquired vines from the camera. Before feeding the deep learning models, the images to be segmented are manipulated according to the predicted distance. The results of semantic segmentation with and without using the decision tree are compared, showing how depth information appears to be highly relevant in enhancing the accuracy and precision of the predicted semantic maps.

Scale-invariant semantic segmentation of natural RGB-D images combining decision tree and deep learning models

L Romeo;R P Devanna;R Marani;G Matranga;M Biddoccu;A Milella
2023

Abstract

In-field sensing systems for automatic yield monitoring are gaining increasing importance as they promise to give a considerable boost in production. The development of artificial intelligence and sensing technologies to assist the human workforce also meets sustainability needs, which impact the ecological goals of current and future agricultural processes. In this context, image acquisition and processing systems are widely adopted to extract useful information for farmers. Although RGB-D cameras have been used in many applications for ground-based proximal sensing, relatively few works can be found that include depth information in image analysis. In this work, both semantic and depth information from RGB-D vineyard images is used in processing pipeline composed of a decision tree algorithm and a deep learning model. The goal is to reach coherent semantic segmentation of a set of natural images acquired at both long and short distances, using a low-cost RGB-D camera in an experimental vineyard. Depth information of each image is fed into a decision tree to predict the distance of the acquired vines from the camera. Before feeding the deep learning models, the images to be segmented are manipulated according to the predicted distance. The results of semantic segmentation with and without using the decision tree are compared, showing how depth information appears to be highly relevant in enhancing the accuracy and precision of the predicted semantic maps.
2023
Istituto di Sistemi e Tecnologie Industriali Intelligenti per il Manifatturiero Avanzato - STIIMA (ex ITIA)
Istituto di Scienze e Tecnologie per l'Energia e la Mobilità Sostenibili - STEMS
SEMANTIC MAPPING
RGB-D SENSING
VINEYARD IMAGE SEGMENTATION
DEEP LEARNING
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/459621
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact