The scientific progress in artificial intelligence and robotics has enabled precision viticulture to pursue sustainability and improve the final yield. For instance, monitoring the canopy volume of each plant can allow the correct ripening of the bunches. In this context, this paper proposes a novel approach for the characterization of biomass volume using images acquired in a vineyard with the low-cost Azure Kinect RGB-D camera. Semantic image segmentation is implemented using three encoder–decoder deep architectures (U-Net, DeepLabV3+, and MANet) to produce accurate masks of the vine leaf structure. In a transfer learning approach, a public dataset acquired with the Intel RealSense D435 depth camera is used to train the segmentation networks. Then, a complete pipeline to estimate possible changes in biomass volume is presented. Experiments are run to analyze the biomass removed during the trimming process of grapevine plants. The best segmentation result is obtained by the U-Net architecture with ResNet50 backbone, showing an accuracy of 92.10%, although the training and test sets consist of images acquired by different cameras. However, the DeepLabV3+ network with ResNeXt50 backbone, which scores an accuracy of 90.25% on the test set, gives the best estimate of the removed biomass, requiring the shortest time for training. These outcomes prove the potential capability of this automatic approach for controlling leaf growth and ensuring sustainable viticulture practices.
Biomass characterization with semantic segmentation models and point cloud analysis for precision viticulture
Marani R.;Guaragnella C.;D'Orazio T.
2024
Abstract
The scientific progress in artificial intelligence and robotics has enabled precision viticulture to pursue sustainability and improve the final yield. For instance, monitoring the canopy volume of each plant can allow the correct ripening of the bunches. In this context, this paper proposes a novel approach for the characterization of biomass volume using images acquired in a vineyard with the low-cost Azure Kinect RGB-D camera. Semantic image segmentation is implemented using three encoder–decoder deep architectures (U-Net, DeepLabV3+, and MANet) to produce accurate masks of the vine leaf structure. In a transfer learning approach, a public dataset acquired with the Intel RealSense D435 depth camera is used to train the segmentation networks. Then, a complete pipeline to estimate possible changes in biomass volume is presented. Experiments are run to analyze the biomass removed during the trimming process of grapevine plants. The best segmentation result is obtained by the U-Net architecture with ResNet50 backbone, showing an accuracy of 92.10%, although the training and test sets consist of images acquired by different cameras. However, the DeepLabV3+ network with ResNeXt50 backbone, which scores an accuracy of 90.25% on the test set, gives the best estimate of the removed biomass, requiring the shortest time for training. These outcomes prove the potential capability of this automatic approach for controlling leaf growth and ensuring sustainable viticulture practices.File | Dimensione | Formato | |
---|---|---|---|
2024_COMPAG.pdf
solo utenti autorizzati
Tipologia:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
5.68 MB
Formato
Adobe PDF
|
5.68 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.