Latest advances in artificial intelligence, particularly in object recognition and segmentation, provide unprecedented opportunities for precision agriculture. This work investigates the use of state-of-the-art AI models, namely Meta’s Segment Anything (SAM) and GroundingDino, for the task of grape cluster detection in vineyards. Three different methods aimed at enhancing the instance segmentation process are proposed: (i) SAM-Refine (SAM-R), which refines a previously proposed depth-based clustering approach, referred to as DepthSeg, using SAM; (ii) SAM-Segmentation (SAM-S), which integrates SAM with a pre-trained semantic segmentation model to improve cluster separation; and (iii) AutoSAM-Dino (ASD), which eliminates the need for manual labeling and transfer learning through the combined use of GroundingDino and SAM. Analysis is conducted on both the object counting and pixel-level segmentation accuracy against a manually labeled ground truth. Metrics such as mean Average Precision (mAP), Intersection over Union (IoU), and precision and recall are calculated to assess the system performance. Compared to the original DepthSeg algorithm, SAM-R slightly advances object counting (mAP: +0.5%) and excels in pixel-level segmentation (IoU: +17.0%). SAM-S, despite a mAP decrease, improves segmentation accuracy (IoU: +13.9%, Precision: +9.2%, Recall: +11.7%). Similarly, ASD, although with a lower mAP, shows significant accuracy enhancement (IoU: +7.8%, Precision: +4.2%, Recall: +4.9%). Additionally, from a labor effort point of view, instance segmentation techniques require much less time for training than manual labeling.

Boosting grape bunch detection in RGB-D images using zero-shot annotation with Segment Anything and GroundingDINO

Rosa Pia Devanna
;
Annalisa Milella
2025

Abstract

Latest advances in artificial intelligence, particularly in object recognition and segmentation, provide unprecedented opportunities for precision agriculture. This work investigates the use of state-of-the-art AI models, namely Meta’s Segment Anything (SAM) and GroundingDino, for the task of grape cluster detection in vineyards. Three different methods aimed at enhancing the instance segmentation process are proposed: (i) SAM-Refine (SAM-R), which refines a previously proposed depth-based clustering approach, referred to as DepthSeg, using SAM; (ii) SAM-Segmentation (SAM-S), which integrates SAM with a pre-trained semantic segmentation model to improve cluster separation; and (iii) AutoSAM-Dino (ASD), which eliminates the need for manual labeling and transfer learning through the combined use of GroundingDino and SAM. Analysis is conducted on both the object counting and pixel-level segmentation accuracy against a manually labeled ground truth. Metrics such as mean Average Precision (mAP), Intersection over Union (IoU), and precision and recall are calculated to assess the system performance. Compared to the original DepthSeg algorithm, SAM-R slightly advances object counting (mAP: +0.5%) and excels in pixel-level segmentation (IoU: +17.0%). SAM-S, despite a mAP decrease, improves segmentation accuracy (IoU: +13.9%, Precision: +9.2%, Recall: +11.7%). Similarly, ASD, although with a lower mAP, shows significant accuracy enhancement (IoU: +7.8%, Precision: +4.2%, Recall: +4.9%). Additionally, from a labor effort point of view, instance segmentation techniques require much less time for training than manual labeling.
2025
Istituto di Sistemi e Tecnologie Industriali Intelligenti per il Manifatturiero Avanzato - STIIMA (ex ITIA) Sede Secondaria Bari
Grape bunch detection, Instance segmentation, Zero-shot networks, Precision agriculture, Agriculture robotics
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0168169924010020-main.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 6.25 MB
Formato Adobe PDF
6.25 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/514297
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact