Assessing wood quality at early stages of the forest-wood value chain remains a significant challenge, as qualitative evaluations are more typically performed after harvesting. This study presents the application of Vision Transformers (ViTs) for the qualitative classification of standing trees at the forest-compartment level, based on images captured using a standard smartphone camera.A dataset of 460 terrestrial forest images was collected in 31 Douglas fir compartments managed for timber production, with images manually assigned to three stem-quality classes (117 class 1, 243 class 2, 100 class 3). ViT models pre-trained on ImageNet were employed to classify stem quality at both the image and forest-compartment levels. Several adaptation strategies were evaluated, including Full Fine-Tuning, and parameter-efficient fine-tuning based on Low-Rank Adaptation (LoRA). Model performance was assessed using a stratified 10-fold cross-validation, with data splitting performed at the compartment level to ensure spatial independence between training and testing data.The results demonstrate that ViT-based models can effectively classify stem quality despite limited and imbalanced training data. Among the evaluated strategies, LoRA achieved the highest performance, reaching approximately 0.69 accuracy at the image level and 0.78 accuracy at the compartment level, consistently outperforming both Full Fine-Tuning and baseline approaches. Aggregation of predictions at the compartment level via majority voting further improved robustness and reduced misclassifications compared to image-level predictions.The proposed approach enables the extraction of qualitative information from low-cost image data and can be readily integrated into digital forest inventory workflows. This development supports more informed forest management and timber commercialization strategies by complementing traditional quantitative metrics with the assessments of wood quality.
Parameter-efficient vision transformer adaptation for stem quality classification from smartphone forest images
Nocetti, Michela
;Aminti, Giovanni;Brunetti, MicheleUltimo
2026
Abstract
Assessing wood quality at early stages of the forest-wood value chain remains a significant challenge, as qualitative evaluations are more typically performed after harvesting. This study presents the application of Vision Transformers (ViTs) for the qualitative classification of standing trees at the forest-compartment level, based on images captured using a standard smartphone camera.A dataset of 460 terrestrial forest images was collected in 31 Douglas fir compartments managed for timber production, with images manually assigned to three stem-quality classes (117 class 1, 243 class 2, 100 class 3). ViT models pre-trained on ImageNet were employed to classify stem quality at both the image and forest-compartment levels. Several adaptation strategies were evaluated, including Full Fine-Tuning, and parameter-efficient fine-tuning based on Low-Rank Adaptation (LoRA). Model performance was assessed using a stratified 10-fold cross-validation, with data splitting performed at the compartment level to ensure spatial independence between training and testing data.The results demonstrate that ViT-based models can effectively classify stem quality despite limited and imbalanced training data. Among the evaluated strategies, LoRA achieved the highest performance, reaching approximately 0.69 accuracy at the image level and 0.78 accuracy at the compartment level, consistently outperforming both Full Fine-Tuning and baseline approaches. Aggregation of predictions at the compartment level via majority voting further improved robustness and reduced misclassifications compared to image-level predictions.The proposed approach enables the extraction of qualitative information from low-cost image data and can be readily integrated into digital forest inventory workflows. This development supports more informed forest management and timber commercialization strategies by complementing traditional quantitative metrics with the assessments of wood quality.| File | Dimensione | Formato | |
|---|---|---|---|
|
2026_SAT.pdf
accesso aperto
Descrizione: Parameter-efficient vision transformer adaptation for stem quality classification from smartphone forest images
Tipologia:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
9.15 MB
Formato
Adobe PDF
|
9.15 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


