Terrestrial laser scanning campaigns provide an important means to document the 3D structure of historical sites. Unfortunately, the process of converting the 3D point clouds acquired by the laser scanner into a coherent and accurate 3D model has many stages and is not generally automated. In particular, the initial cleaning stage of the pipeline--in which undesired scene points are deleted--remains largely manual and is usually labour intensive. In this article, we introduce a semi-automated cleaning approach that incrementally trains a random forest (RF) classifier on an initial keep/discard point labelling generated by the user when cleaning the first scan(s). The classifier is then used to predict the labelling of the next scan in the sequence. Before this classification is presented to the user, a denoising post-process, based on the 2D range map representation of the laser scan, is applied. This significantly reduces small isolated point clusters that the user would otherwise have to fix. The user then selects the remaining incorrectly labelled points and these are weighted, based on a confidence estimate, and fed back into the classifier to retrain it for the next scan. Our experiments, across 8 scanning campaigns, show that when the scan campaign is coherent, i.e., it does not contain widely disparate or contradictory data, the classifier yields a keep/discard labelling that typically ranges between 95% and 99%. This is somewhat surprising, given that the data in each class can represent many object types, such as a tree, person, wall, and so on, and that no further effort beyond the point labeling of keep/discard is required of the user. We conducted an informal timing experiment over a 15-scan campaign, which compared the processing time required by our software, without user interaction (point label correction) time, against the time taken by an expert user to completely clean all scans. The expert user required 95mins to complete all cleaning. The average time required by the expert to clean a single scan was 6.3mins. Even with current unoptimized code, our system was able to generate keep/discard labels for all scans, with 98% (average) accuracy, in 75mins. This leaves as much as 20mins for the user input required to relabel the 2% of mispredicted points across the set of scans before the full system time would match the expert's cleaning time.

Semi-automated cleaning of laser scanning campaigns with machine learning

Dellepiane M;Cignoni P;Scopigno R
2019

Abstract

Terrestrial laser scanning campaigns provide an important means to document the 3D structure of historical sites. Unfortunately, the process of converting the 3D point clouds acquired by the laser scanner into a coherent and accurate 3D model has many stages and is not generally automated. In particular, the initial cleaning stage of the pipeline--in which undesired scene points are deleted--remains largely manual and is usually labour intensive. In this article, we introduce a semi-automated cleaning approach that incrementally trains a random forest (RF) classifier on an initial keep/discard point labelling generated by the user when cleaning the first scan(s). The classifier is then used to predict the labelling of the next scan in the sequence. Before this classification is presented to the user, a denoising post-process, based on the 2D range map representation of the laser scan, is applied. This significantly reduces small isolated point clusters that the user would otherwise have to fix. The user then selects the remaining incorrectly labelled points and these are weighted, based on a confidence estimate, and fed back into the classifier to retrain it for the next scan. Our experiments, across 8 scanning campaigns, show that when the scan campaign is coherent, i.e., it does not contain widely disparate or contradictory data, the classifier yields a keep/discard labelling that typically ranges between 95% and 99%. This is somewhat surprising, given that the data in each class can represent many object types, such as a tree, person, wall, and so on, and that no further effort beyond the point labeling of keep/discard is required of the user. We conducted an informal timing experiment over a 15-scan campaign, which compared the processing time required by our software, without user interaction (point label correction) time, against the time taken by an expert user to completely clean all scans. The expert user required 95mins to complete all cleaning. The average time required by the expert to clean a single scan was 6.3mins. Even with current unoptimized code, our system was able to generate keep/discard labels for all scans, with 98% (average) accuracy, in 75mins. This leaves as much as 20mins for the user input required to relabel the 2% of mispredicted points across the set of scans before the full system time would match the expert's cleaning time.
2019
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Heritage data
Laser scan
Point clouds
Machine learning
File in questo prodotto:
File Dimensione Formato  
prod_403932-doc_140673.pdf

solo utenti autorizzati

Descrizione: Semi-automated Cleaning of Laser Scanning Campaigns with Machine Learning
Tipologia: Versione Editoriale (PDF)
Dimensione 14.21 MB
Formato Adobe PDF
14.21 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/388373
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact