Main Content

Aerial Lidar Semantic Segmentation Using RandLANet Deep Learning

This example shows how to train a RandLANet deep learning network to perform semantic segmentation on aerial lidar data.

You can use lidar data acquired from airborne laser scanning systems in applications such as topographic mapping, city modeling, biomass measurement, and disaster management. Extracting meaningful information from this data requires semantic segmentation, a process in which you assign each point in the point cloud a unique class label.

In this example, you train a RandLANet [1] network to perform semantic segmentation by using the Dayton Annotated Lidar Earth Scan (DALES) data set [2]. The data set contains scenes of dense, labeled aerial lidar data from urban, suburban, rural, and commercial settings. The data set provides semantic segmentation labels for eight classes: buildings, cars, trucks, poles, power lines, fences, ground, and vegetation.

Load DALES Data

The DALES data set contains 40 scenes of aerial lidar data. Out of the 40 scenes, 29 scenes are used for training and the remaining 11 scenes are used for testing. Follow the instructions on the DALES website to download the data set to the folder specified by the dataFolder variable. Create folders to store the training and test data.

dataFolder = fullfile(tempdir,"DALES");
trainDataFolder = fullfile(dataFolder,"dales_las","train");
testDataFolder = fullfile(dataFolder,"dales_las","test");

Preview a point cloud from the training data.

lasReader = lasFileReader(fullfile(trainDataFolder,"5080_54435.las"));
[pc,attr] = readPointCloud(lasReader,"Attributes","Classification");
labels = attr.Classification;

% Select only labeled data.
pc = select(pc,labels~=0);
labels = labels(labels~=0);
classNames = ["ground"; ...
    "vegetation"; ...
    "cars"; ...
    "trucks"; ...
    "powerlines"; ...
    "fences"; ...
    "poles"; ...
ax = pcshow(pc.Location,labels);
title("Point Cloud with Overlaid Semantic Labels")

Figure contains an axes object. The axes object with title Point Cloud with Overlaid Semantic Labels contains an object of type scatter.

Preprocess Training Data

To preprocess the training data, first create a fileDatastore object to collect all the point cloud files in the training data.

fds = fileDatastore(trainDataFolder,"ReadFcn",@lasFileReader);

Each point cloud in the DALES data set covers an area of 500-by-500 meters containing around 10 millions points, processing this many points might slow down network processing. For efficient memory processing, divide the point cloud into small, non-overlapping blocks of size 50-by-50 meters by using the helperTransformData helper function, defined at the end of this example.

Define the block dimensions.

blockSize = [50 50];

Apply the transformation to the training datastore.

ldsTransformed = transform(fds,@(x) helperTransformData(x,blockSize));

Use the helperDatastoreWriteFcn helper function, attached to this example as a supporting file, to save the preprocessed point clouds and semantic labels as PCD and PNG files, respectively. You can set writeFiles to false if your preprocessed training data is already present in the outputFolder.

writeFiles = true;
outputFolder = fullfile(dataFolder,"PreprocessedTrainData");
if writeFiles
    writeall(ldsTransformed,outputFolder,WriteFcn = @helperDatastoreWriteFcn,FolderLayout = "flatten");
Processing done for file: 5190_54400.las

Note: Processing can take some time. The code suspends MATLAB® execution until processing is complete.

If you want to rerun the code, you must first use the "clear helperDatastoreWriteFcn" command.

Create Datastore Object for Training

Create a fileDatastore object that reads PCD files by using the pcread function.

pcCropTrainPath = fullfile(outputFolder,"PointCloud");
ldsTrain = fileDatastore(pcCropTrainPath,"ReadFcn",@(x) pcread(x));

Use a pixelLabelDatastore object to store pixel-wise labels from the pixel label images. The object maps each pixel label to a class name and assigns a unique label ID to each class.

% Specify label IDs from 1 to the number of classes.
numClasses = numel(classNames);
labelIDs = 1 : numClasses;
labelCropTrainPath = fullfile(outputFolder,"Labels");
pxdsTrain = pixelLabelDatastore(labelCropTrainPath,classNames,labelIDs);

Use the combine function to combine the point clouds and labels into a single datastore for training.

cds = combine(ldsTrain,pxdsTrain);

Create RandLANet Object

RandLANet is a popular neural network used for semantic segmentation of lidar point clouds. This network uses an encoder-decoder architecture with skip connections. You feed the input to a shared multilayer perceptron layer, followed by four encoding and decoding layers, to learn features for each point. At end, three fully connected layers and a dropout layer associate each point in a 3-D point cloud with a class label, such as car, truck, ground, or vegetation.

This figure shows the architecture of a RandLANet, where:

  • FC — Fully Connected Layer

  • LFA — Local Feature Aggregation

  • RS — Random Sampling

  • MLP — Shared Multilayer Perceptron

  • US — Upsampling

  • DP — Dropout


Each encoding layer randomly downsamples the point cloud, which significantly decreases the point density. To retain the key features of the downsampled point cloud, the network uses a local feature aggregation module on each point.

Each decoding layer upsamples the point feature set through nearest-neighbor interpolation. The network concatenates these upsampled maps with feature maps produced by the encoding layers through skip connections, and applies a shared MLP to the concatenated feature maps.

Use the randlanet function to create a RandLANet segmentation network.

segmenter = randlanet("none",classNames);

Specify Training Options

Use the Adam optimization algorithm to train the network. Use the trainingOptions (Deep Learning Toolbox) function to specify the hyperparameters.

learningRate = 0.01;
numEpochs = 200;
miniBatchSize = 8;
learnRateDropFactor = 0.9886;
executionEnvironment = "auto";

if canUseParallelPool
    dispatchInBackground = true;
    dispatchInBackground = false;
options = trainingOptions("adam", ...
    InitialLearnRate = learningRate, ...
    MaxEpochs = numEpochs, ...
    MiniBatchSize = miniBatchSize, ...
    LearnRateSchedule = "piecewise", ...
    LearnRateDropPeriod = 1,...
    LearnRateDropFactor = learnRateDropFactor, ...
    Plots = "training-progress", ...
    ExecutionEnvironment = executionEnvironment, ...
    dispatchInBackground = dispatchInBackground, ...
    ResetInputNormalization = false, ...
    CheckpointFrequencyUnit = "epoch", ...
    CheckpointFrequency = 10, ...
    CheckpointPath = tempdir, ...

Note: Reduce or increase the miniBatchSize value to control memory usage when training.

Train Model

Use the trainRandlanet function to train the RandLANet segmentation network, set the doTraining argument to true. Otherwise, load a pretrained segmentation network. To train the network, you can use a CPU or GPU. Using a GPU requires Parallel Computing Toolbox™ and a CUDA® enabled NVIDIA® GPU. For more information, see GPU Computing Requirements (Parallel Computing Toolbox).

doTraining = false;
if doTraining
    segmenter = trainRandlanet(cds,segmenter,options);
    segmenter = randlanet("dales");

Segment Aerial Point Cloud

Create a fileDatastore object to read multiple test point clouds. For demonstration purpose, we are reading a single point cloud.

fdsTest = fileDatastore(fullfile(testDataFolder,"5080_54470.las"),"ReadFcn",@lasFileReader);

Apply the similar transformation used on training data, to the test data:

  • Extract the point cloud and the respective labels.

  • Crop the point clouds into non-overlapping blocks of size 50-by-50 meters.

ldsTestTransformed = transform(fdsTest,@(x) helperTransformData(x,blockSize));

Read all blocks of the test point cloud. Notice X contain pointCloud objects and corresponding labels for all the blocks of the test point cloud.

X = read(ldsTestTransformed);

Create pointCloud object containing array of block of pointClouds of size Mx1. Notice segmentObjects only take pointCloud array of size Mx1 as input.

ptCloudArray = vertcat(X{:,1});

Using segmentObjects function, perform inference to compute prediction labels.

[labels,scores] = segmentObjects(segmenter,ptCloudArray);

Display the predictions on the point cloud.

% Concatenate blocks of point cloud.
pc = pccat(ptCloudArray);

% Concatenate corresponding labels of blocks of point cloud.
labelsDensePred = vertcat(labels{:});

% Convert labels from categorical to numeric for display.
labelsDensePred = single(categorical(labelsDensePred,classNames,cellstr(string(1:numClasses))));
ax = pcshow(pc.Location,labelsDensePred);
axis off
title("Point Cloud Overlaid with Detected Semantic Labels")

Evaluate Network

To perform evaluation on the test data, get the labels from the test point cloud. Use the predicted labels you computed and are stored in labelsDensePred.

Extract the ground truth labels.

labelsDenseTarget = vertcat(X{:,2});

Use the evaluateSemanticSegmentation function to compute the semantic segmentation metrics from the test set results.

confusionMatrix = segmentationConfusionMatrix(double(labelsDensePred), ...
    double(labelsDenseTarget),Classes = 1:numClasses);
metrics = evaluateSemanticSegmentation({confusionMatrix},classNames,Verbose = false);

You can measure the amount of overlap per class using the intersection-over-union (IoU) metric.

The evaluateSemanticSegmentation function returns metrics for the entire data set, for individual classes, and for each test image. To see the metrics at the data set level, inspect the metrics.DataSetMetrics property.

ans=1×4 table
    GlobalAccuracy    MeanAccuracy    MeanIoU    WeightedIoU
    ______________    ____________    _______    ___________

       0.95783          0.85172       0.70197      0.92355  

The data set metrics provide a high-level overview of network performance. To see the impact each class has on the overall performance, view the metrics for each class by inspecting the metrics.ClassMetrics property.

ans=8×2 table
                  Accuracy      IoU  
                  ________    _______

    ground        0.97692     0.95547
    vegetation    0.90972     0.89308
    cars          0.95455     0.72914
    trucks        0.34194     0.21973
    powerlines    0.90057     0.86226
    fences        0.89055     0.48276
    poles          0.8624     0.55633
    buildings     0.97709     0.91698

Supporting Functions

The helperTransformData function applies these transformations to the data:

  • Extract the point cloud and the respective labels.

  • Each point cloud in the DALES data set covers an area of 500-by-500 meters, which is much larger than the typical area covered by terrestrial rotating lidar point clouds. For efficient memory processing, divide the point cloud into small, non-overlapping blocks of size 50-by-50 meters.

function out = helperTransformData(lasReader,blocksize)

[ptCloud,attr] = readPointCloud(lasReader,"Attributes","Classification");
labels = attr.Classification;

% Select only labeled data.
ptCloud = select(ptCloud,labels~=0);
labels = labels(labels~=0);

% Block the input point cloud.
numGridsX = round((diff(ptCloud.XLimits)+eps)/blocksize(1));
numGridsY = round((diff(ptCloud.YLimits)+eps)/blocksize(2));
[~,~,~,indx,indy] = histcounts2(ptCloud.Location(:,1),ptCloud.Location(:,2), ...
    [numGridsX, numGridsY],XBinLimits = ptCloud.XLimits, YBinLimits = ptCloud.YLimits);
ind = sub2ind([numGridsX, numGridsY],indx,indy);

out = cell(numGridsX*numGridsY,2);

    for num = 1:numGridsX*numGridsY
        idx = ind == num;

            out{num,1} = select(ptCloud,idx);
            out{num,2} = labels(idx);



[1] Hu, Qingyong, Bo Yang, Linhai Xie, Stefano Rosa, Yulan Guo, Zhihua Wang, Niki Trigoni, and Andrew Markham. “RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds.” In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 11105–14. Seattle, WA, USA: IEEE, 2020.

[2] Varney, Nina, Vijayan K. Asari, and Quinn Graehling. “DALES: A Large-Scale Aerial LiDAR Data Set for Semantic Segmentation.” In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 717–26. Seattle, WA, USA: IEEE, 2020.

See Also




Related Topics