Stereo Visual SLAM: Warning: Maximum number of trials reached. Consider increasing the maximum distance or decreasing the desired confidence.

Question

Rod Matthew Musni on 20 Apr 2023

1
Link

Direct link to this question

https://uk.mathworks.com/matlabcentral/answers/1950673-stereo-visual-slam-warning-maximum-number-of-trials-reached-consider-increasing-the-maximum-dista

Edited: Qu Cao on 24 Apr 2023

Open in MATLAB Online

I am currently using the documentation on Stereo Visual SLAM:

https://www.mathworks.com/help/vision/ug/stereo-visual-simultaneous-localization-mapping.html

I used the provided dataset from the documentation and it worked fine and I got no warning at all.

YOUTUBE LINK of the WARNING: https://youtu.be/DPU2UNQVpz0

While I was waiting for the delivery for my stereo camera, I wanted to test it out on a Simulated Race Track Environment I made in UNITY. Inside the simulated environment, I made a virtual stereo camera on which I also calibrated using the Stereo Camera Calibrator app from MATLAB. I was able to obtain the stereoParams necessary for the Stereo vSLAM. I have attached below a screenshot of the results from the stereo camera calibrator app.

stereoParameters with properties:
   Parameters of Two Cameras
        CameraParameters1: [1×1 cameraParameters]
        CameraParameters2: [1×1 cameraParameters]
   Inter-camera Geometry
              PoseCamera2: [1×1 rigidtform3d]
        FundamentalMatrix: [3×3 double]
          EssentialMatrix: [3×3 double]
   Accuracy of Estimation
    MeanReprojectionError: 0.0592
   Calibration Settings
              NumPatterns: 6
              WorldPoints: [42×2 double]
               WorldUnits: 'millimeters'

Information about my dataset that might be helpful:

Image Resolution: 1280 x 960

Image file format: .jpg

The picture below is a screenshot from the video with a resolution of 2560 x 960. So I needed to crop and extract each of the frames from the video for my dataset.

I extracted and cropped the frames from my recorded video(23.97fps) using MATLAB which was then distributed to "left" and "right" folders.

Note: I have tried exporting it as .png hoping that I would get better results but I still got the error

I got this warning when I ran the code:

Warning: Maximum number of trials reached. Consider increasing the maximum distance or decreasing the desired confidence. 
> In coder.internal.warning (line 8)
In vision.internal.ransac.msac (line 129)
In vision.internal.calibration.estWorldPoseImpl (line 37)
In estworldpose (line 11)
In helperTrackLastKeyFrame (line 44)
In unityStereoVisualSLAM (line 141) 
Operation terminated by user during disparitySGM
In unityStereoVisualSLAM>helperReconstructFromStereo (line 386)
disparityMap   = disparitySGM(I1, I2, DisparityRange=disparityRange);
In unityStereoVisualSLAM (line 159)
[xyzPoints, matchedPairs] = helperReconstructFromStereo(currILeftGray, currIRightGray, currFeaturesLeft, ...

This warning popped up several times and from what I observed is that whenever the warning pops up, I would see that it would go back to the first few frames for about a second and then it will resume back again to the currFrameIdx. I tried to display the currFrameIdx inside the main loop so I can observe it but it seems like there's no problem. The currFrameIdx doesn't go back to the first few frames. That's why I'm really confused on what's happening.

Another issue I have observed:

After map initialization, it would skip several frames from the origin point.

My guess is that it's not able to localize itself properly that's why it's throwing these warnings. And I think that it might be because of the road texture I used for the Simulated Environment is basically a tile that repeats itself.

CODE:

dataFolder   = [fullfile(pwd), filesep, 'virStereoImageData'];
% zipFileName  = [dataFolder, filesep, 'run_000005.zip'];
%mkdir(dataFolder);
     
imgFolderLeft  = [dataFolder,'/images/left/'];
imgFolderRight = [dataFolder,'/images/right/'];
imdsLeft       = imageDatastore(imgFolderLeft);
imdsRight      = imageDatastore(imgFolderRight);    
% Inspect the first pair of images
currFrameIdx   = 1;
currILeft      = readimage(imdsLeft, currFrameIdx);
currIRight     = readimage(imdsRight, currFrameIdx);
imshowpair(currILeft, currIRight, 'montage');
% Set random seed for reproducibility
rng(0);
% Load the initial camera pose. The initial camera pose is derived based
% on the transformation between the camera and the vehicle:
% http://asrl.utias.utoronto.ca/datasets/2020-vtr-dataset/text_files/transform_camera_vehicle.tx
% initialPose = rigid3d(eye(3), [0 0 0]);
% save initialPose;
% zipFileName;
initialPoseData = load("initialPose.mat");
initialPose     = initialPoseData.initialPose;
% Construct the reprojection matrix for 3-D reconstruction.s
% The intrinsics for the dataset can be found at the following page:
% http://asrl.utias.utoronto.ca/datasets/2020-vtr-dataset/text_files/camera_parameters.txt
focalLength     = [358.288 358.198];     % specified in pixels
principalPoint  = [640.510 480.563];     % specified in pixels [x, y]
baseline        = 0.24;              % specified in meters
imageSize       = size(currILeft,[1,2]); % in pixels [mrows, ncols]
intrinsics      = cameraIntrinsics(focalLength, principalPoint, imageSize);
reprojectionMatrix = [1, 0, 0, -principalPoint(1);
0, 1, 0, -principalPoint(2);
0, 0, 0, focalLength(1);
0, 0, 1/baseline, 0];
% In this example, the images are already undistorted and rectified. In a general workflow, 
% uncomment the following code to undistort and rectify the images.
% currILeft     = undistortImage(currILeft, intrinsics);
% currIRight    = undistortImage(currIRight, intrinsics);
% stereoParams  = stereoParameters(intrinsics, intrinsics, eye(3), [-baseline, 0 0]);
% [currILeft, currIRight] = rectifyStereoImages(currILeft, currIRight, stereoParams, OutputView="full");
% Detect and extract ORB features from the rectified stereo images
scaleFactor = 1.2;
numLevels   = 8;
[currFeaturesLeft,  currPointsLeft]   = helperDetectAndExtractFeatures(im2gray(currILeft), scaleFactor, numLevels); 
[currFeaturesRight, currPointsRight]  = helperDetectAndExtractFeatures(im2gray(currIRight), scaleFactor, numLevels);
% Match feature points between the stereo images and get the 3-D world positions 
disparityRange = [0 64];  % specified in pixels (must be divisible by 8)
[xyzPoints, matchedPairs] = helperReconstructFromStereo(im2gray(currILeft), im2gray(currIRight), ...
    currFeaturesLeft, currFeaturesRight, currPointsLeft, currPointsRight, reprojectionMatrix, initialPose, disparityRange);
% Create an empty imageviewset object to store key frames
vSetKeyFrames = imageviewset;
% Create an empty worldpointset object to store 3-D map points
mapPointSet   = worldpointset;
% Add the first key frame
currKeyFrameId = 1;
vSetKeyFrames = addView(vSetKeyFrames, currKeyFrameId, initialPose, Points=currPointsLeft,...
    Features=currFeaturesLeft.Features);
% Add 3-D map points
[mapPointSet, stereoMapPointsIdx] = addWorldPoints(mapPointSet, xyzPoints);
% Add observations of the map points
mapPointSet = addCorrespondences(mapPointSet, currKeyFrameId, stereoMapPointsIdx, matchedPairs(:, 1));
% Update view direction and depth
mapPointSet = updateLimitsAndDirection(mapPointSet, stereoMapPointsIdx, vSetKeyFrames.Views);
% Update representative view
mapPointSet = updateRepresentativeView(mapPointSet, stereoMapPointsIdx, vSetKeyFrames.Views);
% % Visualize matched features in the first key frame
featurePlot = helperVisualizeMatchedFeaturesStereo(currILeft, currIRight, currPointsLeft, ...
    currPointsRight, matchedPairs);
% Visualize initial map points and camera trajectory
mapPlot     = helperVisualizeMotionAndStructureStereo(vSetKeyFrames, mapPointSet);
% Show legend
showLegend(mapPlot);
% Load the bag of features data created offline
bofData         = load("bagOfFeaturesDataSLAM.mat");
% Initialize the place recognition database
loopDatabase    = invertedImageIndex(bofData.bof,SaveFeatureLocations=false);
% Add features of the first key frame to the database
addImageFeatures(loopDatabase, currFeaturesLeft, currKeyFrameId);
% ViewId of the last key frame
lastKeyFrameId    = currKeyFrameId;
% Index of the last key frame in the input image sequence
lastKeyFrameIdx   = currFrameIdx; 
% Indices of all the key frames in the input image sequence
addedFramesIdx    = lastKeyFrameIdx;
currFrameIdx      = 2;
isLoopClosed      = false;
% Main loop
isLastFrameKeyFrame = true;
while ~isLoopClosed && currFrameIdx <= numel(imdsLeft.Files)
    currILeft  = readimage(imdsLeft, currFrameIdx);
    currIRight = readimage(imdsRight, currFrameIdx);
    currILeft = imgaussfilt3(currILeft);
    currIRight = imgaussfilt3(currIRight);
    currILeftGray = im2gray(currILeft);
    currIRightGray = im2gray(currIRight);
    [currFeaturesLeft, currPointsLeft]    = helperDetectAndExtractFeatures(currILeftGray, scaleFactor, numLevels);
    [currFeaturesRight, currPointsRight]  = helperDetectAndExtractFeatures(currIRightGray, scaleFactor, numLevels);
    % Track the last key frame
    % trackedMapPointsIdx:  Indices of the map points observed in the current left frame 
    % trackedFeatureIdx:    Indices of the corresponding feature points in the current left frame
    [currPose, trackedMapPointsIdx, trackedFeatureIdx] = helperTrackLastKeyFrame(mapPointSet, ...
        vSetKeyFrames.Views, currFeaturesLeft, currPointsLeft, lastKeyFrameId, intrinsics, scaleFactor);
    
    if isempty(currPose) || numel(trackedMapPointsIdx) < 30
        currFrameIdx = currFrameIdx + 1;
        continue
    end
    
    % Track the local map  and check if the current frame is a key frame.
    % localKeyFrameIds:   ViewId of the connected key frames of the current frame
    numSkipFrames     = 5;
    numPointsKeyFrame = 80;
    [localKeyFrameIds, currPose, trackedMapPointsIdx, trackedFeatureIdx, isKeyFrame] = ...
        helperTrackLocalMap(mapPointSet, vSetKeyFrames, trackedMapPointsIdx, ...
        trackedFeatureIdx, currPose, currFeaturesLeft, currPointsLeft, intrinsics, scaleFactor, numLevels, ...
        isLastFrameKeyFrame, lastKeyFrameIdx, currFrameIdx, numSkipFrames, numPointsKeyFrame);
    % Match feature points between the stereo images and get the 3-D world positions
    [xyzPoints, matchedPairs] = helperReconstructFromStereo(currILeftGray, currIRightGray, currFeaturesLeft, ...
        currFeaturesRight, currPointsLeft, currPointsRight, reprojectionMatrix, currPose, disparityRange);
    % Visualize matched features in the stereo image
    updatePlot(featurePlot, currILeft, currIRight, currPointsLeft, currPointsRight, trackedFeatureIdx, matchedPairs);
    
    if ~isKeyFrame && currFrameIdx < numel(imdsLeft.Files)
        currFrameIdx = currFrameIdx + 1;
        isLastFrameKeyFrame = false;
        continue
    else
        [untrackedFeatureIdx, ia] = setdiff(matchedPairs(:, 1), trackedFeatureIdx);
        xyzPoints = xyzPoints(ia, :);
        isLastFrameKeyFrame = true;
    end
    
    % Update current key frame ID
    currKeyFrameId  = currKeyFrameId + 1;
    % Add the new key frame    
    [mapPointSet, vSetKeyFrames] = helperAddNewKeyFrame(mapPointSet, vSetKeyFrames, ...
        currPose, currFeaturesLeft, currPointsLeft, trackedMapPointsIdx, trackedFeatureIdx, localKeyFrameIds);
        
    % Remove outlier map points that are observed in fewer than 3 key frames
    if currKeyFrameId == 2
        triangulatedMapPointsIdx = [];
    end
    
    mapPointSet = helperCullRecentMapPoints(mapPointSet, ...
        trackedMapPointsIdx, triangulatedMapPointsIdx, stereoMapPointsIdx);
    
    % Add new map points computed from disparity 
    [mapPointSet, stereoMapPointsIdx] = addWorldPoints(mapPointSet, xyzPoints);
    mapPointSet = addCorrespondences(mapPointSet, currKeyFrameId, stereoMapPointsIdx, ...
        untrackedFeatureIdx);
    
    % Create new map points by triangulation
    minNumMatches = 20;
    minParallax   = 0.35;
    [mapPointSet, vSetKeyFrames, triangulatedMapPointsIdx, stereoMapPointsIdx] = helperCreateNewMapPointsStereo( ...
        mapPointSet, vSetKeyFrames, currKeyFrameId, intrinsics, scaleFactor, minNumMatches, minParallax, ...
        untrackedFeatureIdx, stereoMapPointsIdx);
    % Local bundle adjustment
    [refinedViews, dist] = connectedViews(vSetKeyFrames, currKeyFrameId, MaxDistance=2);
    refinedKeyFrameIds = refinedViews.ViewId;
    % Always fix the first two key frames
    fixedViewIds = refinedKeyFrameIds(dist==2);
    fixedViewIds = fixedViewIds(1:min(10, numel(fixedViewIds)));
    % Refine local key frames and map points
    [mapPointSet, vSetKeyFrames, mapPointIdx] = bundleAdjustment(...
        mapPointSet, vSetKeyFrames, [refinedKeyFrameIds; currKeyFrameId], intrinsics, ...
        FixedViewIDs=fixedViewIds, PointsUndistorted=true, AbsoluteTolerance=1e-7,...
        RelativeTolerance=1e-16, Solver='preconditioned-conjugate-gradient', MaxIteration=10);
    % Update view direction and depth
    mapPointSet = updateLimitsAndDirection(mapPointSet, mapPointIdx, ...
        vSetKeyFrames.Views);
    % Update representative view
    mapPointSet = updateRepresentativeView(mapPointSet, mapPointIdx, ...
        vSetKeyFrames.Views);
    % Visualize 3-D world points and camera trajectory
    updatePlot(mapPlot, vSetKeyFrames, mapPointSet);
    % Check loop closure after some key frames have been created    
    if currKeyFrameId > 50
        
        % Minimum number of feature matches of loop edges
        loopEdgeNumMatches = 50;
        
        % Detect possible loop closure key frame candidates
        [isDetected, validLoopCandidates] = helperCheckLoopClosure(vSetKeyFrames, currKeyFrameId, ...
            loopDatabase, currILeft, loopEdgeNumMatches);
        
        isTooCloseView = currKeyFrameId - max(validLoopCandidates) < 100;
        if isDetected && ~isTooCloseView
            % Add loop closure connections
            [isLoopClosed, mapPointSet, vSetKeyFrames] = helperAddLoopConnectionsStereo(...
                mapPointSet, vSetKeyFrames, validLoopCandidates, currKeyFrameId, ...
                currFeaturesLeft, currPointsLeft, loopEdgeNumMatches);
        end
    end
    
    % If no loop closure is detected, add current features into the database
    if ~isLoopClosed
        addImageFeatures(loopDatabase,  currFeaturesLeft, currKeyFrameId);
    end
    
    % Update IDs and indices
    lastKeyFrameId  = currKeyFrameId;
    lastKeyFrameIdx = currFrameIdx;
    addedFramesIdx  = [addedFramesIdx; currFrameIdx]; 
    currFrameIdx    = currFrameIdx + 1;
end % End of main loop
% Optimize the poses
vSetKeyFramesOptim = optimizePoses(vSetKeyFrames, minNumMatches, Tolerance=1e-16);
% Update map points after optimizing the poses
mapPointSet = helperUpdateGlobalMap(mapPointSet, vSetKeyFrames, vSetKeyFramesOptim);
updatePlot(mapPlot, vSetKeyFrames, mapPointSet);
% Plot the optimized camera trajectory
optimizedPoses  = poses(vSetKeyFramesOptim);
plotOptimizedTrajectory(mapPlot, optimizedPoses)
% Update legend
showLegend(mapPlot);
% Load GPS data
gpsData     = load("gpsLocation.mat");
gpsLocation = gpsData.gpsLocation;
% Transform GPS locations to the reference coordinate system
gTruth = helperTransformGPSLocations(gpsLocation, optimizedPoses);
% Plot the GPS locations
plotActualTrajectory(mapPlot, gTruth(addedFramesIdx, :));
% Show legend
showLegend(mapPlot);
% Create an array of pointCloud objects to store the 3-D world points
% associated with the key frames
ptClouds =  repmat(pointCloud(zeros(1, 3)), numel(addedFramesIdx), 1);
for i = 1: numel(addedFramesIdx)
    ILeft  = readimage(imdsLeft, addedFramesIdx(i));
    IRight = readimage(imdsRight, addedFramesIdx(i));
    % Reconstruct scene from disparity
    disparityMap = disparitySGM(im2gray(ILeft), im2gray(IRight), DisparityRange=disparityRange);
    xyzPoints    = reconstructScene(disparityMap, reprojectionMatrix);
    % Ignore the upper half of the images which mainly show the sky
    xyzPoints(1:floor(imageSize(1)/2), :, :) = NaN;
    % Ignore the lower part of the images which show the vehicle
    xyzPoints(imageSize(1)-50:end, :, :) = NaN;
    xyzPoints  = reshape(xyzPoints, [imageSize(1)*imageSize(2), 3]);
    % Get color from the color image
    colors = reshape(ILeft, [imageSize(1)*imageSize(2), 3]);
    
    % Remove world points that are too far away from the camera
    validIndex = xyzPoints(:, 3) > 0 & xyzPoints(:, 3) < 100/reprojectionMatrix(4, 3);  
    xyzPoints  = xyzPoints(validIndex, :);
    colors     = colors(validIndex, :);
    
    % Transform world points to the world coordinates
    currPose   = optimizedPoses.AbsolutePose(i);
    xyzPoints  = transformPointsForward(currPose, xyzPoints);
    ptCloud    = pointCloud(xyzPoints, Color=colors);
    % Downsample the point cloud
    ptClouds(i) = pcdownsample(ptCloud, random=0.5);
    
end
% Concatenate the point clouds
pointCloudsAll = pccat(ptClouds);
% Visualize the point cloud
figure
ax = pcshow(pointCloudsAll,VerticalAxis="y", VerticalAxisDir="down");
xlabel('X')
ylabel('Y')
zlabel('Z')
title('Dense Reconstruction')
% Display the bird's eye view of the scene
view(ax, [0 0 1]);
camroll(ax, 90);

I want to know your insights about this and suggestions that I may apply to make it work. I just can't wait for my stereo camera to be delivered so I did this.

I provided a youtube link below when I ran the code with the warning:

https://youtu.be/DPU2UNQVpz0

I would really like to know your insights about this. Thank you!

1 Comment
Show -1 older commentsHide -1 older comments

Qu Cao on 24 Apr 2023

Edited: Qu Cao on 24 Apr 2023

Could you check the number of features extracted from each image? From the video it seems that there are not many features that can be extracted. Based on my experience, synthetic images, especially those do not contain textures, are usually feature-sparse. To alleviate the problem, you can probably add more actors in the scene so that you can get more features to track.

In the Develop Visual SLAM Algorithm Using Unreal Engine Simulation example, you can see that a lot of parked car are added to the parking lot so that the images captured by the camera can provide more features.

Sign in to comment.

Sign in to answer this question.

Stereo Visual SLAM: Warning: Maximum number of trials reached. Consider increasing the maximum distance or decreasing the desired confidence.

1 Comment
Show -1 older commentsHide -1 older comments

Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Stereo Visual SLAM: Warning: Maximum number of trials reached. Consider increasing the maximum distance or decreasing the desired confidence.

1 Comment Show -1 older commentsHide -1 older comments

Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments