Clear Filters
Clear Filters

Stereo Visual SLAM: Warning: Maximum number of trials reached. Consider increasing the maximum distance or decreasing the desired confidence.

21 views (last 30 days)
I am currently using the documentation on Stereo Visual SLAM:
I used the provided dataset from the documentation and it worked fine and I got no warning at all.
YOUTUBE LINK of the WARNING: https://youtu.be/DPU2UNQVpz0
While I was waiting for the delivery for my stereo camera, I wanted to test it out on a Simulated Race Track Environment I made in UNITY. Inside the simulated environment, I made a virtual stereo camera on which I also calibrated using the Stereo Camera Calibrator app from MATLAB. I was able to obtain the stereoParams necessary for the Stereo vSLAM. I have attached below a screenshot of the results from the stereo camera calibrator app.
stereoParameters with properties:
Parameters of Two Cameras
CameraParameters1: [1×1 cameraParameters]
CameraParameters2: [1×1 cameraParameters]
Inter-camera Geometry
PoseCamera2: [1×1 rigidtform3d]
FundamentalMatrix: [3×3 double]
EssentialMatrix: [3×3 double]
Accuracy of Estimation
MeanReprojectionError: 0.0592
Calibration Settings
NumPatterns: 6
WorldPoints: [42×2 double]
WorldUnits: 'millimeters'
Information about my dataset that might be helpful:
Image Resolution: 1280 x 960
Image file format: .jpg
The picture below is a screenshot from the video with a resolution of 2560 x 960. So I needed to crop and extract each of the frames from the video for my dataset.
I extracted and cropped the frames from my recorded video(23.97fps) using MATLAB which was then distributed to "left" and "right" folders.
Note: I have tried exporting it as .png hoping that I would get better results but I still got the error
I got this warning when I ran the code:
Warning: Maximum number of trials reached. Consider increasing the maximum distance or decreasing the desired confidence.
> In coder.internal.warning (line 8)
In vision.internal.ransac.msac (line 129)
In vision.internal.calibration.estWorldPoseImpl (line 37)
In estworldpose (line 11)
In helperTrackLastKeyFrame (line 44)
In unityStereoVisualSLAM (line 141)
Operation terminated by user during disparitySGM
In unityStereoVisualSLAM>helperReconstructFromStereo (line 386)
disparityMap = disparitySGM(I1, I2, DisparityRange=disparityRange);
In unityStereoVisualSLAM (line 159)
[xyzPoints, matchedPairs] = helperReconstructFromStereo(currILeftGray, currIRightGray, currFeaturesLeft, ...
This warning popped up several times and from what I observed is that whenever the warning pops up, I would see that it would go back to the first few frames for about a second and then it will resume back again to the currFrameIdx. I tried to display the currFrameIdx inside the main loop so I can observe it but it seems like there's no problem. The currFrameIdx doesn't go back to the first few frames. That's why I'm really confused on what's happening.
Another issue I have observed:
  • After map initialization, it would skip several frames from the origin point.
My guess is that it's not able to localize itself properly that's why it's throwing these warnings. And I think that it might be because of the road texture I used for the Simulated Environment is basically a tile that repeats itself.
CODE:
dataFolder = [fullfile(pwd), filesep, 'virStereoImageData'];
% zipFileName = [dataFolder, filesep, 'run_000005.zip'];
%mkdir(dataFolder);
imgFolderLeft = [dataFolder,'/images/left/'];
imgFolderRight = [dataFolder,'/images/right/'];
imdsLeft = imageDatastore(imgFolderLeft);
imdsRight = imageDatastore(imgFolderRight);
% Inspect the first pair of images
currFrameIdx = 1;
currILeft = readimage(imdsLeft, currFrameIdx);
currIRight = readimage(imdsRight, currFrameIdx);
imshowpair(currILeft, currIRight, 'montage');
% Set random seed for reproducibility
rng(0);
% Load the initial camera pose. The initial camera pose is derived based
% on the transformation between the camera and the vehicle:
% http://asrl.utias.utoronto.ca/datasets/2020-vtr-dataset/text_files/transform_camera_vehicle.tx
% initialPose = rigid3d(eye(3), [0 0 0]);
% save initialPose;
% zipFileName;
initialPoseData = load("initialPose.mat");
initialPose = initialPoseData.initialPose;
% Construct the reprojection matrix for 3-D reconstruction.s
% The intrinsics for the dataset can be found at the following page:
% http://asrl.utias.utoronto.ca/datasets/2020-vtr-dataset/text_files/camera_parameters.txt
focalLength = [358.288 358.198]; % specified in pixels
principalPoint = [640.510 480.563]; % specified in pixels [x, y]
baseline = 0.24; % specified in meters
imageSize = size(currILeft,[1,2]); % in pixels [mrows, ncols]
intrinsics = cameraIntrinsics(focalLength, principalPoint, imageSize);
reprojectionMatrix = [1, 0, 0, -principalPoint(1);
0, 1, 0, -principalPoint(2);
0, 0, 0, focalLength(1);
0, 0, 1/baseline, 0];
% In this example, the images are already undistorted and rectified. In a general workflow,
% uncomment the following code to undistort and rectify the images.
% currILeft = undistortImage(currILeft, intrinsics);
% currIRight = undistortImage(currIRight, intrinsics);
% stereoParams = stereoParameters(intrinsics, intrinsics, eye(3), [-baseline, 0 0]);
% [currILeft, currIRight] = rectifyStereoImages(currILeft, currIRight, stereoParams, OutputView="full");
% Detect and extract ORB features from the rectified stereo images
scaleFactor = 1.2;
numLevels = 8;
[currFeaturesLeft, currPointsLeft] = helperDetectAndExtractFeatures(im2gray(currILeft), scaleFactor, numLevels);
[currFeaturesRight, currPointsRight] = helperDetectAndExtractFeatures(im2gray(currIRight), scaleFactor, numLevels);
% Match feature points between the stereo images and get the 3-D world positions
disparityRange = [0 64]; % specified in pixels (must be divisible by 8)
[xyzPoints, matchedPairs] = helperReconstructFromStereo(im2gray(currILeft), im2gray(currIRight), ...
currFeaturesLeft, currFeaturesRight, currPointsLeft, currPointsRight, reprojectionMatrix, initialPose, disparityRange);
% Create an empty imageviewset object to store key frames
vSetKeyFrames = imageviewset;
% Create an empty worldpointset object to store 3-D map points
mapPointSet = worldpointset;
% Add the first key frame
currKeyFrameId = 1;
vSetKeyFrames = addView(vSetKeyFrames, currKeyFrameId, initialPose, Points=currPointsLeft,...
Features=currFeaturesLeft.Features);
% Add 3-D map points
[mapPointSet, stereoMapPointsIdx] = addWorldPoints(mapPointSet, xyzPoints);
% Add observations of the map points
mapPointSet = addCorrespondences(mapPointSet, currKeyFrameId, stereoMapPointsIdx, matchedPairs(:, 1));
% Update view direction and depth
mapPointSet = updateLimitsAndDirection(mapPointSet, stereoMapPointsIdx, vSetKeyFrames.Views);
% Update representative view
mapPointSet = updateRepresentativeView(mapPointSet, stereoMapPointsIdx, vSetKeyFrames.Views);
% % Visualize matched features in the first key frame
featurePlot = helperVisualizeMatchedFeaturesStereo(currILeft, currIRight, currPointsLeft, ...
currPointsRight, matchedPairs);
% Visualize initial map points and camera trajectory
mapPlot = helperVisualizeMotionAndStructureStereo(vSetKeyFrames, mapPointSet);
% Show legend
showLegend(mapPlot);
% Load the bag of features data created offline
bofData = load("bagOfFeaturesDataSLAM.mat");
% Initialize the place recognition database
loopDatabase = invertedImageIndex(bofData.bof,SaveFeatureLocations=false);
% Add features of the first key frame to the database
addImageFeatures(loopDatabase, currFeaturesLeft, currKeyFrameId);
% ViewId of the last key frame
lastKeyFrameId = currKeyFrameId;
% Index of the last key frame in the input image sequence
lastKeyFrameIdx = currFrameIdx;
% Indices of all the key frames in the input image sequence
addedFramesIdx = lastKeyFrameIdx;
currFrameIdx = 2;
isLoopClosed = false;
% Main loop
isLastFrameKeyFrame = true;
while ~isLoopClosed && currFrameIdx <= numel(imdsLeft.Files)
currILeft = readimage(imdsLeft, currFrameIdx);
currIRight = readimage(imdsRight, currFrameIdx);
currILeft = imgaussfilt3(currILeft);
currIRight = imgaussfilt3(currIRight);
currILeftGray = im2gray(currILeft);
currIRightGray = im2gray(currIRight);
[currFeaturesLeft, currPointsLeft] = helperDetectAndExtractFeatures(currILeftGray, scaleFactor, numLevels);
[currFeaturesRight, currPointsRight] = helperDetectAndExtractFeatures(currIRightGray, scaleFactor, numLevels);
% Track the last key frame
% trackedMapPointsIdx: Indices of the map points observed in the current left frame
% trackedFeatureIdx: Indices of the corresponding feature points in the current left frame
[currPose, trackedMapPointsIdx, trackedFeatureIdx] = helperTrackLastKeyFrame(mapPointSet, ...
vSetKeyFrames.Views, currFeaturesLeft, currPointsLeft, lastKeyFrameId, intrinsics, scaleFactor);
if isempty(currPose) || numel(trackedMapPointsIdx) < 30
currFrameIdx = currFrameIdx + 1;
continue
end
% Track the local map and check if the current frame is a key frame.
% localKeyFrameIds: ViewId of the connected key frames of the current frame
numSkipFrames = 5;
numPointsKeyFrame = 80;
[localKeyFrameIds, currPose, trackedMapPointsIdx, trackedFeatureIdx, isKeyFrame] = ...
helperTrackLocalMap(mapPointSet, vSetKeyFrames, trackedMapPointsIdx, ...
trackedFeatureIdx, currPose, currFeaturesLeft, currPointsLeft, intrinsics, scaleFactor, numLevels, ...
isLastFrameKeyFrame, lastKeyFrameIdx, currFrameIdx, numSkipFrames, numPointsKeyFrame);
% Match feature points between the stereo images and get the 3-D world positions
[xyzPoints, matchedPairs] = helperReconstructFromStereo(currILeftGray, currIRightGray, currFeaturesLeft, ...
currFeaturesRight, currPointsLeft, currPointsRight, reprojectionMatrix, currPose, disparityRange);
% Visualize matched features in the stereo image
updatePlot(featurePlot, currILeft, currIRight, currPointsLeft, currPointsRight, trackedFeatureIdx, matchedPairs);
if ~isKeyFrame && currFrameIdx < numel(imdsLeft.Files)
currFrameIdx = currFrameIdx + 1;
isLastFrameKeyFrame = false;
continue
else
[untrackedFeatureIdx, ia] = setdiff(matchedPairs(:, 1), trackedFeatureIdx);
xyzPoints = xyzPoints(ia, :);
isLastFrameKeyFrame = true;
end
% Update current key frame ID
currKeyFrameId = currKeyFrameId + 1;
% Add the new key frame
[mapPointSet, vSetKeyFrames] = helperAddNewKeyFrame(mapPointSet, vSetKeyFrames, ...
currPose, currFeaturesLeft, currPointsLeft, trackedMapPointsIdx, trackedFeatureIdx, localKeyFrameIds);
% Remove outlier map points that are observed in fewer than 3 key frames
if currKeyFrameId == 2
triangulatedMapPointsIdx = [];
end
mapPointSet = helperCullRecentMapPoints(mapPointSet, ...
trackedMapPointsIdx, triangulatedMapPointsIdx, stereoMapPointsIdx);
% Add new map points computed from disparity
[mapPointSet, stereoMapPointsIdx] = addWorldPoints(mapPointSet, xyzPoints);
mapPointSet = addCorrespondences(mapPointSet, currKeyFrameId, stereoMapPointsIdx, ...
untrackedFeatureIdx);
% Create new map points by triangulation
minNumMatches = 20;
minParallax = 0.35;
[mapPointSet, vSetKeyFrames, triangulatedMapPointsIdx, stereoMapPointsIdx] = helperCreateNewMapPointsStereo( ...
mapPointSet, vSetKeyFrames, currKeyFrameId, intrinsics, scaleFactor, minNumMatches, minParallax, ...
untrackedFeatureIdx, stereoMapPointsIdx);
% Local bundle adjustment
[refinedViews, dist] = connectedViews(vSetKeyFrames, currKeyFrameId, MaxDistance=2);
refinedKeyFrameIds = refinedViews.ViewId;
% Always fix the first two key frames
fixedViewIds = refinedKeyFrameIds(dist==2);
fixedViewIds = fixedViewIds(1:min(10, numel(fixedViewIds)));
% Refine local key frames and map points
[mapPointSet, vSetKeyFrames, mapPointIdx] = bundleAdjustment(...
mapPointSet, vSetKeyFrames, [refinedKeyFrameIds; currKeyFrameId], intrinsics, ...
FixedViewIDs=fixedViewIds, PointsUndistorted=true, AbsoluteTolerance=1e-7,...
RelativeTolerance=1e-16, Solver='preconditioned-conjugate-gradient', MaxIteration=10);
% Update view direction and depth
mapPointSet = updateLimitsAndDirection(mapPointSet, mapPointIdx, ...
vSetKeyFrames.Views);
% Update representative view
mapPointSet = updateRepresentativeView(mapPointSet, mapPointIdx, ...
vSetKeyFrames.Views);
% Visualize 3-D world points and camera trajectory
updatePlot(mapPlot, vSetKeyFrames, mapPointSet);
% Check loop closure after some key frames have been created
if currKeyFrameId > 50
% Minimum number of feature matches of loop edges
loopEdgeNumMatches = 50;
% Detect possible loop closure key frame candidates
[isDetected, validLoopCandidates] = helperCheckLoopClosure(vSetKeyFrames, currKeyFrameId, ...
loopDatabase, currILeft, loopEdgeNumMatches);
isTooCloseView = currKeyFrameId - max(validLoopCandidates) < 100;
if isDetected && ~isTooCloseView
% Add loop closure connections
[isLoopClosed, mapPointSet, vSetKeyFrames] = helperAddLoopConnectionsStereo(...
mapPointSet, vSetKeyFrames, validLoopCandidates, currKeyFrameId, ...
currFeaturesLeft, currPointsLeft, loopEdgeNumMatches);
end
end
% If no loop closure is detected, add current features into the database
if ~isLoopClosed
addImageFeatures(loopDatabase, currFeaturesLeft, currKeyFrameId);
end
% Update IDs and indices
lastKeyFrameId = currKeyFrameId;
lastKeyFrameIdx = currFrameIdx;
addedFramesIdx = [addedFramesIdx; currFrameIdx];
currFrameIdx = currFrameIdx + 1;
end % End of main loop
% Optimize the poses
vSetKeyFramesOptim = optimizePoses(vSetKeyFrames, minNumMatches, Tolerance=1e-16);
% Update map points after optimizing the poses
mapPointSet = helperUpdateGlobalMap(mapPointSet, vSetKeyFrames, vSetKeyFramesOptim);
updatePlot(mapPlot, vSetKeyFrames, mapPointSet);
% Plot the optimized camera trajectory
optimizedPoses = poses(vSetKeyFramesOptim);
plotOptimizedTrajectory(mapPlot, optimizedPoses)
% Update legend
showLegend(mapPlot);
% Load GPS data
gpsData = load("gpsLocation.mat");
gpsLocation = gpsData.gpsLocation;
% Transform GPS locations to the reference coordinate system
gTruth = helperTransformGPSLocations(gpsLocation, optimizedPoses);
% Plot the GPS locations
plotActualTrajectory(mapPlot, gTruth(addedFramesIdx, :));
% Show legend
showLegend(mapPlot);
% Create an array of pointCloud objects to store the 3-D world points
% associated with the key frames
ptClouds = repmat(pointCloud(zeros(1, 3)), numel(addedFramesIdx), 1);
for i = 1: numel(addedFramesIdx)
ILeft = readimage(imdsLeft, addedFramesIdx(i));
IRight = readimage(imdsRight, addedFramesIdx(i));
% Reconstruct scene from disparity
disparityMap = disparitySGM(im2gray(ILeft), im2gray(IRight), DisparityRange=disparityRange);
xyzPoints = reconstructScene(disparityMap, reprojectionMatrix);
% Ignore the upper half of the images which mainly show the sky
xyzPoints(1:floor(imageSize(1)/2), :, :) = NaN;
% Ignore the lower part of the images which show the vehicle
xyzPoints(imageSize(1)-50:end, :, :) = NaN;
xyzPoints = reshape(xyzPoints, [imageSize(1)*imageSize(2), 3]);
% Get color from the color image
colors = reshape(ILeft, [imageSize(1)*imageSize(2), 3]);
% Remove world points that are too far away from the camera
validIndex = xyzPoints(:, 3) > 0 & xyzPoints(:, 3) < 100/reprojectionMatrix(4, 3);
xyzPoints = xyzPoints(validIndex, :);
colors = colors(validIndex, :);
% Transform world points to the world coordinates
currPose = optimizedPoses.AbsolutePose(i);
xyzPoints = transformPointsForward(currPose, xyzPoints);
ptCloud = pointCloud(xyzPoints, Color=colors);
% Downsample the point cloud
ptClouds(i) = pcdownsample(ptCloud, random=0.5);
end
% Concatenate the point clouds
pointCloudsAll = pccat(ptClouds);
% Visualize the point cloud
figure
ax = pcshow(pointCloudsAll,VerticalAxis="y", VerticalAxisDir="down");
xlabel('X')
ylabel('Y')
zlabel('Z')
title('Dense Reconstruction')
% Display the bird's eye view of the scene
view(ax, [0 0 1]);
camroll(ax, 90);
I want to know your insights about this and suggestions that I may apply to make it work. I just can't wait for my stereo camera to be delivered so I did this.
I provided a youtube link below when I ran the code with the warning:
I would really like to know your insights about this. Thank you!
  1 Comment
Qu Cao
Qu Cao on 24 Apr 2023
Edited: Qu Cao on 24 Apr 2023
Could you check the number of features extracted from each image? From the video it seems that there are not many features that can be extracted. Based on my experience, synthetic images, especially those do not contain textures, are usually feature-sparse. To alleviate the problem, you can probably add more actors in the scene so that you can get more features to track.
In the Develop Visual SLAM Algorithm Using Unreal Engine Simulation example, you can see that a lot of parked car are added to the parking lot so that the images captured by the camera can provide more features.

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!