optimizing: fastest way to detect if ANY points are within distance N of points?

22 views (last 30 days)
I'm creating a volume from a very large number of points, created by accumulating a large number of point cloud objects in a box without overlapping any existing points. Everything works so far. But testing if the new cloud is touching anything already accumulated is very slow - each cloud has >1e5 points and the whole thing can accumulate a vast number of points, creating tons of data to sift through.
I'm using a KDTreesearcher object to search the existing points with rangesearch(), after pre-pruning to points within the maximal distance that a new cloud could have points within. It's dramatically faster than pdist2 or looping through the points but still the vast majority of runtime. Making a new KDT when needed is the slower part, though faster than the exhaustive search method.
Is there something faster? Some hidden tool on the FEX that works like a short-circuit rangesearch rather than exhaustively searching each point? I can't find anything even close other than older implementations of KDT from before it was implemented in matlab.
  4 Comments
James Tursa
James Tursa on 16 May 2023
Edited: James Tursa on 16 May 2023
Seems like a fairly trivial mex routine for short-circuiting. How are the data represented? Double arrays, or ...? Is there any ordering (e.g., sorting) to the data?
Carson Purnell
Carson Purnell on 16 May 2023
Data are nx3 double arrays, no ordering of any kind.
Here's my current testing subfunction. c is the large initial set of points, pts is the small set being tested.
function err = proxtest(c,pts,tol)
l = min(pts,[],1)-tol; h = max(pts,[],1)+tol; %low and high bounds per dimension
ix = c>l & c<h;
ix = find(sum(ix,2)>2); %prefiltering bottleneck
err=0;
if ~isempty(ix) %skip KDT when no overlap is possible
buck = round( size(c,1)/1650 );
modeltree = KDTreeSearcher(c(ix,:),'Bucketsize',buck);
[~,d] = rangesearch(modeltree,pts,tol,'SortIndices',0); %
d = [d{:}]; if any(d<tol), err=1; end %check output distances
end
end

Sign in to comment.

Answers (2)

Kartik
Kartik on 15 May 2023
Hi,
One potential approach to make the search faster is to use a bounding box to filter out points that are definitely outside the search radius before performing a more precise search with rangesearch. This can reduce the number of points that are included in the search, potentially improving run time. Another approach is to divide the existing points into smaller sub-sections and perform rangesearch for each sub-section. This can help reduce the search time while still being more precise than using pdist2.
MathWorks provides useful functions for point cloud processing and spatial indexing, which can be useful for performing efficient operations on large point sets. Specifically, the `pointCloud` and `pointCloudPartition` functions can be used for operations on point clouds. The `KDTreeSearcher` and `rangesearch` functions can be used for efficient searching of points. Additionally, the `BoundingBox` function can be used to define a bounding box region of interest.
Here are some documentation links that are relevant to your task:
  1. `KDTreeSearcher`: https://www.mathworks.com/help/stats/kdtreesearcher.html?s_tid=doc_ta
  2. `rangesearch`: https://www.mathworks.com/help/stats/exhaustivesearcher.rangesearch.html?searchHighlight=rangesearch&s_tid=srchtitle_rangesearch_2
  3. `BoundingBox`: https://www.mathworks.com/help/matlab/ref/polyshape.boundingbox.html?searchHighlight=BoundingBox&s_tid=srchtitle_BoundingBox_1
  4. `pointCloud`: https://www.mathworks.com/help/vision/ref/pointcloud.html
  2 Comments
Carson Purnell
Carson Purnell on 15 May 2023
I'm already pruning the points outside the search area before using rangesearch. I need something better - short circuit logic, a non-KDT option that can mutate points quickly or search faster, or to MEX my own prefilter function.
So unless matlab has a cumulative octree that I missed it appears that I need some wacky FEX assistance.
Huiyan Liang
Huiyan Liang on 9 Dec 2023
Hello! Could you please clarify the mean of filtering out points and how to realize it specifically? Does it mean to filter out points outside the search radius for each query point?

Sign in to comment.


James Tursa
James Tursa on 25 May 2023
Edited: James Tursa on 25 May 2023
Here is a naive brute force mex routine you can try out. It makes no attempts at cache optimization or multi-threading, and is probably sensitive to the relative sizes of the matrices involved. But it does short circuit. Maybe for your particular use case it will help with timing. Or if your data were pre-sorted in some way then maybe we could take advantage of that with a different approach.
// proxtest_mex.c tests if any 3D points of two matrices are withing tol of each other
//
// FLAG = proxtest_mex(A,B,tol)
//
// A = Nx3 full real double matrix
// B = Mx3 full real double matrix
// tol = positive tolerance scalar
// FLAG = 0 (nothing close)
// 1 (at least one point in B close to a point in A)
/* Includes ----------------------------------------------------------- */
#include "mex.h"
/* Gateway ------------------------------------------------------------ */
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mwSize N, M, i, j;
double tol2;
double *Ax, *Ay, *Az, *Bx, *By, *Bz, *BX, *BY, *BZ;
// Argument checks
if( nlhs > 1 ) {
mexErrMsgTxt("Too many outputs.");
}
if( nrhs != 3 ) {
mexErrMsgTxt("Need exactly three inputs.");
}
if( !mxIsDouble(prhs[0]) || !mxIsDouble(prhs[1]) ||
mxIsSparse(prhs[0]) || mxIsSparse(prhs[1]) ||
mxIsComplex(prhs[0]) || mxIsComplex(prhs[1]) ||
mxGetNumberOfDimensions(prhs[0]) != 2 ||
mxGetNumberOfDimensions(prhs[1]) != 2 ||
mxGetN(prhs[0]) != 3 || mxGetN(prhs[1]) != 3 ) {
mexErrMsgTxt("First two inputs must be Nx3 and Mx3 full real double 2D matrices.");
}
if( !mxIsNumeric(prhs[2]) || mxGetNumberOfElements(prhs[2]) != 1 ) {
mexErrMsgTxt("Third input must be numeric scalar.");
}
// Get sizes and data pointers
N = mxGetM(prhs[0]);
M = mxGetM(prhs[1]);
Ax = mxGetData(prhs[0]); Ay = Ax + N; Az = Ay + N;
BX = mxGetData(prhs[1]); BY = BX + M; BZ = BY + M;
tol2 = mxGetScalar(prhs[2]);
if( tol2 <= 0.0 ) {
mexErrMsgTxt("Third input tolerance must be positive.");
}
tol2 = tol2 * tol2;
// Brute force checking
for( i=0; i<N; i++ ) {
Bx = BX; By = BY; Bz = BZ;
for( j=0; j<M; j++ ) {
if( (*Ax - *Bx)*(*Ax - *Bx) + (*Ay - *By)*(*Ay - *By) + (*Az - *Bz)*(*Az - *Bz) < tol2 ) {
plhs[0] = mxCreateDoubleScalar(1.0);
return; // short-circuit return
}
Bx++; By++; Bz++;
}
Ax++; Ay++; Az++;
}
plhs[0] = mxCreateDoubleScalar(0.0);
}

Categories

Find more on Introduction to Installation and Licensing in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!