Performance of selecting a subset of a GPUArray

2 views (last 30 days)
Hi everyone,
I am currently working on a triangulation algorithm that lends itself well for GPU computing since this exclusively involves linear algebra on a relatively large set of data. To keep it simple, I won't go into all the details but present a somewhat simplified version of my problem.
What I do is the following: I need to cross-check the distance between many lines in a set A and another set B in 3D-space. This yields a matrix Dist of the size numel(A) x numel(B) that contains all the distances which runs super fast on the GPU even though it is only a midrange device.
Now, I want to select those combinations, that fulfill a certain criterion (distance smaller than a certain value). Logical indexing of the type A(Dist < criterion) - in this case A has already been expanded to match the size of the Dist - takes much longer than the actual calculation. Since I consequently perform further cross-checks with another set C (and D and so on...) it is necessary to do this selection on the fly. Otherwise, I'd run into severe memory problems (Dist would grow to a size of numel(A) x numel(B) x numel(C) x numel(D) x ...).
Is there a faster way to do this kind of reduction on the GPU? I've already tried gathering the involved sets back to the CPU workspace, doing the selection part of the algorithm on the CPU, and uploading the result back to the GPU. As can be expected, this also performs very poorly.
All help is appreciated. Thank you

Answers (1)

Edric Ellis
Edric Ellis on 16 Nov 2014
Logical indexing on the GPU is slower than indexing with doubles. So, of you reuse a logical indexing vector, you may well find it's faster to call FIND on it. Could you post a sketch of your "slow" code? It sounds like there might be a chance BSXFUN could be used...
  1 Comment
Alexander
Alexander on 17 Nov 2014
Hi Edric, thank you for your reply. I am already making heavy use of bsxfun, which has somewhat improved performance. Let me try to explain what happens in a bit more detail. The problem at hand is more or less to check, if line segments in 3D intersect. Since those line segments are calculated based on measured data, I am allowing a small error (meaning the lines do not exactly intersect. They only get really close to each other) to compensate measurement noise. Here's an example.
Subset A includes 2 lines. Subset B has 3. It is really fast to calculate a table that represents the distances between each possible pair of lines:
%A1 %A2
Dist = [ .3 .2; %B1
1.2 1.6; %B2
1.5 .7 ] %B3
Now I want to select only those entries smaller than 1 and create a table that shows the corresponding pairs.
In this example, this would be:
[A1 B1;
A2 B1;
A2 B3]
where all AX and BX are just integer IDs. I do this by invoking following functions:
SelectedColumnA= bsxfun(@times, AllIDsA', match_table);
SelectedColumnA= SelectedColumnA(match_table);
SelectedColumnA= SelectedColumnA(:);
SelectedColumnB= bsxfun(@times, AllIDsB, match_table);
SelectedColumnB= SelectedColumnB(match_table);
SelectedColumnB= SelectedColumnB(:);
In many cases, this takes multiple times longer than the actual calculation of distances negating any advantage over the SMP-CPU-implementation.
If there's any way to avoid this overhead, I'd appreciate any hints. Thank you

Sign in to comment.

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!