# Minimum Spanning Tree - Path or Start-End

18 views (last 30 days)
Gern on 17 Jun 2021
Commented: Gern on 18 Jun 2021
Hello,
I computed the minimum spanning tree of 3D points and now want to extract the path (i.e. node per node) of the whole MST.
[EDIT]
I did this with the following code:
% Delaunay Triangulation
DT = delaunayTriangulation(pts); % pts: [nx3] Input points
e = DT.edges;
% Distances of the edges for weights
plist1 = pts(e(:,1),:);
plist2 = pts(e(:,2),:);
diffs = plist2-plist1;
dists = vecnorm(diffs');
% create Graph
G = graph(e(:,1), e(:,2), dists);
% Create Minimum spanning tree
[mst, pred] = minspantree(G);
I totally forgot to describe my very special input data. It is data sampled from a rail-bound measurement system (3D Positions), so the MST is almost a perfect path with few exceptions.
The predecessor nodes vector doesnt seem to fit my needs. I would expect the points to be topologically sorted, but this is not the case if I sort the input points using this vector.
Basically, it would be enough for me to just determine the "start" and "end"-nodes of the tree, just like the built-in plot-function does. Then, I could compute the shortest path, which would give me a node list. In other words, I want to extract [19578, 2207] from the mst.
Here is a picture of the plot-function result. If the plot function can do this almost instantly, there must be a way I can do it.
Thank you!
[EDIT]
I came up with this solution for now. It does not take too long, but longer than the plot function, which finds the same node-pair as "start" and "end", so I thought there might be a "trick".
function idx = sortMST(pts)
% function to build the mst like shown above
[mst,~] = buildMST(pts);
% possible endpoints
d1nodes = find(degree(mst)==1);
% the node-pair with the maximum distance is the one I am looking for
dists = distances(mst, d1nodes, d1nodes);
maximum = max(max(dists));
[i,~]=find(dists==maximum);
% Breadth-first graph search, to connect all nodes and find the path
idx = bfsearch(mst,d1nodes(i));
end

Christine Tobler on 17 Jun 2021
In general, you can't rely on the fact that a minimum spanning tree will just be one path. However, if you have a graph for which that's the case, you can find the two leaf nodes between which the path lies as follows:
rng default;
G = graph(sprandn(10, 10, 0.6), 'upper')
G =
graph with properties: Edges: [22×2 table] Nodes: [10×0 table]
T = minspantree(G);
% Plot G, highlight nodes and edges that are part of T
p = plot(G);
highlight(p, T);
% Compute leaf nodes of T (nodes that only have one edge)
leafNodes = find(degree(T) == 1)
leafNodes = 4×1
1 2 6 8
% If T were just one long line, leafNodes would have two elements,
% and you could use
% shortestpath(t, leafNodes(1), leafNodes(2));
% to find that path
Gern on 18 Jun 2021
Thank you for your detailed answer! I will mark this as solved for now but I will keep your idea of the iterative algorithm in mind, in case the distance computation gets to costly. Right now, at 50000 input points, there are only 200 leaf nodes, so its runs quite fast.

Steven Lord on 17 Jun 2021
If the plot function can do this almost instantly
The plot function plays "connect the dots". The start and end of the line to be drawn are the first and last points in the vector you pass into it, so there's no "determining" involved. [Well, if you're plotting a graph or digraph that does try to determine where the nodes should be placed using the various layout functions described in the documentation for the 'Layout' input to the graph and digraph plot functions, but I think that's not really what you meant.]
How are you computing the minimum spanning tree? Did you create a graph or digraph object and call minspantree on it?
What would you consider the "start" and "end" of the minimum spanning tree for the Bucky graph?
g = graph(bucky);
m = minspantree(g);
h = plot(g);
hold on
plot(m, 'XData', h.XData, 'YData', h.YData, 'EdgeColor', 'r', 'LineWidth', 4)
From visual inspection nodes 42, 58, 30, 47, 59, or 57 are all potential candidates as are many other nodes. Where would you say this tree starts and ends?
Gern on 17 Jun 2021
Thank you for your answer! I updated my question with details. The plot function (or the layout manager) seems to sort the points somehow, because the order, or the start and end nodes are always exactly the nodes I am looking for. Unfortunately, these points do not simply correspond to the first and last line of the MST output. I have found a solution to my problem, but it is not really optimal. Maybe it is also not optimally possible, but it confuses me that the plot-function somehow manages it.