Group data in one table to analyze data by set intervals in another table

Question

0 votes

Hello all,

I would like to create a script that would perform different statistical calculation on one table based on ranges given in another table.

Table A has two columns that include the intervals ("Start" in the first column and "Finish" in the second column). The number of rows of Start-Finish intervals can vary, based on data to be analyzed in the second table). Example of the first table (Table A):

Start Finish

80.6 82

106.5 108.9

134.2 136.5

157.2 159.6

201.3 205.2

...

I would like to create a script that would perform different statistical calculation on the second table (Table B) based on ranges given in Table A, e.g. analyze "Data1", "Data2", "Data3", "Data4" by looking up the ranges within "length" column. The example of the second table (Table B) is as follows :

Length,Data1,Data2,Data3,Data4

...

144.135,0.00,103.00,100.00,68.00,0.00

144.137,0.00,104.00,100.00,67.00,0.00

144.140,0.00,105.00,100.00,67.00,0.00

144.142,0.00,107.00,100.00,67.00,0.00

144.145,0.00,108.00,100.00,66.00,0.00

144.202,0.00,165.00,69.00,52.00,0.00

144.205,0.00,176.00,83.00,51.00,0.00

...

Table B can include as much as few millions of rows with data.

So, in the end, I would like to be able to make a table that would look like Table C below (example) with added columns of calculated statistics for given intervals:

Start Finish Data1SNR Data2SNR Data3Mean Data4Mean Data1STD ...ect

80.6 82

106.5 108.9

134.2 136.5

157.2 159.6

201.3 205.2

...

This seems to be a simple task, and I am beginning to learn Matlab, so I would greatly appreciate your help. Thanks.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Peter Perkins on 29 Jun 2017

Open in MATLAB Online

1 vote

The fact that you have discontiguous length bins makes this more complicated than it otherwise would be. There are several ways to handle this, here's one:

Create a new variable in B, maybe called LengthBin, by calling discretize on B.Length, using a merged combination of A.Start and A.Finish as bin edges. Perhaps

edges = sort([A.Start; A.Finish]);
B.LengthBin = discretize(B.Length,edges);

You probably want to do something to account for the fact that discretize expects contiguous bins, and thus takes a list of "edges", whereas you have discontiguous bins and two lists of "lower edges" and "upper edges", but that's a minor detail. Maybe just throw out the rows in B where B.LengthBin is not an odd number. Perhaps

B(mod(B.BinLength,2)==0,:) = []

Then call varfun on B, using LengthBin as your grouping variable.

BStats = varfun(@mean,B,'GroupingVariable','LengthBin','InputVariables',{'Data1' 'Data2' 'Data3' 'Data4'})

Now add the column vector (1:n)' to A as a new variable called A.LengthBin, and join A and B, throwing away the LengthBin variables you no longer need.

BStats = join(A,BStats,'Key,'LengthBin','LeftVariables',{'Start' 'Finish'},'RightVariables',{<your stats variables>})

You'll probably call varfun more than once because you want to compute multiple stats on each variable. Just horzcat all the results. Hope this helps.

1 Comment
Show -1 older comments Hide -1 older comments

Miklsom on 29 Jun 2017

This worked great, thanks Peter.

Sign in to comment.

Group data in one table to analyze data by set intervals in another table

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

1 Comment
Show -1 older comments Hide -1 older comments

More Answers (0)

Categories

Tags

Community Treasure Hunt

Group data in one table to analyze data by set intervals in another table

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

1 Comment Show -1 older comments Hide -1 older comments

More Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments