Sort cell strings according to specific subsets of those cell strings

Question

0 votes

Let's say I have a cell string with values:

filename = {'2009.272.17.57.23.8445.AZ.SMER..BHE.R.SAC';...  
              '2009.272.17.57.24.5500.AZ.FRD..BHN.R.SAC';...   
              '2009.272.17.57.27.5445.AZ.SMER..BHN.R.SAC';...
              '2009.272.17.57.27.8000.AZ.SND..BHZ.R.SAC';... 
              '2009.272.17.57.27.9445.AZ.BZN..BHE.R.SAC';...
              '2009.272.17.57.28.7000.AZ.SND..BHN.R.SAC';...
              '2009.272.17.57.29.1250.AZ.FRD..BHZ.R.SAC';...
              '2009.272.17.57.29.2250.AZ.PFO..BHE.R.SAC';... 
              '2009.272.17.57.29.3695.AZ.SMER..BHZ.R.SAC';...
              '2009.272.17.57.29.9445.AZ.BZN..BHN.R.SAC';...
              '2009.272.17.57.30.0000.AZ.RDM..BHN.R.SAC';...
              '2009.272.17.57.30.8000.AZ.RDM..BHZ.R.SAC';...
              '2009.272.17.57.31.8250.AZ.LVA2..BHZ.R.SAC';...
              '2009.272.17.57.31.8500.AZ.LVA2..BHE.R.SAC';...
              '2009.272.17.57.31.9195.AZ.BZN..BHZ.R.SAC';... 
              '2009.272.17.57.32.0000.AZ.WMC..BHZ.R.SAC';...   
              '2009.272.17.57.32.6750.AZ.WMC..BHN.R.SAC';...   
              '2009.272.17.57.33.3195.AZ.KNW..BHZ.R.SAC';...   
              '2009.272.17.57.33.4750.AZ.TRO..BHN.R.SAC';...   
              '2009.272.17.57.33.7750.AZ.PFO..BHN.R.SAC';...   
              '2009.272.17.57.33.9000.AZ.PFO..BHZ.R.SAC';...   
              '2009.272.17.57.34.1750.AZ.LVA2..BHN.R.SAC';...  
              '2009.272.17.57.34.8000.AZ.TRO..BHZ.R.SAC';...   
              '2009.272.17.57.35.0000.AZ.WMC..BHE.R.SAC';...   
              '2009.272.17.57.35.0750.AZ.RDM..BHE.R.SAC';...   
              '2009.272.17.57.35.8945.AZ.KNW..BHE.R.SAC';...   
              '2009.272.17.57.36.0250.AZ.FRD..BHE.R.SAC';...   
              '2009.272.17.57.36.2250.AZ.CRY..BHZ.R.SAC';...  
              '2009.272.17.57.36.3500.AZ.CRY..BHN.R.SAC';...   
              '2009.272.17.57.36.4500.AZ.SND..BHE.R.SAC';...   
              '2009.272.17.57.36.5000.AZ.TRO..BHE.R.SAC';...   
              '2009.272.17.57.36.5195.AZ.KNW..BHN.R.SAC';...   
              '2009.272.17.57.36.5750.AZ.CRY..BHE.R.SAC'};

I want to be able to assume that I do not know what character the station name (e.g., CRY) or component name (e.g., BHE) starts and ends on. Though, the number of periods (".") will be consistent.

I have something fairly clunky to do this, but I am wondering if anyone can suggest a quick one/two-liner that would assume a string format of the general form:

YYYY.DDD.HH.MM.SS.ssss.$1.$2..$3.R.SAC

where:

$1 = Array name $2 = Station name $3 = Component name

And then sort the list with the primary and secondary sort order according to $2 and $3, respectively, so that the first 6 rows in the cell string would be:

272.17.57.27.9445.AZ.BZN..BHE.R.SAC
272.17.57.29.9445.AZ.BZN..BHN.R.SAC
272.17.57.31.9195.AZ.BZN..BHZ.R.SAC
272.17.57.36.5750.AZ.CRY..BHE.R.SAC
272.17.57.36.3500.AZ.CRY..BHN.R.SAC
272.17.57.36.2250.AZ.CRY..BHZ.R.SAC
...

4 Comments
Show 2 older comments Hide 2 older comments

Jan on 22 Jan 2012

It looks like the parts do *not* have the same length:

'2009.272.17.57.33.9000.AZ.PFO..BHZ.R.SAC'

'2009.272.17.57.34.1750.AZ.LVA2..BHN.R.SAC'

Dr. Seis on 22 Jan 2012

Oh, his question was related to the "component" name, which are all the same number of characters (i.e., 3). The "station" names are not the same - they range from 3 to 4 characters.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Oleg Komarov on 22 Jan 2012

Open in MATLAB Online

2 votes

% Split using |'.'| as the delimiter
splt = regexpi(filename,'\.','split');
% Sort according to the 8th and 10th column
[sorted,idx] = sortrows(cat(1,splt{:}),[8,10])

Now you can use the sorted split array or apply idx to filename

2 Comments
Show None Hide None

Dr. Seis on 22 Jan 2012

Just what I was looking for. Thanks, Oleg!

Jan on 23 Jan 2012

+1 for the compact REGEXP call.

Sign in to comment.

Answer 2

Jan on 22 Jan 2012

Open in MATLAB Online

1 vote

filename = {'2009.272.17.57.23.8445.AZ.SMER..BHE.R.SAC';...  
            '2009.272.17.57.24.5500.AZ.FRD..BHN.R.SAC';...   
            '2009.272.17.57.27.5445.AZ.SMER..BHN.R.SAC';...
            '2009.272.17.57.27.8000.AZ.SND..BHZ.R.SAC';... 
            '2009.272.17.57.27.9445.AZ.BZN..BHE.R.SAC';...
            '2009.272.17.57.28.7000.AZ.SND..BHN.R.SAC';...
            '2009.272.17.57.29.1250.AZ.FRD..BHZ.R.SAC';...
            '2009.272.17.57.29.2250.AZ.PFO..BHE.R.SAC'};
n = numel(filename);
C2 = cell(1, n);
C3 = cell(1, n);
for iC = 1:n
  D      = textscan(filename{iC}(27:end), '%s', 'Delimiter', '.');
  C2{iC} = D{1}{1};
  C3{iC} = D{1}{3};
end
% A kind of SORTROWS:
[dummy, ind3] = sort(C3);
[dummy, ind2] = sort(C2(ind3));
index         = ind3(ind2);
filename      = filename(index);

3 Comments
Show 1 older comment Hide 1 older comment

Dr. Seis on 22 Jan 2012

Thanks for the updated code... +1!

Jan on 23 Jan 2012

While Oleg's REGEXP is much nicer than calling TEXTSCAN in a loop, SORTROWS does exactly the same as my sorting method, but with a lot of overhead.

Sign in to comment.

Answer 3

Dr. Seis on 22 Jan 2012

Open in MATLAB Online

0 votes

Here is the clunky version I have been using:

     numFiles = numel(filename);
     sortcell = {''};
     sortind = zeros(numFiles,4);
     for i = 1 : numFiles
         sortind(i,2)=strfind(filename{i},'..')-1;
         for j = sortind(i,2):-1:1
             if isequal(filename{i}(j),'.')
                 break;
             end
             sortind(i,1)=j;
         end
         sortind(i,3)=sortind(i,2)+3;
         for j = sortind(i,3):length(filename{i})
             if isequal(filename{i}(j),'.')
                 break;
             end
             sortind(i,4)=j;
         end
         sortcell(i,1)=cellstr(filename{i}(sortind(i,1):sortind(i,2)));
         sortcell(i,2)=cellstr(filename{i}(sortind(i,3):sortind(i,4)));
     end
     [tempcell,tempind1]=sort(sortcell(:,2));
     [tempcell,tempind2]=sort(sortcell(tempind1,1));
     filename = filename(tempind1(tempind2));

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sort cell strings according to specific subsets of those cell strings

4 Comments
Show 2 older comments Hide 2 older comments

Accepted Answer

2 Comments
Show None Hide None

More Answers (2)

3 Comments
Show 1 older comment Hide 1 older comment

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Tags

Community Treasure Hunt

Sort cell strings according to specific subsets of those cell strings

4 Comments Show 2 older comments Hide 2 older comments

Accepted Answer

2 Comments Show None Hide None

More Answers (2)

3 Comments Show 1 older comment Hide 1 older comment

0 Comments Show -2 older comments Hide -2 older comments

Categories

Tags

See Also

Community Treasure Hunt

4 Comments
Show 2 older comments Hide 2 older comments

2 Comments
Show None Hide None

3 Comments
Show 1 older comment Hide 1 older comment

0 Comments
Show -2 older comments Hide -2 older comments