Import data with textscan that is non-periodic
Show older comments
Hi,
I need to import some datas from a file that is structured as follow:
Time1 (X11 Y11 Z11) (X12 Y12 Z12) ... (X1n Y1n Z1n)
Time2 (X21 Y21 Z21) (X22 Y22 Z22) ... (X2n Y2n Z2n)
...
Time_m (Xm1 Ym1 Zm1) (Xm2 Ym2 Zm2) ... (Xmn Ymn Zmn)
I would like to obtain a Time vector and 3 matrix X Y Z.
I've tried with textscan but the Time at the beginning gives me some problems: if I use as formatSpec '%f (%f %f %f)' it only reads the first 4 numbers, on the other hand if I use '(%f %f %f)' it does read anything.
I've managed to solve this in an horrible way:
formatSpec=('%f');
for i=1:nprobes
formatSpec=strcat(formatSpec,' (%f %f %f) ');
end
data=textscan(FileID,formatSpec,'HeaderLines',4);
This way I create a 3*N + 1 cell array that i need to merge as:
X=[data{2},data{5},data{8}...data{3N-1}];
Y=[data{3},data{6},data{9}...data{3N}]
Z=[data{4},data{7},data{10}...data{3N+1}]
but i don't know how to do it (since N is very big)...
Can you please help?
Thanks
4 Comments
Luca Amerio
on 21 Mar 2013
One way to proceed for getting n is to extract this information from the header or from the first line of data. A priori you shouldn't have a solution that depends on m, in the sense that you should read lines of the file until the end of file and not write a loop for line = 1:m. You can define m after reading all the line, as the number of rows in your array of data, or increment a counter as you read lines if you don't store data.
Here is an example how to extract n after you read the first line.. assume
>> s = '4.28 (3.2 1.8 7.0) (3.2 19.8 7.0) (0.2 1.8 7.0) (327.1 1.8 7.0)' ;
if the format of the number is not completely fixed or if the number of white spaces can vary, you can use REGEXP pattern matching to count the number of groups of coordinates in parenthesis:
>> regexp(s, '\(.+?\)')
ans =
9 23 38 52
REGEXP returns the starting position in the string of each block in parenthesis. Counting them provides you with n:
>> n = numel(regexp(s, '\(.+?\)'))
n =
4
This is probably the most flexible solution as regular expressions are the most powerful tool for pattern matching. It is not the fastest though (even if it is very efficient), and you could go for something simpler if you had always the same amount of white spaces separating the time from groups, separating groups, and separating coordinates. In the following, we count white spaces:
>> sum(s == ' ')
ans =
15
now if you know that there are always 4 spaces between the time and groups, and one space between all the other elements, you can compute n from this sum
>> n = sum(s == ' ')/3 - 1 % (15-(4-1)) / (1+2)
n =
4
Cedric
on 21 Mar 2013
If we answered your question, please chose one answer as the answer (and/or vote for people who contributed).
Luca Amerio
on 22 Mar 2013
Accepted Answer
More Answers (1)
Just for the merge, without talking about improving the overall approach, you can do
X = [data{2:3:3*N-1}] ;
Y = [data{3:3:3*N}] ;
Z = [data{4:3:3*N+1}] ;
Categories
Find more on Data Type Conversion in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!