Import txt file and pick the values after the selected key words using Regular Expression
13 views (last 30 days)
Show older comments
Hello everybody,
I have a large non-homogeneous text file. And I need find the key words in the text then pick the value next to the key words. Here is one part of this text file:
sfhafjlhakjfhahfaoh(some text before)
LAW NUMBER
8907 0 1 0
-1876.98 11440 1
8 2 2 2 7 8
LAW TYPE:
152 0 1
7 8
163 154 155 156
Geomaterial_2 - solid PHASE
ELASTIC CONSTITUTIVE LAW FOR SOLID ELEMENTS
AT CONSTANT TEMPERATURE
USE OF EFFECTIVE STRESSES. ISOL = 1
NUMBER OF SUBINTERVALS.... NINTV= 1
YOUNG'S MODULUS .......... = 0.200000E+08
POISSON'S RATIO .......... = 0.300000
SPECIFIC MASS AS A MATERIAL LAW,
RHO ...................... = 2670.00
LAW NUMBER
288 0 1 0
-13.45 110 1
8 2 2 2 9 8
LAW TYPE:
171 0 1
5 6
173 174 175 179
Geomaterial_2 - liquid PHASE
WATER-AIR SEEPAGE- VAPOR -THERMAL COUPLED
CONSTITUTIVE LAW FOR SOLID ELEMENTS
ISOTROPIC CASE IANI = 0
FORMULATION INDEX FOR krw IKW = 0
FORMULATION INDEX FOR kra IKA = 0
I need the first values after the key words and the values in the third line after the key word 'LAW NUMBER' and 'LAW TYPE'. So in this case: two vector will be created: Lawnumber=[8907 288] and Lawtype=[152 288] and another two matrix of the third line will be [8 2 2 2 7 8; 8 2 2 2 9 8] for LAW NUMBER and [163 154 155 156; 173 174 175 179]
Mr.Oleg Komarov proposed me to use regexp. His code is very easy and powerful here is the lien: http://www.mathworks.com/matlabcentral/answers/13585-find-the-key-word-in-the-text-file-then-pick-the-value-next-to-it
here is the code:
% Import the whole file at once
fid = fopen('test.txt','r');
text = textscan(fid,'%s','Delimiter','','endofline','');
text = text{1}{1};
fid = fclose(fid);
% Parse with regexp
tk = regexp(text,'LAW NUMBER[\s\.=]+(\d+)|LAW TYPE[:\s]+(\d+)','tokens');
% tk = regexp(text,'LAW TYPE\s+(\d+ ){2}(?:[^\n]+\n){2}(\d+ )+','tokens'); Optional code
% textscan([tk{1}{:}],'%f') Optional code
% COnvert to double
tk = reshape(str2double([tk{:}]),2,[])
It is very powerful to get the first value after key words. But the optional code doesn't work very well. Until now I am not successful to get the third line. Is someone could improve it and help me out ?
Thank you very much.
Gringoire
0 Comments
Accepted Answer
Oleg Komarov
on 13 Aug 2011
This time I slightly different solution:
fid = fopen('test.txt','r');
text = textscan(fid,'%s','Delimiter','');
text = text{1};
fid = fclose(fid);
%Parse LAW NUMBER
idx = find(~cellfun('isempty',strfind(text,'LAW NUMBER'))) + 1;
LW = cellfun(@(x) textscan(x,'%f%*[^\n]'),text(idx),'un',0);
LW = cell2mat(cat(1,LW{:}));
LWm = cellfun(@(x) textscan(x,'%f'),text(idx+2),'un',0);
LWm = cell2mat([LWm{:}]).';
% Parse LAW TYPE
idx = find(~cellfun('isempty',strfind(text,'LAW TYPE'))) + 1;
LT = cellfun(@(x) textscan(x,'%f%*[^\n]'),text(idx),'un',0);
LT = cell2mat(cat(1,LT{:}));
LTm = cellfun(@(x) textscan(x,'%f'),text(idx+2),'un',0);
LTm = cell2mat([LTm{:}]).';
More Answers (1)
Fangjun Jiang
on 13 Aug 2011
For a file like this, I prefer using fgetl().
NumbCount=0;
TypeCount=0;
fid=fopen('test.txt');
fline=fgetl(fid);
while ~feof(fid)
if strfind(fline,'LAW NUMBER')
NumbCount=NumbCount+1;
fline=fgetl(fid);
Temp=sscanf(fline,'%d');
LawNumber(NumbCount)=Temp(1);
fline=fgetl(fid);
fline=fgetl(fid);
Temp=sscanf(fline,'%d');
LawNumberMatrix(NumbCount,1:6)=Temp(1:6);
elseif strfind(fline,'LAW TYPE');
TypeCount=TypeCount+1;
fline=fgetl(fid);
Temp=sscanf(fline,'%d');
LawType(TypeCount)=Temp(1);
fline=fgetl(fid);
fline=fgetl(fid);
Temp=sscanf(fline,'%d');
LawTypeMatrix(TypeCount,1:4)=Temp(1:4);
end
fline=fgetl(fid);
end
fclose(fid);
See Also
Categories
Find more on Characters and Strings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!