Extracting parts of a string

I have a text filewith information like this:
FileName; SampleFreq; Test;Modality;Channel;Description;StimIntensity; Position; RecordingTime
C:\Users\G10040419\Desktop\lp export application\Data 139\00000090_1.WAV; 22000; 2;1;1;5 CH Right; 0.00; -10000; 40147.491374
I need to extract the sampleFreq (22000) and the position (-10000). I tried to use regular expressions, but I cannot find specific delimiter for these data.

 Accepted Answer

Paolo
Paolo on 15 Jun 2018
Edited: Paolo on 15 Jun 2018
The following code uses regexp to extract the data you want. You can play around with the expression here .
data = fileread('00000090Head.txt');
expression = '(?<=WAV;\s*)(\d*)(?:;\s*\d*;\d;\d;(.*?(?=;));\s*\d*\.\d*;\s*)(-?\d*)';
[tokens,match] = regexp(data,expression,'tokens','match');
sampleFrequency = cellfun(@(x) x(1,1),tokens);
position = cellfun(@(x) x(1,2),tokens);
Position and sampleFrequency are both 1x183 cell arrays and contain the data you are interested in.
position = {'-10000' '-9000' '-8000' '-7000' '-6000' '-5000' '-4500' '-4000' '-3500' '-3000' ................}
sampleFrequency = {'22000' '22000' '22000' '22000' '22000' '22000' '22000' '22000' '22000' '22000' .................}

More Answers (3)

per isakson
per isakson on 15 Jun 2018
Edited: per isakson on 15 Jun 2018
Is this what you are looking for?
fid = fopen( '00000090Head.txt', 'r' );
cac = textscan( fid, '%*s%f%*f%*f%*f%*s%*f%f%*f', 'Headerlines',1,'Delimiter',';' );
fclose( fid );
and inspect the result
>> cac
cac =
1×2 cell array
{183×1 double} {183×1 double}
>> cac{2}(1:3)
ans =
-10000
-9000
-8000
Ana Maria Alzate
Ana Maria Alzate on 15 Jun 2018
Yes, but is not giving me the position, it is giving me the lst column, the recording time

5 Comments

Did I make a mistake when counting columns? This format string works here with the sample file. (Do all files have the same format?)
'%*s%f%*f%*f%*f%*s%*f%f%*f'
However, remove one %*f and try
'%*s%f%*f%*f%*f%*s%f%*f'
I think the problem is that the column kind of shift as you go down in the text.
per isakson
per isakson on 15 Jun 2018
Edited: per isakson on 15 Jun 2018
Shifts like this one should not be a problem. (This is the only "shift" I find in the sample file.)
  • Do have problems reading the uploaded sample file?
  • Why not upload a file, which causes problems.
  • Do you get any error or warning messages?
@Ana Maria Alzate: Please do not post comments in the section for answers in the future. There is a section for comments for this job. Thanks.
Thank you for the advice!

Sign in to comment.

Stephen23
Stephen23 on 18 Jun 2018
Edited: Stephen23 on 18 Jun 2018
Importing the data as strings and then using regular expressions to parse them is inefficient, yet is not required because that file is very nicely formatted in delimited columns, and the required data can easily and efficiently be read directly as numeric (or char). The command textscan makes it easy specify how to read those columns, and the format string is much simpler and more intuitive that those regular expressions:
>> fmt = '%*s%f%*d%*d%*d%*s%*f%f%*f';
>> opt = {'HeaderLines',1,'Delimiter',';'};
>> [fid,msg] = fopen('00000090Head.txt','rt');
>> assert(fid>=3,msg)
>> C = textscan(fid,fmt,opt{:});
>> fclose(fid);
>> [C{:}]
ans =
22000 -10000
22000 -9000
22000 -8000
22000 -7000
22000 -6000
22000 -5000
22000 -4500
22000 -4000
22000 -3500
22000 -3000
22000 -3000
22000 -2500
22000 -2000
22000 -1500
22000 -1000
22000 -500
22000 0
22000 500
22000 1000
22000 1500
... lots of lines here
22000 -3000
22000 -2500
22000 -2000
22000 -1500
22000 -1000
22000 -500
22000 0
22000 500
22000 1000
22000 1500
22000 2000
22000 2500
22000 3000
22000 3500
22000 4000
22000 4000
22000 5000

Categories

Asked:

on 15 Jun 2018

Edited:

on 18 Jun 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!