Extract numbers from lines in a txt file. Onset times from experimental data.

2 views (last 30 days)
Hi,
This problem should be familiar to people working with behavioral data, but I was unable to find a previous answer that would work for me.
I am trying to extract the onset times of certain conditions during an experiment.
The data is organised as in the enclosed file. I want to extract the lines starting with:
'WaitForScanner.OnsetTime'
'Tune.OnsetTime'
and
'ImagineMelody.OnsetTime'
and then extract the numbers at the end of these lines (5-digit onset times in ms) as variables,
work with them a bit in matlab: subtract scanneronset from tuneonset and imagineonset, and convert them to seconds,
and output them in two separate txt files (tuneonset.txt and imagine.txt).
When that works for one subject, I plan to loop over many subjects and also different sessions per subject (different txt file, same principle).
Best,
Andreas
  3 Comments
Andreas Voldstad
Andreas Voldstad on 23 Feb 2020
Edited: dpb on 23 Feb 2020
As a start, I tried this, but I'm not getting any matches with the strncmp command:
filename = 'Imagine_simple-1-1.txt';
Scanneronset = 'WaitForScanner.OnsetTime';
Tuneonset = 'Tune.OnsetTime';
Imagineonset = 'ImagineMelody.OnsetTime';
imaginefile = 'ImagineMelody.OnsetTime.txt';
tunefile = 'Tune.OnsetTime.txt';
Str = fileread(filename);
CStr = strsplit(Str, '\n');
Scanner = strncmp(CStr, Scanneronset, length(Scanneronset));
CStrScanner = CStr(Scanner);
Imagine = strncmp(CStr, Imagineonset, length(Imagineonset)); % creates logical file
CStrImagine = CStr(Imagine);
Tune = strncmp(CStr,Tuneonset, length(Tuneonset));
CStrTune = CStr(Tune);
Andreas Voldstad
Andreas Voldstad on 23 Feb 2020
Edited: Andreas Voldstad on 23 Feb 2020
Yes, WaitForScanner.OnsetTime is a session-global value, whereas the other two variables of interest repeat 20 times. I only need the WaitForScanner.OnsetTime in order to make the other onsets relative to this one.
It is always at the end before *** LogFrame End ***.

Sign in to comment.

Accepted Answer

dpb
dpb on 23 Feb 2020
Edited: dpb on 24 Feb 2020
Carrying on from the above that just diddled with the file format to be able to get the data; the following seems to work:
filename = 'Imagine_simple-1-1.txt';
onsets={'Tune.OnsetTime';'ImagineMelody.OnsetTime';'WaitForScanner.OnsetTime'};
imaginefile = 'ImagineMelody.OnsetTime.txt';
tunefile = 'Tune.OnsetTime.txt';
txt=importdata(filename); % read data file to cellstr array
txt=strrep(txt,char(0),''); % and convert to char() from 2-byte encoding
txt=strrep(txt,char(255:-1:254),''); % finally, clean up BOM mess...
fnGrabNum=@(t) (str2double(extractAfter(txt(contains(txt,t)),':'))/1000); % return numeric data
times=cellfun(fnGrabNum,onsets,'UniformOutput',0); % get the wanted sections
elapsedtimes=seconds(cell2mat(cellfun(@(t) t-times{3},times(1:2),'UniformOutput',0).')); % times from reference time
Results in:
>> elapsedtimes
elapsedtimes =
20×2 duration array
3.233 sec 16.232 sec
55.402 sec 42.23 sec
81.644 sec 94.642 sec
133.88 sec 120.64 sec
160.07 sec 173.07 sec
212.27 sec 199.07 sec
238.45 sec 251.46 sec
290.68 sec 277.46 sec
316.91 sec 329.91 sec
369.07 sec 355.9 sec
395.25 sec 408.25 sec
447.46 sec 434.25 sec
473.69 sec 486.69 sec
525.86 sec 512.69 sec
552.03 sec 565.04 sec
604.22 sec 591.03 sec
630.4 sec 643.41 sec
682.6 sec 669.41 sec
708.79 sec 721.79 sec
761.16 sec 747.79 sec
>>
  2 Comments
Andreas Voldstad
Andreas Voldstad on 24 Feb 2020
Thank you, that's really helpful!
I'd like to understand the code better so I can use it more flexibly.
I checked the help functions for functions I did not know, but would you mind unpacking what the @(t) does in the lines below, as well as the 'UniformOutput' ?
fnGrabNum=@(t) (str2double(extractAfter(txt(contains(txt,t)),':'))/1000); % return numeric data
times=cellfun(fnGrabNum,onsets,'UniformOutput',0); % get the wanted sections
elapsedtimes=seconds(cell2mat(cellfun(@(t) t-times{3},times(1:2),'UniformOutput',0).')); % times from reference time
dpb
dpb on 24 Feb 2020
Edited: dpb on 25 Feb 2020
"@()" is MATLAB syntax for an anonymous function...it defines a one-line function that can be called using the variable name as the function name/handle with the given argument(s). Here it just shortens the cellfun line by separating out the operations to be done for each lookup string into a function w/o having to write another m-file or have a full function header for a one-liner.
NB: All variables in an anonymous function outside those in the argument list are the values of those variables in the context of the code scope AT THE TIME THE ANONYMOUS FUNCTION IS CREATED.
The 'UniformOutput' named parameter to cellfun is either True (1) or False (0) depending upon whether the output of the functional (here the above anonymous function) returns a single value or possibly multiple. This is a case where more outputs than one are created so it is False and that tells cellfun to package the results in a cell array. Otherwise, it could return an ordinary array without nieeding the cell array to hold the outputs.
I subsequently masked you seeing that by the cell2mat call wrapping it and then casting the resulting double array to a duration array. Thus this code ends up with a 2D array of the two lookups by column rather than having them in a cell array.
ADDENDUM:
"All" cellfun does is save writing an explicit for...end loop to process the list of lookup codes/strings; and casting the code in this manner lets you add as many others to the mix as you wish simply by putting them into the onsets array. It simply passes the (dereferenced) cell content of each element of the input cell array to the function. So, each call is the same as would be
times{i}=fnGrabNum(onsets{i});
inside a loop over i=1:numel(onsets)

Sign in to comment.

More Answers (1)

dpb
dpb on 23 Feb 2020
Edited: dpb on 23 Feb 2020
The problem is the file contains double-byte characters and ML isn't interpreting them ...
>> txt=importdata('Imagine_simple-1-1.txt');
>> txt(1:10)
ans =
10×1 cell array
{'ÿþ* * * H e a d e r S t a r t * * * '}
{' ' }
{' V e r s i o n P e r s i s t : 1 ' }
{' ' }
{' L e v e l N a m e : S e s s i o n ' }
{' ' }
{' L e v e l N a m e : B l o c k ' }
{' ' }
{' L e v e l N a m e : T r i a l ' }
{' '
Shows every character internally is two bytes instead of one...and all the string comparison functions are written to handle ASCII only. Hence, all those comparisons fail.
If you can't force the application that creates the file to write only ASCII, it's not too hard to clean it up empirically...every odd row is simply an extra null character and the content of the extra byte is also char(0) excepting for the two leading non-ASCII characters that are FFh, FEh, respectively..
>> uint8(txt{1}(1:6))
ans =
1×6 uint8 row vector
255 254 42 0 42 0
>>
My playing with various encodings and all inside Matlab wasn't successful in converting to single-byte, but I'm not that clever in really understanding all I know regarding such. So, it's not that difficult to just fixup the file
>> txt=txt(1:2:end);
>> txt(1:10)
ans =
10×1 cell array
{'ÿþ* * * H e a d e r S t a r t * * * '}
{' V e r s i o n P e r s i s t : 1 ' }
{' L e v e l N a m e : S e s s i o n ' }
{' L e v e l N a m e : B l o c k ' }
{' L e v e l N a m e : T r i a l ' }
{' L e v e l N a m e : S u b T r i a l ' }
{' L e v e l N a m e : L o g L e v e l 5 ' }
{' L e v e l N a m e : L o g L e v e l 6 ' }
{' L e v e l N a m e : L o g L e v e l 7 ' }
{' L e v e l N a m e : L o g L e v e l 8 ' }
>> txt=strrep(txt,char(0),'');
>> txt(1:10)
ans =
10×1 cell array
{'ÿþ*** Header Start ***'}
{'VersionPersist: 1' }
{'LevelName: Session' }
{'LevelName: Block' }
{'LevelName: Trial' }
{'LevelName: SubTrial' }
{'LevelName: LogLevel5' }
{'LevelName: LogLevel6' }
{'LevelName: LogLevel7' }
{'LevelName: LogLevel8' }
>>
The last version will then let you do your string searches and should work fine...you can also delete the first two characters of the first row if desired, but that won't affect what you're trying to do so I didn't bother...

Categories

Find more on Data Import and Export in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!