extract data from EEG text file
    6 views (last 30 days)
  
       Show older comments
    
I need help to write script to exatrct  MCAP samples with time it occured in seaerate file and plot so I can use these sampes ton signal procsing application on maltlb this is only art of the data , the file contains tens of CAP samples so need genaral code to exatrrct them 
      Time       Date     Sample #  Type  Sub Chan  Num	Aux
[22:16:05.000 01/01/2007]        0     "    0    0    0	## time resolution: 256
[22:16:05.000 01/01/2007]        0          0    0    0
[22:34:35.000 01/01/2007]   284160     "    0    0    0	SLEEP-S0 30 W ROC-LOC
[22:35:05.000 01/01/2007]   291840     "    0    0    0	SLEEP-S0 30 W ROC-LOC
[22:35:35.000 01/01/2007]   299520     "    0    0    0	SLEEP-S0 30 W ROC-LOC
[22:36:05.000 01/01/2007]   307200     "    0    0    0	SLEEP-S0 30 W ROC-LOC
[22:36:35.000 01/01/2007]   314880     "    0    0    0	SLEEP-S0 30 W ROC-LOC
[22:37:05.000 01/01/2007]   322560     "    0    0    0	SLEEP-S1 30 S1 ROC-LOC
[22:37:35.000 01/01/2007]   330240     "    0    0    0	SLEEP-S0 30 W ROC-LOC
[22:38:05.000 01/01/2007]   337920     "    0    0    0	SLEEP-S0 30 W ROC-LOC
[22:38:35.000 01/01/2007]   345600     "    0    0    0	SLEEP-S0 30 W ROC-LOC
[22:39:05.000 01/01/2007]   353280     "    0    0    0	SLEEP-S0 30 W ROC-LOC
[22:39:35.000 01/01/2007]   360960     "    0    0    0	SLEEP-S1 30 S1 ROC-LOC
[22:40:05.000 01/01/2007]   368640     "    0    0    0	SLEEP-S1 30 S1 ROC-LOC
[22:40:35.000 01/01/2007]   376320     "    0    0    0	SLEEP-S0 30 W ROC-LOC
[22:41:05.000 01/01/2007]   384000     "    0    0    0	SLEEP-S1 30 S1 ROC-LOC
[22:41:35.000 01/01/2007]   391680     "    0    0    0	SLEEP-S1 30 S1 ROC-LOC
[22:41:37.000 01/01/2007]   392192     "    0    0    0	MCAP-A3 17 S1 EEG-F4-C4
[22:41:57.000 01/01/2007]   397312     "    0    0    0	MCAP-A3 9 S1 EEG-F4-C4
[22:42:05.000 01/01/2007]   399360     "    0    0    0	SLEEP-S2 30 S2 ROC-LOC
[22:42:13.000 01/01/2007]   401408     "    0    0    0	MCAP-A3 11 S2 EEG-F4-C4
[22:42:28.000 01/01/2007]   405248     "    0    0    0	MCAP-A3 23 S2 EEG-F4-C4
[22:42:35.000 01/01/2007]   407040     "    0    0    0	SLEEP-S2 30 S2 ROC-LOC
[22:42:57.000 01/01/2007]   412672     "    0    0    0	MCAP-A3 10 S2 EEG-F4-C4
[22:43:05.000 01/01/2007]   414720     "    0    0    0	SLEEP-S2 30 S2 ROC-LOC
[22:43:11.000 01/01/2007]   416256     "    0    0    0	MCAP-A2 6 S2 EEG-F4-C
1 Comment
  per isakson
      
      
 on 28 Apr 2019
				See  readtable  Create table from file and fixedWidthImportOptions  Import options object for fixed-width text files (Introduced in R2017a)
Accepted Answer
  Cedric
      
      
 on 27 Apr 2019
        
      Edited: Cedric
      
      
 on 28 Apr 2019
  
      Using the data text file that you provided elsewhere (renamed and attached to this answer), here is a short example of one way to parse it. Note that it is not the best way, but it is good enough for starting the discussion:
buffer = fileread( 'data01.txt' ) ;
pattern = '\[([^\]]+).\s+(\d+)\s+"\s+(\d+)\s+(\d+)\s+(\d)+\s+MCAP-(\S+)\s+(\S+)\s+(\S+)\s+(\S+)' ;
data = regexp( buffer, pattern, 'tokens' ) ;
data = vertcat( data{:} ) ;
Running it outputs a cell array of 830 rows associated with MCAP entries, as follows:
EDIT 04/28/2019@1:59pm UTC: I updated the pattern so REGEXP extracts all other "numeric" columns.
>> data
data =
  830×9 cell array
    {'22:41:37.000 01…'}    {'392192' }    {'0'}    {'0'}    {'0'}    {'A3'}    {'17'}    {'S1'}    {'EEG-F4-C4' }
    {'22:41:57.000 01…'}    {'397312' }    {'0'}    {'0'}    {'0'}    {'A3'}    {'9' }    {'S1'}    {'EEG-F4-C4' }
    {'22:42:13.000 01…'}    {'401408' }    {'0'}    {'0'}    {'0'}    {'A3'}    {'11'}    {'S2'}    {'EEG-F4-C4' }
    {'22:42:28.000 01…'}    {'405248' }    {'0'}    {'0'}    {'0'}    {'A3'}    {'23'}    {'S2'}    {'EEG-F4-C4' }
    ...
    {'07:08:22.000 02…'}    {'8175872'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'8' }    {'S4'}    {'EEG-F4-C4' }
    {'07:11:27.000 02…'}    {'8223232'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'8' }    {'S4'}    {'EEG-Fp2-F4'}
    {'07:12:08.000 02…'}    {'8233728'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'6' }    {'S4'}    {'EEG-Fp2-F4'}
    {'07:18:31.000 02…'}    {'8331776'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'6' }    {'S4'}    {'EEG-F4-C4' }
    {'07:18:53.000 02…'}    {'8337408'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'7' }    {'S4'}    {'EEG-F4-C4' }
    {'07:19:27.000 02…'}    {'8346112'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'15'}    {'S4'}    {'EEG-F4-C4' }
    {'07:20:29.000 02…'}    {'8361984'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'11'}    {'S4'}    {'EEG-F4-C4' }
    {'07:20:48.000 02…'}    {'8366848'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'12'}    {'S4'}    {'EEG-F4-C4' }
Now depending what you want to accomplish, you may prefer using a TIMETABLE or a TIMESERIES object, or just some conversion of these columns.
So now you should define which part of the data you are interested in, and how you are planning to process it.
Let me know if you have any question.
21 Comments
  Cedric
      
      
 on 2 May 2019
				
      Edited: Cedric
      
      
 on 2 May 2019
  
			No problem!
Next issue though: rdmat output arrays that suggest that there are 1e6 samples:
>> [tm,signal,Fs,siginfo]=rdmat('sdb4_edfm');
>> whos
  Name               Size                     Bytes  Class     Attributes
  Fs                 1x1                          8  double              
  siginfo            1x18                     11040  struct              
  signal       1000000x18                 144000000  double              
  tm                 1x1000000              8000000  double     
Here you see tm, the vector of times I suppose, that has 1 million elements and the array of signals has 1 million rows (I guess each corresponding to a sample).
Now after converting the sample # from you annotation file to numeric:
buffer = fileread( 'annotations sdb4.txt' ) ;
pattern = '\[([^\]]+).\s+(\d+)\s+"\s+(\d+)\s+(\d+)\s+(\d)+\s+MCAP-(\S+)\s+(\S+)\s+(\S+)\s+(\S+)' ;
annotations = regexp( buffer, pattern, 'tokens' ) ;
annotations = vertcat( annotations{:} ) ;
sampleId = str2double( annotations(:,2) ) ;
 I see that sample # (or IDs) up to 8,36,6848, which is way above 1 million. So most of the sample IDs correspond to regions that are outside of the plot ..(?)
More Answers (0)
See Also
Categories
				Find more on Text Data Preparation in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

