Split Data in Character Array into Matrix

1 view (last 30 days)
I currently have a character array (~ 50000 x 1) with data "DD-MM-YY 1000 NaN NaN NaN 0.200 0.300" all in one cell, and I want to split the characters into a matrix with cells for each of the colums (i.e. col1= "DD-MM-YY" col2= 1000, etc.). However, the data in some rows are not perfectly spaced/aligned with other rows because the length of the data may be > than the length of NaN (i.e. "DD-MM-YY 1000 0.111 NaN 0.2002 0.200 0.300 ") so I cannot filter them based on character location. There is extra spacing at the end to account for this shifting (each has a length of 80). Any ideas on how I could split the data, or even how to align all the data columns?
Thanks so much!
  2 Comments
Ameer Hamza
Ameer Hamza on 24 May 2020
Can you attach a sample dataset.
Vanessa Yau
Vanessa Yau on 25 May 2020
Yes, the text file looks like this:
sample=
' ''May-24-2020 00:00:00" 100.000 NaN NaN NaN 0.3030 '
' ''May-24-2020 00:00:20" 100.0233 NaN 1.4 NaN NaN '
' ''May-24-2020 00:00:40" 100.33155 NaN NaN NaN 0.402 '
' ''May-24-2020 00:01:00" 100.507 NaN NaN NaN 0.7433 '
' ''May-24-2020 00:01:20" 100.900001 NaN NaN NaN 0.224 '

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 25 May 2020
temp = regexp( cellstr(sample), '"(?<date>)[^"])"\s+(?<col2>\S+)\s+(?<col3>\S+)\s+(?<col3>\S+)\s+(?<col4>\S+)\s+(?<col5>\S+)\s+(?<col6>\S+)', 'names', 'once');
parts = vertcat(temp{:});
dates = datetime({parts.date}, 'Format', 'MMM-dd-yyyy HH:mm:ss');
col2s = str2double({parts.col2});
col3s = str2double({parts.col3});
col4s = str2double({parts.col4});
col5s = str2double({parts.col5});
col6s = str2double({parts.col6});

More Answers (0)

Categories

Find more on Shifting and Sorting Matrices in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!