Importing data from a text file (with ".txt" extension), data processing and export it to excel

There is a text file "example.txt" (file attached here) containing the following data:
Date & Time Latitude Longitude Depth Mag Region name
2022-02-13 23:35:11.2 41.15 N 43.97 E 2 3.0 ARMENIA
2022-02-13 20:56:02.4 41.28 N 43.96 E 2 2.0 GEORGIA (SAK'ART'VELO)
2022-02-13 20:26:08.5 41.15 N 44.02 E 2 2.1 ARMENIA
2022-02-13 19:11:45.8 38.66 N 44.84 E 10 2.4 TURKEY-IRAN BORDER REGION
2022-02-13 18:50:38.7 41.31 N 43.99 E 2 2.1 GEORGIA (SAK'ART'VELO)
30
IV
2022-02-13 18:34:59.5 41.23 N 44.12 E 2 3.1 GEORGIA (SAK'ART'VELO)
29
V
2022-02-13 18:28:46.0 41.17 N 43.97 E 2 4.2 GEORGIA (SAK'ART'VELO)
303 6
V
2022-02-13 18:25:56.2 41.14 N 43.99 E 10 5.4 ARMENIA
2022-02-13 12:09:38.9 38.90 N 43.51 E 1 3.0 EASTERN TURKEY
2022-02-13 05:31:04.1 42.58 N 45.39 E 5 2.8 CAUCASUS REGION, RUSSIA
I need to import this file into matlab, Delete all columns consisting of only one character - "N" or "E", remove the all spaces at the beginning and end of each row, and inside the row from several subsequent spaces, leave only one. I also want to remove all the empty rows and also the rows which have a different format from the general one, create correct table and export it to excel. As a result, the following table should appear in the excel sheet:

 Accepted Answer

I have export your text data into excel. then i work with this excel in matlab.
[number,str,all]=xlsread('example.xlsx'); % export data from excel file
all(:,[3 5])=[]; % deleting unnecessay column 3 and 5
A=all;
B=string(A); % convert into string array
D=rmmissing(B); % delete the missing entries
Name=D(1,:); % variable name
datetime=D(2:end,1);
D1=split(datetime," "); % split the date and time
variablename=['Date' 'Time' Name(2:end)]; % concatenate variable name
matrix=[D1 D(2:end,2:end)]; % conctenation
output=[variablename;matrix]; % concatenate the expected output
out1=cellstr(output);
writematrix(output, 'MyData.xlsx') % string array
writecell(out1,'MyData2.xlsx') % cell array

More Answers (1)

Thanks for the answer, but you manually exported the data from a text file to an Excel file(example.txt to example.xlsx) and manually gave the table such a correct form when all columns have separate headers. this is my main problem, so that matlab does this programmatically. I want matlab to read data directly from a text file, determine that the last column should be a character vector and not divide by spaces, and then matlab acted as described in the task. This last phase of the task you carried out beautifully, for which many thanks.

6 Comments

This file is kind of a mess. You do not say if this is exactly like all the files you need to read, or if they have different problems, or what. I would suggest you use something like
readcell('example.txt',delimiter='\t')
(not xlsread), and create some rules for what rows and columns of that cell array to delete. Once you have it cleaned up, use cell2table, then writetable.
Peter, thank you very much for your tips. is there any equivalent function of "xlsread" to find all output arguments(number,str,all)? as i know readtable,readmatrix does not return these output.
I had not realized that there was both a .txt and a .xlsx file attached to this post. I think the .xlsx is not what you are starting from, and in fact would be a big step towards what you want to end up with. But you need to write code that can make sense of understand the .txt automatically.
xlsread is for spreadsheets, and is not recommended any more. I was thinking readcell, but actually, readtable does a decent job:
>> readtable('example.txt','delimiter','\t')
ans =
16×8 table
Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8
_______________________ _____ __________ _____ __________ ____ ____ _____________________________
2022-02-13 23:35:11.200 41.15 {'N' } 43.97 {'E' } 2 3 {'ARMENIA' }
2022-02-13 20:56:02.400 41.28 {'N' } 43.96 {'E' } 2 2 {'GEORGIA (SAK'ART'VELO)' }
2022-02-13 20:26:08.500 41.15 {'N' } 44.02 {'E' } 2 2.1 {'ARMENIA' }
2022-02-13 19:11:45.800 38.66 {'N' } 44.84 {'E' } 10 2.4 {'TURKEY-IRAN BORDER REGION'}
2022-02-13 18:50:38.700 41.31 {'N' } 43.99 {'E' } 2 2.1 {'GEORGIA (SAK'ART'VELO)' }
NaT NaN {0×0 char} NaN {0×0 char} NaN NaN {0×0 char }
NaT NaN {0×0 char} NaN {0×0 char} NaN NaN {0×0 char }
2022-02-13 18:34:59.500 41.23 {'N' } 44.12 {'E' } 2 3.1 {'GEORGIA (SAK'ART'VELO)' }
NaT NaN {0×0 char} NaN {0×0 char} NaN NaN {0×0 char }
NaT NaN {0×0 char} NaN {0×0 char} NaN NaN {0×0 char }
2022-02-13 18:28:46.000 41.17 {'N' } 43.97 {'E' } 2 4.2 {'GEORGIA (SAK'ART'VELO)' }
NaT 6 {0×0 char} NaN {0×0 char} NaN NaN {0×0 char }
NaT NaN {0×0 char} NaN {0×0 char} NaN NaN {0×0 char }
2022-02-13 18:25:56.200 41.14 {'N' } 43.99 {'E' } 10 5.4 {'ARMENIA' }
2022-02-13 12:09:38.900 38.9 {'N' } 43.51 {'E' } 1 3 {'EASTERN TURKEY' }
2022-02-13 05:31:04.100 42.58 {'N' } 45.39 {'E' } 5 2.8 {'CAUCASUS REGION, RUSSIA' }
Now just clean things up:
  1. delete rows with NaT times
  2. delete Var3 and Var5 (I think)
  3. probably turn Var8 into categorical
That pretty much gets you there, I think.
Thank you very much Peter. But still i am quite poor to get my answer. I am attaching the spreadsheet again. i know xlsread is not recommended. But which function returns these output(number,str,all) instead of xlsread?
I don't understand. If you have this as a spreadsheet, just read that into a table and proceed as I descibed with the table you get back. Getting number,str,all isn't going to help you.
i was wrong on my step. I followed what you said. it works. thanks a lot Peter.

Sign in to comment.

Asked:

on 23 Feb 2022

Commented:

on 4 Mar 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!