Extract Data from text File

1 view (last 30 days)
Jason
Jason on 7 Feb 2017
Commented: Jason on 9 Feb 2017
Hi, can some one give me an indication to how to get the 2nd & 3rd column from the data at the end of this text file. There will generally be different headers, but the data headings ('Data', 'Position' and 'Value') will always be present just before the numeric data I wish to extract and plot.
Thanks
Listing of Huygens PSF Cross Section Data
File : C:\zemax\test
Title: Q654645645tretertertertretrtret
Date : 07/02/2017
Configuration 1 of 3
Huygens PSF Cross Section X
0.5820 to 0.5820 µm at 0.0000 mm.
Data spacing is 0.290 µm.
Strehl ratio: 0.991
Pupil grid size: 64 by 64
Image grid size: 64 by 64
Center coordinates : 0.00000000E+00, 0.00000000E+00 Millimeters
Values are relative irradiance normalized to a peak of 1.0.
X Section, Center Row
Data Position Value
0 -9.28143 0.000865
1 -8.99138 0.001357
2 -8.70134 0.001554
3 -8.41129 0.001315
4 -8.12125 0.000745
5 -7.83120 0.000189
6 -7.54116 0.000080
7 -7.25111 0.000691
8 -6.96107 0.001924
9 -6.67102 0.003279
10 -6.38098 0.004047
11 -6.09094 0.003706
12 -5.80089 0.002314
13 -5.51085 0.000677
14 -5.22080 0.000122
15 -4.93076 0.001872
16 -4.64071 0.006265
  2 Comments
Walter Roberson
Walter Roberson on 7 Feb 2017
Will the 'File' header also always be at the beginning? That would make things easier.
Jason
Jason on 7 Feb 2017
Yes it will.

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 7 Feb 2017
fid = fopen('YourFile.txt', 'rt');
data_cell = textscan(fid, '%f%f%f', 'CommentStyle', {'File', 'Data Position Value'}, 'CollectData', 1);
fclose(fid);
data = data_cell{1};
Note: make sure you get the spacing right on the Data header. If it turns out to have variable spacing then I would have to re-think how to do it.
  5 Comments
Walter Roberson
Walter Roberson on 9 Feb 2017
uchar is unsigned character. The same as bytes. You could use uint8 instead there.
UTF16-LE is the "Little Endian" (having to do with byte order) version of UTF-16, which is a system to encode unicode characters that uses 2 byte groups for most characters but has schemes for going to additional groups of bytes when Unicode codepoints past U+D7FF or so are required (for example you might need to encode Linear B, or Ancient Greek Numbers.) the range that can be directly encoded in 2 bytes in UTF-16 includes most characters in the common Asian languages. More common than UTF-16 is UTF-8, which uses single byte for about about 170 common characters, but then requires multiple bytes -- 2 bytes gets you another 95 characters and after that you need to go to 3 bytes. So if you have a bunch of characters outside of the first 254 code points, it can be more efficient to use the fixed 2-byte encoding than to continually have to generate 3 bytes for the sake of allowing some characters to be represented in 1 byte.
Anyhow, the point at hand is that your file is not typical "one byte per character" text encoding: it has been encoded as two bytes per characters, and MATLAB needs to be told to expect that.
Jason
Jason on 9 Feb 2017
Thankyou.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!