Read mixed numbers in Matlab

Question

0 votes

data.txt

Dear Sir/Madam,

I have a data file containing text and mixed number like this: (see the file attached data.txt.)

AA,  BB, 28 21/64,  28 45/64,
AA,  BB, 1/64,  11/64,

the mixed number have format:

integer space numerator/denominator

I would like to read the data file in matlab as

AA, BB, 28.328125, 28.703125,
AA, BB, 0.015625,  0.171875,

i. e. read mixed numbers and convert them into decimal numbers.

What Matlab command to use? I would greatly appreciate it if you left your code and running output.

I am using MATLAB R2014a.

Thank you

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Stephen23 on 23 Dec 2018

Edited: Stephen23 on 23 Dec 2018

0 votes

data.txt

opt = {'CollectOutput',true,'Delimiter',','};
fmt = repmat('%s',1,4);
[fid,msg] = fopen('data.txt','rt');
assert(fid>=3,msg);
C = textscan(fid,fmt,opt{:})
fclose(fid);
C = C{1}; % raw character data in a cell array.
% Convert fractions to numeric:
foo = @(v) sum([v(1:end-2),v(end-1)/v(end)]);
baz = @(s) foo(sscanf(s,'%d%*[ /]'));
M = cellfun(baz,C(:,3:4))

Giving:

M =
   28.328125   28.703125
    0.031250    0.046875

and the string data is in C. If you want the numeric data reinserted back into C then use num2cell:

    >> C(:,3:4) = num2cell(M)
C =
   'AA'    'CC'    28.328125   28.703125
   'AA'    'CC'     0.031250    0.046875

14 Comments
Show 12 older comments Hide 12 older comments

Stephen23 on 23 Dec 2018

Edited: Stephen23 on 23 Dec 2018

@John Smith: the square brackets are just an artifact of how the numeric data is displayed, it does not mean that the numeric data somehow magically has square brackets around it. How data is displayed is not the same thing as what data is actually stored in memory: in this case, C is a cell array and so numeric scalars in any cell are helpfully displayed with square brackets around them to indicate that they are numeric values (and not strings or char arrays or something else).

Try it yourself:

D = {1}

There are NO square brackets in the data, it is just how it is displayed. You do NOT need to remove them because there is NOTHING THERE to remove. Exactly the same thing applies to the character vectors, which have single quotes to indicate that they are character vectors:

'this is a char vector'

But those single quotes are also just an artifact of how the character vector is displayed, and they cannot be removed because they are NOT actually in the character vector itself.

Learn about cell arrays:

https://www.mathworks.com/help/matlab/cell-arrays.html

https://www.mathworks.com/help/matlab/matlab_prog/access-data-in-a-cell-array.html

John Smith on 23 Dec 2018

Open in MATLAB Online

Thank you very much Stephen,

I just modified your code to my real data, I am attaching the data file as ZN_CALL.txt .

For the data file ZN_CALL.txt, I need skip first two rows(lines), Then I need to extract

4th column, 8th column and 9th column, all other columns are not important.

The following is modified code:

opt = {'CollectOutput',true,'Delimiter',','}
fmt = repmat('%s',1,10)  %%  'FORMAT'
%  %s is used to print output that formated as string.  %s represents character vector(containing letters) 
%%  repmat Replicate and tile an array.   1, 4   mean 1 row 4 column,    %s%s%s%s
                       
[fid,msg] = fopen('ZN_CALL.txt','rt')  %%% r  Open file for reading (default). To open in text mode, add "t" to the permission string, for example 'rt' and 'wt+'.
assert(fid>=3,msg);   %%% make sure fid >=3
C = textscan(fid,fmt,opt{:},'headerlines',2)  
disp(C)
fclose(fid);
C = C{1}     % show raw character data in a cell array.
% Convert fractions to numeric:
foo = @(v) sum([v(1:end-2),v(end-1)/v(end)])
baz = @(s) foo(sscanf(s,'%d%*[ /]'))
M = cellfun(baz,C(:,4,8:9))
disp(M)
%%% and the string data is in C. If you want the numeric data reinserted back into C then use num2cell:
C(:,3:4) = num2cell(M)
C(2,2)
C = C(2,3)     % show raw character data in a cell array.

I receive the folloing error:

foo = 
    @(v)sum([v(1:end-2),v(end-1)/v(end)])
baz = 
    @(s)foo(sscanf(s,'%d%*[ /]'))
Index exceeds matrix dimensions.
Error in StephenCobeldick_ZN_data (line 22)
M = cellfun(baz,C(:,4,8:9))

Thank you for your time and knowledge.

Stephen23 on 24 Dec 2018

Edited: Stephen23 on 24 Dec 2018

ZN_CALL.txt

You will need to use indexing to select only the relevant cells of the imported data:

opt = {'CollectOutput',true,'Delimiter',',','HeaderLines',2};
fmt = repmat('%s',1,10);
[fid,msg] = fopen('ZN_CALL.txt','rt');
assert(fid>=3,msg);
C = textscan(fid,fmt,opt{:});
fclose(fid);
C = C{1};
% eighth & ninth columns:
tmp = C(:,8:9);
out = str2double(tmp);
idx = cellfun('length',tmp)>2;
foo = @(v) sum([v(1:end-2),v(end-1)/v(end)]);
baz = @(s) foo(sscanf(s,'%d%*[ /]'));
out(idx) = cellfun(baz,tmp(idx));
% fourth column:
out = [str2double(C(:,4)),out];

This gives the three columns that you requested in one numeric array out:

>> out
out =
50000    28.32812    28.70312
00000    28.04688    28.42188
50000    27.32812    27.70312
00000    26.82812    27.20312
50000    26.54688    26.92188
00000    26.04688    26.42188
50000    25.32812    25.70312
00000    25.04688    25.42188
50000    24.32812    24.70312
00000    24.04688    24.42188
50000    23.54688    23.92188
00000    23.04688    23.42188
50000    22.54688    22.92188
00000    22.04688    22.42188
50000    21.32812    21.70312
00000    21.04688    21.42188
50000    20.54688    20.92188
00000    20.04688    20.42188
50000    19.54688    19.92188
00000    19.04688    19.42188
50000    18.32812    18.70312
00000    18.04688    18.42188
50000    17.54688    17.92188
00000    17.04688    17.42188
50000    16.54688    16.92188
00000    15.82812    16.20312
50000    15.32812    15.70312
00000    14.82812    15.20312
50000    14.32812    14.70312
00000    13.82812    14.20312
50000    13.54688    13.92188
00000    12.82812    13.20312
50000    12.54688    12.70312
00000    11.82812    12.20312
50000    11.32812    11.70312
00000    11.04688    11.42188
50000    10.54688    10.92188
00000    10.04688    10.42188
50000     9.54688     9.92188
00000     9.04688     9.42188
50000     8.54688     8.92188
00000     8.04688     8.42188
50000     7.54688     7.92188
00000     7.04688     7.42188
50000     6.54688     6.92188
00000     6.04688     6.42188
50000     5.54688     5.92188
00000     5.04688     5.21875
50000     4.54688     4.71875
00000     4.06250     4.23438
50000     3.56250     3.73438
00000     3.06250     3.25000
50000     2.64062     2.68750
00000     2.17188     2.21875
50000     1.71875     1.76562
00000     1.29688     1.34375
50000     0.95312     0.98438
00000     0.67188     0.70312
50000     0.46875     0.48438
00000     0.31250     0.34375
50000     0.21875     0.25000
00000     0.17188     0.18750
50000     0.12500     0.15625
00000     0.10938     0.12500
50000     0.07812     0.10938
00000     0.06250     0.09375
50000     0.06250     0.07812
00000     0.04688     0.06250
50000     0.03125     0.06250
00000     0.03125     0.04688
50000     0.01562     0.04688
00000     0.01562     0.03125
50000     0.01562     0.03125
00000         NaN     0.03125
50000         NaN     0.03125
00000         NaN     0.01562
50000         NaN     0.01562
00000         NaN         NaN
50000         NaN         NaN
00000         NaN         NaN
50000         NaN         NaN
00000         NaN         NaN
50000         NaN         NaN
00000         NaN         NaN
50000         NaN     0.00000
00000         NaN     0.00000
50000         NaN     0.00000
00000         NaN     0.00000
50000         NaN     0.00000
00000         NaN     0.00000
50000         NaN     0.00000
00000         NaN     0.00000
50000         NaN     0.00000
00000         NaN     0.00000
50000         NaN     0.00000
00000         NaN     0.00000
50000         NaN     0.00000
00000         NaN     0.00000
50000         NaN     0.00000
00000         NaN     0.00000
50000         NaN     0.00000
00000         NaN     0.00000
50000         NaN     0.00000
00000         NaN     0.00000
50000         NaN     0.00000
00000         NaN     0.00000
50000         NaN     0.00000
00000         NaN     0.00000
   

Note that blank cells are converted to NaN.

PS: Note that your indexing is incorrect:

C(:,4,8:9)
  ^          All of first dimension (rows)
    ^        4th of second dimension (columns)
      ^^^    8th & 9th of third dimension (pages)

To get the 4th, 8th and 9th columns then you would need to create a vector of those indices:

C(:,[4,8:9])
  ^           All of first dimension (rows)
    ^^^^^^^   4th, 8th & 9th of second dimension (columns)

https://www.mathworks.com/help/matlab/math/array-indexing.html

https://www.mathworks.com/help/matlab/math/multidimensional-arrays.html

Stephen23 on 24 Dec 2018

Edited: Stephen23 on 30 Dec 2018

idx = cellfun('length',tmp)>2 & isnan(out);

Checks if the length of each character vector in the cell array tmp is greater than two AND where the elements of out are NaN (i.e. the elements of tmp do not encode scalar numbers). So this will only be TRUE for values that are non-empty and that are not scalar numeric. It will therefore be FALSE where the cell is empty (or has length <=2) OR it has already been converted to a scalar numeric by str2double.

foo = @(v) sum([v(1:end-2),v(end-1)/v(end)]);

Accepts a numeric vector v of two or three elements, and converts this into a scalar numeric. The final two elements of the vector form a fraction, which is then summed with any preceding elements of the vector.

baz = @(s) foo(sscanf(s,'%d%*[ /]'));

Converts a string/character vector representing numbers into a numeric vector, e.g.

>> sscanf('7 2/3','%d%*[ /]')
ans =
   7
   2
   3
>> sscanf('5/6','%d%*[ /]')
ans =
   5
   6

and then applies the function foo to this numeric vector.

out(idx) = cellfun(baz,tmp(idx));

Applies the function baz to cells of the cell array tmp that were selected by the index idx. Thus selects the character vectors that could not be converted by str2double and are not blank, converts to numeric vector, calculates the fraction, sums, and assigns the output numeric values into numeric matrix out.

"what do you do with baz = @(s) foo(sscanf(s,'%d%*[ /]')) "

Define a function that uses sscanf to convert a character vector into a numeric vector, and then applies foo to that numeric vector.

"what is %*[ /]' ?"

Tells sscanf to ignore space characters and forward slash characters.

"what do you mean with @(v) and @(s) ?"

Define anonymous functions:

https://www.mathworks.com/help/matlab/matlab_prog/anonymous-functions.html

Stephen23 on 25 Dec 2018

Edited: Stephen23 on 26 Dec 2018

"Are the answer from MATLAB correct?"

Yes.

>> D1 = {}; % an empty cell array, so there are no arrays inside of it.        
>> cellfun('length',D1) % no arrays -> no lengths -> empty output.
ans = []
>> D2 = {23.5}; % a 1x1 cell array containing a scalar numeric.
>> cellfun('length',D2) % a scalar numeric has length 1.
ans =  1
>> D3 = {5/64}; % a 1x1 cell array containing a scalar numeric.
>> cellfun('length',D3) % a scalar numeric has length 1.
ans =  1
>> D4 = {28,1/64}; % a 1x2 cell array, each cell contains a scalar numeric.
>> cellfun('length',D4) % two scalar numerics, each has length 1.
ans =
   1   1

"no answer is great than 2 ?"

Correct, none of those cell arrays contain arrays with length greater than two. In fact all of the lengths are one because you defined all of the arrays in every cell to be scalar. Scalars have length one:

>> length(23.5)
ans =  1

A cell array is just a container for holding other arrays. This is useful because sometimes we need to hold together other arrays of different sizes, types, or dimensions. In your examples, you have cell arrays of different sizes (with zero, one, or two cells), and you placed a scalar numeric in every cell. The length of a scalar is one, so cellfun('length',...) carefully measures the length of the array in each cell and returns that answer for every cell of the cell arrays you defined.

https://www.mathworks.com/help/matlab/matlab_prog/access-data-in-a-cell-array.html

http://matlab.wikia.com/wiki/FAQ#What_is_a_cell_array.3F

John Smith on 30 Dec 2018

Dear Stephen,

I would like to ask you the following question:

I have a data file like this

tmp =

121 12 6914 0.5625

122 -48 6853 0.29688

119 48 6914 0.17188

125 -12 6853 0.078125

125 4 6853 0.4375

119 5 6832 0.20313

119 4 6832 0.039063

119 -4 6832 0.023438

I would like re-group (or reduce) it with following conditions:

For any row, if column 1 AND column 3 of this row is identical with any column 1 AND column 3 of any other row. Then reduce to one new row with new value of column 2, this new value of column 2 is the sum of original values of column 2. Column 1 is kept the same, Column 4 is not important.

So for above data, I expect to have the answer:

119 5 6832 0.20313 % 5+4-4=5

122 -48 6853 0.29688

125 -8 6853 0.4375 % -12+4=-8

121 12 6914 0.5625

119 48 6914 0.17188

Thank you very much

Stephen23 on 30 Dec 2018

@John Smith: that is quite a different topic. Please ask a new question about that.

Sign in to comment.

Answer 2

Brian Hart on 22 Dec 2018

Open in MATLAB Online

0 votes

Here's some code to read the data in as a table, then update the mixed number values in each row to the decimal...

T = readtable("C:\Users\MATLAB\Desktop\data.txt",'Delimiter',',','ReadVariableNames',false)
numRows = size(T,1);
for i = 1:numRows
    for j=3:4
        tmpstr = T{i,j};
        tmpstr=tmpstr{:};
        tmpstr=strrep(tmpstr, '/',' ');
        mixedVal = sscanf(tmpstr,'%d');
        if size(mixedVal,1) == 2
            decVal = mixedVal(1)/mixedVal(2);
        else
            decVal = mixedVal(1) + mixedVal(2)/mixedVal(3);
        end
        T(i,j) = {num2str(decVal)};
    end
end
disp(T)

1 Comment
Show -1 older comments Hide -1 older comments

John Smith on 23 Dec 2018

Edited: per isakson on 23 Dec 2018

Open in MATLAB Online

data.txt

Thank you Brian,

The code give me those errors: (I am using Matlab R2014a)

Sign in to comment.

Answer 3

Guillaume on 22 Dec 2018

Edited: Guillaume on 24 Dec 2018

Open in MATLAB Online

0 votes

I would recommend that you modify whatever is creating these files so that it creates files that don't have such an unusual format.

As it is, the following should work but is not particularly good code:

data = readtable('data.txt', 'Delimiter', ',', 'ReadVariableNames', false);
data = [data(:, [1 2]), ...  %leave text variables unchanged
        varfun(@(var) cellfun(@(var) str2num(strrep(var, ' ', '+')), var), data, 'InputVariables', [3 4])];  %convert 'numeric' variables

It simply replaces the space by + and use num2str to parse the expression which is simple but dangerous.

edited to add missing input to cellfun and wrong function use

3 Comments
Show 1 older comment Hide 1 older comment

Guillaume on 23 Dec 2018

Edited: Guillaume on 23 Dec 2018

Open in MATLAB Online

Sorry,forgot the input to cellfun. Fixed now.

In the future, please copy/paste the error as text rather than a screenshot.

Note: the advantage of my answer over the other proposed ones, is that it will parse correctly

28 21/64
21/64
28

Guillaume on 24 Dec 2018

John wrote in a comment now deleted: "Could you please make your code also apply to decimal numbers (as well as mixed number). The reason is that some data has mixed numbers, some data use decimal numbers. I would like have one code fits all. "

As I pointed out, my answer already does that. In a much simpler way. Also note that the numeric columns are stored as vectors which is more memory efficient and easier to use than a cell array of scalars.

Sign in to comment.

Read mixed numbers in Matlab

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

14 Comments
Show 12 older comments Hide 12 older comments

More Answers (2)

1 Comment
Show -1 older comments Hide -1 older comments

3 Comments
Show 1 older comment Hide 1 older comment

Categories

Tags

Community Treasure Hunt

Read mixed numbers in Matlab

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

14 Comments Show 12 older comments Hide 12 older comments

More Answers (2)

1 Comment Show -1 older comments Hide -1 older comments

3 Comments Show 1 older comment Hide 1 older comment

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

14 Comments
Show 12 older comments Hide 12 older comments

1 Comment
Show -1 older comments Hide -1 older comments

3 Comments
Show 1 older comment Hide 1 older comment