Excel sheet data upload "Invalid 'DataRange'. The column size must match the number of variables".
You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Show older comments
Error:
"Invalid 'DataRange'. The column size must match the number of variables".
I imported xlsx (excel) data into matlab then generated a script after which I copied into the beginning of a different script i wrote but I get the above error. Can you help me fix this? Why do I get this error?
2 Comments
Mathieu NOE
on 26 Jan 2023
you didn't tell us what the error is
Chanille
on 26 Jan 2023
Sorry all:
I put it in the subject, this was the error. "Invalid 'DataRange'. The column size must match the number of variables".
Accepted Answer
Walter Roberson
on 26 Jan 2023
opts.DataRange = "A2:AT78";
That requests that column "A" to "AT" be read in. However, M1_New_Macro1.xlsx only has data up to column "AS"
The headers in the xlsx file do not align with the content of the data. Observe that column A is titled "cut" and contains a file name, and column B is titled "Count" and contains integer numbers (plausibly counts of something.) Then look at column L "Slice" which is empty, then column M which named "Count" but contains file names. The Total Area in column C contains fractions but the Total Area in column N contains only integers (looking like a count)
It looks to me as if there are blank columns in the data but that the headers in row 1 do not take that into account.
When you look at the variable named in the code, it is not obvious to me that the variable names match up to the data in the file.
Starting in row 34 the file contains additional tables that are not in the same format as the data above that.
You should probably be using readcell() instead of readtable(), and extracting the data you need from the resulting cells.
12 Comments
Chanille
on 26 Jan 2023
Can you attach a corrected set up of how the excel sheet to be formatted? I'm sorry I just want to make sure I am imaging your instructions in the correct way.
how would i extract the data from only the resulting cells?
For example,
format long g
T = readcell('M1_New_Macro1.xlsx');
main_rows = [2:13, 19:30];
second_rows = 35:46;
third_rows = 51:62;
fourth_rows = 67:78;
Counts = cell2mat(T(main_rows, [2:12:end]));
ferret_ratios = cell2mat(T(fourth_rows, 37:41));
%cross-check
Counts(1:3,:)
ans = 3×4
1.0e+00 *
1156 1368 1466 0.38288
1157 889 839 0.38288
728 889 528 0.38288
ferret_ratios(1:3,:)
ans = 3×5
1.629101 1.638418 1.633194 1.630503 1.609898
1.629101 1.638418 1.633194 1.630503 1.609898
1.629101 1.638418 1.633194 1.630503 1.609898
This is Insightful,
But its still not working for me. I am trying to extraploate the PCA components from this data (per the code). With your suggestions I still get an error. What am I missing?
Thank you!
I cannot tell from your code which values you want to calculate the PCA of. Your code was obvious written for a different format of xlsx file.
Chanille
on 27 Jan 2023
Yes, it was written for a different xlsx file that's why I regenerated the script and pasted it above.
how do i make it obvious which values I want to calculate the PCA of?
You could give excel cell references. Or you could describe it, something like "the odd-numbered columns of the ferret ratios"
There is also the question of whether you will need to do this for other files as well, and if so then will they have the same number of rows of data and if information is needed from the additional tables, will the tables always start in the same column ?
For example is the rule that you need to search for two empty lines to mark the end of each section, and that for the additional sections you want to skip all leading empty columns ? Will additional sections always be in the same order ?
Chanille
on 27 Jan 2023
I want to give the excel cell references of every column with at least 4 significant figures. How can i fix this code so that it reads all of the exact cell references in the xlsx file?
Chanille
on 28 Jan 2023
Can you write what you mean using my xlsx as reference where it reads all the columns with more than 3 significant figures into a pca generating format as I have tried, for example? Thanks!
in order to figure out whether something has at least 3 significant figures, you need to load the xlsx file as if all of the entries are text, and then you have to examine strings and count positions after the decimal place. But this is risky because excel sometimes stores long approximations, so for example 8.93 (intended to be 2 decimal places) might appear in the excel file itself as 8.9299999999875.
Remember that excel internally uses binary floating point for anything stored in General format (or some time formats), and binary floating point numbers cannot represent exactly 0.1 (for the same mathematical reasons that finite decimal cannot exactly represent 1/3)
If you read in a number that displays as 8.93 (unless you look very closely at it) can you say that it is two decimal places, or is it instead 8.930, three decimal places the last of which happens to be 0?
Chanille
on 29 Jan 2023
Ok. I don’t know how to read in the columns with the data that I want analyzed.
filename = 'https://www.mathworks.com/matlabcentral/answers/uploaded_files/1274435/M1_New_Macro1.xlsx';
C = readcell(filename);
nmask = cellfun(@isnumeric, C);
spy(nmask)

L = bwlabel(nmask, 4); %4 connected
stats = regionprops(L, 'BoundingBox');
bb = ceil(vertcat(stats.BoundingBox)); %ceil is important here!
blobs = {};
bloblocs = {};
for row = 1 : size(bb,1)
sc = bb(row,1);
sr = bb(row,2);
ec = sc + bb(row,3) - 1;
er = sr + bb(row,4) - 1;
blobcols = sc:ec;
blob = cell2mat(C(sr:er, blobcols));
need4 = blob ~= round(blob,3);
col_needs_4 = any(need4, 1);
extract = blob(:,col_needs_4);
if ~isempty(extract)
blobs{end+1} = extract;
usedcols = blobcols(col_needs_4);
bloblocs{end+1} = [sr er usedcols];
end
end
bloblocs
bloblocs = 1×14 cell array
{[2 13 6 7 11]} {[19 30 6 7 11]} {[2 13 18 19 23]} {[19 30 18 19 23]} {[2 13 30 31 35]} {[19 30 30 31 35]} {[2 13 37 38 39 40 41]} {[19 30 38 39 40 41]} {[67 78 37 38 39 40 41]} {[2 13 44 45]} {[19 30 44 45]} {[35 46 44 45]} {[51 62 44 45]} {[67 78 44 45]}
The code is able to detect 14 different blocks of numeric data, and identify which columns within the blocks need 4 significant digits. The bloblocs output you see above, such as [2 13 6 7 11] means there is a block of numeric data starting from row 2 and ending at row 13, in which columns 6 7 and 11 contained 4+ digit numbers. In terms of the headings of your file those are density, norm/area, and ferret/minferret . You can see from [19 30 6 7 11] that rows 19 to 30 had a similar pattern of numeric values.
When you get to [2 13 37 38 39 40 41] that is again rows 2 to 13, columns AK to AO; the column headers are messed up by that point, but probably those are intended to be density columns.
Then you get the blobs such as [67 78 37 38 39 40 41] which is rows 67 to 78, AK to A0, and probably intended to be ferret/minferret values. You will also see that the end of those rows have blobs for Avg, std, N, where the std and N columns need at least 4 digits.
In each case, the contents of the columns that need 4 digits (only those columns) from each blob are stored into the cell array blobs -- blobs{K} is a numeric array with multiple columns, and bloblocks{K} tells you where that data was originally from inside the spreadsheet.
So we can now automatically identify where in the file there are blocks containing data that needs at least 4 digits, and we have extracted those sections into a cell array, and we have recorded the locations we took them from.
Now what? Should we assume that each blob will have exactly the same number of rows, and just horizontally concatenate them all together and do pca on that??
allblocks = horzcat(blobs{:});
size(allblocks)
ans = 1×2
12 42
Chanille
on 31 Jan 2023
Thank you @Walter Roberson
More Answers (0)
Categories
Find more on Spreadsheets in Help Center and File Exchange
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)