how to organise data in an Excel spreadsheet

Hi there,
I want to analyse the data from the excel sheet shown in the screenshot.
I'm anticipating filters that work like structural code, allowing me to group the data and do further arithmetic operations on a new tab sheet with a header already present.
I need help with matlab guidance code.
Many thanks

 Accepted Answer

Look into -
This documentation page has a good amount of examples related to what you want to do -

11 Comments

Thank you for the pointers
Can you kindly explain how to use the groupsummary when analysing categorical and numerical variables?
Say, Could you kindly assist and prepare groupsummary for Filename and Type and Parameters 1 or 2 from the example screen shot above?
If you're not sure how to use groupsummary and would like some guidance, consider using the Compute by Group Live Editor task and using the interactive controls to experiment with the various options. Once you've got the results looking the way you want with the interactive controls you could view and edit the generated code that you could use to reproduce those results or as the starting point for operating on different but related data.
@Life is Wonderful did you see the last link I attached? It has examples on how to do what you want to do.
If you want help, you will need to show what you have attempted yet, ask where you having a problem and attach the data you are working with.
In the moment, I have following code as of now.
Thanks
function sampledatav = importfile(workbookFile, sheetName, dataLines)
%% Input handling
% If no sheet is specified, read first sheet
if nargin == 1 || isempty(sheetName)
sheetName = 1;
end
% If row start and end points are not specified, define defaults
if nargin <= 2
dataLines = [1, Inf];
end
%% Set up the Import Options and import the data
opts = spreadsheetImportOptions("NumVariables", 5);
% Specify sheet and range
opts.Sheet = sheetName;
opts.DataRange = dataLines(1, :);
% Specify column names and types
opts.VariableNames = ["Var1", "Type", "Name", "param1", "param2"];
opts.SelectedVariableNames = ["Type", "Name", "param1", "param2"];
opts.VariableTypes = ["char", "double", "string", "double", "double"];
% Specify variable properties
opts = setvaropts(opts, ["Var1", "Name"], "WhitespaceRule", "preserve");
opts = setvaropts(opts, ["Var1", "Name"], "EmptyFieldRule", "auto");
% Import the data
sampledatav = readtable(workbookFile, opts, "UseExcel", false);
for idx = 2:size(dataLines, 1)
opts.DataRange = dataLines(idx, :);
tb = readtable(workbookFile, opts, "UseExcel", false);
sampledatav = [sampledatav; tb]; %#ok<AGROW>
end
end
Can you attach the excel file you are working with?
I have include the excel file.Thank you !!
I have include the excel file.Thank you !!
You are free to adjust as you see fit; however, the fileName and corresponding data selection is where I am having difficulty creating the groupsummary.
Idk why you are using such a complicated method just to read the data.
%Read data via readtable
tb = readtable('sample_data_v1.xlsx','VariableNamingRule','preserve')
tb = 11×10 table
Type Name param1 param2 file1-max-param1 file2-avg-param1 file3-std-param1 file1-max-param2 file2-avg-param2 file3-std-param3 ____ _________ ______ ______ ________________ ________________ ________________ ________________ ________________ ________________ 0 {'file1'} 10 100 12 15.25 6.7175 100 275 7.7782 1 {'file1'} 12 100 NaN NaN NaN NaN NaN NaN 1 {'file1'} 11 100 NaN NaN NaN NaN NaN NaN 2 {'file1'} 12 100 NaN NaN NaN NaN NaN NaN 2 {'file2'} 15 200 NaN NaN NaN NaN NaN NaN 1 {'file2'} 15 200 NaN NaN NaN NaN NaN NaN 6 {'file2'} 15 300 NaN NaN NaN NaN NaN NaN 8 {'file2'} 16 400 NaN NaN NaN NaN NaN NaN 10 {'file1'} 10 50 NaN NaN NaN NaN NaN NaN 1 {'file3'} 10 89 NaN NaN NaN NaN NaN NaN 2 {'file3'} 0.5 100 NaN NaN NaN NaN NaN NaN
You can import the data of interest from the columns B to E directly by mentioning the Range arguement, but I have chosen a general approach.
%Get indices of columns which have missing values
idx = any(ismissing(tb),1);
%Remove the data that is not required
tb(:,idx)=[];
%Get the variable names
%str = tb.Properties.VariableNames
%Convert the file names to categorical array
tb.Name = categorical(tb.(2))
tb = 11×4 table
Type Name param1 param2 ____ _____ ______ ______ 0 file1 10 100 1 file1 12 100 1 file1 11 100 2 file1 12 100 2 file2 15 200 1 file2 15 200 6 file2 15 300 8 file2 16 400 10 file1 10 50 1 file3 10 89 2 file3 0.5 100
%Get the max for the 3rd and 4th variables, grouping according to the 2nd variable
out = groupsummary(tb,2,"max",[3 4])
out = 3×4 table
Name GroupCount max_param1 max_param2 _____ __________ __________ __________ file1 5 12 100 file2 4 16 400 file3 2 10 100
Similarly, do for mean and standard deviation.
Thank you@Dyuman Joshi, this takes me as first step done.
At the next level, I want to examine data to find change points, outliers, comprehend the trend, and locate detrends.

Sign in to comment.

More Answers (0)

Categories

Products

Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!