How can I analyse particular portions of an array to use for plotting? Considering dataset of ~1.5 million rows for each variable and ~150,000 locations with 10 values (for each variable) per location

1 view (last 30 days)
I have a dataset of ~1.5 million rows and 6 different variables, defined below:
  1. Variable 1 (Column 1) - Location
  2. Var 2 (Column 2) - Temperature
  3. Var 3 - Rainfall
  4. Var 4 - Number of people in the location (Variable 1)
  5. Var 5 - Sensor value A
  6. Var 6 - Sensor value B
The dataset contains 10 values per location, meaning there's ~150,000 locations.
The questions I'm trying to answer are:
  1. a) Determine the average number of people and plot the average for the top 100 locations with the most people on average
  1. b) Determine the minimum and maximum temperature and plot the both values for the top 100 locations of the maximum temperature
The structure given to approach this question was to find the Minimum, Maximum and Average from the 10 values in each location for each Variable (Temperature, Rainfall, Sensor A and B readings). They then suggested creating a matrix with all ~150,000 rows that includes the min, max and average for each required Variable, then plotting the graphs with the newly created matrix.
What process would I follow to find the Minimum, Maximum and Average of the 10 values in each location for each Variable?
I'm currently unsure how to:
  • Group the 10 values from each location together to find the min, max and mean for starters
  • Make those into a matrix
  • Plot the top 100 rows of my matrix (to represent a plot of the top 100 locations); I think I know the basics in how to plot a graph, but not for particular/select data, like the top 100 rows of a matrix/array.
Any guidance would be much appreciated, thank you in advance for any assistance!
  3 Comments
Adam Cook
Adam Cook on 4 Jun 2020
Ahh I forgot to include what I already had in the original post. Currently I’ve imported the data from a table and converted each variable into their own individual arrays, using the following:
location = table2array(data(:,1)); temp = table2array(data(:,2)); rainfall = table2array(data(:,3)); num_people = table2array(data(:,4)); sensorA = table2array(data(:,5)); sensorB = table2array(data(:,6));
location = table2array(data(:,1)); temp = table2array(data(:,2)); rainfall = table2array(data(:,3)); num_people = table2array(data(:,4)); sensorA = table2array(data(:,5)); sensorB = table2array(data(:,6));
[location, i_location] = sort(location); temp = temp(i_location); rainfall = rainfall(i_location); num_people = num_people(i_location); sensorA = sensorA(i_location); sensorB = sensorB(i_location);
From here I’m not sure how to acquire min, max and average for each location to be put into a matrix
Adam Cook
Adam Cook on 4 Jun 2020
Whoops, wrote the table2array code sections twice by accident in the above comment. But yeah, that’s where I’m currently at anyway

Sign in to comment.

Answers (1)

dpb
dpb on 4 Jun 2020
Edited: dpb on 4 Jun 2020
"I’ve imported the data from a table and converted each variable into their own individual arrays,..."
That's exactly the wrong approach -- use grouping variables on the desired variables as suggested and illustrated in the doc for groupsummary, findgroups and/or splitapply. You'll also find groupsummary already does much if not all of what you're asking for automagically.
Assuming you already have the table, I'll name it tData for "table Data"
tData.Properties.VariableNames={'Location','Temperature','Rainfall','Population','SensorA','SensorB'}; % define meaningful variable names
tGData=groupsummary(tData,'Location',{'min','max','mean'}); % compute wanted statistics by location
Then you use maxk on the desired statistic to find the locations (via optional second output) in the output table for the topmost 100 and extract those for whatever else it is to be done.
It's really all there; just use the tools TMW has provided...

Categories

Find more on Line Plots in Help Center and File Exchange

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!