Plot exceeding time limit due to large dataset
1 view (last 30 days)
Show older comments
I have a dataset in which I have to categorize the rise of UV levels from a satellite.
As this satellite orbits the Earth there are periods where it cannot process the UV levels.
The raw data that I am getting is close to a square wave.
I have to categorize this data into rising where the satellite comes out of the influence of the Earth,
OnDuty where the satellite is able to record data properly and falling where it is going behind the Earth.
The dataset contains a timestamp of frequency 1Hz where one value is recorded every second.
I want to categorize these three classes into for atleast now, red, green and blue.
The dataset contains the said timestamp and the corresponding UV values.
I am currently using a threshold to categorize the data and is working for a smaller dataset, however with a dataset of 80000 entry points my code is not able to run efficiently.
Here is the code. Where states stores the ith values nature. The problem lies within the for loop that runs for the plotting.
function categorize_square_wave()
y_foo = readtable("80000entries.csv", Range='uv_values');
y_values = table2array(y_foo);
k = 1; % Sensitivity parameter
differences = diff(y_values);
% comments have been put to remove staticstical computation
mu = mean(differences);
sigma = std(differences);
% Thresholds
T_high = mu + k * sigma;
T_low = mu - k * sigma;
% coded directly to reduce computation
% T_high = 500;
% T_low = -100;
% Categorizing based on threshold values
% Initialize states with zeros (default to dutiful)
states = zeros(1, length(differences));
% Assign states using vectorization
states(differences > T_high) = 1; % Rising
states(differences < T_low) = -1; % Falling
% Collect dutiful values
% dutiful_values = y_values(states == 0);
figure;
hold on;
for i = 1:length(differences)
if states(i) == 1
plot([i, i+1], [y_values(i), y_values(i+1)], 'g', 'LineWidth', 2); % Green for Rising
elseif states(i) == -1
plot([i, i+1], [y_values(i), y_values(i+1)], 'r', 'LineWidth', 2); % Red for Falling
else
plot([i, i+1], [y_values(i), y_values(i+1)], 'b', 'LineWidth', 2); % Blue for Dutiful
% dutiful_values = [dutiful_values, y_values(i)]; % Collect Dutiful values
end
end
% Plot original values in dashed lines
plot(y_values, 'k--', 'LineWidth', 1); % Black dashed line for original values
yline(T_high, 'r--', 'T_{high}', 'LabelVerticalAlignment', 'bottom', 'LabelHorizontalAlignment', 'right'); % Dashed red line for T_high
xlabel('Sample Number');
ylabel('y(t)');
title('Classification of Changes in Square Wave');
hold off;
end
categorize_square_wave();
% we have a good estimation of what values are
% is there a way to use flags on each state so that we can just plot
% without checking each value?
% implemetn flags
2 Comments
Accepted Answer
Voss
on 4 Oct 2024
You can replace those ~80000 plotted red, green, and blue lines with 3 lines: one red, one green, and one blue.
Use NaNs in the plotted lines where the data is not pertinent to that line, e.g., the green "rising" line will have NaNs wherever the data is not "rising". NaNs don't render on a plotted line so can be used to create gaps in a line.
Here's an example with made-up data consisting of 80000 datapoints:
y_values = min(0.5,max(-0.5,sin(linspace(0,4*pi,80000))));
k = 1; % Sensitivity parameter
differences = diff(y_values);
% comments have been put to remove statistical computation
mu = mean(differences);
sigma = std(differences);
% Thresholds
T_high = mu + k * sigma;
T_low = mu - k * sigma;
% Categorizing based on threshold values
% Initialize states with zeros (default to dutiful)
states = zeros(1, length(differences));
% Assign states using vectorization
states(differences > T_high) = 1; % Rising
states(differences < T_low) = -1; % Falling
% Initialize three vectors of NaNs
N = numel(y_values);
y_rising = NaN(1,N);
y_falling = NaN(1,N);
y_on = NaN(1,N);
% populate the rising data line with y_values where the data is rising
idx = find(states == 1);
y_rising(idx) = y_values(idx);
y_rising(idx+1) = y_values(idx+1);
% populate the falling data line with y_values where the data is falling
idx = find(states == -1);
y_falling(idx) = y_values(idx);
y_falling(idx+1) = y_values(idx+1);
% populate the on-duty data line with y_values where the data is on-duty
idx = find(states == 0);
y_on(idx) = y_values(idx);
y_on(idx+1) = y_values(idx+1);
% plot the three lines
figure
hold on
plot(y_rising,'g','LineWidth',2)
plot(y_falling,'r','LineWidth',2)
plot(y_on,'b','LineWidth',2)
% Plot original values in dashed lines
plot(y_values, 'k--', 'LineWidth', 1); % Black dashed line for original values
yline(T_high, 'r--', 'T_{high}', 'LabelVerticalAlignment', 'bottom', 'LabelHorizontalAlignment', 'right'); % Dashed red line for T_high
xlabel('Sample Number');
ylabel('y(t)');
title('Classification of Changes in Square Wave');
4 Comments
Voss
on 5 Oct 2024
Whatever method you decide to use to categorize the data into the three groups, you can use the approach in my answer to plot three lines.
You might want to post a new question about selecting the best method for categorizing that data.
More Answers (0)
See Also
Categories
Find more on Matrix Indexing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!