Finding area under peaks for drifting signal

Question

0 votes

EDITED, see bottom, (regarding numerical integration)

Dear all,

How can I find the area under the "larger" peaks, as seen below?

My idea is to:

1) locate where the peaks are (I have tried using the findpeaks() function)

2) add or subtract the minimum value of the particular peak to get a baseline for each peak at y = 0.

3) use some sort of trapezoidal integration (I saw that MATLAB has a built-in function that does this).

When using the findpeaks() function, I'm having a really hard time locating "interesting" peaks. For instance, the first "big" peak has some noise in it and then findpeaks() give me a lot of peaks on it - and then there are other small peaks I'm not interested in.

I hope someone can help!

(If you need it, my data is found in the attached CSV file - I'm only using the first two columns, "Time" and "Channel1V". I have 100 CSV files like the uploaded one, so I really need a smart code ;) )

EDIT:

Q: How to find area under peak using trapz()?

So, using the arguments 'MinPeakDistance' and 'MinPeakProminence' I have been able to figure out which peaks to use. I still need to find the area under the graphs; so far, I've been using the very rough estimate Prominence times Width (where width is the width of the peak taken at the half of the prominence). It should for most purposes be alright, but I'd really appreaciate if anyone has got a good idea of how to numerically integrate this.

11 Comments
Show 9 older comments Hide 9 older comments

Christian Hjuler Christensen on 23 Jan 2018

My apologies. I thought I was clear and that you were being polemic, but from your questions in the recent post I can see you are willing to help. Sorry. I haven't analyzed these types of things before but I think I can now make it more clear:

Intuitively, everything that looks like a peak is what I want to somehow define as a peak and then find the area under it.

To be precise, I have used the arguments "MinPeakDistance" and "MinPeakProminence". The first argument is to ensure I don't have multiple "peaks" on what I consider to be a peak, e.g. the first big peak needs to be considered as one peak. The second argument has been used so that I only get the peaks of a certain size.

The actual problem arises because I would like to "stretch" or "smoothen" the data so that 1) I do not have negative values (for numerical integration), and 2) so that the "beginning" of the peaks start at zero - and I simply do not know how to that in a smart way, maybe through some sort of filtering?

The second problem is more of a coding nature: How to keep track of when a peak starts and ends. If I could figure that out, I could use the trapz() function for each of my peaks.

Thank you for taking your time and sorry for the misunderstanding.

John D'Errico on 24 Jan 2018

If you have ONE peak, so perhaps you have simply looped over the list of peaks, then just call traps on the vector from the start to end of that defined peak. You should be able to decide on the index that defines where the peak starts and where it ends.

If you have multiple peaks, then you could create a list of start points for each peak and the end points. Now, just call cumtrapz on the ENTIRE vector. Then as one operation, index into that cumulative vector with the start and end points. Subtract the results of those two vectors, the result being the area for each peak.

If you have no clue as to how to create a vector of start points, or how to work with vectors in general, or how to use indexing, then you need to read the getting started tutorials for MATLAB. (Sorry, but it is not obvious what your problems are in all this. My gut tells me that you are very new to using MATLAB from the questions asked.)

As far as smoothing a sequence of values, you could start with the function smooth. However, ANY kind of smoothing will destroy those high narrow peaks. That is the nature of smoothing algorithms, dampening high frequency components in the signal. So any smooth will cost you exactly the signals that you are trying to measure. That argues you really do NOT want to do any smoothing.

Finally, to start the sequence at the beginning of the peaks, just choose some criteria that identifies that location. Again, that criteria can only come from your mind. Then delete the part of the signal before that point.

I think your problem here is that you are trying to solve a problem too large for your current computing skills (in MATLAB) to solve. You are faced with a programming elephant. When that happens, eat a programming elephant one byte at a time. Break the problem down into small problems, each of which is solvable by you. Solve each one. Then put it all together.

Don't worry about elegance or about efficiency. Screw elegance, in favor of getting the job done. Loops are fine. When you have something that works, if it is too slow, only then do you worry about efficiency.

Here, if findpeaks does what you want, then use it. Work on each peak in turn, using a loop. Step back from the peak until you find a point that fits your criterion for the end of the peak. Do the same moving forward. That gives you two points. Call trapz. WTP? So you need to define some algorithm to look in the vicinity of a peak, forward and back. Again, a simple loop could suffice, depending on whatever criterion you define.

Christian Hjuler Christensen on 25 Jan 2018

@John D'Errico, @Star Strider: Thanks for your replies, especially John D'Errico for taking your time.

I am indeed not an expert in computing. The dataset described is only one out of many obtained by my girlfriend. She is doing a research thing on something to do with molecules in ants' brains. Due to a software issue, she has transfered her data to excel and is now doing it all manually - looking at the peaks she knows is of interest, then scrolling down the data and applying trapezoidal integration - very tidyous since she has 100 files with 16000 data points in each.

@John D'Errico:

Thank you for your time again - I have finally been able to locate the peaks of interest - done! As you say, I need to find a criteria to determine the "beginning" of each peak. This is very, very difficult I find - the baseline for the peaks is not constant and some peaks "start" at one vertical/y - value but ends at another.

I only just learned about the islocalmin() function and was hoping I could use this to find the relevant intervals - however, due to "noise", it provided me with too many minima. HOWEVER:

Maybe I could somehow, by keeping track of indices, compare the indices of the islocalmin() and the findpeaks() functions' indices. My idea is to "catch" an index from findpeaks() in between two indices of islocalmin(). This is what is currently causing me trouble.

Hopefully you have an idea, maybe not, maybe this is tiresome - in any case THANK YOU for taking your time :)

Christian Hjuler Christensen on 25 Jan 2018

@Star Strider:

Thank you for taking your time. I have no experience in fft, signal processing and filtering, but I will use a few hours tomorrow to look into it and see if I can make anything out of it. Once again, thank you for your time, and if I get smarter reading up on fft, signal processing and the like and find a method, I will let you know with much appreciation!

Star Strider on 25 Jan 2018

My pleasure.

I can help with the signal processing. You posted one of your signals, so I will download it, and experiment with it.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Star Strider on 25 Jan 2018

Open in MATLAB Online

1 vote

I would use the Signal Processing Toolbox medfilt1() function to estimnate the base line drift, that you can then remove. (None of my other approaches, such as frequency-selective filtering or the Savitzky-Golay filter, gave satisfactory results.)

Try this to remove the baseline drift:

[D,S,R] = xlsread('420B1.1.csv');
T = D(:,1);
Ch1 = D(:,2);
Ch2 = D(:,3);
Ts = mean(diff(T));
L = length(T);
MF = medfilt1(Ch1, 750);                                                        % Median Filter
Ch1F = Ch1-MF;                                                                  % Correct Baseline
figure
plot(T,Ch1F)
grid

I leave the peak detection to you. Here is one possibility:

dCh1F = gradient(Ch1F, Ts);                                                     % Derivative
[pkp,locp] = findpeaks(dCh1F, 'MinPeakHeight',0.04, 'MinPeakDistance',500);     % Positive Derivative Peaks, ‘locp’ Are Indices
[pkn,locn] = findpeaks(-dCh1F, 'MinPeakHeight',0.03, 'MinPeakDistance',500);    % Negative Derivative Peaks, ‘locn’ Are Indices
figure
plot(T,Ch1F)
hold on
plot(T([locp; locn]),Ch1F([locp; locn]), '+r')
hold off
grid

You will have to experiment to get the result you want. This may not be robust to all your data, so you may have to adjust it for each file.

2 Comments
Show None Hide None

Christian Hjuler Christensen on 25 Jan 2018

Amazing! I really appreciate you (everyone here!) taking the time. Sorting out the baseline looks very promising! I shall do some experiments and play around with it, but determining peaks and start/end intervals for them now looks like a walk in the park!

Thanks!

Star Strider on 25 Jan 2018

As always, my pleasure!

(And for the rest of us, as well!)

Sign in to comment.

Answer 2

AJ Geiger on 24 Jan 2018

Open in MATLAB Online

0 votes

1. instead of using the function maxpeak use

[~,I] = max(f);

to obtain the index of the max peak

2. Nice! Now you know where the max peak is. The next step is to set up your boundaries, so you know where to start and stop your integration. There are several algorithms you can implement. the easiest would be simulated annealing. The algorithm goes as follows.

% Note: E is your data vector
% for the alg I ~= 1
if I == 1 % check condition
   I = 2;
end
%
%
% Preform Alg
for t = 0:length(E)-1
  % check if next point is 
  idx = I+t;
  Delta_E = E(I+t) - E(I-1+t);
  if E > 0, continue, end
  %
  % Note T: is your cooling function which
  % Converges to 0 as t increases. now this
  % is basic but T could be as simple as
  % T = 1/t or T = 1/(t^2)
  % now for tuning part to be complete you
  % should say:
  % if T == 0, break, end
  % however only use it if you develop T
  % properly.
  %
  T = 1/(t^2); % Temperature Function
  if exp(Delta_E/T) < Thresh, break, end
end

Now you have found the Right-hand side boundary. For now, you can

3. To be complete, idx_2 = idx;

4. use the function E_New = flip(E);

5. solve for I_New = I - length(E) + 1;

6. then run alg two on E_new and I_New

7. To be complete, idx_1 = idx;

8. get index vector II you wish to integrate over

II = idx_1:idx_2

9. Then solve for the area using II

Area = trapz(E(II));

10. Grab a beer if of legal age and does not effect believes (Optional)

11. Finish beer

12. Explore other integration methods.

2 Comments
Show None Hide None

Christian Hjuler Christensen on 25 Jan 2018

I was in a loop between point 10 and 11 when reading your answer ;-) I shall look into your answer soon and see if I can get it to work - thank you for taking your time!

Christian Hjuler Christensen on 25 Jan 2018

Thank you once again - I looked at it and it really inspired me. Star Strider provided an answer involving a filter which should make the interval-detection much easier, so I have opted to use that.

Sign in to comment.

Finding area under peaks for drifting signal

11 Comments
Show 9 older comments Hide 9 older comments

Accepted Answer

2 Comments
Show None Hide None

More Answers (1)

2 Comments
Show None Hide None

Categories

Tags

Community Treasure Hunt

Finding area under peaks for drifting signal

11 Comments Show 9 older comments Hide 9 older comments

Accepted Answer

2 Comments Show None Hide None

More Answers (1)

2 Comments Show None Hide None

Categories

Tags

See Also

Community Treasure Hunt

11 Comments
Show 9 older comments Hide 9 older comments

2 Comments
Show None Hide None

2 Comments
Show None Hide None