Interpolating Missing Data with Noise like Brownian Motion

Hello folks,
I have a few questions regarding interpolating financial price data set.
I have a set of price points, but their frequency is not high, so I would like to interporlate between these between.
For instance, consider 24 by 2 double price data points where it spans over 24 hours.
Between each hour, I would like to interporlate 15, 30, 45, so total three interporlated points, that way I now have 72 points per day.
However, I would like to insert some noise that follows possibly a geometric brownian motion or random walk to give a flavor of randomness in the data and they are not all linear connected.
What might be the best way to do this?
I have the basic codes so far of importing the data and converting to 24 by 2 double format:
file= "pricedata.xlsx";
Tab =readtable(file);
T=table2array(Tab);
Any suggestions?
Thanks in advance!

 Accepted Answer

"What might be the best way to do this?'"
Interpolation THEN
adding noise.
There is no reason to mix the noise generation with interpolation step.

9 Comments

Hi Bruno,
Ok that makes sense.
Would you mind showing an example set of codes in doing this, for example using -spline- or -ppval-?
Thanks!
Why you bother tp use spline interpolation for function that is not smooth? I beieve the brownian motion is not derivable anywhere.
You will make more harm than good. I would use nearest interpolation then add Brownian noise, which can derive like sqrt(T) IIRC.
Hey Bruno, thanks for the suggestion. I am quite new to this part of MATLAB area. Would you mind showing an example with some data points?
This is for "normal" brownian (as opposed to harmonic)
% Generate t and data1 that are fake time/data
t = linspace(0,24*40,24*40+1); % hour
sigmaperhour = 1;
dt=mean(diff(t));
data1 = cumsum(sigmaperhour*sqrt(dt)*randn(size(t)));
n = 3600; % upsampling factor, 3 in your case
dti=dt/n;
ti = linspace(min(t),max(t),n*(length(t)-1)+1);
% unconditioning brownian noise
noise = cumsum(sigmaperhour*sqrt(dti)*randn(size(ti)));
% conditioning noise, the hypothesis is noise = 0 at all values of t
noise_at_t = interp1(ti,noise,t,'linear','extrap');
pp = interp1(t,noise_at_t,'linear','pp');
noise = noise-ppval(pp,ti);
% Upsampling + add noise
datai = interp1(t, data1, ti, 'linear', 'extrap') + noise;
% Graphical output
close all
plot(t,data1,'-r.')
hold on
plot(ti,datai,'color',0.7+[0 0 0])
legend('original data', 'upsamling data + brownian noise')
Bruno, thanks for providing a cool example, and this helps a lot. In your 'datai', it has too many points added. For example, if the data frequency is daily, I want to just interpolate with the Brownian motion noise at max say 24. How would I adjust the number of interpolated points in your example to be like 9 or 12 or 24 only?
This line
n = 3600; % upsampling factor, 3 in your case
A quick question: when you said my upsampling factor is 3, why did you insert this comment? Is this because, I want 24 points in between a set of two given data points?
Because I read this:
"For instance, consider 24 by 2 double price data points where it spans over 24 hours.
...
that way I now have 72 points per day."
You start with 24 data per day and want to end up with 72. 72/24 is 3. But I might miss understanding what you wrote.
Excellent. No, you are correct. I wanted this, but your sample code had 3600, so now it is clear to me you put 3600 for demonstration. Thanks!

Sign in to comment.

More Answers (1)

Try this:
rows = 24;
prices = round(100*rand(rows, 2), 2) % [Beginning of day, end of day]
% Scan down getting prices.
% Allocate 72 points per day, including the start and end.
interpPrices = zeros(rows, 72);
for row = 1 : rows
% Add up to +/- 2 unit of noise.
noise = 2 * rand(1, 72);
interpPrices(row, :) = round(linspace(prices(row, 1), prices(row, 2), 72) + noise, 2);
end

3 Comments

Hi Image, thanks for the response!
Just wanted to make sure I understand it correctly.
In your array, you start with 24 days and each day has the market open and closing prices, correct?
So each row represents a valid trading day?
Yes. prices is the opening and closing prices for the day. You said you already have this. Then you said you wanted to go between those two values with 70 interpolated values that had a bit of noise added to them, for a total of 72 prices per day. Each row of interpPrices is a day, and there are 72 prices at various times thoughout that day.
Thanks for the response and comments. I will try this method as well!

Sign in to comment.

Categories

Find more on Interpolation in Help Center and File Exchange

Products

Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!