Maximize/minimize output of weighted inputs

I am trying to find an efficient method to "optimize" a result given a set if inputs and weights e.g. a stock trading strategy. I can use brute force to test every possible combination of weights for the inputs, but the processing is intense and slow. I have heard of genetic algorithms, neural networks and optimisation but really don't know too much more about them. I am not trading the stock market myself, but using this as learning example.
My example "system": I have four numeric inputs (i) from "technical indicators" of the price time series data normalized [0, 1], I weight (w) each input [-1, 1] and then sum the values: Y = (w1*i1) + (w2*i2) + (w3*i3) + (w4*i4);
For "trading": If Y>0 then BUY, otherwise SELL. The result of this trading will yield a return R; the idea for now is to find what weights return the highest R. I also have other metrics which describe the "success" of the system that I would also like to test, such as maxmizing profit/loss, minimizing drawdown etc, but I'll start here for now.
If I set up the problem:
tic
for w1=-1:0.1:1
for w2=-1:0.1:1
for w3=-1:0.1:1
for w4=-1:0.1:1
% evaluate results of trading
% compare to previous results
% if "better" then store w1,w2,w3,w4 else discard result
end
end
end
end
toc
On my computer this processing will take about 8 hours; is there a better way? Of course, if I add more inputs, the scale of the problem and the processing time will become unmanageable.
Optimisation newbie, so please be gentle...

 Accepted Answer

Matt J
Matt J on 9 May 2014
Edited: Matt J on 9 May 2014
Depends a lot on what R as a function of w looks like. LINPROG might be appropriate if the dependence of R on w is linear.

8 Comments

R is the return on the "investment"
tr = ((Y>0)-(Y<0));
tr(isnan(tr))=0;
R = [1; tr(1:end-1).*diff(pr)];
where: Y is the output of i*w tr will be either 0 (no trades), 1 buy, or -1 sell. pr is the column vector of the stock price.
Since R is non-scalar, it's not clear in what sense you want to "maximize" it.
The value to maximize will be the last value of R, R(end)
The value to maximize will be the last value of R, R(end)
That doesn't sound right. If we define
K=diff(pr)
then R(end) is just
R(end) = sign(i(end-1,:)*w))*K(end)
This means that only one row of your i(m,n) matrix data has any impact and only the last two elements of pr have any impact.
If that's truly the problem you face, then it's easy. Just pick a particular non-zero entry E=i(end-1,j). If sign(E)=sign(K(end)) make w(j)>0 and the rest of the weights zero. Otherwise, make w(j)<0 and the rest of the weights zero.
D'oh... sorry my mistake! Bad question, your answer is right.
Of course, I need to accumulate the returns, R, over time not just find the last value, so if I try:
maxR = -inf;
i1 = rand(100,1); % some input (yet to be decided)
i2 = rand(100,1); % some input
i3 = rand(100,1); % some input
i4 = rand(100,1); % some input
pr = rand(100,1); % stock price data
tic
for w1=-1:0.1:1
for w2=-1:0.1:1
for w3=-1:0.1:1
for w4=-1:0.1:1
Y = (w1*i1)+(w2*i2)+(w3*i3)+(w4*i4);
tr = ((Y>0) - (Y<0));
tr(isnan(tr))=0;
R = [1; tr(1:end-1).*diff(pr)];
R(isnan(R))=0;
R = cumsum(R);
if R(end) > maxR
maxR = R(end);
weights = [w1 w2 w3 w4];
end
end
end
end
end
toc
disp(weights);
disp(maxR(end));
Now we can find the w which maximises R(end)?
Well, if your example problem reflects the data size and dimensions of your true problem, the loops can be eliminated in favor of the vectorized approach below. For me, this runs in the blink of an eye.
i1 = rand(100,1); % some input (yet to be decided)
i2 = rand(100,1); % some input
i3 = rand(100,1); % some input
i4 = rand(100,1); % some input
pr = rand(100,1); % stock price data
[w1,w2,w3,w4]=ndgrid(-1:.1:1);
W=[w1(:),w2(:),w3(:),w4(:)].';
I=[i1,i2,i3,i4];
I(end,:)=[];
Rcumulative=diff(pr).'*sign(I*W)+1;
[maxR,indmax]=max(Rcumulative);
weights=W(:,indmax);
number_of_solutions = nnz(maxR==Rcumulative),
Note that your objective function Rcumulative only depends on w through sign(Y). Since sign() is a piece-wise constant function, so is Rcumulative. I.e., there is likely to be an infinite continuum of maximizing weights. This is why number_of_solutions in the code above is sometimes greater than 1.
All this is for brute force search. You could also try ga() if you have the Global Optimization Toolbox. You would use the fitness function
fitnessfunc=@(w) diff(pr)*sign(I*w(:))+1
where the matrix I is as in my code above. I'm not sure how well the genetic algorithm handles piece-wise constant functions, however.
Sorry about the delay in thanking you for the answer (computer died!)
The solution you offered didn't exactly solve my issue, but it gave me a big shove in the right direction! Thanks. I'm still working on figuring why the outputs are different from my code, but that is something I can work on in my own time. I also run into a memory issue if I try to use an input data set longer than 300 points, but again, I can work on these issues in my own time.
Matt J
Matt J on 14 May 2014
Edited: Matt J on 14 May 2014
I'm still working on figuring why the outputs are different from my code
Possibly because there are multiple/infinite solutions, see the 2nd last paragraph in my last comment. Where the maxR's the same? If not, who's was higher?

Sign in to comment.

More Answers (0)

Categories

Products

Asked:

on 9 May 2014

Edited:

on 14 May 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!