If the fit can be done with 5 data points only, I assume the data errors are very small, in which case, you might get a pretty accurate fit just by doing a linear fit to the log of your model
logY = log(p(1)) - X*p(2)
This can be solved for a=log(p(1)) and b=-p(2) using any of MATLAB's solvers, and of course it can be further vectorized across all Y(i,:),
Xcell= num2cell(reshape(Xcell, 5,2,),[1,2] );
If nothing else, this might give you a better beta0 than [Y(i,1) 0.001] and that, of course, could speed things up.