You haven't mentioned any constraints, so I'm assuming fminunc would apply to your problem. If you do have constraints, note that they too would have to be independent for each sub-model in order for the overall optimization to be decouplable.
Assuming you have the memory to store the gradient and Hessian in sparse form, it should be more efficient to solve simultaneously, however, you would have to set up the option parameter input judiciously.
It would be necessary to use the trust-region reflective algorithm, which allows you to take advantage of the block sparsity of your Hessian, via either the HessPattern, HessianMultiplyFcn, or HessianFcn options. Note that with the trust-region method, you are also required to supply your own gradient calculation (SpecifyObjectiveGradient=true).