Struggling to distribute a cell array across several parallel workers

I have some code, part of which a structure of the form below.
for a = 1:A
for b = 1:B
for c = 1:C
for d = 1:D
for j = 1:2
cellArray = manipulate(cellArray,a,b,c,d,j,params);
end
end
end
end
end
A,B,C and D are constant.
cellArray is a large cell array, each entry of which is a large, multidimensional matrix.
The function 'manipulate' needs to know the loop parameters, a, b, c, d and j, as well as some other fixed parameters, 'params', and modifies only cellArray, simply by adding numbers to elements of its constituent matrices. The loop parameters, a, b, c, d and j specify which subset of the data is used and which subset is to have numbers added to it, with j having a slightly different status to a, b, c and d in that it partitions the cells into two distinct subsets, one of which is read and the other modified for j=1 and vice versa for j=2.
I've squeezed every last bit of performance that I can out of my code, and now I want to parallelize it. My question is, how can I distribute access to cellArray and the other parameters across parallel workers and run 'manipulate' in parallel? I have a mental picture of cellArray as a large pool of data, with the parallel workers as little robots buzzing around the pool, reading what they need and then going in and adding something to the appropriate bit of the pool. I feel like this ought to be possible...or am I just crazy??

7 Comments

As a start you would need your expression to be accessing specific parts of cellArray that are independent rather than the entire cell array.
As it is right now you would have to pass the whole cell array to each worker, but even then they would potentially be writing over the top of each other even if you were allowed to do it.
for example, something like
for a = 1:A
...
parfor j = 1:2
cellArray(j,:) = manipulate( cellArray(j,:), a, b, c, d, j, params);
end
...
end
can work because the parallel workers each access distinct parts of the cell array.
Obviously in this case with only 2 times round the for loop a parfor at that level would not make much sense though.
Unfortunately the structure of the array and the way the parts of it are accessed depends on a,b,c,d and j in a way that makes it impossible to do this, even in principle.
How about if I were to change the function 'manipulate' so that its output was, say, dcellArray, instead, so that the structure became
for a = 1:A
.....
parfor j = 1:2
dcellArray(j) = manipulate(cellarray,a,b,c,d,j)
end
for j = 1:2
cellArray(j) = cellArray(j)+dcellArray(j);
end
......
end
or some variant on this. Is that possible? It's really not what I'm after, but it would be something.
That would at least allow you to call a parfor loop, but whether it is helpful or not I'm not sure.
Have you read the following Matlab Help page?:
It may help you to understand exactly what you can and can't do with a parfor loop to restructure your algorithm and loops. In particular, slicing variables is of importance in your case.
At first glance, your algorithm does not appear to be parallelisable since you're modifying the cell array that is going to be used on the next iteration. Usually, if you write code like this, it means that the order of operations matters.
If it truly does not, then I don't see why you can't split your manipulate function in two. One bit calculate the destination (different for each iteration), one bit calculate the source.
I agree with the others. So far, there's no obvious parallel structure in your code. One remark though, is that there is no apparent logic to having your data split into multi-dimensional arrays in different cells. If all you are doing is incrementing individual elements of the arrays, you could just as well concatenate all the data into one long vector. Furthermore, you could be using linear indexing instead of the complicated multi-dimensional subscripting (a,b,c,d,j). This should make data access faster.
It also seems to me that at least a part of what manipulate() is doing is just an accumarray operation. Even if you can't parallelize the additive incrementing of your array data, you might still be able to use parfor to pre-assemble input data (subs,val) for a call to accumarray. Without a clearer picture of what's going on inside manipulate(), though, this is just speculation.
I'm starting to realize that my code, and I think the problem I'm solving, indeed doesn't have an obvious parallel structure. I have considered using linear indexing, and if you think that this will make my code faster I should give it a try.
I've written a MATLAB code to solve two player, postflop, holdem poker. The indices a,b,c, and d are the flop holdings and turn and river cards. The index j is over the two players. The function 'manipulate' does counterfactual regret minimisation on the players' strategies. It really does flicker about across the data structure in a way that isn't obviously parallelizable......although the literature suggests that it is. Looks like I have some more reading to do. Thanks for your help everyone.
Final update: I've modified the algorithm that I'm using, and now have no problem parallelising it. Thanks again.

Sign in to comment.

Answers (0)

Categories

Find more on Conway's Game of Life in Help Center and File Exchange

Products

Asked:

on 4 Sep 2014

Commented:

on 10 Sep 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!