Documentation

### This is machine translation

Translated by
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

## Nested `parfor`-Loops and `for`-Loops

### Parallelizing Nested Loops

You cannot use a `parfor`-loop inside another `parfor`-loop. As an example, the following nesting of `parfor`-loops is not allowed:

``` parfor i = 1:10 parfor j = 1:5 ... end end```

### Tip

You cannot nest `parfor` directly within another `parfor`-loop. A `parfor`-loop can call a function that contains a `parfor`-loop, but you do not get any additional parallelism.

Code Analyzer in the MATLAB® Editor flags the use of `parfor` inside another `parfor`-loop:

You cannot nest `parfor`-loops because parallelization can be performed at only one level. Therefore, choose which loop to run in parallel, and convert the other loop to a `for`-loop.

Consider the following performance issues when dealing with nested loops:

• Parallel processing incurs overhead. Generally, you should run the outer loop in parallel, because overhead only occurs once. If you run the inner loop in parallel, then each of the multiple `parfor` executions incurs an overhead. See Convert Nested for-Loops to parfor for an example how to measure parallel overhead.

• Make sure that the number of iterations exceeds the number of workers. Otherwise, you do not use all available workers.

• Try to balance the `parfor`-loop iteration times. `parfor` tries to compensate for some load imbalance.

### Tip

Always run the outermost loop in parallel, because you reduce parallel overhead.

You can also use a function that uses `parfor` and embed it in a `parfor`-loop. Parallelization occurs only at the outer level. In the following example, call a function `MyFun.m` inside the outer `parfor`-loop. The inner `parfor`-loop embedded in `MyFun.m` runs sequentially, not in parallel.

```parfor i = 1:10 MyFun(i) end function MyFun(i) parfor j = 1:5 ... end end ```

### Tip

Nested `parfor`-loops generally give you no computational benefit.

### Convert Nested for-Loops to parfor

A typical use of nested loops is to step through an array using a one-loop variable to index one dimension, and a nested-loop variable to index another dimension. The basic form is:

```X = zeros(n,m); for a = 1:n for b = 1:m X(a,b) = fun(a,b) end end```

The following code shows a simple example. Use `tic` and `toc` to measure the computing time needed.

```A = 100; tic for i = 1:100 for j = 1:100 a(i,j) = max(abs(eig(rand(A)))); end end toc```
`Elapsed time is 49.376732 seconds.`

You can parallelize either of the nested loops, but you cannot run both in parallel. The reason is that the workers in a parallel pool cannot start or access further parallel pools.

If the loop counted by `i` is converted to a `parfor`-loop, then each worker in the pool executes the nested loops using the `j` loop counter. The `j` loops themselves cannot run as a `parfor` on each worker.

Because parallel processing incurs overhead, you must choose carefully whether you want to convert either the inner or the outer `for`-loop to a `parfor`-loop. The following example shows how to measure the parallel overhead.

First convert only the outer `for`-loop to a `parfor`-loop. Use `tic` and `toc` to measure the computing time needed. Use `ticBytes` and `tocBytes` to measure how much data is transferred to and from the workers in the parallel pool.

Run the new code, and run it again. The first run is slower than subsequent runs, because the parallel pool takes some time to start and make the code available to the workers.

```A = 100; tic ticBytes(gcp); parfor i = 1:100 for j = 1:100 a(i,j) = max(abs(eig(rand(A)))); end end tocBytes(gcp) toc```
``` BytesSentToWorkers BytesReceivedFromWorkers __________________ ________________________ 1 32984 24512 2 33784 25312 3 33784 25312 4 34584 26112 Total 1.3514e+05 1.0125e+05 Elapsed time is 14.130674 seconds. ```

Next convert only the inner loop to a `parfor`-loop. Measure the time needed and data transferred as in the previous case.

```A = 100; tic ticBytes(gcp); for i = 1:100 parfor j = 1:100 a(i,j) = max(abs(eig(rand(A)))); end end tocBytes(gcp) toc```
``` BytesSentToWorkers BytesReceivedFromWorkers __________________ ________________________ 1 1.3496e+06 5.487e+05 2 1.3496e+06 5.4858e+05 3 1.3677e+06 5.6034e+05 4 1.3476e+06 5.4717e+05 Total 5.4144e+06 2.2048e+06 Elapsed time is 48.631737 seconds. ```

If you convert the inner loop to a `parfor`-loop, both the time and amount of data transferred are much greater than in the parallel outer loop. In this case, the elapsed time is almost the same as in the nested `for`-loop example. The speedup is smaller than running the outer loop in parallel, because you have more data transfer and thus more parallel overhead. Therefore if you execute the inner loop in parallel, you get no computational benefit compared to running the serial `for`-loop.

If you want to reduce parallel overhead and speed up your computation, run the outer loop in parallel.

If you convert the inner loop instead, then each iteration of the outer loop initiates a separate `parfor`-loop. That is, the inner loop conversion creates 100 `parfor`-loops. Each of the multiple `parfor` executions incurs overhead. If you want to reduce parallel overhead, you should run the outer loop in parallel instead, because overhead only occurs once.

### Tip

If you want to speed up your code, always run the outer loop in parallel, because you reduce parallel overhead.

### Nested Loops: Requirements and Limitations

If you want to convert a nested `for`-loop to a `parfor`-loop, you must ensure that your loop variables are properly classified, see Troubleshoot Variables in parfor-Loops. For proper variable classification, you must define the range of a `for`-loop nested in a `parfor`-loop by constant numbers or variables. In the following example, the code on the left does not work because you define the upper limit of the `for`-loop by a function call. The code on the right provides a workaround by first defining a broadcast or constant variable outside the `parfor`-loop:

InvalidValid
```A = zeros(100, 200); parfor i = 1:size(A, 1) for j = 1:size(A, 2) A(i, j) = i + j; end end```
```A = zeros(100, 200); n = size(A, 2); parfor i = 1:size(A,1) for j = 1:n A(i, j) = i + j; end end```

The index variable for the nested `for`-loop must never be explicitly assigned other than in its `for` statement. When using the nested `for`-loop variable for indexing the sliced array, you must use the variable in plain form, not as part of an expression. For example, the following code on the left does not work, but the code on the right does:

InvalidValid
```A = zeros(4, 11); parfor i = 1:4 for j = 1:10 A(i, j + 1) = i + j; end end```
```A = zeros(4, 11); parfor i = 1:4 for j = 2:11 A(i, j) = i + j - 1; end end```

If you use a nested `for`-loop to index into a sliced array, you cannot use that array elsewhere in the `parfor`-loop. In the following example, the code on the left does not work because `A` is sliced and indexed inside the nested `for`-loop. The code on the right works because `v` is assigned to `A` outside of the nested loop:

InvalidValid
```A = zeros(4, 10); parfor i = 1:4 for j = 1:10 A(i, j) = i + j; end disp(A(i, 1)) end```
```A = zeros(4, 10); parfor i = 1:4 v = zeros(1, 10); for j = 1:10 v(j) = i + j; end disp(v(1)) A(i, :) = v; end```

Suppose that you use multiple `for`-loops (not nested inside each other) inside a `parfor`-loop, to index into a single sliced array. In this case, the `for`-loops must loop over the same range of values. A sliced output variable can be used in only one nested for-loop. In the following example, the code on the left does not work because `j` and `k` loop over different values. The code on the right works to index different portions of the sliced array `A`:

InvalidValid
```A = zeros(4, 10); parfor i = 1:4 for j = 1:5 A(i, j) = i + j; end for k = 6:10 A(i, k) = pi; end end```
```A = zeros(4, 10); parfor i = 1:4 for j = 1:10 if j < 6 A(i, j) = i + j; else A(i, j) = pi; end end end```

### Nested Functions

The body of a `parfor`-loop cannot make reference to a nested function, see Nested Functions (MATLAB). However, it can call a nested function by a function handle. Try the following example. Note that ```A(idx) = nfcn(idx)``` in the `parfor`-loop does not work. You must use `feval` to invoke the `fcn` handle in the `parfor`-loop body:

```function A = pfeg function out = nfcn(in) out = 1 + in; end fcn = @nfcn; parfor idx = 1:10 A(idx) = feval(fcn, idx); end end ```
```>> pfeg Starting parallel pool (parpool) using the 'local' profile ... connected to 4 workers. ans = 2 3 4 5 6 7 8 9 10 11```

### Tip

If you use function handles that refer to nested functions inside a `parfor`-loop, then the values of externally scoped variables are not synchronized among the workers. For more information on handles, see Copying Objects (MATLAB).

### Nested `spmd` Statements

The body of a `parfor`-loop cannot contain an `spmd` statement, and an `spmd` statement cannot contain a `parfor`-loop.

### Break and Return Statements

The body of a `parfor`-loop cannot contain `break` or `return` statements. Consider `parfeval` or `parfevalOnAll` instead.

### P-Code Scripts

You can call P-code script files from within a `parfor`-loop, but P-code script cannot contain a `parfor`-loop.

However, if a script introduces a variable, you cannot call this script from within a `parfor`-loop or `spmd` statement. The reason is that this script would cause a transparency violation. For more details, see Ensure Transparency in parfor-Loops.

Watch now