## Resource Sharing Guidelines for Vector Processing and Matrix Multiplication

*Resource sharing* is an area optimization in which
HDL Coder™ identifies multiple functionally equivalent resources and replaces them
with a single resource. The data is time-multiplexed over the shared resource to perform
the same operations. To learn more about how resource sharing works, see Resource Sharing.

You can follow these guidelines to learn how to use the resource sharing with streaming when processing 1-D vectors and 2-D matrices. Each guideline has a severity level that indicates the level of compliance requirements. To learn more, see HDL Modeling Guidelines Severity Levels.

### Use StreamingFactor for Resource Sharing of Vector Signals

**Guideline ID**

`3.1.9`

**Severity**

*Informative*

**Description**

To reduce circuit area of a Subsystem block that performs the same computation on each element of a 1-D vector, use the Subsystem HDL block property **StreamingFactor**. For a vector signal that has N elements, set **StreamingFactor** to `N`

. By using time-division multiplexing to process each element, you can process the result by using smaller number of operations. The clock frequency of the operators becomes N times faster than that of the original model.

When the subsystem containing resources to be shared uses multiple vector signals with different sizes, the clock frequency is multiplied by the least common multiple of the vector sizes, which can reduce the maximum achievable target frequency. To achieve the desired frequency:

Add logic for demultiplexing the vector signal before it enters the subsystem and for multiplexing the signal that leaves the subsystem. You can then specify a

**SharingFactor**on the subsystem instead of the**StreamingFactor**.Pad the different vector signals to make them the same size as the vector signal that has the maximum size, and then specify the

**StreamingFactor**.

Open the model `hdlcoder_vector_stream_gain`

.

open_system('hdlcoder_vector_stream_gain') set_param('hdlcoder_vector_stream_gain', 'SimulationCommand', 'Update')

The model accepts a 10-element vector signal as input and multiplies each element by a gain value that is one more than the previous value.

```
open_system('hdlcoder_vector_stream_gain/Gain_Stream')
```

To see the simulation results, simulate the model and open the Scope block.

sim('hdlcoder_vector_stream_gain') open_system('hdlcoder_vector_stream_gain/Show Processing Time')

The `Gain_Stream`

subsystem has a **StreamingFactor** set to `10`

. To generate HDL code for this subsystem, run the `makehdl`

function:

```
makehdl('hdlcoder_vector_stream_gain/Gain_Stream')
```

After generating HDL code, to see the effect of the streaming optimization, open the generated model and navigate inside the `Gain_Stream`

subsystem.

The vector data is serialized on the input side and the output size parallelizes the serial data. This optimization increases the total circuit size conversely when the target circuit size to be shared is small. The Gain block inside the shared subsystem is running at a rate that is 10 times faster than the model base rate, which avoids an increase in the subsystem latency and balances the reduction in maximum achievable frequency by the increase in area savings on the target hardware.

### Use **SharingFactor** and HDL Block Properties for Sharing Matrix Multiplication Operations

#### Guideline ID

`3.1.10`

#### Severity

*Informative*

#### Description

The Matrix Multiply block is a Product block
that has **Multiplication** block parameter set to
`Matrix(*)`

. In the HDL Block Properties dialog
box, the HDL architecture is set to `Matrix Multiply`

and you can specify the DotProductStrategy.

**DotProductStrategy Settings**

DotProductStrategy | Description |
---|---|

`'Fully Parallel'` (default) | Performs multiplication and addition operations in
parallel. `[MxN]*[NxM]` matrix
multiplication requires `N*M*M`
multipliers. |

`'Parallel Multiply-Accumulate'` | Uses the Parallel architecture of
the Multiply-Accumulate block to implement the
matrix multiplication. This architecture performs multiple
Multiply-Add blocks in parallel with
accumulation. |

`'Serial Multiply-Accumulate'` | Uses the Serial architecture of the
Multiply-Accumulate block to implement the
matrix multiplication. This mode performs
`N` times oversampling and number of
multipliers becomes `M*M` . |

To share resources and reduce the number of multipliers further, when you have
multiple Matrix Multiply blocks in the same subsystem, set
**DotProductStrategy** to ```
Fully
Parallel
```

and specify the **SharingFactor** on
the upper subsystem.

For multiplications involving complex and real numbers, the number of multipliers become doubled.

**Number of Multipliers Generated by Multiplication of
[MxN]*[NxM]**

Multiplication Type | Fully Parallel/Parallel Multiply-Accumulate | Serial Multiply-Accumulate |
---|---|---|

Real x Real | `N*M*M` | `M*M` |

Complex x Real | `N*M*M*2` | `M*M*2` |

Complex x Complex | `N*M*M*4` | `M*M*4` |

For floating-point matrix multiplication, select **Use
Floating Point**. In this case, you must use the
`Fully Parallel`

**DotProductStrategy**. As this mode does not use element-wise
operations and performs parallel multiplication and addition operations, use the
**SharingFactor** instead of the
**StreamingFactor** to share resources and save circuit
area.

For an example that shows how to perform streaming matrix multiplication using floating-point types, see HDL Code Generation for Streaming Matrix Multiply System Object.