mapreduce
Programming technique for analyzing data sets that do not fit in memory
Syntax
Description
optionally
specifies the run-time configuration settings for outds = mapreduce(ds,mapfun,reducefun,mr)mapreduce.
The mr input is the result of a call to the mapreducer function.
Typically, this argument is used with Parallel Computing Toolbox™, MATLAB®
Parallel Server™,
or MATLAB
Compiler™. For more information, see Speed Up and Deploy MapReduce Using Other Products.
specifies
additional options with one or more outds = mapreduce(___,Name,Value)Name,Value pair
arguments using any of the previous syntaxes. For example, you can
specify 'OutputFolder' followed by a character
vector specifying a path to the output folder.
Examples
Input Arguments
Name-Value Arguments
Output Arguments
Tips
Debugging your
mapreducealgorithms to examine how key-value pairs move through the different phases is always useful. To examine the movement of data, set breakpoints in your map and reduce functions. The breakpoints stop execution ofmapreduce, allowing you to examine the current status of relevant variables, like theKeyValueStoreorValueIterator. For more information, see Debug MapReduce Algorithms.Some recommendations to optimize
mapreduceperformance on any platform are:Minimize the number of calls to the map function. The easiest approach is to increase the value of the
ReadSizeproperty of the input datastore. The result is thatmapreducepasses larger blocks of data to the map function, and the datastore depletes with fewer reads.Decrease the amount of intermediate data sent between map and reduce functions. One approach is to use
uniqueinside a map function to combine similar keys. See Compute Mean by Group Using MapReduce for an example of this technique.
Extended Capabilities
Version History
Introduced in R2014b