Transfer Data with Job Methods and Properties
To transfer data to a cloud cluster, you can use the AttachedFiles
or
JobData
properties, as you do for other clusters. For example:
Place all required executable and data files in the same folder.
Specify that folder in the
AttachedFiles
property of the job.
Submitting your job transfers the files to the cloud and makes them available to the workers running on the cloud cluster.
Data stored in job and task properties is available to the client. Therefore, your task or
batch function results are accessible from the finished job fetchOutputs
function or the task OutputArguments
property. For batch jobs running on
the cloud, access the job workspace variables with the load
(Parallel Computing Toolbox)
function in your client session.
In this example, you run a batch job with files on your machine and a function
divideData
on clusters in Cloud Center.
Load Data
Copy the data for this example to your current working folder by opening the supporting
function prepareSupportingFiles
and using the code inside.
openExample("parallel/RunBatchJobAndAccessFilesFromWorkersExample", ... supportingFile="prepareSupportingFiles.m")
Your current working folder now contains 4 files: A.dat
,
B1.dat
, B2.dat
, and B3.dat
.
Run Batch Job
Create and discover your Cloud Center profile on MATLAB. Specify this profile as your default cluster profile. For more details, see Create and Discover Clusters.
Create a cluster object using parcluster
(Parallel Computing Toolbox).
c = parcluster;
batch
(Parallel Computing Toolbox). Use
the AttachedFiles
name-value argument to transfer files from your local
machine to the workers. For example, use a parallel pool with three workers and offload the
computations in the divideData
function.filenames = "B" + string(1:3) + ".dat"; job = batch(c,@divideData,1,{}, ... Pool=3, ... AttachedFiles=filenames);
To block MATLAB until the job completes, use the wait
(Parallel Computing Toolbox)
function on the job object.
wait(job);
Retrieve Results and Clean Up Data
To retrieve the results of a batch job, use the fetchOutputs
(Parallel Computing Toolbox)
function. fetchOutputs
returns a cell array containing the outputs of the
function run in the batch job. You can also access the job workspace variables with the
load
(Parallel Computing Toolbox) function.
X = fetchOutputs(job)
X = 1×1 cell array {40×207 double}
When you have retrieved all the required outputs and do not need the job object anymore, delete it to clean up its data and avoid consuming resources unnecessarily.
delete(job)
clear job
For more details, see Run Batch Job and Access Files from Workers (Parallel Computing Toolbox).