Copy Data from Amazon S3 Account to Your Cluster

Tip

If you have data stored in an Amazon S3 bucket, then you can use datastores in MATLAB to directly access the data without needing any storage on the cluster. For instructions on how to transfer S3 data from any AWS account, see Transfer Data to Amazon S3 Buckets and Access Data Using MATLAB. The alternative method to download S3 Data using Cloud Center described below is simpler. However, it comes with the limitation that you can only transfer S3 data from the AWS account authorized in Cloud Center.

To transfer individual files from an Amazon S3 bucket to the cluster machines, on the Create Cluster screen, next to Amazon S3 Data, click Add Files. Specify which files you want to make available to your cluster machines. You can specify S3 files only when creating your cluster and starting it for the first time.

When the cluster starts up, before the mjs process starts, specified files are copied to /shared/imported on the cluster’s shared file system. See Cluster File System and Storage. If any of the files is in gzip, tar, or zip file format, they are automatically expanded in /shared/imported.

Note

Transferring a large amount of data from your Amazon S3 account can cause the cluster to time out during its startup. If your data size exceeds approximately 5 GB, start your cluster without the S3 data transfer, then upload the necessary data to the cluster /shared/persisted folder from a local drive, as described in Transfer Data with Standard Utilities.

Copy Data from Amazon S3 Account to Your Cluster

See Also

Topics