Tip
If you have data stored in an Amazon S3 bucket, then you can use datastores in MATLAB to directly access the data without needing any storage on the cluster. For details, see Transfer Data To Amazon S3 Buckets. You can also select the following storage options when creating your cluster.
Cluster Shared Storage
Persisted Storage. To request shared disk space that remains
after you shut down the cluster, select a disk size. The shared storage is mounted at
/shared/persisted
. For details, see the table below.
Amazon S3 Data. To transfer individual files from an Amazon
S3 bucket to the cluster machines, click Add Files. You can
specify S3 files only when creating your cluster and starting it for the first time.
When the cluster starts up, files are copied to /shared/imported
.
See Copy Data from Amazon S3 Account to Your Cluster.
Local Machine Storage
Volume Size: To request an Amazon EBS Volume, enter a
number of GB in the box, e.g. 100. This requests a local SSD,
created on each worker machine of your cluster. The SSD is mounted at
/mnt/localdata
. Use this option when read/write performance is
critical.
EBS Snapshot ID: If you previously saved an EBS snapshot of
your data on Amazon, then enter the ID. The data is copied to the SSD volume attached
to each worker machine. If you provide a formatted snapshot, then the file system type
must be ext3
, ext4
, or xfs
.
For ext3
and ext4
, the full volume size of the
file system might not be immediately available when the instance comes online. Growing
the file system to full capacity can take up to 30 minutes after the instance is
online, depending on the size of the extN
volume and the instance
type. You can access all data in the original snapshot as soon as the cluster is
online.
After selecting your storage options, click Create Cluster. For details on other cluster settings, see Create a Cloud Cluster.
All worker machines have access to local and cluster shared storage. You can use these
folders for storing data generated by your jobs, and for data you want to transfer between the
cluster and your client location. See Transfer Data to or from a Cloud Center Cluster. The paths are the same for all worker machines
of the cluster. Changes to files and folders under /mnt/localdata
are not
visible to other machines. Files and folders under the /shared
mount point
are shared by all worker machines of your cluster. Changes made by any machine are visible to
all other machines. Each folder has different longevity, as shown in the table.
Location | Size | Usage |
---|---|---|
/mnt/localdata | Specified in cluster configuration | The location of the local machine storage volume. Each worker machine gets its own copy of the data. Temporary and intermediate data can also be written to this location. Deleted when cluster is stopped. The data is not retained between cluster runs. If you have specified an EBS snapshot, then the data is copied again when the cluster is started. |
/shared/persisted | Specified at cluster creation | The location of the cluster shared persisted storage and MATLAB Job
Scheduler data. This folder is shared among worker machines and is retained between
cluster runs. Save data you want to retrieve on the next start of the cluster in
folders and files under Deleted when cluster is deleted. |
/shared/tmp | Varies with instance type | This folder is shared among worker machines and is not retained between cluster runs. Use it to store temporary and intermediate data that must be visible or accessible from multiple cluster machines. The available storage space depends on the ephemeral storage available on the selected machine instance type. Deleted when cluster is stopped. |
/shared/imported | Part of allocation for
| The location of the cluster shared Amazon S3 data. Selected Amazon S3 objects are copied to this location as part of the cluster start up and are deleted on shut down. Deleted when cluster is stopped; copied again when cluster is started. |
Note:
To use /shared/tmp
or /shared/imported
, you must
disable the dedicated headnode mode (see Use a Dedicated Headnode Instance for Management Services), and use an instance
with ephemeral storage. Consult the table in Choose Supported EC2 Instance Machine Types to find out which
instances have ephemeral storage.
Cloud cluster machines can share these folders only with machines of the same cluster; that is, there is no file sharing between different clusters.
You create, start, stop, and delete your cloud clusters independent of your local MATLAB® session. Deleting an associated cluster object in MATLAB does not affect the cloud cluster or its persistent storage.
When a cluster times out, it shuts down and clears the contents of
/shared/tmp
, /shared/imported
and
/mnt/localdata
, but preserves the content of
/shared/persisted
. If you use an automatic shutdown setting for your
cluster, ensure that you have all data you need from /shared/tmp
and
/mnt/localdata
before this timeout occurs.
The contents of /shared/tmp
are built using ephemeral
storage.