Submitting batch jobs across multiple nodes using slurm

220 views (last 30 days)
I have a workstation that I am currently using to run the following code structure:
A matlab script that manages everything and iteratively calls a second wrapper function. Within this wrapper, I submit multiple jobs (each one is a model simulation requiring one core) using the batch command, wait for them to all complete, then return some output to the main script. This works fine on my computer running 12 jobs in parallel but each model simulation takes 2-3 hours and I am limited to the number of cores on my machine, ideally I would need to run ~50+ jobs in parallel to get reasonable run times.
I would like to get this working on the university cluster which uses the SLURM workload manager. My problem is that each node on this cluster does not have sufficient cores to get much of a speedup and so I need to submit the job to run on multiple nodes to take full advantage of the resources available. Of course I run into a problem because the main script only needs 1 core and so trying to split this over several nodes makes no sense to slurm and throws an error.
I am very much a beginner with how to use slurm so presumably this is a mistake in how I configure the job submission, the script I am using is as follows:
#!/bin/bash
#SBATCH -J my_script
#SBATCH --output=/scratch/%u/%x-%N-%j.out
#SBATCH --error=/scratch/%u/%x-%N-%j.err
#SBATCH -p 24hour
#SBATCH --cpus-per-task=40
#SBATCH --nodes=2
#SBATCH --tasks=1
#SBATCH --mail-type=BEGIN,END,FAIL
#SBATCH --mail-user sebastian.rosier@northumbria.ac.uk
#SBATCH --exclusive
module load MATLAB/R2018a
srun -N 2 -n 1 -c 40 matlab -nosplash -nodesktop -r "my_script; quit;"
The model wrapper that submits multiple batch jobs is something like this:
c = parcluster;
for ii = 1:N
workerTable{ii} = batch(c,'my_model',1,{my_model_opts});
end
with additional lines to check job status and get results etc.
Perhaps what I am trying to do makes no sense and I need to come up with a completely different structure to my MATLAB script. Either way, any help would be much appreciated!
Sebastian

Accepted Answer

Raymond Norris
Raymond Norris on 3 Mar 2021
Edited: Raymond Norris on 4 Mar 2021
Hi Sebastian,
I'm going to assume that my_script is the code "workerTable{ii} = ..."
There are several ways to approach this, but none require that your Slurm job request >1 node.
OPTION #1
As you've written it, you could request 1 node with 40 cores. Use the local profile to submit single core batch jobs on that one node.
#!/bin/bash
#SBATCH -J my_script
#SBATCH --output=/scratch/%u/%x-%N-%j.out
#SBATCH --error=/scratch/%u/%x-%N-%j.err
#SBATCH -p 24hour
#SBATCH --cpus-per-task=40
#SBATCH --nodes=1
#SBATCH --tasks=1
#SBATCH --mail-type=BEGIN,END,FAIL
SBATCH --mail-user sebastian.rosier@northumbria.ac.uk
#SBATCH --exclusive
module load MATLAB/R2018a
matlab -nodesktop -r "my_script; quit"
OPTION #2
Same Slurm script, but modifyed my_script to make it a bit more streamlined (though parfeval isn't much different than your call to batch).
% Start pool
c = parcluster;
sz = str2num(getenv('SLURM_CPUS_PER_TASK'))-1;
if isempty(sz)
sz = maxNumCompThreads-1;
end
p = c.parpool(sz);
parfor ii = 1:N
results{ii} = my_model(my_model_opts);
end
or
% Start pool
c = parcluster;
sz = str2num(getenv('SLURM_CPUS_PER_TASK'))-1;
if isempty(sz)
sz = maxNumCompThreads-1;
end
p = c.parpool(sz);
for ii = 1:N
f(ii) = p.parfeval(@my_model,1,my_mode_opts);
end
% Run other code
...
% Now fetch the results
for ii = 1:N
[idx,results] = fetchNext(f);
end
OPTION #3
Rather than sticking with a local profile, use a Slurm profile and then expand Option #2 to use a much larger parallel pool (notice in this Slurm script, we're only requesting a single core since parpool will request the larger pool of cores). This will make use of the MATLAB Parallel Server.
#!/bin/bash
#SBATCH -J my_script
#SBATCH --output=/scratch/%u/%x-%N-%j.out
#SBATCH --error=/scratch/%u/%x-%N-%j.err
#SBATCH -p 24hour
#SBATCH --cpus-per-task=1
#SBATCH --nodes=1
#SBATCH --tasks=1
#SBATCH --mail-type=BEGIN,END,FAIL
SBATCH --mail-user sebastian.rosier@northumbria.ac.uk
module load MATLAB/R2018a
matlab -nodesktop -r "my_script; quit"
We'll use parfor here, but we could have used parfeval as well. This assume a 'slurm' profile has been created. Contact Technical Support (support@mathworks.com) if you need help.
c = parcluster('slurm');
p = c.parpool(100);
parfor ii = 1:N
results{ii} = my_model(my_model_opts);
end
  3 Comments
Raymond Norris
Raymond Norris on 4 Mar 2021
Hi Sebastian,
A couple of things.
I cleaned up my example a bit so that rather than hardcoding the size of the parpool (e.g. 40), we query Slurm for the appropriate size.
sz = str2num(getenv('SLURM_CPUS_PER_TASK'))-1;
if isempty(sz)
sz = maxNumCompThreads-1;
end
p = c.parpool(sz);
If for some reason we're not running in Slurm, sz will be empty, so we assign it to the number of cores on the machine. I decrement it to account for the MATLAB process that is also running on the machine.
Secondly, I choose 40 because I read
#SBATCH --cpus-per-task=40
#SBATCH --nodes=2
#SBATCH --tasks=1
But as I reread this, I'm guessing there are 20 cores? And that you were requesting 40 across 2 nodes? In any event. As I've now written it (querying SLURM_CPUS_PER_TASK), the parallel pool should size better.
N and the size of the pool don't need to be the same. If N is greater than the size of the pool, then yes, batch jobs will be queued. That's the advantage of using MATLAB Parallel Server. Where the local pool is bound to the size of your machine, running a parallel pool that uses MATLAB Parallel Server allows you to scale to multiple node.
There are advantages and disadvantages to batch vs parpool. With batch, I can submit single core jobs that will probably have less wait time to run (in this case, we're only requesting a single core), but the code must be written slightly differently. parpool requires all the cores to be available before running, but then the code is a bit more elgant.
A hybrid approach to this (submit single core jobs using a parfor syntax) is to use parfor with parforOptions, which might be the best of both worlds.
c = parcluster('slurm');
opts = parforOptions(c);
parfor (ii = 1:N, opts)
results{ii} = my_model(my_model_opts);
end
Here, we're not starting a parallel pool, but using the parfor syntax to submit single core batch jobs.
To answer your last question, yes, you can create a Slurm profile on your own. You can either use these instructions or contacts Technical Support (support@mathworks.com) for help.
Sebastian Rosier
Sebastian Rosier on 4 Mar 2021
Hi Raymon,
Thanks for the detailed answer! I'll have a go implementing this on the cluster and contact support if I run into further problems.
Sebastian

Sign in to comment.

More Answers (0)

Categories

Find more on Cluster Configuration in Help Center and File Exchange

Products


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!