MATLAB Answers

0

Why am I unable to use parpool with the local scheduler or validate my local configuration of Parallel Computing Toolbox?

When attempting to use parpool in Parallel Computing Toolbox, it will not open or I receive an error message. I am unable to validate my local profile.

Tags

No tags entered yet.

3 Answers

Answer by MathWorks Support Team on 5 May 2016
Edited by MathWorks Support Team on 5 May 2016
 Accepted Answer

There are several issues that can prevent the parpool from starting. Run the following tests below to make sure that your configuration is setup properly. If at any point you receive an error message, you can submit a request to Installation support using the link at the bottom of the page. When submitting a request, be sure to include the following:
- Your license number
- The release of MATLAB and PCT
- The output of your validation (click details to get the full information)
- The results of the tests below
Also when submitting a request please reference Solution 1-C27YO8.
NOTE: If you are using R2008a and you are unable to use Parallel Computing Toolbox, see the bug report here:
1) Make sure your license of Parallel Computing Toolbox works
In MATLAB you can run the following command to check your license:
 
license checkout Distrib_Computing_Toolbox
You will receive the output "ANS=1" if the license is working. Otherwise, you will see a license manager error. In that case, searching for the license error message should give a solution for how to solve your issue.
2) Make sure your release of MATLAB matches the release of PCT
To check the release of your products, run the "ver" command in MATLAB. Next to each product there should be a release (ex: R2009b). Make sure that each release matches (It is fine if there is plus "+" next to some of the products and not all)
If the release of Parallel Computing Toolbox and MATLAB do not match the release of MATLAB and your other products, you will not be able to use this configuration until the installations are at the same release.
3) Disable local mpiexec
If you are using R2010a or newer, you may experience issues with the new local mpiexec implementation. In that case, try the following command to disable this feature:
distcomp.feature( 'LocalUseMpiexec', false )
This should allow the local scheduler to create and process parallel jobs and parpool.
4) Check your local scheduler configuration
There are no changes that need to be made in order to use the local scheduler, but if you have made changes to the configuration, you may want to reset these. This can be done by creating a new local scheduler configuration. To do so,
1. Go to the Parallel menu in MATLAB and select "Manage Cluster Profiles..." ("Manage Configurations..." for R2011b or earlier) 
2. Click on Add > Custom > Local (for older releases: From the File menu, select New > Local Configuration)
3. Click the radio option in the default column to set this as the default configuration
Once complete, close the Manage Configuration windows and try again.
5) Clear the local scheduler data folder
The error you are seeing might be the result of bad local scheduler data. In that case, the local scheduler data can be removed. To do so:
1. In MATLAB, run the command "prefdir" to find your preferences folder. Ex:
 
>>prefdir
ans =
C:\Users\Administrator\AppData\Roaming\MathWorks\MATLAB\R2009b
This will output your preferences folder. The local scheduler data folder is one level up from the preferences in a folder called "local_scheduler_data" or "local_cluster_jobs". For example:
C:\Users\Administrator\AppData\Roaming\MathWorks\MATLAB\local_scheduler_data
2. Close MATLAB
3. Rename or delete the local_scheduler_data or local_cluster_jobs
4. Restart MATLAB
You can now try to run parpool to see if the data folder was the problem.
6) Ensure that hostname resolution works on your computer
In order to use the local scheduler, your computer's own hostname must be resolvable. To confirm this, run the following command in MATLAB:
 
!hostname
This will give you your computer's hostname. You must be able to resolve this hostname to the computer's IP address. To test this you can run:
 
!ping <hostname>
Where <hostname> is the output of the hostname command above. If the results indicate the wrong IP address or say that your computer is an "unknown host", there is a network issue on your computer that needs to be resolved in order to use the local scheduler. In that case, ask your network administrator for help.
7) Check to see if you have a startup.m file on the MATLAB path
It may be causing an error, even if it works fine in MATLAB when run as code.
To see if you have a startup.m file on the MATLAB path run the below command in MATLAB:
 
which startup.m
Either delete or move that file outside of the MATLAB path.
If you are still unable to run parpool, run a validation of your local scheduler configuration and submit this to support. To validate:
1. Go to the Parallel menu in MATLAB and select "Manage Cluster Profiles..." ("Manage Configurations..." for R2011b or earlier)
2. Highlight your local scheduler configuration and click the "Validate" button ("Start Validation" for R201b or earlier)
3. Once the validation completes, click the "details" link to see the results
You can then forward your output of validation, the results of the tests below, and your license number to support here:

  5 Comments

Hello,
If the steps in this support solution have not helped you resolve the problem, please contact MathWorks support directly:
-Shawn
MathWorks Support
On R2015b, in order for the cluster to work - every node needs to have port 7 (ECHO) open.
On MSFT platform the workaround is (on each node and head)
  1. Installing MSFT Simple TCP services (windows feature) or equivalent ECHO daemon on port 7
  2. Configure firewall on each node to allow TCP traffic on port 7
In order to allow matlab/distcomp to check cluster connectivity using java.net.InetAddress.isReachable().
References:
Symptoms are
  • Admin center / test connectivity fails where machines unable to ping each other but all other connectivity (connecting to ports / creating sockets work). In other words all tests are ok apart yellow in "Node can connect to server ports" and "Nodes can connect to client"
  • Cluster profile validation fails (Cannot rerun task because there are no rerun attempts left)
  • following code returns b=0:address = java.net.InetAddress.getByName('***cluster nodes dns***')b = address.isReachable(10000)
  • you can launch c=parcluster('your cluster'); parpool(c,1) - but not more than 1 worker
PS: the error message when initializing jobs or spmd/parfor is "Worker exiting because of an error during PreJobEvaluate". Where PreJobEvaluate is checking connectivity per java.net.InetAddress.isReachable().

Sign in to comment.


Answer by Tomasz Wyrowinski on 3 Feb 2018
Edited by Tomasz Wyrowinski on 3 Feb 2018

In my case the issue was caused by OpenJDK as it's stated in this answer. This did the trick for me:
Try removing any MATLAB_JAVA environment variable pointing
to the OpenJDK directory or use Oracle's Java instead.

  0 Comments

Sign in to comment.


Answer by MuMing Zhao on 2 Jan 2019

Above answer 7) to remove the 'startup.m' file fits the problem in my case (matlab2017b). However only remove the 'startup.m' file does not work, I further remove the directory that contains the 'startup.m' file and the parpool works fine.
In detail, the folder contains that file in my computer is: ~/Documents/MATLAB/, and this path is automatically included in the default path when matlab is started. After I remove this path from the default path directories, the parpool can work without errors.

  0 Comments

Sign in to comment.