MATLAB Answers

Why does validation fail when using the MATLAB Job Scheduler?

33 views (last 30 days)
Why does validation fail on the "Cluster connection test" when I am using the MATLAB Job Scheduler?
Validation fails with the following error message:
Could not contact an MJS lookup service on host 'ah-smarshal'. Possible reasons for this problem are:
1. MJS has not been started, has crashed, or has been shut down.
2. A firewall is blocking communication between this computer and 'ah-smarshal'.
3. This computer cannot resolve the hostname of 'ah-smarshal', it resolves it to an incorrect IP address.
4. 'ah-smarshal' resolves its own hostname to an incorrect IP address.
5. Network routers are unable to route traffic from this computer to 'ah-smarshal'.
The hostname, ah-smarshal, corresponds to the fully qualified hostname AH-SMARSHAL.dhcp.mathworks.com.
This computer resolves it to the IP address 172.28.137.61.

Accepted Answer

MathWorks Support Team
MathWorks Support Team on 6 Dec 2019
Edited: MathWorks Support Team on 6 Dec 2019
The 'Cluster Connection Test' stage verifies that the MATLAB client machine is able to communicate with the headnode running the MATLAB Job Scheduler. This tests hostname resolution (both ways), that the necessary ports are open and accepting traffic, that the mjs service is running on the headnode.
If validation fails on this stage, here are some things you can troubleshoot:
1) Verify that the headnode, specified in the cluster profile, is up and running MATLAB Job Scheduler (MJS).
You will need to verify that MJS is running on the headnode and the mjs service has been started. You can test this through the Admincenter or by running $MATLAB/toolbox/parallel/bin/nodestatus on the headnode. You can also do this through the client machine by running Admincenter and entering the headnode hostname into the 'hosts' field or by running $MATLAB/parallel/bin/nodestatus -remotehost headnode_hostname.
2) Verify that the necessary ports are open and accepting traffic on the MATLAB client and MATLAB Parallel Server cluster.
More information on the ports required for MATLAB Parallel Server can be found here:
3) Verify that your MATLAB client is able to resolve the MJS hostname.
If 'ping' is enabled on your network, you can perform a ping test by pinging the MJS hostname from the MATLAB client. Make sure to ping the MJS headnode by the hostname specified in the cluster profile.
Similarly, you will also want to make sure that your MJS headnode can ping your MATLAB client machine. In MATLAB, if you run:
pctconfig
It will output the hostname that the MATLAB client advertises itself by to the MJS headnode. If MJS is unable to communicate to the MATLAB client by the specified hostname, you can change the hostname the MATLAB client uses by running the following:
pctconfig('hostname', 'desktop24.subnet6.companydomain.com');
You will be able to change the hostname to a Fully Qualified Domain Name or an IP Address.
4) Verify the hostname of the machine running MJS resolves to the desired IP address
Verify that the hostname of the machine that MJS is running on is resolving to the correct IP Address. You can do this by pinging the hostname of the MJS headnode and comparing that to the IP Address from ifconfig, Unix, or ipconfig, Windows.
5) Verify your firewall is not blocking communication between the head node, workers and clients
Refer to the MATLAB Parallel Server firewall article:
If you are still unable to validate your cluster please contact MathWorks support:
*NOTE*: Starting in R2019a the following name changes occurred:
  • MATLAB Distributed Computing Server was renamed to MATLAB Parallel Server
  • mdce_def was renamed to mjs_def
  • mdce binary was renamed to mjs
  • mjs scripts are in $MATLAB/R20XXx/toolbox/distcomp/bin for R2019a and earlier

  0 Comments

Sign in to comment.

More Answers (0)

Sign in to answer this question.

Tags

No tags entered yet.

Products


Release

R2015a