MDCE Admin center is ok, But parallel computing does not recognize workers

11 views (last 30 days)
Hi All,
I have a desktop with XPSP3x86 os and a laptop with win7x64. Both have MATLAB distributed computing server.
I start job manager on XPSP3x86 and worker on both computers. Using Admin Center i can check everything that is ok. But checking with cluster profile manager makes me confused. It's check yields to passed results, but it didn't recognize all workers (i.e. 2).
what is the problem? Thanks in advanced

Answers (2)

Jason Ross
Jason Ross on 19 Oct 2012
Do you have a 64-bit worker and a 32-bit worker? This is not a recommended configuration, due to the underlying technology that PCT/MDCS uses, which requires them to have matching word sizes and processor endianness -- and you'll see behavior like you are seeing.
From the system requirements page:
Homogeneous cluster configurations are recommended. Parallel processing constructs that work on the infrastructure enabled by matlabpool—parfor, spmd, distributed arrays, and message passing functions—cannot be used on a heterogeneous cluster configuration. The underlying MPI infrastructure requires that all cluster computers have matching word sizes and processor endianness. A limited set of functions in Parallel Computing Toolbox can work in heterogeneous cluster configurations.

Sina
Sina on 21 Oct 2012
Thanks for replying,
I found that (after many tries) if the workers have same cpu architecture (x86 or 64) then they see each other in matlab pool. This is true when the grid is as following :
Physical pc virtual pc on that
*x86 xpsp3*
*x86 xpsp3*
*x86 xpsp3*
x64 win7
*x86 xpsp3*
so i have 4 workers. Now the question is when i start cluster with 3 workers and submit a job on that, there is a job there that mess me. see below please:
>> myCluster.Jobs
ans =
Job: 2-by-1
============
# ID Type State FinishTime Username #tasks
-------------------------------------------------------------------------
1 134 pool running hormoz 3
2 135 independent queued hormoz 20
myCluster.Jobs(1).Tasks
ans =
MJSTask: 3-by-1
================
# ID State FinishTime Function Error
--------------------------------------------------------------
1 1 running @distcomp.nop
2 2 running @distcomp.nop
3 3 running @distcomp.nop
so, my job never started and program hangs. please help me.
  11 Comments
Sina
Sina on 4 Nov 2012
Edited: Sina on 4 Nov 2012
Dear Jason,
Thanks for all your replies. Another question comes up. local profile uses all cores from local CPU: 2 labs.
But while i use MATLAB pool with profile1 (knows 2 computers. my computer and a virtual mashine with 2 core processor), i have 4 labs (two pc with 2 dual core processors). Unfrtunately, matlabpool just knows 2 labs. is it possible to use 4 labs and how?
Jason Ross
Jason Ross on 5 Nov 2012
Edited: Jason Ross on 5 Nov 2012
Yes, you can start more workers on the machines and you will be able to access more labs. You can do this using AdminCenter or via the "startworker" command in matlabroot\toolbox\disctomp\bin
Be careful with starting more, though -- a good starting point for worker count is one per (compute, not virtual/hyperthreaded) core and 2 GB of RAM per worker. If you start exceeding those, it's possible your performance will actually decrease as you could run out of RAM (and use much slower swap -- especially on a VM), processor capacity, network bandwidth, etc.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!