Multiple Slaves in Parallel Computing Toolkit
- To: mathgroup at smc.vnet.net
- Subject: [mg92827] Multiple Slaves in Parallel Computing Toolkit
- From: Jason Sidabras <jason.sidabras at gmail.com>
- Date: Tue, 14 Oct 2008 04:58:28 -0400 (EDT)
Hello,
I've just purchased PCT 2.1 from Wolfram and I'm trying to get it to
work. Two part question:
PART 1 (local multiple slaves)
First try is a local slave simply
Needs["Parallel`"]
LaunchSlave["localhost"]
Parallel Computing Toolkit 2.1 (April 27, 2007)
Created by Roman E. Maeder
Subscript["slave", 1]["localhost"]
Running:
TableForm[ RemoteEvaluate[{$ProcessorID, $MachineName, $SystemID,
$ProcessID, $Version}], TableHeadings -> {None, {"ID", "host", "OS",
"process", "Mathematica Version"}}]
Yields only 1 ID, which makes me assume it is working. But where is my
master, does it not run independently?
Run CloseSlaves[], Kills Slave1.
Secondly I try to create multiple slaves viewing $AvailableMachines
{"localhost", "localhost"}
Which gives me:
During evaluation of In[9]:= LinkObject::linkd: Unable to communicate
\
with closed link LinkObject["C:\Program Files (x86)\Wolfram \
Research\Mathematica\6.0\MathKernel",11,8]. >>
During evaluation of In[9]:= \
Parallel`RemoteKernels`LaunchKernel::rdead: Remote kernel connected \
through LinkObject["C:\Program Files (x86)\Wolfram \
Research\Mathematica\6.0\MathKernel",11,8] appears dead.
During evaluation of In[9]:= LinkObject::linkn: Argument \
LinkObject["C:\Program Files (x86)\Wolfram \
Research\Mathematica\6.0\MathKernel",11,8] in \
LinkClose[LinkObject["C:\Program Files (x86)\Wolfram \
Research\Mathematica\6.0\MathKernel",11,8]] has an invalid LinkObject
\
number; the link may be closed. >>
Out[11]= {Subscript["slave", 2]["localhost"], $Failed}
So I created 1 slave out of 2 available. What is going on?
PART 2 (remote slaves)
Using MathLink debugging in PTC i get:
In[7]:= LaunchSlave["<remotemachine>", "ssh -f `1` \
/usr/local/bin/math.exe -mathlink -linkmode Connect -linkname '`2`' \
-noinit"]
During evaluation of In[7]:= MathLink: Creating listening link with \
LinkCreate[Sequence[LinkProtocol->TCPIP,LinkHost->,LinkHost->]]
During evaluation of In[7]:= MathLink: Link created as \
LinkObject[4265 at 10.99.99.1,4266 at 10.99.99.1,9,8]
During evaluation of In[7]:= MathLink: Running command "ssh -f \
<remotemachine> /usr/local/bin/math.exe -mathlink -linkmode \
Connect -linkname '4265 at 10.99.99.1,4266 at 10.99.99.1' -noinit"
During evaluation of In[7]:= MathLink: Command returned 0
During evaluation of In[7]:= \
Parallel`RemoteKernels`remoteKernel::time: Timeout waiting for \
LinkWrite (10 seconds).
During evaluation of In[7]:= \
Parallel`RemoteKernels`LaunchKernel::rdead: Remote kernel connected \
through LinkObject[4265 at 10.99.99.1,4266 at 10.99.99.1,9,8] appears dead.
Out[7]= $Failed
This happends if i use `<remotemachine>` as a local ssh connection or
remote ssh connection on identical computers running sshd in cygwin
with the path of math.exe /usr/local/bin/math.exe
Any suggestions would be welcome.
Jason