Multiple Slaves in Parallel Computing Toolkit
Multiple Slaves in Parallel Computing Toolkit
Jason Sidabras
Tue, 14 Oct 2008
Hello,
I've just purchased PCT 2.1 from Wolfram and I'm trying to get it to
work. Two part question:
PART 1 (local multiple slaves)
First try is a local slave simply
Needs["Parallel`"]
LaunchSlave["localhost"]
Parallel Computing Toolkit 2.1 (April 27, 2007)
Created by Roman E. Maeder
Subscript["slave", 1]["localhost"]
Running:
TableForm[ RemoteEvaluate[{$ProcessorID, $MachineName, $SystemID,
$ProcessID, $Version}], TableHeadings -> {None, {"ID", "host", "OS",
"process", "Mathematica Version"}}]
Yields only 1 ID, which makes me assume it is working. But where is my
master, does it not run independently?
Run CloseSlaves[], Kills Slave1.
Secondly I try to create multiple slaves viewing $AvailableMachines
{"localhost", "localhost"}
Which gives me:
During evaluation of In[9]:= LinkObject::linkd: Unable to communicate
\
with closed link LinkObject["C:\Program Files (x86)\Wolfram \
Research\Mathematica\6.0\MathKernel",11,8]. >>
During evaluation of In[9]:= \
Parallel`RemoteKernels`LaunchKernel::rdead: Remote kernel connected \
through LinkObject["C:\Program Files (x86)\Wolfram \
Research\Mathematica\6.0\MathKernel",11,8] appears dead.
During evaluation of In[9]:= LinkObject::linkn: Argument \
LinkObject["C:\Program Files (x86)\Wolfram \
Research\Mathematica\6.0\MathKernel",11,8] in \
LinkClose[LinkObject["C:\Program Files (x86)\Wolfram \
Research\Mathematica\6.0\MathKernel",11,8]] has an invalid LinkObject
\
number; the link may be closed. >>
Out[11]= {Subscript["slave", 2]["localhost"], $Failed}
So I created 1 slave out of 2 available. What is going on?
PART 2 (remote slaves)
Using MathLink debugging in PTC i get:
In[7]:= LaunchSlave["<remotemachine>", "ssh -f `1` \
/usr/local/bin/math.exe -mathlink -linkmode Connect -linkname '`2`' \
-noinit"]
During evaluation of In[7]:= MathLink: Creating listening link with \
LinkCreate[Sequence[LinkProtocol->TCPIP,LinkHost->,LinkHost->]]
During evaluation of In[7]:= MathLink: Link created as \
LinkObject[4265 at 10.99.99.1,4266 at 10.99.99.1,9,8]
During evaluation of In[7]:= MathLink: Running command "ssh -f \
<remotemachine> /usr/local/bin/math.exe -mathlink -linkmode \
Connect -linkname '4265 at 10.99.99.1,4266 at 10.99.99.1' -noinit"
During evaluation of In[7]:= MathLink: Command returned 0
During evaluation of In[7]:= \
Parallel`RemoteKernels`remoteKernel::time: Timeout waiting for \
LinkWrite (10 seconds).
During evaluation of In[7]:= \
Parallel`RemoteKernels`LaunchKernel::rdead: Remote kernel connected \
through LinkObject[4265 at 10.99.99.1,4266 at 10.99.99.1,9,8] appears dead.
Out[7]= $Failed
This happends if i use `<remotemachine>` as a local ssh connection or
remote ssh connection on identical computers running sshd in cygwin
with the path of math.exe /usr/local/bin/math.exe
Any suggestions would be welcome.
Jason
| |