Multiple Slaves in Parallel Computing Toolkit
- To: mathgroup at smc.vnet.net
- Subject: [mg92827] Multiple Slaves in Parallel Computing Toolkit
- From: Jason Sidabras <jason.sidabras at gmail.com>
- Date: Tue, 14 Oct 2008 04:58:28 -0400 (EDT)
Hello, I've just purchased PCT 2.1 from Wolfram and I'm trying to get it to work. Two part question: PART 1 (local multiple slaves) First try is a local slave simply Needs["Parallel`"] LaunchSlave["localhost"] Parallel Computing Toolkit 2.1 (April 27, 2007) Created by Roman E. Maeder Subscript["slave", 1]["localhost"] Running: TableForm[ RemoteEvaluate[{$ProcessorID, $MachineName, $SystemID, $ProcessID, $Version}], TableHeadings -> {None, {"ID", "host", "OS", "process", "Mathematica Version"}}] Yields only 1 ID, which makes me assume it is working. But where is my master, does it not run independently? Run CloseSlaves[], Kills Slave1. Secondly I try to create multiple slaves viewing $AvailableMachines {"localhost", "localhost"} Which gives me: During evaluation of In[9]:= LinkObject::linkd: Unable to communicate \ with closed link LinkObject["C:\Program Files (x86)\Wolfram \ Research\Mathematica\6.0\MathKernel",11,8]. >> During evaluation of In[9]:= \ Parallel`RemoteKernels`LaunchKernel::rdead: Remote kernel connected \ through LinkObject["C:\Program Files (x86)\Wolfram \ Research\Mathematica\6.0\MathKernel",11,8] appears dead. During evaluation of In[9]:= LinkObject::linkn: Argument \ LinkObject["C:\Program Files (x86)\Wolfram \ Research\Mathematica\6.0\MathKernel",11,8] in \ LinkClose[LinkObject["C:\Program Files (x86)\Wolfram \ Research\Mathematica\6.0\MathKernel",11,8]] has an invalid LinkObject \ number; the link may be closed. >> Out[11]= {Subscript["slave", 2]["localhost"], $Failed} So I created 1 slave out of 2 available. What is going on? PART 2 (remote slaves) Using MathLink debugging in PTC i get: In[7]:= LaunchSlave["<remotemachine>", "ssh -f `1` \ /usr/local/bin/math.exe -mathlink -linkmode Connect -linkname '`2`' \ -noinit"] During evaluation of In[7]:= MathLink: Creating listening link with \ LinkCreate[Sequence[LinkProtocol->TCPIP,LinkHost->,LinkHost->]] During evaluation of In[7]:= MathLink: Link created as \ LinkObject[4265 at 10.99.99.1,4266 at 10.99.99.1,9,8] During evaluation of In[7]:= MathLink: Running command "ssh -f \ <remotemachine> /usr/local/bin/math.exe -mathlink -linkmode \ Connect -linkname '4265 at 10.99.99.1,4266 at 10.99.99.1' -noinit" During evaluation of In[7]:= MathLink: Command returned 0 During evaluation of In[7]:= \ Parallel`RemoteKernels`remoteKernel::time: Timeout waiting for \ LinkWrite (10 seconds). During evaluation of In[7]:= \ Parallel`RemoteKernels`LaunchKernel::rdead: Remote kernel connected \ through LinkObject[4265 at 10.99.99.1,4266 at 10.99.99.1,9,8] appears dead. Out[7]= $Failed This happends if i use `<remotemachine>` as a local ssh connection or remote ssh connection on identical computers running sshd in cygwin with the path of math.exe /usr/local/bin/math.exe Any suggestions would be welcome. Jason