Services & Resources / Wolfram Forums / MathGroup Archive
-----

MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Multiple Slaves in Parallel Computing Toolkit

  • To: mathgroup at smc.vnet.net
  • Subject: [mg92827] Multiple Slaves in Parallel Computing Toolkit
  • From: Jason Sidabras <jason.sidabras at gmail.com>
  • Date: Tue, 14 Oct 2008 04:58:28 -0400 (EDT)

Hello,

I've just purchased PCT 2.1 from Wolfram and I'm trying to get it to
work. Two part question:

PART 1 (local multiple slaves)
First try is a local slave simply

Needs["Parallel`"]
LaunchSlave["localhost"]

Parallel Computing Toolkit 2.1 (April 27, 2007)
Created by Roman E. Maeder
Subscript["slave", 1]["localhost"]

Running:
TableForm[ RemoteEvaluate[{$ProcessorID, $MachineName, $SystemID,
$ProcessID, $Version}],  TableHeadings -> {None, {"ID", "host", "OS",
"process",  "Mathematica Version"}}]

Yields only 1 ID, which makes me assume it is working. But where is my
master, does it not run independently?

Run CloseSlaves[], Kills Slave1.

Secondly I try to create multiple slaves viewing $AvailableMachines
{"localhost", "localhost"}

Which gives me:

During evaluation of In[9]:= LinkObject::linkd: Unable to communicate
\
with closed link LinkObject["C:\Program Files (x86)\Wolfram \
Research\Mathematica\6.0\MathKernel",11,8]. >>

During evaluation of In[9]:= \
Parallel`RemoteKernels`LaunchKernel::rdead: Remote kernel connected \
through LinkObject["C:\Program Files (x86)\Wolfram \
Research\Mathematica\6.0\MathKernel",11,8] appears dead.

During evaluation of In[9]:= LinkObject::linkn: Argument \
LinkObject["C:\Program Files (x86)\Wolfram \
Research\Mathematica\6.0\MathKernel",11,8] in \
LinkClose[LinkObject["C:\Program Files (x86)\Wolfram \
Research\Mathematica\6.0\MathKernel",11,8]] has an invalid LinkObject
\
number; the link may be closed. >>

Out[11]= {Subscript["slave", 2]["localhost"], $Failed}

So I created 1 slave out of 2 available. What is going on?

PART 2 (remote slaves)
Using MathLink debugging in PTC i get:

In[7]:= LaunchSlave["<remotemachine>", "ssh -f `1` \
/usr/local/bin/math.exe -mathlink -linkmode Connect -linkname '`2`' \
-noinit"]

During evaluation of In[7]:= MathLink: Creating listening link with \
LinkCreate[Sequence[LinkProtocol->TCPIP,LinkHost->,LinkHost->]]

During evaluation of In[7]:= MathLink: Link created as \
LinkObject[4265 at 10.99.99.1,4266 at 10.99.99.1,9,8]

During evaluation of In[7]:= MathLink: Running command "ssh -f \
<remotemachine> /usr/local/bin/math.exe -mathlink -linkmode \
Connect -linkname '4265 at 10.99.99.1,4266 at 10.99.99.1' -noinit"

During evaluation of In[7]:= MathLink: Command returned 0

During evaluation of In[7]:= \
Parallel`RemoteKernels`remoteKernel::time: Timeout waiting for \
LinkWrite (10 seconds).

During evaluation of In[7]:= \
Parallel`RemoteKernels`LaunchKernel::rdead: Remote kernel connected \
through LinkObject[4265 at 10.99.99.1,4266 at 10.99.99.1,9,8] appears dead.

Out[7]= $Failed

This happends if i use `<remotemachine>` as a local ssh connection or
remote ssh connection on identical computers running sshd in cygwin
with the path of math.exe /usr/local/bin/math.exe

Any suggestions would be welcome.

Jason


  • Prev by Date: Re: Getting rid of those deprecated Do[] loops?
  • Next by Date: Re: Exclude Ofrom Seriesfor Solvein Mathematica
  • Previous by thread: Re: Nested If
  • Next by thread: Re: Exclude Ofrom Seriesfor Solvein Mathematica