Failover and high availability
- To: mathgroup at smc.vnet.net
- Subject: [mg103443] Failover and high availability
- From: "Andreas Agas" <aagas at ix.netcom.com>
- Date: Mon, 21 Sep 2009 19:27:37 -0400 (EDT)
I need to build an application, which simply described: 1) inputs data streams from a data service via J/Link; 2) analyzes the data in Mathematica; 3) outputs its calculations via J/Link to downstream systems; and 4) receives messages that the downstream systems have received the output and messages on the states of the downstream systems relative to the output. Both input and output will require defining and running - in Mathematica - timed or conditional "events" and monitoring heartbeats from the upstream and down stream systems so that the Mathematica program will know whether or not it can access the services or if the application needs to restart their respective API's and reconnect. Once up and running the Mathematica application will process lots of streaming data and have access to lots of data in RAM, but will have very little reading and writing to hard disk. Not quite certain how to do all of this yet, but I'll leave that for subsequent questions. The question of today: The system specifications require high availability, and failover. It will run one primary Apple Xserve and needs to "mirror" or replicate in real-time to two other identical servers each located in different remote locations -- three servers synced in as close to real time as possible. These servers will have a dedicated use, they won't have responsibilities to do anything else. I'll keep them as simply configured and dedicated as possible. I can run a heartbeat on the primary Xserve, if it fails, it should trigger an event on one of the others to kick in and take over for it. Now a number of solutions approach this at Apple's operating system management level ( heartbeatd on the Xserve, failoverd, failovernotifyd and failover scripts), but I wondered if I need to do that at all? Since the servers will be otherwise identical and have almost no required data on disk I wondered if I could just sync the Mathematica kernels between the three servers? Does Mathematica have anyway to do that? Seems like it would, given how it handles parallel computations with such E9lan, just not certain how to access or control it. Does anyone have any ideas of the issues involved in this, how to think about the problem, and/or how to go about it? Thanks