Make a rough model of the dataflow graph of what you do, and then a rough estimate of the machine cycles for the "compute-intensive" portions of your work. Typical I/O coupled von-Neuman machines really stink at doing scalar matrix work (eg, y=mx+b correction of an B&W scientific image) and get really bad for complex arrays. The datflow graph of the arithmetic unit(s) is almost always a bottleneck since they cannot support pipelined operations with a result-per-clock for an array. Might be a few thousand times faster to look for a DMA-coupled, pipelined arithmetic unit that a a single processor can do "control flow" nursemaiding for. I/O coupled general-purpose machines are not especially good for problems that have some distinct dataflow graphs - especially those that might include complex variables. They are good if control flow and data graphs change at very high frequency so setup of one or more multi-stage, complex pipelined processors becomes a time-consumer. You may find some scientific job control packages for systems that have some special arithmetic units. May require manual job control for "favorite" dataflow graphs. Otherwise it may just be a cluster or circle jerk with trivial differences in intensive I/O churning between CPU, FPU, and other resources. I think job control is much less important than support for various "classic" dataflow graphs. Some architectures can make 4-10 thousand times differences in runtimes for matrix-intensive dataflows. If you can't make a factor of ten difference by architectural choices, the "which one" issue may be scientifically trivial... IMHO. Find a way to assess real "bang for the buck", and avoid the trivia and speculation. Chuck > -----Original Message----- > From: tclug-list-bounces at mn-linux.org > [mailto:tclug-list-bounces at mn-linux.org]On Behalf Of Mike Miller > Sent: Sunday, August 14, 2005 12:23 PM > To: TCLUG List > Subject: [tclug-list] Linux clusters (was "Debian v. Gentoo v. Ubuntu") > > > On Sun, 14 Aug 2005, Ken Fuchs wrote: > > > I still think a cluster is what you need. There are several Linux > > cluster distributions that are available on Live CDs that are easy to > > setup. They should also have hard drive install options. > > [snip] > > > How well a cluster will work depends on how fast the interconnect is and > > how much information must flow between nodes? These same issues will be > > need to be addressed in a single system with several multi-core CPUs, > > although the interconnect is bound to be much faster in a single > > (non-cluster) system. > > Why do I need a cluster? I don't fully understand how clusters can work > for me, but I am very interested. In fact, I have been thinking that I > would expand the system in the future to include other machines in a > cluster configuration, but right now I think one might be enough. If I > have a single machine with multiple sockets/cores, the OS will > transparently handle the multiple jobs and level the load across the > cores. So the single-machine-multiple-core setup is easy to work with. > I'm not as confident about the cluster setup. > > In case someone on this list has some experience or knowledge in this > area, here are some questions: > > (1) How are jobs handled in a Linux cluster? Obviously, someone logs into > one machine and jobs are sent to that machine or to other machines, but > how does the system decide where to send a job? To submit a job, does the > user have to specify any parameters to determine where/how it will run? > > (2) If different nodes in the cluster have different amounts of RAM > available, how does the system decide where the more memory-intensive jobs > will run? If it does this automatically, ignoring memory requirements, is > it possible to request that a certain job go to a certain node? > > In general, I need to understand job control in the cluster system to > understand how that system can be used for the kind of work I do. > > Thanks! > > Mike