This is a perfect use of EC2. Memory bandwidth in my tests has been pretty stable most of the time but you might get unlucky once in a while. You can do multiple tests and then come up with your conclusion. If you keep the traffic within EC2, all you're paying for EC2 compute time (even up for a minute is billed as an hour) which means you can get high CPU count instances that match your real systems. For disk I/O, the initial reads are slow (40 MB/s) but after that, my disk I/O benchmark results have a range between 150 to 220 MB/s. Wayne Johnson wrote: > We have a need to stress test our product. We have a few multi core > machines to run as a DB server and App server but they are pretty > heavily used and hiding under a desk. > > Now the question. Would it be reasonable to try and run stress > testing on EC2 (or other) farms? Since we only will need them > occasionally, but beat them to death when we do? Would it be cost > effective to run this on a farm? If there is no control over how much > memory bandwidth you get, you may not be able to get a consistent > load. Is there a similar issue with disk I/O? > > > > --- > Wayne Johnson, | There are two kinds of people: Those > 3943 Penn Ave. N. | who say to God, "Thy will be done," > Minneapolis, MN 55412-1908 | and those to whom God says, "All right, > (612) 522-7003 | then, have it your way." --C.S. Lewis > > > ------------------------------------------------------------------------ > *From:* Elvedin Trnjanin <trnja001 at umn.edu> > *To:* steve ulrich <sulrich at botwerks.org> > *Cc:* Mike Miller <mbmiller+l at gmail.com>; TCLUG List > <tclug-list at mn-linux.org> > *Sent:* Tuesday, July 7, 2009 12:15:04 PM > *Subject:* Re: [tclug-list] cheapest "farm"? > > I've spent a lot of time working with EC2 and I would not really > recommend it for this purpose without putting a lot of effort into > planning and considering all the options. First of all, EC2 can be > more expensive than purchasing your own hardware unless you do it > right. There are two billing types of Amazon Machine Image (AMI) > instances; on demand and reserved. On demand instances are intended to > be up for the short term - from a few hours to days. Their pricing per > hour reflects this. Reserved instances are cheaper to run per hour (3 > cents compared to 10 cents for certain instances) since you pay a > chunk of money up front. Throwing your infrastructure in the cloud is > not always cost effective unless you plan it correctly. (There are > companies that do this - I work for one) Keep in mind that after a > year or two of hardcore EC2 usage, you might have spent enough to have > purchased your own cluster; all expenses after that point is wasted > money. > > The other issues are designing your infrastructure over non-persistent > storage. You might need to set up your own AMIs to ease some of the > initial configuration (application installation and cluster management > software). While you can use the many gigabytes EC2 instances come > with for scratch space, you will need a combination of Simple Storage > Service (S3) and Elastic Block Storage (EBS) for persistent storage. > Each of these services has their own limitations. S3 can store an > unlimited amount of files but maximum file size is around 5GB. An EBS > volume can only be mounted by one instance at a time (for now). An EBS > volume is also only available to EC2 instances in the same > availability zone. You can think of availability zones as data centers > in the same geographic region (although this isn't necessarily correct). > > While data transfers are free between EC2 instances (over local IP > addresses), they are not when your are using the public IP, even if it > is between EC2 instances as I've heard. If you're transferring > gigabytes or even terabytes of data to be computed or resulting from > computation, this can be an expensive and slow process. Amazon > provides a service (AWS Import/Export) where you can send in storage > devices and they'll copy the data over to S3. If you have a lot of > devices, it can be very expensive. Amazon does provide a nice and > simple calculator for this - > http://awsimportexport.s3.amazonaws.com/aws-import-export-calculator.html > - so that you can pick which option works best. > > They also have another calculator for their other services like EC2 > and S3 - http://calculator.s3.amazonaws.com/calc5.html > > The biggest flaw with EC2 is that while you do have guaranteed CPU and > memory resources, there is no guarantee of memory bandwidth. This > means if there is a separate instance from a different AWS account > sharing the same physical machine as your compute job, the other > instance could be taking up all or most of the memory bandwidth thus > making your job run slower. Not only does your job take longer to > finish, it is actually more expensive. > > Since the infrastructure for power, space, and cooling already exists > for you, it might be a better route to go with purchasing your own > hardware. The biggest issue I see with deciding how many cores to put > in a system is the network architecture you choose to purchase. If you > choose to go with gigabit Ethernet, it doesn't make a huge difference. > If you're thinking of using high speed interconnects like Infiniband, > the number of systems you have is crucial since the switches and > adapters can cost quite a bit of money. While a 24 port switch can be > reasonably cheap (around $5000), a 48 port switches may not be > ($20k-50k - > http://www.provantage.com/scripts/search.dll?QUERY=Infiniband+switch ) > so you would need to buy multiple smaller switches to get the right > number of ports, and then add the right amount of switches to that so > you can have good enough bisection bandwidth. > > For the current Intel Xeon (non-Nehalem) processors, you shouldn't > really get more than 8 cores in the system as if you go over that > count, there isn't enough memory bandwidth to keep them all well fed > with work. Dell and sometimes Sun offer good deals to academic groups, > so you might benefit from that. Both companies also offer free trials > of hardware so you can benchmark your applications on each and pick > which is best. While you could get more AMD nodes that have same or > equal power for about the same price of a single Intel node, keep in > mind the costs of having many less powerful systems opposed to few > very powerful ones can be a financial hit in the future. > > steve ulrich wrote: >> mike - >> >> building out your own compute infrastructure is so 2002. ;) >> >> i've used amazon EC2 for a very similar application where i've been >> running large simulations on their infrastructure with my own VM image >> that i use for my purposes. you can simply dial up the number of >> processors that you purchase and use. you're charged by the hour for >> the the number of CPU instances you use. >> >> instead of buying hardware yourself that you have to power up, replace >> HDDs, etc. for and manage connectivity for you let someone pay for >> that and simply use their resources on demand. >> >> On Tue, Jul 7, 2009 at 9:29 AM, Mike Miller<mbmiller+l at gmail.com> wrote: >> >>> We want to put together a few computers to make a little "farm" for doing >>> our statistical analyses. It would be good to have 50-100 cores. What is >>> the cheapest way to go? About 4GB RAM per core should be more than >>> enough. I'm thinking quad-core chips are going to be cheaper. How many >>> sockets per mobo? I guess 1-, 2- and 4-socket mobos are available. We >>> don't need SMP, but we'll take it if it is cheap (which I doubt). We'll >>> use cloned HDDs in these boxes. My first thought is "blade" but maybe >>> blades are more expensive than somewhat less convenient ways of housing >>> the mobos. >>> >>> We have people here to house it and manage it and to pay for >>> electricity(!). They also will have ideas about what we should buy. >>> >>> Any ideas? >>> >>> Which CPU gives the most flops/dollar these days? >>> >>> Mike >>> >>> _______________________________________________ >>> TCLUG Mailing List - >>> Minneapolis/St. Paul, Minnesota >>> tclug-list at mn-linux.org >>> http://mailman.mn-linux.org/mailman/listinfo/tclug-list >>> >>> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.mn-linux.org/pipermail/tclug-list/attachments/20090707/cd2a7550/attachment.htm