[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Re: Datacenter Classification
|> From: Sean Donelan [mailto:[email protected]]
|> Sent: Wednesday, January 09, 2002 5:02 AM
|>
|> On Tue, 8 Jan 2002, Dan Lawrence wrote:
|> > board with Roeland. A Class "A" datacenter, by most
|> > accounts and without going directly to a published
|> > standard as the Holy Grail, is probably best
|> > determined by a Reliability Index (number of those
|> > elusive 9's everybody chases) as a result of meeting
|> > some performance criteria laid down by an Owner
|> > (like Beenu). If the end result can give a consistent
|> > high five-9's, it's usually considered a Class "A"
|> > design.
|>
|> Data center owners are reluctant to talk about problems, so
|> its relatively difficult to find out the performance of
|> various designs. And if you ask any sales person, they
|> always tell you their data center is designed for x-9's.
Yes, but they don't actually tell you what level they actually achieve.
Because, falsehoods there can land you in court. The best ones I've seen are
AboveNet, where they make stats available as part of their billing system.
Accountants have to be accurate, when doing 95th percentile billing.
|> Honestly, has anyone every had a sales person tell them they
|> are selling "Class B" data center space?
I think that the original requestor was considering building their own, or
trying to measure their own space. The problem is, there aren't any standard
markings, for the rulers that we do have, other than performance. But when
it comes down to the point, performance is everything, to the exclusion of
all else. I measure two gross factors, the ratio of <total WAN backbone
capacity>:<total WAN backbone reserves (dark fibre)> and <aggregate
percentage uptime (WAN-wide)>, when considering an ISP or colo. For a
datacenter, it gets much more complex and is dependent on the architecture.
However, I expect twice the WAN capacity, at minimum, in total processing
power. I am more comfortable with a factor of 10, supporting LAN end-nodes
at 200 Mbps (100baseTX FDX switched).
BTW, performance metrics on cluster-nodes is a bear, how much bandwidth can
a node support, performing which functions, and at what level of
load-sharing? There's a research paper in there, somewhere. The is why the
performance/capacity guy gets the big bucks and has all the grey hair, or
should.
Then (having been a telco guy long enough to know better) I look at the
factors that make up those numbers. I've seen too many "reporting" issues
between telcos and PUCs, don't let *them* define all of the reporting
methods and terms. ie. What's an "outage"? That definition varies, dependant
on whether you are discussing telco, CCTV, Internet gaming, Internet
streaming, VOIP, etc. Or course, all of this should show up in your SLA.
Even internal datacenters should have an internal SLA to the rest of the
org. But, that's a management issue. Actually, it's all a management issue.
Most of the stuff that we discuss on this list will help one "get there",
but there are no guarantees that, if you do any combination of all of them,
that you will arrive. There are people components as well. Procedures and
policies can make the best design either succede or fail. Then, Stef trips
over the power mains (http://ars.userfriendly.org/cartoons/?id=20020109). In
the end, you still have to operate a 99.9%+ operational staff (power mains
guarded by rabid pit-bulls?).
In addition to all of the above, one asteroid strike can ruin your whole
day.
--
R O E L A N D M J M E Y E R
Managing Director
Morgan Hill Software Company
tel: +1 925 373 3954
cel: +1 925 352 3615
fax: +1 925 373 9781
http://www.mhsc.com