One of the more interesting areas I have been lucky enough to work with is cloud. We run some banks live in the public cloud. The scale of the cloud offerings is simply staggering. There are three main big players; Microsoft, Google & Amazon. Between them, they bought 31% of the world’s CPUs in the last two years and Microsoft alone is pumping a billion dollars a year into the initiative.

I was surprised to learn that the datacenters that make up the cloud are not stocked with the latest generation machines, because slightly older stock makes much more sense on a dollar per CPU cycle basis. Also, the economics are quite surprising. CPU power accounts for less than 1% of the overall cost of the data centre. Between 50 – 90% of power that feeds into the datacenter is eaten up by lighting, cooling, transforming etc. and a sizable proportion is taken up by the storage arrays.

Siting the datacenters is extremely important. The first consideration is power. You need a lot of it and it needs to be reliable & cheap. You also need good internet links. There’s no point siting your fledgling data centre next to a nice shiny power station if people can’t connect into it. Climate is another consideration. Obviously, the hotter the country, the more your cooling costs will be. The last consideration is security. Obviously, the more the customer base builds up, the more attractive a cloud data centre becomes as a terrorist target. The ideal site for datacenters is Iceland. The climate is cool, there is plenty of cheap geothermal energy and all transatlantic internet links pass through Iceland. The UK is a very expensive place for a datacenter due to high land and labour costs.

The datacentres are populated using pre-fabricated containers containing 5,000 CPUs in readymade racks, preconfigured with storage. When the container is delivered, they plug in power, network & water (for cooling) and the container starts initializing. Not all the servers in the container can start up at the same time as the thing would melt, so they start up in waves equally spaced around the container. A typical datacenter will have many of these containers with typically 2 – 3 being added every week.

With this many machines, hardware failure is a fact of life. Anything up to a 3% failure rate is business as usual. If the failure rate hits 10%, the machine vendor dials in and performs remote diagnostics on the container. If the failure rate hits 30%, the machine vendor sends out an engineer to inspect the container and perform any remedial action necessary. If the failure rate hits 50%, the container is powered down and replaced.

I was surprised by how flexible the Azure cloud is. There are three ways you can use it. The first is what they call the “web role”, which allows any kind of web-based activity (web pages, active server pages, web services etc). The second is what they call the “worker role” which allows you to do pretty much anything providing the software you want to run does not require installation (i.e. it can be copied across) and does not need to access the registry.

There were big challenges to overcome. Because you never really know which machine your application is running on or where your data is, it gives regulators a heart attack. There is also the fact that the database is what they call “eventually consistent”. Because your data is spread around and copied onto different machines, there is no real backup as we would normally consider it. You can back up your data to a local backup, but if your database is big, you may want to take advantage of the service they offer whereby they post you a tape once a day. There is no such thing as a printer connected to the cloud so if you want a hard copy of anything, you have to get creative.

The costs of a cloud based solution are *very* compelling, especially when you consider that you are effectively getting a fully managed, fault tolerant data centre. There are a number of pricing options depending on how much you want to commit to. There is everything from pay as you go, right through to volume licensing. If I had my way, I’d put every server in the cloud.



