Skip to main content

Virtualisation, HPC and Cloud Computing

Virtualisation has obvious benefits for much commercial IT, where existing servers often have utilisation rates of 10% or less. It's less clear whether virtualisation is so useful for High-Performance Computing (HPC), where systems are often kept running at utilisation rates above 90%. Nevertheless, there are potential benefits from adopting some aspects of virtualisation in these environments. The question is, do the benefits outweigh the cost in performance?

This was the topic of yesterday's workshop on System Level Virtualisation for HPC (which was part of EuroSys 2008). The workshop was rather small but I did learn quite a bit.

A couple of talks investigated the performance hits of using virtualised OS's instead of addressing the hardware directly. This varied, depending on the type of application; if IO was minimal, the slowdown was minimal too. An "average" slowdown seemed to be on the order of 8%.

Samuel Thibault of Xensource looked at ways of using the flexibility of a hypervisor to implement just those parts of an operating system that are absolutely necessary - ignoring more general-purpose facilities such as virtual memory, pre-emptive threads and the like. Essentially, this was using Xen as a "micro-kernel done right".

Larry Rudolph, who works for VMWare but was here speaking with his MIT hat, gave a scenario where virtualisation might benefit the HPC user. Some problems require massive machines that researchers only get access to maybe twice a year. The example Larry gave was modelling global ocean currents. If the researchers can save the messages sent between regions of this model, they can later run more detailed analyses on one particular region on a much smaller machine, using the recorded messages to playback the effect of the rest of the simulation. This gets complicated if the new micro-simulation gets out of sync with the orginal and in this case the system may have to switch other regions from record mode to full simulation. The point is that it is easier to handle this mixture of recorded and actual simulations using virtual machines than having to control it all "by hand".

The discussion that ended the workshop brought all these points together. Virtualisation makes grid or cloud computing more feasible, which may make HPC resources available to more people. New CPUs have explicit support for virtualisation that will make it almost cost-free for the data centre, and there is a good chance that these advances will also reduce the overhead for some HPC applications too. The key problem seems to be predicting which applications will run smoothly on a virtualised system and which will suffer noticeable slowdowns. As always, more research is needed!

Comments

Popular posts from this blog

Presentation: Putting IT all together

This is a presentation I gave to an audience of University staff: 

In this seminar, I invite you to consider what the University’s online services would be like, if we worked together to design them from the perspective of the student or member of staff who will use them, instead of designing them around the organisational units that provide them. I’ll start with how the services might appear to that student or member of staff, then work back from there to show what this implies for how we work, how we manage our data, and how we integrate our IT systems. It might even lead to changes in our organisational structure.

Our online services make a vital and valued contribution to the work of our students and staff. I argue that with better integration, more consistent user interfaces, and shared data, this contribution could be significantly enhanced.

This practice is called “Enterprise Architecture”. I’ll describe how it consults multiple organisational units and defines a framework …

Not so simple...

A common approach to explaining the benefits of Enterprise Architecture is to draw two diagrams: one that shows a complicated mess of interconnections, and one that shows a nicely layered set of blocks. Something like this one, which came from some consultants:


I've never felt entirely happy with this approach.  Yes, we do want to remove as much of the needless complexity and ad-hoc design that litters the existing architecture.  Yes, we do want to simplify the architecture and make it more consistent and intelligible.  But the simplicity of the block diagram shown here is unobtainable in the vast majority of real enterprises.  We have a mixture of in-house development and different third-party systems, some hosted in-house, some on cloud infrastructure and some accessed as software-as-a-service.  For all the talk of standards, vendors use different authentication systems, different integration systems, and different user interfaces.

So the simple block diagram is, basically, a l…

2016 has been a good year

So much has happened over the last year with our Enterprise Architecture practice that it's hard to write a succinct summary.  For my day-to-day experience as enterprise architect, the biggest change is that I now have a team to work with.  This time last year, I was in the middle of a 12-month secondment to create the EA practice, working mainly on my own.  Now my post has been made permanent and I have recruited two members of staff to help meet the University's architectural needs.

I have spent a lot of the year meeting people, listening to their concerns and explaining how architecture can help them.  This communication remains vital, the absolute core of what we do and we will continue to meet people in this way.  We also talk to people in other Universities in order to learn from what they are doing and to share our own experience back.  A highlight in this regard was my trip to the USA last January.

Our biggest deliverable for the past year was the design of the data wa…