Skip to main content

Coping in a parallel world

The heat produced by modern CPUs means that chip designers can no longer crank up the clock speed. But they can fit more processing cores on a chip. 2-core and 4-core x86 chips are already standard; Sun's SPARC chips have 8 or 16 cores, and Intel recently announced an 80-core prototype.

To get the best performance out of multi-core chips, especially once you get beyond 2 or 4 cores, you either have to write applications to make best use of those cores or write compilers and dynamic optimising systems that automatically transform your code to get that performance. The problem is, most programmers aren't very good at this and most of the tools they use aren't brilliant either.

I think this has immediate policy implications. CS departments should immediately, if they haven't already done so, make the teaching of parallel and distributed programming a required component of their undergraduate courses. Companies and professional bodies should encourage their staff to retrain. For this retraining, someone has to prepare and give the courses. Some government funding might kickstart the process and so ensure that the UK doesn't fall behind.

Several pundits, including Tony Hey of Microsoft, have suggested that experience from the supercomputer world could help, because supercomputers have been massively parallel for years. There is some truth in this, and organisations such as EPCC are already teaching parallel programming techniques. But there are differences too. The economics of the supercomputer world are rather old fashioned; there, the computer is still the expensive part of the system while programmers' time is relatively cheap. So programmers often tune each application for each new machine.

You can see this philosophy in much of the scientific grid world: people submit jobs that require a certain number of processors to run. Think about that. They're not asking for a certain amount of processing power, nor for their jobs to run in a certain time or for a certain cost. Their applications are programmed to run on a specific number of processors. To anyone outside that community, this approach is clearly crazy, and it certainly won't transfer to a world where multi-core processors are plentiful and programmers' time is expensive.

So the distributed (grid) world and the multicore CPU world face similar problems. Both need new programming models and tools. Fortunately there are many computer scientists who are investigating better ways to program distributed and parallel machines. I can't attempt to represent the field but some of my friends and colleagues are working on such problems. For example, Murray Cole at the University of Edinburgh has developed a model for transforming parallel programs on to different numbers of processors. In 2002, a team from Microsoft Research in Cambridge incorporated modern concurrency abstractions into a research version of C# - replacing the usual locks, semaphores and critical regions that Tony Hoare invented 40 year ago.

Here at the e-Science Institute, we are about to launch a theme - a series of workshops and visitors - on Distributed Programming Abstractions. I certainly hope that this will address some of the questions raised above and that soome of the people reading this will contribute. Another useful resource is this wiki at Berkeley, which also includes an interesting white paper.

Comments

I wholeheartedly agree with the challenge that Dave offers here. We are building a largescale, distributed computing infrastructure capable of wholesale parallel operation for both transactional and computational computing. It should be a matter of course that significant applications are designed with multiple-thread operation in mind. This is not a new problem in CS, its just that we've never had much opportunity in the past to do this for real!!
Dave Berry said…
Intel have a white paper than makes these points and many more, at ftp://download.intel.com/research/platform/terascale/terascale_overview_paper.pdf.

Thoams Sterling addresses these issues too. His recent keynote points out some of the differences as well as the similarities. A major difference is the bottleneck in memory access on multi-core CPUs. See http://www.cct.lsu.edu/~tron/ICCC06EndNoteFinal.pdf.

Popular posts from this blog

2016 has been a good year

So much has happened over the last year with our Enterprise Architecture practice that it's hard to write a succinct summary.  For my day-to-day experience as enterprise architect, the biggest change is that I now have a team to work with.  This time last year, I was in the middle of a 12-month secondment to create the EA practice, working mainly on my own.  Now my post has been made permanent and I have recruited two members of staff to help meet the University's architectural needs.

I have spent a lot of the year meeting people, listening to their concerns and explaining how architecture can help them.  This communication remains vital, the absolute core of what we do and we will continue to meet people in this way.  We also talk to people in other Universities in order to learn from what they are doing and to share our own experience back.  A highlight in this regard was my trip to the USA last January.

Our biggest deliverable for the past year was the design of the data wa…

A new EA Repository

One of my goals since starting this job two years ago has always been to create a repository for architecture documents.  The idea is to have a central store where people can find information about the University's applications, data sources, business processes, and other architectural information.  This store will make it easier for us to explain our plans, to show the current state of the University's information systems, and to explain what Enterprise Architecture is all about.

It's taken a long time to reach this goal, mainly because we're often had more pressing and immediate work to be done.  The creation of a repository is one of those tasks that is very important but never quite urgent.  So I'm now very happy to say that we are in the process of deploying a repository and modelling tool.


This is the culmination of a careful process to select the most appropriate tool for our needs.  We began by organising several workshops to gather requirements from a rang…

A brief summary of our major initiatives

I notice that in 2016 I wrote 34 posts on this blog.  This is only my fifth post in 2017 and we're already three-quarters of the way through the year.  Either I've suddenly got lazier, or else I've had less time to spend writing here.  As I'm not inclined to think of myself as especially lazy, I'm plumping for the latter explanation.

There really is a lot going on.  The University has several major initiatives under way, many of which need input from the Enterprise Architecture section.

The Service Excellence programme is overhauling (the buzzword is "transforming") our administrative processes for HR, Finance, and Student Administration.  Linked to this is a programme to procure an integrated ERP system to replace the adminstrative IT systems. 

Enabling Digital Transformation is a programme to put the middleware and architecture in place so that we can make our processes "digital first".  We're implementing an API framework, a notification…