Skip to main content

Posts

Showing posts from 2007

Innovation and Knowledge Transfer Networks

I'm very pleased to say that our Knowledge Transfer Network, Grid Computing Now! , will continue to operate for at least another year. We have just received final confirmation from the newly-reconstituted Technology Strategy Board . This is welcome news; it means that we can continue our plans to bring users, vendors and academics together to address real problems in several sectors. So it seems a good moment to reflect on the state of KTNs and how they might develop. Innovate07 was the showcase for all 22 Knowledge Transfer Networks. This was my first time at Innovate and I was impressed by the range of technologu areas and delegates. It was also a good opportunity for networking between KTNs, which has led to some joint initiatives. This range of KTNs is in part a branding exercise, as some KTNs had previous existences as Faraday Institutes or other institutes. So we at GCN are in the odd position of being one of the first KTNs to be set up and at the same time among the o

Scotland's first eco-powered data farm

I delighted to see that Internet Villages International are building a major data centre to be powered entirely by "green" energy in Scotland. This is exactly the kind of development that some of us have been arguing for. The centre will be built next to a source of renewable energy and the waste heat is intended to be used for a new village and for local horticulture. This makes sense in many ways. Data centres do not need to be near the businesses that use them (think of how often you use Google and where their data centres are). Placing them near sources of renewable energy saves on transmission losses; indeed Google, Amazon and other firms are already doing this in the USA. Also, data centres produce a lot of heat and it makes sense to use this rather than waste it. Let's hope that the operators minimise the energy they do use as well, for example by running virtualised servers, building modular UPS and cooling systems, and using external air intakes rather tha

High Throughput Computing week

We;ve just finished a week (well, four days) of talks, tutorials and discussion about High Throughput Computing. The event was opened by Miron Livny, leader of the Condor team, who gave an excellent introduction - the key point is that HTC is about the number of tasks that can be completed in a given time, whereas "traditional" High Performance Computing is about how much computing power can be brought to bear at a given time. As Miron puts it, Floating Operations per Year is not necessarily 60*60*24*7*52 Floating Operations per Second (FLOPS). We've hosted events by the Condor team in the past, but for HTC week we extended our range. In particular, John Powers and Dan Ciruli of Digipede flew over from the Bay Area to tell us about their product. A day of hands-on tutorials allowed delegates to compare the strengths of Digipede and Condor, and the evening discussions included ways the systems could be used together. Scheduled discussions looked at requirements for

Web 2.0, e-Science and Innovation

Last week the e-Science Institute organised a "think tank" to review the state of e-Science and suggest opportunities for research. A major emphasis of the debate was the recent trend to use Web 2.0 tools to support scientists. Dave de Roure gave several examples he saw at recent conferences, including wikis and blogs such as Open Wetware and Useful Chemistry , as well as various data mashups. Tony Hey gave a public lecture on e-Science and Digital Scholarship which presented a similar story, including the use of utility computing (which now seems to be called cloud computing - you've got to love the constantly changing buzzwords in IT). Among the discussions, people mentioned the combination of Web 2.0 tools with semantic web technology, and the combination of structured queries and semi-structured information as in DBpedia . This growth of e-Science 2.0 (to coin a buzzword of my own) has mainly seems to have occurred largely in the life sciences, perhaps becaus

Centralisation, security, and 25m personal account details

The media and online world are buzzing with the news that HMRC have lost discs containing financial details of 25m people. My particular interest is to what extent the centralisation of the database contributed to the problem. If we consider the NPfIT programme for storing medical records, would they be safer in distributed data stores? At first glance, one might think that a security breach in one store would at least be limited to the set of data held there. But those distributed stores would have to be networked and to allow remote queries; would this increase security (by checking for mass requests) or decrease it (because people wouldn't know which remote requests to distrust)? In the meantime, here are a few URLs to comments that I found interesting. Philip Virgo worries (as do I) about those organisations that cover up their breaches rather than report them. David Lacey argues that certification of security practices is required to make sure policies are followed.

The Business Case and Methods for the Green Data Centre

"Green IT" is a hot topic at the moment. In the UK, data centres contribute nearly 2% of the country's CO2 emissions, a figure which is similar to that of the much-vilified airline sector. We recently broadcast a webinar on this subject which went very well - several people have said that it was our best webinar yet. We were fortunate to have two excellent speakers. The first was Zahl Limbuwala of the BCS Specialist Group on Data Centres. This SG has developed an open source model for measuring server room efficiency, which will be published in January; Zahl presented the case for why energy efficiency is relevant and talked a little about this model. The second speaker was Kate Craig-Wood, who runs a carbon-neutral hosting company. Kate gave a lot of low-level practical suggestions for improving the energy efficiency, which her company has used in their new data centre. There are obvious advantages in making such improvements, both economic and environmental. The &

Virtual autopsies

One of the best talks at AHM2007 was on medical imaging, by Prof. Anders Ynnerman of Linkoping University in Sweden. Imaging researchers always have an advantage when giving talks because they can show much of their work in pictures; Anders took full advantage. The immediacy of visualisation in communicating information meant that some viewers found his medical pictures a little too gruesome. The computer science aspect of his talk was about techniques for 3-D reconstruction of body images from the 2-D “slices” taken by medical CAT and MRI scanners. As these scanning devices increase in resolution, so the size of data to be processed increases. In order to process the data, ZZX is using intelligent compression techniques, based on whether a voxel represents bone, blood or tissue. Aside from the technical content, I was very interested in the uses XXX has found for this work in addition to the usual pre-operative medical briefing. His team are now able to help the police by perfo

e-Science and Industry

AHM2007 saw several examples of the industrial exploitation of e-Science technologies. I organised a session on this theme for Grid Computing Now! (GCN), which was attended by some 300 people. Jim Austin and Yike Guo gave talks about their contrasting approaches to commercialising their research. Jim has built his company, Cybula , from income, opting for the advantage of control and accepting slower growth. Yike has chosen an investment-based approach for Inforsense , achieving the fast growth required by his funders. Both explained how they found markets for their technologies. They also reflected on how to maintain both academic and commercial careers, using each to benefit the other while balancing their sometimes contrasting demands. I followed with a short talk that gave a high-level outline of the state of advanced IT infrastructure in industry, drawing on some of the presentations given our Grids Mean Business track at OGF20 . Of course, IT infrastructure is only one aspec

Grids, data centres and reliability

In my work with the Grid Computing Now! Knowledge Transfer Network, I talk about "virtualisation" and "service-oriented architecture" just as much as "grid" itself. People sometimes ask what is the difference between these concepts. My first answer is perhaps rather glib - I say that I don't care as long as the technology gets the job done. Although this is not a straight answer, those of us on the GCN! team believe it is important to put the business answers before any notion of technological purity. But if we turn to the question as stated, I think that as long as a solution includes the key concepts of virtualised resources and dynamic allocation of applications across those resource, then that to me is enough to call the system a grid. But, of course, we can go further. A recent conversation reminded me of the important point that distributed systems typically have to manage failure. As systems scale to many machines and many sites, then some

"That's not Grid!" - A cautionary tale

I don't usually attempt humour here, but see if you like this... Once upon a time, in the far-off world of Computerland, a great guru arose and declared a vision. "I see a future when all computers will be linked together and people will run their programs without knowing which computers are running them. People do not need to know where their jobs run; they just need the results. I call this Grid Computing!" The people of Computerland were excited by the guru's vision. They went away and worked to make it happen. When they were ready, they returned to the guru and said: "Oh great guru! We have implemented your vision. We can run our programs on whichever processor is free at the time, making sure that all programs can run and making best use of all our processors. No longer does one computer sit idle while another one is overloaded. Your vision is a great success!" "No, no, no!", said the guru, "That's not Grid!" The guru exp

Research grids and industrial data

What happens when industry collaborates with academics, using the grid to share data? This was one of the main issues that we discussed today in a meeting of the NanoCMOS project. The industrial partners were clear that they would have to be convinced that their valuable data will be adequately protected before they allow their academic colleagues to use it on the grid. The NanoCMOS project is looking at the impact of variability on the design and production of next-generation microchips. It is funded by the EPSRC and involves several leading electronics companies. The aim is to make circuit designs more resistent to the variations in the yield and performance of microchips; such variability is increasing as transistors get smaller and smaller. In a multi-billion dollar industry, it is clear that the companies involved do not want information about the design or performance of their products to go AWOL. In the B.G. world (Before Grids), companies license their data to certain aca

Interlude: New opportunities for user interfaces

In my previous entry, I promised some thoughts on presentations at OGF20. Since then you will have seen nothing from me. This is not because I lost interest; unfortunately I've been in hospital and then recuperating. Now I'm well again, back to work and the proud owner of a heart pacemaker. I still plan to blog some thoughts arising from OGF20, but for this entry I will deal with a different topic. I've just read and been inspired by Chris Mairs' 2006 Turing Lecture on Inclusion and Exclusion in the Digital World. Chris takes what could be a dull (albeit important) topic and makes it very interesting by showing the big picture. He's talking about accessbility, which previously I tended to associate with detailed guidelines about which colours to use on web pages and how to tag HTML images with meaningful "Alt" tags. I did consider it important - I'm now old enough to hate small fonts - but I didn't find it inspiring. Chris's lecture look

Webinar: The Semantic Web in Industry

Our next webinar will look at a technology that is only just beginning to be deployed in industrial applications. The Semantic Web allows meanings to be attached to data and text and users to look for content by querying these annotations. At its simplest this should mean no more scrolling through pages of search results. More sophisticated uses include enabling service-oriented markets and automating aspects of data integration. This seminar, which will take place on Thursday May 24th, will describe the principles of the semantic web and show how it can already be applied to real industry use cases. John Davies of BT will give a brief introduction to Semantic Web technology and show how it can be applied in industry, focussing on four application areas: knowledge management, information integration, service-oriented environments and applications in the health sector. Paul Walsh of Segala will show how semantic Content Labels improve trust when browsing, by letting users discover whic

Initial reflections on OGF20 / EGEE User Forum

So that was OGF20 and the 2nd EGEE User Forum. I was so busy there that I didn't have time or energy to blog. So much happened in such a short time that I'd have been hard put to keep up. This turned out to be the best-attended GGF/OGF meeting ever, narrowly beating GGF5 (which was held in Edinburgh in 2002). We had over 900 people attend during the course of the week and a significant number stayed for all or most of the week. I haven't had feedback from all the sessions but I believe the workshops were well attended and I know the Grids Mean Business and the EGEE-specific sessions went very well. A notable aspect of the week was the good interaction between the different communities. In particular, the colocation of the OGF and EGEE events has helped to show both sides where standards can apply or are needed. The commercial delegates helped to guide the requirements for standardisation work as well as sharing information on best practice. The exhibition space had

Grids Mean Business, Day 1

It's the end of the first day of Grids Mean Business , the industry track that we've organised at the joint meeting of OGF20 and The EGEE User Forum. I'm too tired to write a detailed report but I'm very happy with today's sessions. We've had some great talks and really interesting discussions. Here's hoping that tomorrow is just as successful.

BBC Horizon programme on FireGrid

Next Tuesday, April 24th, the BBC will be showing an episode of Horizon all about the FireGrid project. The vision behind FireGrid is that we can use the grid and HPC computing to model the spread of fire in buildings, to improve building design, firefighters' training and actual response in emergencies. This is one of a number of grid projects looking at the use of real-time modelling to improve emergency response. Others are focussing on floods, earthquakes or the release of noxious substances into the atmosphere. In each case, the idea is to use lots of sensors to provide input to complex HPC models, which in turn provide forecasts to an emergency response control room. There are many technical issues that need to be solved before such systems can be deployed live. We don't yet know the best way to guide simulations as new sensor input becomes available. We need better ways of allocating priority jobs to grid resources - it's clear that a traditional queue system wo

Webinar: Virtualisation and Service-Oriented Architecture - Building a cutting edge IT Infrastructure

On Thursday (April 19th) I will be hosting another in our series of webinars. This week's seminar will feature case studies of two core technologies for building a cutting-edge infrastructure. Zafar Chaudry will describe how he used virtualisation to consolidate his disparate servers and storage provision into a manageable and scalable infrastructure. Then Mark Simpson will show how he deployed a service-oriented architecture to give a major financial institution far more control and scalability over its business processes. Zafar and Mark come from very different sectors. Zafar is at the Liverpool Women's Hospital while Mark works for the business IT consultants Griffiths-Waite. I find it fascinating to see just how broad is the uptake of these new techniques - they really seem to be applicable across most sectors. This will be one issue that I'll explore in the discussion on Thursday. I'll also ask Mark and Zafar to comment on how virtualisation and SOA interact in m

OGF20 & Grids Mean Business

Much of the reason that I haven't blogged much recently is that I've been up to my eyes in organising OGF20 (colocated with the EGEE User Forum) and in particular the Grids Mean Business industry track. With just over a month to go, I'm pleased to report that we have well over 500 delegates registered and the programme of speakers is rather good. In addition, we have a range of workshops which will look at new developments in Grid middleware and operations. Our two keynote speakers for OGF20 will be Tony Hey and Peter Coveney . Tony is VP of technical computing at Microsoft and will be giving us a new talk on the Social Grid. Peter is professor of computational chemistry at University College London and has stretched the use of Grid infrastructures to perform massive simulations of chemical processes. Both are excellent speakers and will give fascinating talks. In addition, Mark Linesch and Bob Jones will introduce the activities of OGF and EGEE respectively, while

Webinar: Distributed Systems in e-Health

Last week I hosted a webinar on distributed systems in e-health. The speakers were Derek Hill, CEO of Ixico, and Michael Rigby, Professor of Health Information Strategy at Keele University. Derek launched Ixico three years ago as a spin-off of the UK e-Science Programme and now it is selling services internationally. Michael's project is still in the research stages but in my opinion it shows a more robust approach to managing health records than the current approach being implemented by the English National Health Service. Take a look and see what you think - follow the links at http://grid.globalwatchonline.com/epicentric_portal/site/GRID/menuitem.c4f9e41660ec9e9b08a38510eb3e8a0c/ .

Coping in a parallel world

The heat produced by modern CPUs means that chip designers can no longer crank up the clock speed. But they can fit more processing cores on a chip. 2-core and 4-core x86 chips are already standard; Sun's SPARC chips have 8 or 16 cores, and Intel recently announced an 80-core prototype. To get the best performance out of multi-core chips, especially once you get beyond 2 or 4 cores, you either have to write applications to make best use of those cores or write compilers and dynamic optimising systems that automatically transform your code to get that performance. The problem is, most programmers aren't very good at this and most of the tools they use aren't brilliant either. I think this has immediate policy implications. CS departments should immediately, if they haven't already done so, make the teaching of parallel and distributed programming a required component of their undergraduate courses. Companies and professional bodies should encourage their staff to re

OGF20 Registration is open

Much of the reason I haven't blogged here more frequently is that I am programme chair of OGF20 and this is taking much of my time. So I'm pleased to report that registration for OGF20 is now open. Here is the official announcement. Note that early registration rates are available until March 15th. Some aspects of the programme have been confirmed; others are still being finalised. I'll post news here over the next few weeks. Registration is now open for OGF20 and the EGEE 2nd User Forum being held May 7-11 in Manchester, UK. Register on-line by visiting http://www.ogf.org/OGF20/events_regstrtn_ogf20.php This event will feature: • Keynote and Plenary presentations by leading grid luminaries • Chartered Group Sessions including Standards Working Group Sessions and BoFs • Enterprise Track including Requirements Alignment, Best Practices and Adoption Sessions • e-Science Track featuring Community Workshops • ‘Grids Means Business’ Industry Program showcasing business sol

Webinar - Case Studies: IT Infrastructure for inter-enterprise collaboration

Tomorrow (Thursday) I'll be hosting the next web seminar run by Grid Computing Now! , at 2:30pm UK time. We'll be looking at how grid technologies can help businesses collaborate on joint projects. Mike Boniface of IT Innovation will explain how the SIMDAT project has enabled pharmaceutical and automotive companies to collaborate on product design. Tom Jackson of York University will describe how the BROADEN project is enabling Rolls-Royce to monitor after-sales performance. Both projects use grid to manage distributed data and computer assets belonging to multiple organisations. The presentations will show how this leads to real business benefits. You can join the seminar at http://mediazone.brighttalk.com/event/gridcomputingnow/a4f23670e1-260-intro .

Storage meets the Grid

I'm just back from OGF19 , which was very productive in a number of ways. One strand of interest is the continuing dialogue between the storage industry (represented by SNIA ) and the grid world. This conversation has been developing slowly over the last year. I was on a panel at MSST 2006 that explored some aspects of this. SNIA were also present at GGF18 to explore where the two concerns meet. Both sides are still learning about each other, as they are both complex and changing technologies. One obvious area of overlap is that of data replication. Many grid projects maintain replicas of data, to improve access times and/or to guard against loss. The classic example is the LHC Grid, but there are many other examples, particularly in the world of data librarines. Meanwhile, the storage industry supply replication systems for commerical data, specialising in backups and disaster recovery. The two technologies work at different levels. Storage systems copy data from one di

ClimatePrediction.net on the BBC

ClimatePrediction.net will feature on the BBC again this Sunday, in a television programme fronted by Sir David Attenborough. " Climate Change: Britain Under Threat " will broadcast the results of a BBC-sponsored experiment using ClimatePredication.net with a focus on how climate change is and will affect the UK. The results will appear on the BBC Climate Change website after thebroadcast (and will also hopefully be in a Nature paper in a little while, pending peer review!)

Service Modeling Language

Last night Heather Kreger gave an overview of the Service Modeling Language (SML) to members of the OGF's OGSA working group. The high-level view that I took away from this is that SML is a modelling effort based purely on XML (particularly XML Schema ), rather than initiatives that map CIM or UML into XML. The advantage of this is that it has a cleaner rendering; for example, it doesn't have to translate the CIM or UML use of inheritance (which XML Schema doesn't directly support). In practice, models can be rather large. SML also uses Schematron to enable models to be split across multiple documents, while still enabling validation against the models. A related activity, CML, is producing some core models in SML. SML is backed by BEA, BMC, CA, Cisco, Dell, EMC, HP, IBM, Intel, Microsoft, and Sun, so certainly has industry support. They intend to submit the specification to a standards body in a few months time, although it will be a while after that before it is pu