Skip to main content

Small fire, no-one dead

As we attempt to provide round-the-clock IT services without employing out-of-hours staff, our worst nightmare for an "incident" is one that would affect the whole machine room, starting after everyone has left work on a Friday evening, with the whole weekend until normal work resumes on Monday morning.  It would be even worse if this were to occur just before the start of a new academic year.

So guess what happened last weekend?

One of the power supplies in our main machine room caught fire as a result of an electrical fault.  Although the fire was quickly contained, the emergency services shut off all power to the machine room as a precaution.  Several of our major services became unavailable and staff had to be called in over the weekend to fix everything.

The good news was that those  services that are designed to automatically fail-over to the backup machine room did so. Also, the support team had all our top priority services back up by six o'clock on the Saturday.  Our disaster recovery plan aims to have them restored within 24 hours so this was a good result.  (Technically, this wasn't a "disaster" in the terms of our DR plan because we were able to use the main site once the power was restored, but it seems to me that the result still stands).

Even though the overall result was not too shabby, there are a lot of things that our support teams will learn from this experience.  I'll be interested to see the results of the post-event analysis.

Comments

Popular posts from this blog

Presentation: Putting IT all together

This is a presentation I gave to an audience of University staff: 

In this seminar, I invite you to consider what the University’s online services would be like, if we worked together to design them from the perspective of the student or member of staff who will use them, instead of designing them around the organisational units that provide them. I’ll start with how the services might appear to that student or member of staff, then work back from there to show what this implies for how we work, how we manage our data, and how we integrate our IT systems. It might even lead to changes in our organisational structure.

Our online services make a vital and valued contribution to the work of our students and staff. I argue that with better integration, more consistent user interfaces, and shared data, this contribution could be significantly enhanced.

This practice is called “Enterprise Architecture”. I’ll describe how it consults multiple organisational units and defines a framework …

Service Excellence, Digital Transformation and Enterprise Architecture

Our University Secretary has sponsored a major review of the University’s administrative processes, coining the banner “Service Excellence”.  The aim is to look at the services we provide to staff and students with a fresh eye, making them more effective, more efficient, and focussed on the user rather than administrative convenience.

Our CIO is sponsoring a similar programme called “Digital Transformation”. This will replace old paper-based processes, starting with the question of what would processes look like if we designed them afresh for the modern connected world.  The aim is to make processes that are more focussed on the user and hence more effective and efficient.

Both of these ambitious programmes will need an effective enterprise architecture, if they are to succeed.  Digital Transformation is intrinsically about using opportunities provided by new technology to improve services and, as such, it requires effective technology services to make data available when needed, to pro…

Not so simple...

A common approach to explaining the benefits of Enterprise Architecture is to draw two diagrams: one that shows a complicated mess of interconnections, and one that shows a nicely layered set of blocks. Something like this one, which came from some consultants:


I've never felt entirely happy with this approach.  Yes, we do want to remove as much of the needless complexity and ad-hoc design that litters the existing architecture.  Yes, we do want to simplify the architecture and make it more consistent and intelligible.  But the simplicity of the block diagram shown here is unobtainable in the vast majority of real enterprises.  We have a mixture of in-house development and different third-party systems, some hosted in-house, some on cloud infrastructure and some accessed as software-as-a-service.  For all the talk of standards, vendors use different authentication systems, different integration systems, and different user interfaces.

So the simple block diagram is, basically, a l…