Skip to main content

The UoE Research Data Initiative

I've been asked about the University's Research Data Initiative and what effect it will have on the work of Applications Division. I'm not actually involved with the initiative but I do know some of the background, so in this post I will attempt to explain briefly what research data is about in general and why it is important, then give a few links about the University's initiative.

The root of all this is the computers and networking are introducing new ways for to academics to undertake and share their research.  Sensors, microscopes, telescopes, particle accelerators, satellites, social media, audio and video can all produce vast amounts of data that can be of interest to researchers. We can store, search and analyse petabytes of data, and make data available for others to share.

For example, astronomers are collecting "sky surveys" - images of the entire sky, in various wavelengths (radio, Xray, visual, infra-red, ...), matching the objects between these surveys, and scanning the data for interesting anomalies. Particle physicists are collecting vast amounts of data from the Large Hadron Collider and associated simulations of particle interactions. Digital microscopes, MRI scanners, and similar devices created detailed (and therefore large) image files. Satellites and ocean sensors collect climate data. Biologists are mapping genes of many species, comparing the genetic codes of individuals to understand genetic variation, and modelling how proteins created from genes map to the phenotype. Actual maps are used in geosciences and in many social sciences. Social sciences also share census information and less formal data, in some cases from twitter feeds or other social media. The expressive arts are creating digital copies of important artworks. Archaeologists are creating virtual models of interesting sites. The performing arts can store videos of performances. And so on.

This has been described as a "fourth paradigm" for science. First there was experimental science, then theoretical science. These were followed by computational science - the use of complex models running on parallel multiprocessors, as done by EPCC at Edinburgh for example. The fourth paradigm, if you believe the story, is data-intensive science. This was a major aspect of the UK e-Science initiative.

http://research.microsoft.com/en-us/collaboration/fourthparadigm/4th_paradigm_book_jim_gray_transcript.pdf

http://en.wikipedia.org/wiki/E-Science


The UK research councils, who fund much of the University research in the UK, are now requiring academics to keep and make available their research data. This makes it easier for other academics to verify research and to build on previous work. Arguably, the data will become as important as the published papers.

This requires infrastructure, from the actual disks through to catalogues and indexes for searching and cataloguing the data. Our University is well-placed in this regard; we are one of the co-founders of the Digital Curation Centre, who advise academics on how to manage data, and we have the experience from the Digital Library team.

The University's Research Data Initiative will give every researcher in the University access to half a terabyte of file storage and will advise them on how to structure, describe and index that data. The IS Infrastructure division are taking the lead on the storage aspects and the Library are leading on the metadata side. You can read more information at the following links.

http://www.itc.isg.ed.ac.uk/docs/open/Paper_A_business_case_RDS_RDM_Feb2012_penultimate.pdf

http://datablog.is.ed.ac.uk/2013/05/21/background-to-the-research-data-blog/

As this work is being led by other divisions within IS, I expect that it won't have an immediate effect on Applications Division. At a guess, one area where we might become involved is if links to research data become part of the University's REF submissions or of an individual researcher's profile, in which case Pure would need to process and display this information, but this is conjecture. We might also be involved if particular research projects require advice or support for databases. Although much research data is stored in files, the metadata is typically held in databases and we could help researchers set these up.

Comments

Popular posts from this blog

Presentation: Putting IT all together

This is a presentation I gave to an audience of University staff: 

In this seminar, I invite you to consider what the University’s online services would be like, if we worked together to design them from the perspective of the student or member of staff who will use them, instead of designing them around the organisational units that provide them. I’ll start with how the services might appear to that student or member of staff, then work back from there to show what this implies for how we work, how we manage our data, and how we integrate our IT systems. It might even lead to changes in our organisational structure.

Our online services make a vital and valued contribution to the work of our students and staff. I argue that with better integration, more consistent user interfaces, and shared data, this contribution could be significantly enhanced.

This practice is called “Enterprise Architecture”. I’ll describe how it consults multiple organisational units and defines a framework …

Not so simple...

A common approach to explaining the benefits of Enterprise Architecture is to draw two diagrams: one that shows a complicated mess of interconnections, and one that shows a nicely layered set of blocks. Something like this one, which came from some consultants:


I've never felt entirely happy with this approach.  Yes, we do want to remove as much of the needless complexity and ad-hoc design that litters the existing architecture.  Yes, we do want to simplify the architecture and make it more consistent and intelligible.  But the simplicity of the block diagram shown here is unobtainable in the vast majority of real enterprises.  We have a mixture of in-house development and different third-party systems, some hosted in-house, some on cloud infrastructure and some accessed as software-as-a-service.  For all the talk of standards, vendors use different authentication systems, different integration systems, and different user interfaces.

So the simple block diagram is, basically, a l…

2016 has been a good year

So much has happened over the last year with our Enterprise Architecture practice that it's hard to write a succinct summary.  For my day-to-day experience as enterprise architect, the biggest change is that I now have a team to work with.  This time last year, I was in the middle of a 12-month secondment to create the EA practice, working mainly on my own.  Now my post has been made permanent and I have recruited two members of staff to help meet the University's architectural needs.

I have spent a lot of the year meeting people, listening to their concerns and explaining how architecture can help them.  This communication remains vital, the absolute core of what we do and we will continue to meet people in this way.  We also talk to people in other Universities in order to learn from what they are doing and to share our own experience back.  A highlight in this regard was my trip to the USA last January.

Our biggest deliverable for the past year was the design of the data wa…