Skip to main content

The UoE Research Data Initiative

I've been asked about the University's Research Data Initiative and what effect it will have on the work of Applications Division. I'm not actually involved with the initiative but I do know some of the background, so in this post I will attempt to explain briefly what research data is about in general and why it is important, then give a few links about the University's initiative.

The root of all this is the computers and networking are introducing new ways for to academics to undertake and share their research.  Sensors, microscopes, telescopes, particle accelerators, satellites, social media, audio and video can all produce vast amounts of data that can be of interest to researchers. We can store, search and analyse petabytes of data, and make data available for others to share.

For example, astronomers are collecting "sky surveys" - images of the entire sky, in various wavelengths (radio, Xray, visual, infra-red, ...), matching the objects between these surveys, and scanning the data for interesting anomalies. Particle physicists are collecting vast amounts of data from the Large Hadron Collider and associated simulations of particle interactions. Digital microscopes, MRI scanners, and similar devices created detailed (and therefore large) image files. Satellites and ocean sensors collect climate data. Biologists are mapping genes of many species, comparing the genetic codes of individuals to understand genetic variation, and modelling how proteins created from genes map to the phenotype. Actual maps are used in geosciences and in many social sciences. Social sciences also share census information and less formal data, in some cases from twitter feeds or other social media. The expressive arts are creating digital copies of important artworks. Archaeologists are creating virtual models of interesting sites. The performing arts can store videos of performances. And so on.

This has been described as a "fourth paradigm" for science. First there was experimental science, then theoretical science. These were followed by computational science - the use of complex models running on parallel multiprocessors, as done by EPCC at Edinburgh for example. The fourth paradigm, if you believe the story, is data-intensive science. This was a major aspect of the UK e-Science initiative.

The UK research councils, who fund much of the University research in the UK, are now requiring academics to keep and make available their research data. This makes it easier for other academics to verify research and to build on previous work. Arguably, the data will become as important as the published papers.

This requires infrastructure, from the actual disks through to catalogues and indexes for searching and cataloguing the data. Our University is well-placed in this regard; we are one of the co-founders of the Digital Curation Centre, who advise academics on how to manage data, and we have the experience from the Digital Library team.

The University's Research Data Initiative will give every researcher in the University access to half a terabyte of file storage and will advise them on how to structure, describe and index that data. The IS Infrastructure division are taking the lead on the storage aspects and the Library are leading on the metadata side. You can read more information at the following links.

As this work is being led by other divisions within IS, I expect that it won't have an immediate effect on Applications Division. At a guess, one area where we might become involved is if links to research data become part of the University's REF submissions or of an individual researcher's profile, in which case Pure would need to process and display this information, but this is conjecture. We might also be involved if particular research projects require advice or support for databases. Although much research data is stored in files, the metadata is typically held in databases and we could help researchers set these up.


Popular posts from this blog

A new EA Repository

One of my goals since starting this job two years ago has always been to create a repository for architecture documents.  The idea is to have a central store where people can find information about the University's applications, data sources, business processes, and other architectural information.  This store will make it easier for us to explain our plans, to show the current state of the University's information systems, and to explain what Enterprise Architecture is all about.

It's taken a long time to reach this goal, mainly because we're often had more pressing and immediate work to be done.  The creation of a repository is one of those tasks that is very important but never quite urgent.  So I'm now very happy to say that we are in the process of deploying a repository and modelling tool.

This is the culmination of a careful process to select the most appropriate tool for our needs.  We began by organising several workshops to gather requirements from a rang…

A brief summary of our major initiatives

I notice that in 2016 I wrote 34 posts on this blog.  This is only my fifth post in 2017 and we're already three-quarters of the way through the year.  Either I've suddenly got lazier, or else I've had less time to spend writing here.  As I'm not inclined to think of myself as especially lazy, I'm plumping for the latter explanation.

There really is a lot going on.  The University has several major initiatives under way, many of which need input from the Enterprise Architecture section.

The Service Excellence programme is overhauling (the buzzword is "transforming") our administrative processes for HR, Finance, and Student Administration.  Linked to this is a programme to procure an integrated ERP system to replace the adminstrative IT systems. 

Enabling Digital Transformation is a programme to put the middleware and architecture in place so that we can make our processes "digital first".  We're implementing an API framework, a notification…