Dev8D: Data.ac.uk panel, Joss Winn
Joss Winn, senior lecturer at the University of Lincoln (lncd.lincoln.ac.uk), chaired a panel session about data.ac.uk, linked data and RDF. Read a short interview with Joss which introduces the topic, and a summary of the panel discussion
It’s my third time at Dev8D, it’s my favourite conference of the year and I enjoy the practical focus of the event and also the collaborative skill-sharing aspect. And the informality. A lot of conferences are show and tell – you go and deliver a paper about a project and that’s that. You don’t tend to get that at Dev8D, you get people sitting down to solve practical problems.
What was your panel session about?
It was about sector-wide work or initiatives around publishing open data from different universities, mainly with a focus on institutional data rather than research data – data about buildings, about local amenities, about energy use. The stuff that institutions produce as part of their function rather than research activity itself.
Did anything from the session surprise you?
The lack of examples from the audience. Unless people were keeping quiet about their work, we seemed to have a panel that represented much of the activity. I know there were people from York and the OU that could not be here but I was hoping to find more pockets of activity. I think the data.ac.uk discussion we had is key to drawing out the work that is ongoing around producing open data in different institutions. A panel discussion is not representative of all the work that is happening. The best thing that could happen is a follow-on event or day-long meeting around data.ac.uk which will give everyone’s projects a presence.
What do you hope people took away from the session?
I hope they took away examples of what is happening and they would have heard about the technology that is being used and the licenses. But most of all, raising the profile of whether an aggregator discovery hub like data.ac.uk is worthwhile.
What do you think you have taken away from it?
I’m going to try to get this meeting for data.ac.uk set up. Chris Gutteridge has been the voice and done a lot of the groundwork and I think he needs support in terms of setting up an event. I think I will go away and put energy into that.
THE PANEL DISCUSSION
On the panel were:
Alex Dutton, University of Oxford
Chris Gutteridge, University of Southampton
Wilbert Kraan, University of Bolton
Damien Steer, University of Bristol
Adrian Stevenson, University of Manchester
Jo Walsh, University of Edinburgh
Chris: we have a strong commitment to publishing open data about the organisation. Because we’re not a project it doesn’t run out but it makes getting certain costs covered more difficult. There is a clear divide between research data and data about our organisation. Need to be discussed as separate things. Both important but need to be managed separately. Some things we do are not exclusive to universities, they can be picked up elsewhere eg our buildings app is being used by a university in China.
Adrian: we’ve been putting out linked data for archives and two services, Mimas and Copac. The data in there is about what data universities have.
Wilbert: we’ve only just started work on this and we’re looking purely at executional data and it is mainly closed. It’s mostly about data integration within the institution itself, trying to remove silos through linked data. The institution is the primary consumer. It might be useful for business intelligence purposes such as looking at retention, and we would never publish that. Also looking at the various different dimensions of a course – who is teaching it, in which rooms etc.
Alex: we are also using it for data within the university so there is a course data project and data looking at research facilities.
Jo: our effort has stalled after a good start next summer. It was mostly from the outside in, using scraping. We had positive feedback from estates and building who were interested in it for energy data, occupancy etc and keen on open licensing as a way to cut through territorial disputes. But we havehad some blockages around the open licenses.
Joss: at Lincoln we have a range of open data. We have hardly touched on producing RDF linked data. We call it open data because it is available through APIs. Slowly converting as RDF linked data.
Paul Walk: I was involved in something similar to Wilbert’s experience years ago – it was not open and not RDF – and it reached the point where we did business intelligence-type functions on it and could correlate movements of students through security systems and attendance at lectures and drop out rates. We found we could correlate lack of attendance of lecturers with drop out rates. This kind of work has ethical dimensions.
What about a single data.ac.uk?
Chris: the real question is, if we have a hub site for data projects around the country in academia, how much it should and shouldn’t do. My personal take is that, in the short term, it should be a discovery mechanism for the top level sources of data in an organisation. These should then allow discovery of other sources of information. You don’t want to aggregate every bit of data and metadata. Without a lot of funding I wouldn’t want to take on more that that. Publishing a list of data everyday with a pretty interface on top is something we could do now eg of university buildings and transport. Or research data.
Paul Walk: what’s the driver for any of this to happen? To compare it to open government, there were drivers there around transparency etc. The equivalent in HE is probably going to come from key information sets. Will that really drive it?
Chris: it is not about not discovering datasets but more about how to discover them. Initially it’s just a mechanism for discovering.
Wilfred: could we do it iteratively and try with datahub.org – if you want to open it up, that might be a good place to park it anyway. If that doesn’t meet needs could try something else. Other thing to bear in mind with institutional data in particular is that the whole area of the HE information landscape is fraught and there’s a land grab going on at the moment. Have to be sensitive about who wants what data.
Paul Walk: the success as such will come from there being a clear mandate
Chris: need to look at return on investment. Web pages are used and really useful but not many people are using the data because the number of people who have the skills is small. But if you can create an app that works across different institutions then it’s worth them publishing their data like this because they can then use the app. There are a lot of areas where we benefit from opening our data
Damian: Wikipedia is going to have data components where institutions can feed in data, there are search engine implications. It’s like back when we started creating webpages for universities and there was a lack of understanding about why it was needed but now they are essential.
Paul Walk: yes, I can see the harried head of a university might see the marketing implications
Chris: need to look at benefits – to our members without restriction but also may be of benefit to others outside the university
Paul Walk: we need one or two examples of success that show the return on a few investments (does not need to be all of them). Our institutions are going to forced to compete more and more and some of the information will be deemed to be commercially sensitive. We might see the closing up of some of this data so we need to be able to make a good arguments if we’re to see a flourishing data.ac.uk ecosystem