Open Data Hackday
DevCSI’s Open Data Hackday brought together web managers, developers and domain experts ahead of the Institutional Web Management Workshop (IWMW). The event was designed to provide an opportunity to discuss the issues and explore how developers and web managers could be working together to exploit the potential of open data to support their institutions. Data formed a clear theme throughout IWMW itself, so those who attending the hackday had a head start.
The event began with a series of short presentations to introduce some of the concepts, techniques and problem spaces of open data.
When Not to Link your Linked Data: Open Data at the University of Southampton
Gutteridge kicked off the event with a crash course in the differences and overlaps between open data, RDF and linked data before giving us a whirlwind tour of some of the ways in which they are using open linked data at the University of Southampton.
His main case study focused on their work for the university catering department. Gutteridge and his team were keen that there should be no additional workload for those maintaining the source data by opening it up, so they took existing Excel spreadsheets used to record menus, product prices and stock information, tidied these up and made them available as Google spreadsheets. They created a script to take this data and use it to drive the catering department website at http://catering.southampton.ac.uk, making it the first RDF-powered catering site.
“All this transparency stuff is all very well and go, but all people want to know is where to get coffee”
The team added further value by creating a spreadsheet by hand to link data from a disabled access review to the coffee shop locations, so you can now see access information when you’re searching the catering site for a cafe on campus. The is a good example of when to link data, as it adds real value for relatively low investment by connecting data that is already available.
Gutteridge emphasised that you should not link data unless there is a return on the effort invested to create it. If you build a useful tool on top of the linked, open data, people can see the value. In this case, the catering department see the real benefit of a fully and automatically up-to-date website, which can help to drive sales and also reduce some of their administrative workload by enabling them to generate and print out menus in only a few clicks.
“You have to trick people into doing the right thing”
Gutteridge moved on to discuss the wider open data activity at the University of Southampton, including the types of data that they make available (explicitly formal information put out by the university or approved by the university) and the risks to the university’s reputation which had to be considered when judging how best to use the data and what services to offer website visitors.
Finally, Gutteridge cited wiki.openorg.ecs.soton.ac.uk which is run by himself and Dave Challis, and provides some recommended patterns to help people get going. He noted the importance to use common patterns to link data so that it can be connected with other data sets in the future.
Gutteridge concluded by emphasising that your should not link data unless you can find a return on investment to justify the effort of doing so.
Defining the Problem Space Around Getting Institutional Estates Systems Talking to Other Campus IT
Rob Bristow, Greening ICT Manager
Bristow outlined the essential need for universities to save energy, money and carbon, and highlighted the potential for using data from the building management systems used by Estates departments to help realise these savings. He observed that Estates managers don’t have the expertise to expose this data and make use of it in other systems, whilst university developers often don’t know this data exists or how to talk effectively to Estates managers.
However, estates people are beginning to look for help with their IT to help join up their systems and make use of the data to drive energy and carbon savings. To illustrate, Bristow described a problem at Leeds University’s data centre where smart doors that couldn’t be used because they didn’t connect to the building management system, and highlighted comments from the Chair of AUDE, who observed that there is a deluge of data coming out of the number of new meters being installed, but people don’t know what to do with that data.
Bristow went on to describe some current projects within this space, including the PAWS project (an open source PC power-down solution developed at the University of Aberystwyth), Kit-Catalogue (a swap shop for laboratory equipment developed at Loughborough University), the Heat & Light by Timetable project at Leeds Metropolitan and the CUSTOMER project at the University of Coventry, which is investigating how to reduce energy consumption at halls of residence by examining behaviour and feedback loops to help encourage change.
Bristow is looking to bring in the skills of local developers and open data domain experts to help university Estates managers to make fuller use of the data that’s available. He discussed plans for two future events with DevCSI – suggesting one “interfaces day” to facilitate talks between developers and estates people, followed by a hack day to produce some real solutions.
Tom Kirkham, University of Nottingham
Kirkam provided us with an introduction to the SALAMI project created by the CIEPD group. The aim of the project is to integrate data into a employability ecosystem to help fill the gaps created by recent cuts to careers advice services.
Kirkham outlined some of the data sources that are used in SALAMI and provided a practical demonstration of the interface by searching for a particular occupation against a specific location. The system returned job descriptions, iCould videos interviews about what you can achieve through training for that occupation, a map of local colleges providing relevant training, lists of course titles, job trends in the local area (such as vacancies and “job claimants” who are trained to do a job but are claiming benefits), and crime statistics.
Kirkham noted that this has helped to open people’s minds to open data, whilst also showing that where funding has been lost there are options to fill the gaps for students.
“The more open data there is, the more innovation there is”
Looking forward to the future of this project, Kirkham observed that they have a lot of student data in ePortfolios, which they want to use to help students create resumes for employers to search to find placement students. By collecting data about how the employers are searching they hope to recycle this information so students can see how employers search and consider their CVs.
Metrics for the Social Web
Brian Kelly, UKOLN, University of Bath
Kelly looked at how open data can be used to collect data about the university from outside of the institution, including the use of social web tools to support institutional activities. He presented some of the ways you can currently visualise social web data, including tools Peerindex, which help you to make comparisons between the topic fingerprint of different institutional Twitter accounts.
Kelly described his efforts to carry out a manual proof-of-concept to collect data about use of Slideshare in order to prove the value of uploading resources, noting this identifies a need for a more efficient, streamlined way of doing such analysis. He asked if there are APIs available to help get this data and therefore enable better understanding and strategy development.
Kelly emphasised that he would like to avoid university marketing departments paying social media companies for social media analytics. He feels there is a role for local developers to add value to their institution by gathering, interpreting and visualising data for personal and institutional use of social web.
There were several key debates which continued throughout the event, including a discussion about the value of opening up scientific data and rewarding people for producing good, open data upon which research can be based, rather than only rewarding people for producing a paper.
The group also discussed the importance of lobbying within the institution to open up more data, and the need to attach the idea of open data to things people already understand, such as RSS feeds, to help change the hearts and minds of decision makers. Chris Gutteridge argued for collecting quotes from happy managers and shared some of the comments he has already collected towards this aim.
KIS Data Sources
One of the main practical activities of the day was to consider the requirements of the new Key Information Sets (KIS) that the government are requiring universities to provide.
The group examined the the requirements in detail and brainstormed potential data sources that could be used to satisfy the requirements. They identified a number of issues and gaps in the data that is currently openly available and discussed mechanisms for encouraging specific organisations to release the necessary data.
Following on from this, one group worked on an interface design which could take KIS data and data from other sources to create a profile of each institution, enabling prospective students to compare universities and select their own weighting for different aspects of university life that interest them to create a more personalised search experience.
They identified eight priority areas which prospective students could rank on sliders to indicate their relative importance to them. These included: gender rations, accommodation costs, sports facilities, local cultural context, social life, community engagement opportunities, environment and transport. The group identified potential data sources to score universities against these priorities and explored the types of algorithms that might be needed to calculate a single value from
Accommodation Cost Heat Map
The main code produced by the hack day was written by Ben O’Steen (Cottage Labs) and Dave Challis (University of Southampton) who created an accommodation cost heat map using data from Zoopla. They queried Data.gov.uk to get a list of latitude and longitudes for the UK HE institutions, which they passed to the Zoopla search API to find properties within a 15 miles radius of each featuring the keywordsstudent and rental only. They calculated the price per person by dividing the rental cost by the number of bedrooms in the property, then converted this data to KML and fed it to Google Maps. Ben describes the project in more detail in his blog post about the project, including links to the code.
Open, linked data proved to be a key theme during the main IWMW workshop, particularly as web managers focused on how to add value to their services. Mike Nolan from Edge Hill University provided a report to the workshop about the activities of the hack day group, observing that: “Web teams may be able to provide assistance integrating information from estates to make campuses more efficient.” This emphasis on data integration highlighted the value of developers for web managers and the strategic innovation that local developers are able to contribute.