One of the outcomes from the Orbital project that I’m part of is a set of new policies on the subject of research data management. Early on it was decided that this would – in the spirit of open research – be made available under an open licence along with the rest of our resources on the subject (such as training and support materials).
Being the technically minded folk that we are, we wanted to make sure that several of us could work on documentation at the same time without running the risk of overwiting each others changes. We also wanted a comprehensive versioning system to be in place from us putting the first words into the keyboard so that we could see every single change and who made it, something that we think is a big part of making a resource truly open. Finally, we also wanted a mechanism which could allow other people indirectly connected to the project to propose changes. Given our history of using similar systems to manage code there was an obvious choice – the Git source control system.
Git is a system which primarily relies on tracking line-by-line changes, meaning that when we wrote stuff we’d want to use a file format which behaved on a line-by-line basis. This made compiled binary formats such as Microsoft Word or even PDF a bit unsuitable, since a small change could result in a huge set of changes spanning hundreds (or more) of lines. We also wanted to use an open standard which didn’t have prohibitive licence restrictions and which was simple enough to be read and understood by anybody with a basic text editor. There are quite a few standards out there which meet this requirement, but again based on past experience we’re using Markdown for our RDM Policy.
Finally, inspired in no small way by the efforts of the Bundestag to convert their entire body of law to Git we wanted to store policy on a platform which not just allowed community involvement, but which positively encouraged it. GitHub is the world’s largest repository of open development, covering every language under the sun and projects ranging from hardcore low level programming through writing documentation through to communal story writing. Even better, they provide free hosting space for open projects. We already had a University of Lincoln user kicking around from past work, so it was a logical place to stick our Git repository. If you’re interested you can take a look at what we’ve got.
What’s interesting about using open text-based standards to write policy, Git for managing revisions and GitHub as a storage provider is that we’ve inadvertently made it very easy for people to do things that they couldn’t do before.
As part of the Orbital project we’re interacting a lot with our Nucleus data store, and we decided that it would be good fun to build a decent library which we could include in all of our projects which need it. Here’s how we now talk to Nucleus:
We didn’t see much point in keeping this to ourselves though, so for those of you who interact with Nucleus on a regular basis, use PHP and just can’t be bothered building your own CURL requests and mucking about with complex error handling you can just use our handy Composer library. The whole thing is open source, so please feel free to have a look and make changes.
Today we shipped the latest version (v3) of our Staff Directory. This is the most heavily used of the LNCD creations, underpinning public profiles for staff members as well as helping to build parts of the corporate website.
Today’s release brings a new look to profile pages, based on feedback from the past few months of operation. It also brings with it a brand new profile editor, replacing the editor on our Blogs platform. The new editor helps staff complete their profiles in a much more structured way, with more robust validation, meaning that when we re-expose the data to the world it’s significantly cleaner. This new data is responsible for the biggest change yet to how Directory works.
Previously the Directory gleaned all of its data from various sources around the University, then did some work to tidy it up as best we could before presenting it. Instead Directory now takes its information exclusively from our Nucleus data platform, and when people edit their profile the changes are written back into Nucleus. Coupled with the newly structured data this allows us to build a very powerful web of knowledge which gives us insights in previously impossible ways. This is our first step to making Nucleus more than just a collection of data gleaned from elsewhere, making it the only source of some information. Over the coming months we plan to move more stuff to having Nucleus as its primary source (specifically information around research projects, and supplementary Estates information used for geolocation).
I’m very pleased to announce that after quite a while being tinkered with, our brand new (and shiny) version of Data.Lincoln is ready to rock and provide you with open (for the most part) institutional data. To start with we’re really pleased to give you some of our estates data, our staff phone book and an institutional profile as fully open, linked data. We’re also making it easier to find the University’s published documents (unfortunately not yet available as raw data) such as our financial returns and not-yet-openly-licenced course data, as well as doing our best to get this data released in better formats.
We’re still working on cracking open more data, and have meetings lined up with various parts of the University to try and make this happen faster. In the meantime we’re working on building even more reliable data pathways from the source of the data through our Nucleus data platform and out onto Data.Lincoln.
For now, why not grab some data and have a play? Just remember to follow the licence restrictions where they exist.
Those with the (mis)fortune to have worked with me in the past will know that I have a fairly low tolerance for what I consider to be stupid decisions. This normally comes across as me being arrogant – something which is entirely understandable – and often causes people to think I’m sticking my nose in to bits of the business which don’t concern me or which I’m not an expert in – also entirely understandable.
However, recently I’ve been exposed to the products of some decisions which fall well within the realms of my expertise, specifically those regarding the procurement of new web-based services for the University. As somebody who works in this field all day every day, I’m not impressed.
At Lincoln we have not one but two teams of people (Online Services in ICT, and LNCD in CERD) who are paid for their expertise in web application development, usability, accessibility, optimisation, deployment and support. Yet there are departments (who will remain nameless) who still insist on buying websites which do very simple things – most likely at a significant expense – without bothering to ask either of these teams. A recent example (which will also remain nameless) completely ignored the Blogs platform, which would have addressed 90% of the site functionality without spending a single penny, and the other 10% with a simple plugin which would have been done in a week. Other instances have totally ignored the fact that we have a unified web design for things, which has been the product of over a year of constant tweaking and improvement. Some things don’t bother to use any of the University’s authentication options, instead choosing to ask people to register again, or worse to email somebody asking for an account.
The fix is simple. All that needs to happen is for somebody wanting a website to call somebody in OST or LNCD and ask advice before paperwork is signed or the process has gone too far down the path.
On Sunday 9th September our monitoring noticed that the Gateway site had become unavailable. Analysis provided by monitoring narrowed the problem down to the DNS not resolving correctly; our entire lncn.eu domain had been rendered inaccessible. The root cause of this was a DDoS attack against our DNS providers for the lncn.eu domain, PointHQ, causing their upstream provider to temporarily remove all their servers from the routing pool to mitigate damage.
What was affected?
PointHQ provides DNS services for the lncn.eu and lncd.org domains, meaning that any services using these domains were potentially unavailable. The most visible of these services are the lncn.eu address shortener, and the University’s Gateway service, although other services were affected.
As DNS caches expired any services using the affected domains were left unable to be resolved by end users, meaning that the services were inaccessible. Since gateway.lincoln.ac.uk implements a redirect to a lncn.eu subdomain, accessing Gateway using this domain was also affected.
What was done to fix the problem?
PointHQ were working to mitigate the problem throughout its duration, and DNS servers were restored later in the day. Essential records from the lncn.eu domain were also duplicated on the Rackspace Cloud DNS service, with the Rackspace DNS servers being added to the lncn.eu domain record to serve as a backup in the event of PointHQ becoming unavailable again later.
What is being done to stop this happening again?
The lncn.eu domain will retain at least one backup DNS server in its record, protecting essential services against a single failure.
If you’re a member of staff at the University you will soon be hearing loads more about the Directory, the planned replacement for the University’s phone search system and staff profiles.
Whilst the Directory itself is rather cool, how it’s been built is of somewhat more interest. First of all, it’s driven entirely by data from other sources. The Directory itself doesn’t store any data at all, save for a search index. This means that unlike the old staff profiles on the corporate website it helps to expose bad data where it exists — since we soft-launched the Directory we’ve been barraged by requests from people to ‘fix their profile’, when in fact the thing that needs ‘fixing’ often lies at a far higher level. In some cases it’s literally been a case of people having misspelt job titles in the University’s HR system for years, data which is now corrected. This whole cycle of exposing bad data and not attempting to automatically or manually patch it up at the Directory end helps lead the University to have better data as a whole, making lives easier and people happier.
Secondly, the Directory is a perfect example of why iterative development rocks. The very first version of the Directory arrived over a year ago, and since then has been improved to include semantic markup, a new look, faster searching, staff profiles, more data sources, open data output formats and more. Over the last couple of weeks as it’s started to be integrated with the corporate website it’s been subject to even more refining, fixing formatting, typos, incorrect data and more. These changes happen quickly – a new version is released with minor changes almost daily – and are driven almost exclusively by real users getting in touch and telling us what they think needs doing.
The upshot of doing things this way, harnessing data that already exists and letting people feed back as quickly as possible, leads to products and services which reach a usable state far faster, are a closer match to user requirements, and help to improve other systems which are connected or exist in the same data ecosystem.
Today I’ve been kicking around the ICT office with Alex, figuring out how to make Jenkins (our wonderful CI server) build and publish the latest version of the CWD with all the bells and whistles like compilation of CSS using LESS, minification, validation of code and so-on. As part of this we managed to fix a couple of bits and pieces which had been bugging me for a while, namely the fact that GitHub commit notifications weren’t working properly (fixed by changing the repository URI in the configuration) and the fact that Campfire integration wasn’t working (fixed by hitting it repeatedly with a hammer).
This brought me to thinking about how our various things tie in together, so I set about charting a few of them up. After a while I realised the chart had basically expanded into a complete flowchart of the various tools and processes that hang together to keep the code flowing in a steady stream from my brain – via my fingers – into an actual deployment on the development server. Since it may be of interest to some of you, here’s a pretty picture:
The beauty of this is that the vast majority of the lines happen completely by themselves — I get to spend my days living in the small bubble of my local development server and dipping in and out of Pivotal Tracker to update stories. The rest is magically happening as I work, and the constant feedback through all our monitoring and planning systems (take a look at SplendidBacon for an epic high-level overview) means that the rest of the project team and any project clients can see what’s going on at any time.
The ever popular JISC sponsored Dev8D event is just 1 month away now. Dev8D is the largest event of its kind in the UK and it is your only opportunity this year to get 3 days of free training in essential skills and emerging technologies. This year the event is packed with great opportunities for learning new tech skills, discovering new technologies and collaborating on technical challenges with experts in the sector. At past events attendees have learnt new programming languages, experimented with new software platforms and established lasting communities. As always tickets for the event are free but availability is limited so book soon to avoid disappointment.
This year more than ever the goal of Dev8D is to tool up attendees with skills in essential and emerging technologies for the coming year in away that is practical and fun. There will be sessions for:
Programming in Python
Git version control and source code management
Understanding and using Linked Data
Writing effective Moodle plug-ins
Making your web applications accessibility friendly
too many more to name…
Attendees will also get to interact with cool technologies like 3D printers and multi-touch tables and form their own sessions in our larger than ever unconference sessions. Bring your technical problems along and work them through with experts in the sector.
This years event runs from the February the 14th-16th and will be in University of London student union. Remember tickets are free but they are limited so sign up before you miss the opportunity.
You can see more details and register for the event on the Dev8D website at: http://dev8d.org/
Hope to see you there!