Week Beginning 9th November 2015

This week my time was mostly split across three projects. Firstly, I returned to finish off some work for the Thesaurus of Old English. I had been asked to create a content management system that would allow staff to edit categories and words when I redeveloped the website a couple of months ago, but due to other work commitments I hadn’t got round to implementing it. This week I decided that the time had come. I had initially been planning on using the WordPress based Thesaurus management system that I had created for the Scots Thesaurus project, but I realised that this was a bit unnecessary for the task in hand. The WordPress based system is configured to manage every aspect a thesaurus website – not just adding and editing categories and words but also the front end, the search facilities, handling users and user submitted content and more. TOE already has a front end and doesn’t need WordPress to manage all of these aspects. Instead I decided to take the approach I’d previously taken with the Mapping Metaphor project: Have a very simple script that displays an edit form for a category and processes and updates (with user authentication, of course). It took about a day to get this set up and tested for the TOE data. The resulting script allows all the thesaurus category information (e.g. the heading, part of speech and number) to be edited and category cross references to be added, edited and deleted. Associated lexemes can also be added, edited and deleted and all the lexeme data, including the associated search terms can be updated. I also updated the database so that whenever information is deleted it’s not really deleted but moved to a different ‘deleted’ table.

My second project of the week was Mapping Metaphor. Last week I had begun to update the advanced search and the quick search to enable searches for the sample lexemes. This week I updated the Old English version of the site to also include these facilities. This wasn’t as straightforward as copying the code across as the OE data has some differences to the main data – for example there are no dates or ‘first lexemes’. This meant updating the code I’d written for the main site to take this into consideration. I also had to ensure that the buttons for adding ashes and thorns worked with the new sample lexeme search box. With all this implemented and then tested by Wendy and Ellen I made the new versions live and they are now available through the Mapping Metaphor Website.

My third major project of the week was the Hansard visualisations for the Samuels project. My first big task was to finish off the ‘limit by member’ feature.  Last week I had created the user interface components for this, but the database query running behind it just wasn’t working.  A bit of further investigation this week uncovered some problems with the way in which the SQL queries were being dynamically generated and I managed to fix these, and also to add some additional indices to the tables to speed up data retrieval.  I also ensured that returned data was cached in another table which great improves the speed of subsequent queries for the same member.  The limit by member feature is now working rather well, although there are still some improvements that I need to make to the user interface.  We had an XML file containing more information about members from the ’Digging into Linked Parliamentary Data’ project. This included information on members’ party affiliations and also their gender, both of which will be very useful to limit the display of thematic headings by. I managed to extract party information from the XML file and have uploaded it to our Hansard database now, associating it with members (and through members to speeches and frequencies). Some people have multiple parties and I managed to get them all out too, including dates for these where available. We have 9704 party affiliations for the 9575 members. I’ve also extracted all of the parties too – there are 54 of these, which is more than I was expecting. This data will mean that it will eventually be possible to select a party and see the frequency data for that party.

I also took the opportunity to add the gender data this to our member database as well as I thought a search for gender might interest people (although we’ll definitely need to normalise this due to the massive gender imbalance and even then it might not be considered advisable to compare thematic heading use by gender – we’ll need to see). I had a bit of trouble with the import of the gender data as there are two ID fields in the people database – ‘ID’ and ‘import_ID’. I initially used the first one but spotted something was wrong when it told me that Paddy Ashdown was a woman! All is fixed now, though, and I’ll try to update the visualisation to include limit options for party and gender next week.

Also this week I had a catch-up meeting with Marc where we discussed the various projects I’m involved with and where things are headed. As always, it was a very useful meeting. I also had a couple of other university related tasks that I had to take care of this week that I can’t really go into too much detail about here. That’s all for this week.