Week Beginning 3rd August 2015

The ISAS (International Society of Anglo-Saxonists) conference took place this week and two projects I have been working on over the past few weeks were launched at this event. The first was A Thesaurus of Old English (http://oldenglishthesaurus.arts.gla.ac.uk/), which went live on Monday. As is usual with these things there were some last minute changes and additions that needed to be made, but overall the launch went very smoothly and I’m particularly pleased with how the ‘search for word in other online resources’ feature works.

The second project that launched was the Old English Metaphor Map (http://mappingmetaphor.arts.gla.ac.uk/old-english/). We were due to launch this on Thursday but due to illness the launch was bumped up to Tuesday instead. Thankfully I had completed everything that needed sorting out before Tuesday so making the resource live was a very straightforward process. I think the map is looking pretty good and it complements the main site nicely.

With these two projects out of the way I had to spend about a day this week on AHRC duties, but once all that was done I could breathe a bit of a sigh of relief and get on with some other projects that I haven’t been able to devote much time to recently due to other commitments. The first of these was Gavin Miller’s Science Fiction and the Medical Humanities project. I’m developing a WordPress based tool for his project to manage a database of sources and this week I continued adding functionality to this tool as follows:

  1. I removed the error messages that were appearing when there weren’t any errors
  2. I’ve replaced ‘publisher’ with a new entity named ‘organisation’.  This allows the connection the organisation has with the item (e.g. Publisher, Studio) to be selected in the same way as connections to items from places and people are handled.
  3. I’ve updated the way in which these connections are pulled out of the database to make it much easier to add new connection types.  After adding a new connection type to the database this then immediately appears as a selectable option in all relevant places in the system.
  4. I’ve updated the underlying database so that data can have an ‘active’ or ‘deleted’ state, which will allow entities like people and places to be ‘deleted’ via WordPress but still retained in the underlying database in case they need to be reinstated.
  5. I’ve begun work on the pages that will allow the management of types and mediums, themes, people, places and organisations.  Currently there are new menu items that provide options to list these data types.  The lists also include counts of the number of bibliographic items each row is connected to.  The next step will be to add in facilities to allow admin users to edit, delete and create types, mediums, themes, people, places and organisations.

The next project I worked on was the Scots Thesaurus project. Magda has emailed me stating she was having problems uploading words via CSV files and also assigning category numbers. I met with Magda on Thursday to discuss these issues and to try and figure out what was going wrong. The CSV issue was being caused by the CSV files created by Excel on Magda’s PC being given a rather unexpected MIME type. The upload script was checking the uploaded file for specific CSV MIME types but Excel was giving them a MIME type of ‘application/vnd.ms-excel’. I have no idea why this was happening, and even more strangely, when Magda emailed me one of her files and I uploaded it on my PC (without re-saving the file) it uploaded fine. I didn’t really get to the bottom of this problem, but instead I simply fixed it by allowing files of MIME type ‘application/vnd.ms-excel’ to be accepted.

The issue with certain category numbers not saving was being caused by deleted rows in the system. When creating a new category the system checks to see if there is already a row with the supplied number and part of speech in the system. If there is then the upload fails. However, the check wasn’t taking into consideration categories that had been deleted from within WordPress. These rows were being marked as ‘trash’ in WordPress but still existed in our non-Wordpress ‘category’ table. I updated the check to link up the category table to WordPress’s posts table to check the status of the category there. Now if a category number exists but it’s associated with a WordPress post that is marked as deleted then the upload of a new row can proceed without any problems.

In addition to fixing these issues I also continued working on the visualisations for the Scots Thesaurus. Magda will be presenting the thesaurus at a conference next week and she was hoping to be able to show some visualisations of the weather data. We had previously agreed at a meeting with Susan that I would continue to work on the static visualisation I had made for the ‘Golf’ data using the d3.js ‘node-link tree’ diagram type (see http://bl.ocks.org/mbostock/4063550). I would make this ‘dynamic’ (i.e. it would work with any data passed to it from the database and it would be possible to update the central node). Eventually we may choose a completely different visualisation approach but this is the one we will focus on for now. I spent some time adapting my ‘Golf’ visualisation to work with any thesaurus data passed to it – simply give it a category ID and a part of speech and the thesaurus structure (including subcategories) from this point downwards gets displayed. There’s still a lot of work to do on this (e.g. integrating it within WordPress) but I happy with the progress I’m making with it.

The last project I worked on this week was the SAMUELS Hansard data, or more specifically trying to get Bookworm set up on the test server I have access to. Previously I had managed to get the underlying database working and the test data (US Congress) installed. I had then installed the Bookworm API but I was having difficulty getting Python scripts to execute. I’m happy to report that I got to the bottom of this. After reading this post (https://www.linux.com/community/blogs/129-servers/757148-configuring-apache2-to-run-python-scripts) I realised that I had not enabled the CGI module of Apache, so even though the cgi-bin directory was now web accessible nothing was getting executed there. The second thing I realised was that I’d installed the API in a subdirectory within cg-bin and I needed to add privileges in the Apache configuration file for this subdirectory as well as the parent directory. With that out of the way I could query the API from a web browser, which was quite a relief.

After this I installed the Bookworm GUI code, which connects to the API in order to retrieve data from the database. I still haven’t managed to get this working successfully. The page surroundings load but the connection to the data just isn’t working. One reason why this is the case is because I’d installed the API in a subdirectory of the cgi-bin, but even after updating every place in the Javascript where the API is called I was still getting errors. The AJAX call is definitely connecting to the API as I’m getting a bunch of Python errors returned instead of data. I’ll need to further investigate this next week.

Also this week I had a meeting with Gary Thomas about Jennifer Smith’s Syntactic Atlas of Scots project. Gary is the RA on the project and we met on Thursday to discuss how we should get the technical aspects of the project off the ground. It was a really useful meeting and we already have some ideas about how things will be managed. We’re not going to get started on this until next month, though, due to the availability of the project staff.