Week Beginning 8th March 2021

It was another Data Management Plan heavy week this week.  I created an initial version of a DMP for Kirsteen McCue’s project at the start of the week and then participated in a Zoom call with Kirsteen and other members of the proposed team on Thursday where the plan was discussed.  I also continued to think through the technical aspects of the metaphor-related proposal involving Wendy and colleagues at Duncan Jordanstone College of Art and Design at Dundee and reviewed another DMP that Katherine Forsyth in Celtic had asked me to look at.

Other than issues I arranged for Joanna Kopaczyk’s ‘The Future of Scots’ project website to be moved to its top-level ‘ac.uk’ domain, and it can now be found here: https://scotslanguagepolicy.ac.uk/.  Marc Alexander had also contacted me about a weird bug he’d encountered in the Historical Thesaurus.  One of the category pages was failing to display properly and after investigation I figured out that it was an issue with the timeline data for one of the words on this page, which was causing the JavaScript to break.  I pulled out the JSON embedded in the page and the data for the word seemed to be missing a closing ‘}’, which was causing the error.  It turned out that someone had entered the dates the wrong way round for the word.  It was listed as ‘a1400-c1386’.  My dates system had plucked out the dates and correctly ordered them, but this meant the system was left with ‘1400’ with a joining ‘-‘ and then nothing after it, which resulted in the JSON being malformed.  I swapped the dates around (both in the new dates table and in the display date) and everything started working as it should again.  It was a relief to know that it was an issue with the data rather than my code.

Also this week I spent a bit of time working on the Books and Borrowing project, generating more page image tilesets and their corresponding pages for two more of the Edinburgh ledgers and adding an ‘Events’ page to the project website and giving more members of the project team permission to edit the site.  I also had an email chat with Thomas Clancy about the Iona project and created a ‘Call for Papers’ page including submission form on the project website (it’s not live yet, though).

I spent the rest of my week continuing to work on the Anglo-Norman Dictionary.  We received the excellent news this week that our AHRC application for funding to complete the remaining letters of the dictionary (and carry out more development work) was successful.  This week I mage some further tweaks to the new blog pages, adding in the first image in the blog post to the right of the blog snippet on the blog summary page.  I also made the new blog pages live, and you can now access them here: https://anglo-norman.net/blog/.

I also made some updates to the bibliography system based on requests from the editors to separate out the display of links to the DEAF website from the actual URLs (previously just the URLs were displayed).  I updated the database, the DMS and the new bibliography page to add in a new ‘DEAF link text’ field for both main source text records and items within source text records.  I copied the contents of the DEAF field into this new field for all records, I updated the DMS to add in the new fields when adding / editing sources and I updated the new bibliography page so that the text that gets displayed for the DEAF link uses the new field, whereas the actual link through to the DEAF website uses the original field.

I also continued to work on the facilities to upload batches of new or updated entry XML files to the DMS.  I created a new ‘holding’ table for uploaded entries and created a page that allows the user to drag and drop XML files into the browser, with files then getting uploaded, processed and added into this new table.  This uses a handy JavaScript library called Dropzone (https://www.dropzonejs.com/) that I previously used for the Scots Syntax Atlas CMS.  The initial version of the upload is working well, but I needed to know exactly how uploaded files should be fully processed before I could proceed further, which required some lengthy email exchanges with the editors.

The scripts I written when uploading the new ‘R’ dataset needed to make changes to the data to bring it into line with the data already in the system as the ‘R’ data didn’t include some attributes that were necessary for the system to work with the XML files, namely:

In the <main_entry> tag the attribute ‘lead’, which is used to display the editor’s initials in the front end (e.g. “gdw”) and the ‘id’ attribute, which although not used to uniquely identify the entries in my new system is still used in the XML for things like cross-references and therefore is required and must be unique.  In the <sense> tag the attribute ‘n’, which increments from 1 within each part of speech and is used to identify senses in the front-end.  In the <senseInfo> tag the ID attribute, which is used in the citation and translation searches and the POS attribute which is used to generate the summary information at the top of each entry page.  In the <attestation> tag the ID attribute, which is used in the citation search.

We needed to decide how these will be handled in future – whether they will be manually added to the XML as the editors work on them or whether the upload script needs to add them in at the point of upload.  We also needed to consider updates to existing entries.  If an editor downloads an entry and then works on it (e.g. adding in a new sense or attestation) then the exported file will already include all of the above attributes, except for any new sections that are added.  In such cases should the new sections have the attributes added manually, or do I need to ensure my script checks for the existence of the attributes and only adds the missing ones as required?

We decided that I’d set up the systems to automatically check for the existence of the attributes and add them in if they’re not already present.  It will take more time to develop such a system but it will make it more robust and hopefully will result in fewer errors.  I’ll also add an option to specify the ‘lead’ initials for the batch of files that are being uploaded, but this will not overwrite the ‘lead’ attribute for any XML files in the batch that already have the attribute specified.

I’ll hopefully get a chance to work on this next week.  Thankfully this is the last week of home-schooling for us so I should have a bit more time from next week onwards.