Week Beginning 22nd June 2020

This was week 14 of Lockdown and I spent most of it continuing to work on the Books and Borrowing project.  Last week I’d planned to migrate the CMS from my test server at Glasgow to the official project server at Stirling, but during the process some discrepancies between PHP versions on the servers meant that the code which worked fine at Glasgow was giving errors at Stirling.  As mentioned in last week’s post, on the Stirling server calling a function while passing less than the required number of variables resulted in a fatal error, plus database ‘warnings’ (e.g. an empty string rather than a numeric zero being inserted into an integer field) were being treated as fatal errors too.  It took most of Monday to go through my scripts and identify all the places such issues cropped up, but by the end of the day I had the CMS set up and fully usable at Stirling and had asked the team to start using it.

I then spent some further time working on the public website for the project, installing a theme, working with fonts and colour schemes, selecting header images, adding logos to the footer and other such matters.  I made six different versions of the interface and emailed screenshots to the team for comment.  We all agreed on the interface and I then made some further tweaks to it, during which time team member Kit Baston was adding content to the pages.  On Thursday the website went live and you can access it here: https://borrowing.stir.ac.uk/.  Here’s a screenshot too:

I also continued to make improvements to the CMS this week, adding new functionality to the pages for browsing book editions, book works and authors.  The table of Book Works now includes a column listing the number of Holdings each Work is associated with and now includes the options of ordering the listed Works by any of the columns in the table.  When a book work row is expanded and its associated editions loads in, this table also now features the number of holdings an edition is associated with and allows the table to be ordered by any of the columns.  I then made the number of holdings and records listed for each Work and Edition a link (so long as the number is greater than 0).  Pressing on the link brings up a popup that lists the holdings and records.  Each item in the list features an ‘eye’ icon and pressing on this will take you to the record in question (either in the library’s list of holdings or the page that the borrowing record appears on) with the page opening at the item in question.

I updated the ‘browse authors’ page in a similar way:  added in the option of ordering the table by any of the columns and adding in counts of associated works, editions, holdings and items that are also now links that open up a popup containing all related items.  Each of these feature an ‘eye’ icon and you can press on one of these to be taken to the record in question.  Holdings and Items will open in the corresponding library’s list of book holdings while works and editions will load the ‘Browse books’ page.  Linking to an edition was a bit tricky as editions are dynamically loaded into the page via JavaScript when a book work row is expanded.  I had to pass variables to the page that flagged that one work should be open on page load, triggered the loading in of the editions and then scrolled the page to the correct location once the editions had loaded.  If the edition has no work then the ‘no work specified’ section needs to open, which currently takes a long time due to there being 1911 such editions at present.  There isn’t currently a ‘loading’ icon or anything but things do load in the background and the page will eventually jump down to the correct place.  I also fixed a bug whereby if you disassociated a book holding from a record the edition and work autocompletes stopped working for that record.

On Friday I had a Zoom call wit Project PI Katie Halsey and Co-I Matt Sangster to discuss my work on the project and to decide where I should focus my attention next.  We agreed that it would be good to get all of the sample data into the system now, so that the team can see what’s already there and begin the process of merging records and rationalising the data.  Therefore I’ll be spending a lot of next week writing import scripts for the remaining datasets.

I worked for a number of additional projects this week as well.  On Tuesday I had a Zoom call with Jane Stuart-Smith, Eleanor Lawson of QMU and Joanne Cleland of Strathclyde to discuss a new project that they’re putting together.  I can’t say too much about it at this stage, but I’ll probably be doing the technical work for the project, if it gets funding.  I also spoke with Thomas Clancy about another place-names project that has been funded and I’ll need to adapt my existing place-names system for.  This will probably be starting in September and involves a part of East Ayrshire.  I also adding in some forum software to Matthew Creasy’s new project website that I recently put together for him.  He’s hoping to launch this next week so will probably add in a link to it then.

I also managed to spend some time this week looking into the Historical Thesaurus’s new dates system.  My scripts to generate the new HT date structure completed over the weekend and I then had to manually fix the 60 or so label errors that Fraser had previously identified in his spreadsheet.  I then wrote a further script to check that the original fulldate, the new fulldate and a fulldate generated on the fly from the new date table all matched for each lexeme.  This brought up about a thousand lexemes where the match wasn’t identical.  Most of these were due to ‘b’ dates not being recorded in a consistent manner in the original data (sometimes two digits e.g. 1781/86 and sometimes one digit e.g. 1781/6).  There were some other issues with dates that had both labels and slashes as connectors, whereby the label ended up associated with both dates rather than just one.  There were also some issues with bracketed dates sometimes being recorded with the brackets and sometimes not, plus a few that had a dash before the date instead.  I went through the 1000 or so rows and fixed the ones that actually needed fixing (maybe about 50).  I then imported the new lexeme_dates table into the online database.  There are 1,381,772 rows in it.  I also attempted to import the updated lexeme database (which includes a new fulldate column plus new firstdate and lastdate fields).  Unfortunately the file contains too much data to be uploaded and the process timed out.  I contacted Arts IT Support and they managed to increase the execution time on the server and I was then able to get this second table uploaded too.

Fraser had sent around a document listing the next steps in the data update process and I read through this and began to think things through.  Fraser noted that the unique date types list didn’t appear to include ‘a’ and ‘c’ for firstdates.  I checked my script that generated the date types (way back in April last year) and spotted an error – the script was looking for a column called ‘oefirstdac’ where it should have been looking for ‘firstdac’.  What this means is any lexeme that has an ‘a’ or ‘c’ with its first date has been rolled into the count for regular first dates, but it turns out that this is what Fraser wanted to happen anyway, so no harm was done there.

Before I can make a start on getting all HT lexemes that are XXXX-XXXX, OE-XXXX and XXXX-Current and are matched to an OED lexeme and grabbing the OED date information I’ll need to find a way to actually get the new OED date information.  Fraser noted that we can’t just use the OED ‘sortdate’ and ‘enddate’ fields but instead need to use the first and last citation dates as these have ‘a’ and ‘c’.  I’m going to need to get access to the most recent version of all of the OED XML files and to write a script that goes through all of the quotations data, such as:

<quotations><q year=”1200″><date>?c1200</date></q><q year=”1392″><date>a1393</date></q><q year=”1450″><date>c1450</date></q><q year=”1481″><date>1481</date></q><q year=”1520″><date>?1520</date></q><q year=”1530″><date>1530</date></q><q year=”1556″><date>1556</date></q><q year=”1608″><date>1608</date></q><q year=”1647″><date>1647</date></q><q year=”1690″><date>1690</date></q><q year=”1709″><date>1709</date></q><q year=”1728″><date>1728</date></q><q year=”1755″><date>1755</date></q><q year=”1804″><date>1804</date></q><q year=”1882″><date>1882</date></q><q year=”1967″><date>1967</date></q><q year=”2007″><date>2007</date></q></quotations>

And then picks out the first date and the last date, plus any ‘a’, ‘c’ and ‘?’ value.  This is going to be another long process, but I can’t begin it until I can get my hands on the full OED dataset, which I don’t have with my at home.