Week Beginning 1st July 2024

I was back at work this week after a lovely week’s holiday in Mexico.  My biggest task of the week was to migrate the ‘map first’ place-names interface I’ve been working on for the Iona place-names project to the Ayr place-names resource.  The version I’ve created for the Iona project has not yet launched and the Ayr project is nearing completion so it looks like this might be the project that gets to launch the new interface first, but we’ll see.

It took a bit of time to migrate the interface as the Iona project is slightly different to the previous place-names projects I’ve worked on (Ayr, Galloway Glens and Berwickshire).  The resource has many bilingual fields to record Gaelic versions, it has a ‘flattened’ altitude range due to Iona being so close to the sea and it doesn’t feature parish boundaries as Iona is all in one parish.  To get the resource to work with the Ayr data I had to address all of these issues, wit the parish boundaries being the trickiest to get working.  In order to integrate these I needed to update the map’s ‘Display Options’ to add the option in and then update my code to incorporate a new display option.  As this needed to be represented in the page URL I needed to shift all parts of the URL relating to the search, browse and viewing of records along by one, and my code then needed to ensure this shift was represented throughout.  It was a bit of a pain to sort out but I got there.

I also made a new colour scheme for the Ayr map interface to differentiate it from the Iona one, but this is only temporary and we might change this before we go live.  I also added in a logo for the map (an ‘A’ for ‘Ayr’ taken from the background map of the website) but again this might be changed to something else (e.g. a ‘C’ or the Coalfield Communities icon if this is allowed), or removed entirely before the resource launches.  Below is a screenshot of the new map interface showing the results of a quick search for ‘Burn’ with place-name labels and parish boundaries (the orange lines) turned on using the OS1881 map:

I had two meetings this week, both on Tuesday.  The first was with my line manager Marc Alexander to discuss the logistics of the new project the Books and Borrowing team are putting a proposal together for.  I can’t say too much about this for the moment.  The second was with Deven Parker to discuss her Playbills project with a couple of people from Computing Science who are researching AI.  It was interesting to hear the possibilities that might be offered by AI in terms of extracting data form the Playbill images, although I think that we’ll need to discuss things with them in more detail if we are to ensure they fully understand the data and what we need to get out of it.  More discussions will no doubt follow.

I had an email discussion with Sofia from the Iona project about further updates to the map interface and to explain how to successfully import the data exported from the resource into Excel in a way that would ensure accented characters are not garbled.  I also gave some advice to Pauline Graham of the DSL about creating user accounts and helped Pauline MacKay of Scottish Literature with some issues she’d been having accessing the content management system for the Burns correspondence.  This project also encountered an issue later in the week whereby the scripts weren’t executing but were instead downloading.  This was very concerning and turned out to be a problem with our hosting company that we were thankfully able to fix.

Also this week I added feedback questionnaire popups, pages and menu items to the https://www.seeingspeech.ac.uk/speechstar/ and https://speechstar.ac.uk/ websites, made a few further tweaks to the text of each resource and gave some advice to Eleanor about accessing the Google Analytics stats for each site.

I was also asked to add a further library register to the Books and Borrowing resource.  This consisted of around 190 images, which were supplied as PNG files.  Unfortunately we need the images to be JPEGs to be consistent with all of our other images so I needed to figure out a way to convert the images.  Batch converting images from PNG to JPEG seems like the sort of thing that should be straightforward to do using Photoshop or even just in Windows, but despite trying several methods I didn’t find anything that worked.  Eventually I installed the ImageMagick command-line tool and used a single command:

magick mogrify -format jpg *.png

That converted all of the files in one go, as detailed here: https://imagemagick.org/script/mogrify.php.  Unfortunately I then discovered that my access to the Stirling VPN had been blocked so I was unable to access the project’s server.  Unblocking my access required authorisation from a few people and by the end of the week the process had still not been completed so as of yet I haven’t been able to complete this task.

Finally this week we received some feedback from the testing of the new Wales ‘Speak For Yersel’ resource, which required me to make many changes to the project’s data.  This included replacing existing and adding new sound files, adding new answer options, updating questions and adding new ones.  I also fixed a bug that had been caused by a difference in the way blank fields were stored in the database for new questions that I’d added a while back.  These were classed as ‘empty’ for previously created data but ‘null’ for the new data (two different things in databases) and my code wasn’t dealing with the ‘null’ values properly.  I made the fields blank instead, which has thankfully sorted things.  Note that the data was all recorded successfully for survey responses – the issue was purely with their display on the maps.  I also spotted that some map markers for the ‘Mate’ question were not displaying properly due to the sheer number of possible options being greater than my code was set up to work with.  I added in some new marker colours and this addressed the issue.

Week Beginning 17th June 2024

I had a couple of meetings this week, the first of which was with the Books and Borrowing team to discuss a potential new project.  I had to do a fair bit of preparation for this and had several discussions following the meeting too.  It’s all looks really great, but I can’t go into any more details at this stage.  My second meeting was with Deven Parker to discuss her Playbills project.  This was also a very productive meeting and plans are beginning to come together.  I’ve been invited to participate in a meeting Deven is having with some AI people (erm, that’s people studying AI rather than ‘AI people’… at least I hope so) in Computing Science next month, which I’m pretty excited about.

I spent quite a lot of time this week continuing with my mockups of a new interface for the Dictionaries of the Scots Language website.  I can’t share any screenshots of my work yet, but this week I added in the ‘Add yogh’ button to the ‘Older Scots’ search bar (both index page and entry page).  This appears as a button to the left of the bar with ‘What is yogh’ added as a link underneath.  Pressing on the ‘Add’ button adds the yogh to the input.  Pressing on the ‘What…’ link opens a modal overlay featuring the explanatory text.  I also created a mockup of the ‘advanced search’ form.  As with the live site, the advanced search features a tab for entries and another for bibliography.  Each has a search form section and a help section and the layout is pretty similar to the live site, but has been modernised and tidied up.  As with the live site, pressing on the ‘In’ buttons in the entry search changes which search options are visible and the layout works a lot better on mobile screens than the live site does.

Finally I completed a mockup of the ‘About’ page.  The page features boxes for the ‘top level’ information, each of which appears as a link (not currently linking to anywhere).  Where an information type has subpages these appear in a list within the box.  Initially I wasn’t going to add a quick search box to the ‘About’ pages, but decided it would be better to do so.  However, as these pages are not within a selected dictionary section the search bar needs to include a dictionary selector.  I’ve added this to the left of the search and hopefully it should be intuitive to use.  At the moment the yogh information only appears when ‘Older Scots’ is selected and is hidden when ‘Modern Scots’ is selected.  When Older Scots is selected the search bar text input does get a bit small on mobile screens, but it’s still perfectly usable.  I’d envisage such a bar appearing on all of the ancillary pages, but we would retain the specific dictionary searches when in a dictionary.

I also continued with updates to the Anglo-Norman Dictionary this week, using my newly created workflow to add a further six texts to the site’s Textbase.  I also rearranged the ‘browse’ page so that the texts are now arranged by genre, and there are buttons to jump straight to a genre you’re interested in.  You can view the updated feature here: https://anglo-norman.net/textbase-browse/ and below is a screenshot:

Also this week I added further content to the Speechstar website (more videos and an additional ‘Phonemic target’ metadata field for all records listed here: https://www.seeingspeech.ac.uk/speechstar/disordered-child-speech-sentences-database/.  I also went through the other site (https://speechstar.ac.uk/) to rename to project from ‘Speech Star’ to ‘SpeechSTAR’ wherever this text appears.  I also had to spend rather a lot of time creating timesheets for every month I’ve worked on the project since July 2021 due to my time having been costed incorrectly.  As you can imagine, this was quite a long and tedious task, but thankfully it was made easier by having this blog to consult.

I’ll be on holiday next week so there will be no more from me until the start of July.

Week Beginning 10th June 2024

My time this week was devoted mostly to dictionaries, with about half my time devoted to the Anglo-Norman Dictionary and the other half working for the Dictionaries of the Scots Language.  For the AND I continued with the creation of a workflow to add new texts to the Textbase (https://anglo-norman.net/textbase-browse/).  Last week the editor Geert had created a complete XML version of a new text to be added using the Oxygen XML editor and a limited set of tags based on the existing Textbase texts.  As I discovered last week, while these texts use some TEI elements they are not actually valid TEI as they don’t link to a DTD and include undeclared non-TEI elements, but we’re going to have to stick with this structure for the new text to ensure compatibility with the older ones.

I spent some time this week working on the scripts that would then import this text, based on the scripts I’d initially created to batch import the older Textbase texts.  It was a bit of a lengthy process as it’s not just a case of importing the text but also generating the search terms including KWIC for the concordance and proximity searches, but I managed to get everything working with the new text, testing things on a version of the site running on my laptop rather than the live site, as adding a text directly to the live site would immediately make it live.

One issue with the text is that it is tagged as one continuous page, with folio breaks used throughout rather than page breaks.  This means that all of the footnotes are added to the very end of the text.  I suggested to Geert that the text could be split using the <pb> tag for folios instead, which would help break the text up and make the footnotes easier to use.  It turned out that Geert had been using the folio break tag for both paragraph numbers and folio breaks, which wasn’t quite right.   Swapping to using pagebreak tags purely for actual folio breaks would work much better as there are only 20 or so folios while there are many more paragraphs.

I decided to work on the XML myself to restructure it before sending it back to Geert for final tweaking.  I replaced the proper folio breaks (as opposed to paragraph IDs) with page breaks and have given them an ID beginning with ‘fol-‘, e.g. ‘fol-1ra’.  Geert wanted a different format including spaces, but we can’t include them as the values end up being used as JavaScript IDs and spaces are invalid in IDs.  I think the resulting page markers look ok, though.  I then changed the way the paragraph markers were recorded.  These are now just an ‘n’ attribute of the <p> element, e.g. ‘<p n=”1.1″>’ which I reckon makes a lot more sense than having them as part of the actual text.  I updated the XSLT so that when a paragraph has an ‘n’ attribute this gets displayed in bold square brackets before the paragraph text.  The only concern here is that the XSLT is applied to all texts in the textbase so if any existing texts have paragraph IDs these are now also going to end up displayed like this.

I also ensured the import scripts extracted all of the text for the Textbase concordance and proximity search.  This involved many stages, including splitting the text into pages, then splitting the pages into individual words, logging counts of the number of words on each page and generating the ‘Keyword in Context’ for every word in the text.  It took a bit of trial and error to get the scripts working properly with the new text, but I got there in the end.

With the scripts in place I was then able to run the new text through the import process on the server, resulting in the text being available through the ‘Browse’ facility and fully incorporated into the search facilities.  You can view the new text here:  https://anglo-norman.net/textbase-browse/sjer.  Soon after completing work on the import of this text the editor Delphine sent me a further text that she had been working on.  This was very handy as I was able to test the import scripts on a new text whilst everything was still fresh in my mind.  Thankfully the scripts worked perfectly and this new text was also successfully incorporated and can be found here:  https://anglo-norman.net/textbase-browse/otinel.  I then suggested to Geert that we should maybe update the ‘Browse’ feature to break up the list by genre as currently the texts are all just listed alphabetically by title, which is possibly not all that helpful.  I’m going to investigate this next week.

I spent most of the remainder of the week working for the DSL.  I’d previously discussed with the team how the existing website design was getting a bit long in the tooth, as it will be 10 years this September since we launched the site.  Users of the site increasingly access it on mobile devices and while the interface was designed to work on much devices it doesn’t make best use of the space available on such screens.  I therefore wanted to experiment with a new interface for the website, which is what I did this week.  I’m not able to share any screenshots of my experiments at this stage as I haven’t even heard back from the DSL people yet, but I’ll give an overview of what I’d been working on so far.

The interface I created uses the Bootstrap front-end toolkit, which is an ideal starting point for creating mobile-friendly sites.  I carried over some design aspects from the live site, and have also borrowed some elements from my work on the Anglo-Norman Dictionary.  I created a working mockup of the interface for the homepage and dictionary entry pages and shared this with the DSL team.

The biggest structural change to my test interface is something that has been discussed before:  Splitting up SND and DOST.  The mockup site only has four top-level menu items:  ‘Modern Scots’, ‘Older Scots’, ‘Advanced Search’ and ‘About’.  I’ve gone with ‘Modern Scots’ and ‘Older Scots’ as these are more understandable than ‘SND’ and ‘DOST’ and there is no space for the full titles or dates.  As it currently stands all four tabs fit on one line on a mobile device (tested on my Android phone and an iPhone 15).

The default landing page for the site is ‘Modern Scots’.  It features a large ‘Search Modern Scots’ input and some brief text about the SND, which I’ve taken from the current homepage and the ‘about SND’ page.  This is followed by three info boxes as found on the right-hand side of the current homepage.  On mobile devices these are stacked.  Note that I haven’t added the ‘Add Yogh’ button to any quick search form yet.  There is also a separate ‘Older Scots’ homepage, which is very similar but with information about DOST.

I then created a mockup of an SND entry page for ‘Dreich’.  The entry page is now two-column.  A ‘Search Modern Scots’ input appears at the top of the main column.  The entry itself has a summary section at the top (influenced partially by the new OED site).  This includes the headword and POS (as already found in the ‘sticky’ header that appears when you scroll down the entry page on the live site), information about the dictionary, the ‘about’ text and options to show/hide things etc.  Dates of attestation and the sparkline are also included here.  I’ve gone with a rounded-end for the sparkline to make it look a bit more swish.  The main entry itself is exactly as it appears on the live site.  Note that a ‘Top’ button appears in the bottom right as you scroll and pressing on this scrolls the page back to the top.  For now I haven’t included a ‘sticky’ header as we have in the live site but I can add this in.  The right-hand column is also not ‘sticky’ as I’m not convinced it’s really necessary and wouldn’t work on mobile devices anyway as the column will appear beneath the entry.

The right-hand column is adapted from the Anglo-Norman site.  It defaults to the ‘Browse’ but has further tabs for ‘Results’ and ‘Log’.  Neither of these currently work, but the former would feature the search results as currently found in the live site’s left-hand column and the latter would contain a list of all entries you’ve looked at during the current session, which is quite a handy feature I added to the AND site.  The ‘Share’ box is also present in the right-hand column.  On mobile screens the right-hand column appears below the entry column.

I also made a similar mockup for DOST, using the DOST entry for ‘Scunner’.  It’s currently identical in structure to the SND entry, but gives you another example of the entry page, with a more interesting sparkline.  We could differentiate SND and DOST by using a different colour scheme for the DOST entry page (e.g. a different shade of blue for the entry header) but for now everything is the same.

I haven’t added in the search results page yet, but it will be fairly similar to the live site.  I’ve also not added in a mockup of the Advanced Search form yet, but this will be the only place to search both dictionaries.  It will be pretty similar to the live site.  The ‘Results’ tab on the entry page will work in a similar way to the left-hand column on the entry page on the live site.  Where an advanced search involving both dictionaries is performed the results will be split like the live site.  We could make the ‘About’ tab a drop-down menu but there are so many ‘About’ pages that I think it might work better if it just led to a page that listed the various sections (‘Our Publications’, ‘About Scots’ etc) and gave links to everything.  Similar to the OED’s ‘Information’ page.

I did consider having both a top section of each page under the selected tab, and the selected tab itself in blue like the entry header, with this then containing the search and the entry header and then the rest of the page below this being white.  This would sort of give a similar approach to the OED’s entry page.  However, I went with a white tab and all white page to make it clearer that all of the content ‘belongs’ to the tab.  In terms of the site font I’ve used an updated version of the one used on the live site, but we can change this, or use multiple fonts for different sections.  I added a subtle gradient to the background of the header and footer (it gets lighter to the right), but this can be changed.

I’ll just have to see what the DSL people make of my experiments and we’ll see if any aspects of them find their way into the live site.

Week Beginning 3rd June 2024

I returned to working on the Iona place-names interactive map this week, which I’d not worked on for a while.  The team had demonstrated the resource at an event last week and had a few suggestions after this.  They’d noticed that the map was rather cluttered on mobile devices with both the side panel and the map legend opened and I therefore added in a check for screen width when the page initially loads.  If the width is less than 500 pixels the legend is hidden by default, which stops multiple panels appearing and overlapping.  I decided against hiding the left panel as this might be more confusing as users will generally need to use this panel and if it’s off by default they might not realise it’s there.

I also made a couple of other tweaks.  Firstly, I’d noticed the ‘Select all’ option in the legend was remaining visible when the legend was hidden and I sorted this.  It was actually rather tricker than I’d expected to get this working but it’s sorted now.  I also replaced the ‘Show / Hide’ text that was the only text in the legend button to just display ‘Legend’ instead (along with the up/down icon).  I figured this would make it clearer what the section was when it’s hidden by default.  Secondly, the tabs in the record popup were overlapping on narrow screens when the tabs were split over two lines so I added a vertical margin to the tabs to fix this.  I don’t think there are any other issues with the layout on mobile devices, but I’ll just need to see if anything else gets reported.  Below is a screenshot of the map on a mobile device, showing the legend closed by default and the left-hand panel open:

I was also alerted to the fact that I hadn’t implemented the display of images associated with place-names yet so I implemented this.  I had to update the API to ensure image data was outputted and for places that have images the record popup now features an ‘Images’ tab.  Pressing on this displays the image and its caption, for example:

For the moment if there are multiple images for a place they will each appear in the tab, one after the other.  We might want to consider some sort of slideshow instead.  I also ensured that sounds associated with place-names are now also set to be returned by the API, but as of yet I haven’t added in a tab for them, as we don’t currently have any sounds in the system.

The team also wanted cross references between place-names to appear in the mini-popup in addition to the main one, so I added these in.  However, whilst working on this I realised there was a bit of a problem with how cross references were working:  The links would only work if the cross referenced place-name was also part of the currently visible map data.  For example, a quick search for ‘buaile’ results in only 6 matching place-names on the map.  The cross reference for ‘Buaile Staonaig’ is ‘Loch Staonaig’, which is not part of these results and therefore pressing on the cross reference wasn’t doing anything.  This was a bit of a headache to sort out, resulting in some pretty major changes to the code.

What happens now is if the cross referenced place is also on the map then clicking on the link will move you to it as you’d expect.  But if the place is not on the map then when you press on the link the map resets itself, loading in all place-name data.  Once the data is loaded in then the system identifies the cross referenced place and moves the map to it.  It’s maybe a bit confusing to have all of the data loading in, but I just couldn’t think of another way to do it without messing up the view of the map with place-names that don’t match the criteria.

Also this week I spent some time making final alterations to the websites for the three new Speak For Yersel regions.  This included adding in logos, replacing some of the images used in the surveys, adding in new questions and making tweaks to existing questions, adding in the ‘About’ pages and removing all of the test users and their data from the systems ahead of piloting.  The resources are ready to go now, although it’s likely that some further tweaks will be required after the piloting phase.

I also spent about a day further researching a new potential project on 18th and 19th century playbills for researcher Deven Parker.  I can’t really go into too much about this here, but there was a lot to consider and I sent a lengthy email to Deven containing many discussion points.

My other main task of the week was to investigate adding new texts to the ‘Textbase’ of the Anglo-Norman Dictionary (see https://anglo-norman.net/textbase-browse/).  Geert had sent me a new text for me to try and add to the resource and I spent some time working with it and the various import scripts that I’d created for the existing Textbase texts.

Unfortunately the structure of the new file is quite different to the existing XML files.  I was hoping I’d be able to just run the XML through the same scripts but this isn’t going to work without a lot of changes to the scripts.  There are no <pb> tags in the new file.  These are used to denote page breaks and are important for the search facilities, as words extracted for search purposes are associated with a page record in the system.  If having pages really doesn’t matter I can easily just add a <pb> around all of the text, which should sort the issue.  There may also be an issue with the ‘jump to page’ feature, which would of course only then contain one page.  Related to this, as there are no page breaks all notes will end up appearing at the very end of the text as technically there will be no ‘pages’.  The XML (which was generated using a tool called the Classical Text Editor) contains a lot of data relating to the style of elements (within <tagsDecl>) and a lot of the text is contained within <hi> tags, although it’s not clear why.  E.g. we have <hi rend=”font-size:12pt;”>Nous lisoms en anciens estories les uns avoir environee les provinces, et avoir alee as novels poeples, avoir passee les meers, q’eux verroient devaunt</hi> but in the PDF view of the text there is no discernible reason why ‘devaunt’ is the last word in this tag as the first word in the following tag (ycestes) follows directly on from it.  Also, the numbers at the start of paragraphs are not specifically marked up but are instead just part of the text, (e.g. <hi rend=”font-size:12pt;”>[1.1] Frere Ambros, a moi portaunt…).  This means we wouldn’t be able to separate out the numbers from the text or apply a different style to them.  There was also a lack of metadata about the text, and what was there was not in the same structure as in the existing textbase texts.

I suggested to the editor Geert that as the Classical Text Editor seems to add an awful lot of unnecessary (from our point of view) tags that result in a very messy XML file whether the texts could instead just be produced directly in Oxygen.  I spent a bit of time creating a sample XML document in Oxygen, including just the first few paragraphs of the text.

It seemed pretty feasible to me to create the texts in this fashion, but I did spot an issue.  The existing Textbase texts are not actually valid TEI XML.  The texts don’t link to a DTD or the TEI namespace and when I did try doing so with my sample file in Oxygen it produced a lot of errors.  E.g. the ‘fb’ element used for folio breaks in the Textbase texts is not a TEI element.  Unfortunately we’d need to conform to the old Textbase XML as that is what my processing scripts are set up to work with.  However, we’d have to end up with such non-valid (from a TEI point of view) XML in any case if we want to incorporate new texts so perhaps this isn’t an issue, but just something we will have to bear in mind.  I’ll continue with this next week.

I also updated the entry page to make language tags searchable.  You can now click on language tags (for example in this page https://anglo-norman.net/entry/alabrot) and perform a search for the corresponding language.

Week Beginning 27th May 2024

Monday was the late May bank holiday this week, so it was a four-day week for me.  On Tuesday I had an online meeting with Tony Harris, a developer at Cambridge who will be working on a project about Middle English Lexicon.  The project involves Louise Sylvester, who is in charge of the Bilingual Thesaurus of Everyday Live in Medieval England (https://thesaurus.ac.uk/bth/) which I developed, and this new project is going to expand upon the data held in this resource.  We had a good chat about the Bilingual Thesaurus and the technologies I’d used to put it together, and discussed some ways in which the new project might function from a technical perspective.  It’s likely that we’ll meet again in the coming months to expand upon our ideas and I’ll probably be involved with the project in some small capacity.

Also on Tuesday fellow Arts developer Stevie Barrett and I met with the ‘Technical Champion’ for the College of Arts and Humanities, Aris Palyvos who is a technician in Archaeology.  We had a good chat about the role of technicians in the College and how we can improve our visibility.  We now have a Teams group for technicians and hopefully we’ll be able to meet up with some of the others in the College in the coming months.

Last week I’d started work on a content management system for Burns correspondence and I spent a bit of time this week finishing things off.  I’ve given Craig and Pauline in Scottish Literature access to the CMS now and they have someone starting next week who will be using the system so I’ll just need to see how they get on and if they request any changes.

Also this week I made a few further tweaks to the new Speak For Yersel survey regions, fixed a couple of typos on the Speech Star website and helped to resolve a fairly serious issue with the Books and Borrowing website.  The IIIF server that the website uses had gone offline, meaning none of the images of register pages were loading.  Our usual IT guy at Stirling was out of office, but thankfully someone else there was able to get things up and running again.

I also completed my work on the migration of the British Association for Romantic Studies’ journal the BARS Review (https://www.bars.ac.uk/review).  This has taken quite some time over the past few weeks to get sorted but it’s now up and running.  To get it working I needed to switch the PHP version the site was using from a rather ancient version to the current version, and thankfully the other parts of the site that don’t use the OJS system were not adversely affected by this change.  I also took the opportunity to add Google reCAPTCHA to the site for registration and login, which should hopefully stop the spam registrations.  Registration also now requires the user to verify their registration via an email.  I also made a few additional security updates that I’d better not discuss here.

I spent the rest of the week working for the Dictionaries of the Scots Language, making a few tweaks to the advanced search on our test server, investigating some issues and replying to emails.  I also made a fairly major change to the sparkline data for entries so that dates of attestation beyond the period of each dictionary are handled in a different manner.  Previously all such dates were bundled together as the start or end date and then this date was used to generate blocks for the sparkline visualisation.  For example, ‘Abeich’ in SND has a first date of attestation of 1568, a long time before the official start date of the dictionary, which is 1700.  Previously this start date was being converted to 1700.  Our ‘cut off’ point for generating blocks of continuous attestation is now set to 50 years, meaning that if there are two or more attestations 50 years or less from each other this results in a block of continuous usage in the sparkline visualisation.  As the next date of attestation for the entry was 1721 the resulting sparkline therefore gave a continuous block from 1700 to 1721, which did not affect the underlying data, plus the sparkline text then included ‘1700-1721’ which was not at all accurate.  See the following screenshot to see what I mean:

I updated the code that generates the data for the sparklines so that any dates prior to 1700 result in the text ‘<1700’ appearing and the code no longer uses such dates as a starting point for a ‘block’ in the visualisation.  After the update we’re now presented with the following sparkline, which has a line at the start of the visualisation representing ‘<1700’ and then a gap from this point until 1721, which is the first attestation in the dictionary’s official period:

In order to get this working I needed to regenerate the data for Solr and then update the Solr core with the new data.  For now this is only running on my laptop and I have put in a request with our IT people to update the online Solr cores, as I don’t have access to do this myself.  Once the change has been made our online test site will be updated and hopefully it won’t be too much longer before we can actually update the live site and make this new feature available to everyone.

Week Beginning 20th May 2024

This was a week of working on many different projects for me.  I spent some time making final tweaks to the Speech Star websites (https://speechstar.ac.uk/ and https://www.seeingspeech.ac.uk/speechstar/) as the project officially came to an end this week.  This included adding new videos, replacing existing videos, updating the metadata, updating the site text and adding in a few new images.  Both resources have come together really well and I’m sure they will be hugely useful for speech therapy for a long time to come.

I also made a further update to the Anglo-Norman Dictionary resource related to the new ‘language’ search I added in last week.  After this had gone live I had intended to make the languages in the entries link to the search results and I spent a bit of time getting this working.  This involved updating the XSLT, the CSS and the JavaScript used to display the dictionary entries and the update is not live yet, but it’s in place on a test page and works pretty well.  See the ‘(loanword….’ section in the following screenshot to see how it will work:

I also replied to a number of emails that had been sent to me from Ann Fergusson of the DSL regarding the new date search and sparklines that I developed for the project many months ago but is still to go live.  Ann gave me some feedback on a number of issues and I spent some time making updates.  The big change will be to the way dates of attestation that are before or after the active period of the dictionary in question.  I’m going to have to update the way the sparkline data is generated, which may take some time.  I’ll hopefully be able to look into this next week.

For the new regions of the Speak For Yersel project I updated the privacy policies and now everything is ready for us to begin sending out the URLs for the new areas for test purposes and for the Books and Borrowing project I wrote a couple of scripts to find and then generate some missing pages from the registers of Selkirk library.  The first script identified page images that we have on the server for the two registers that do not have corresponding page records in our database.  The script was set to output a 300px wide version of each missing image plus its filename and it was also possible to click on the image to view a full-size version.  It turned out that the majority of the missing images were omitted for a reason – they were either blank or didn’t contain borrowing records.  We decided that 70 images needed to be added in and 117 could be safely ignored.  My second script then added in the missing pages.  This took a bit of time to get working as not only did I need to create the pages, I also needed to ensure they were slotted into the correct place, updating the ‘next’ and ‘previous’ links and the page order as required.  But the pages are now available and someone can begin the process of transcribing the borrowing records found on the pages.

I also spent about a day or so this week creating a content management system for the Burns Letter Writing Trail, which will be an interactive map of Burns’ correspondence that will eventually be added to https://www.burnsc21-letters-poems.glasgow.ac.uk/.  I can’t really say much more about it at this stage, but the CMS is about 80% complete and I hope to finish the rest next week.

Finally, I spent about a day looking through the images and YAML files of playbills from the 19th century for Deven Parker.  The images are generally a relatively small file size but they are actually pretty high resolution and I think they will work in a ‘zoom and pan’ interface.  As an experiment I set up such an interface for one image, using the same JavaScript library I used for the Books and Borrowing project.  I spent the rest of the day getting to grips with the YAMP files, building up my understanding of the textual data, sketching out an initial relational database structure that could be used for the project and compiling a list of questions for Deven.  I’ll probably have to meet with her to go through these and see if or how my involvement with the project will develop.

 

Week Beginning 13th May 2024

I continued with the upgrade of the BARS review this week.  I’d managed to complete the upgrade process last week, but discovered that the links to the actual text of journal articles as HTML and PDFs were not working.  Further investigation this week revealed that this was a major problem.  The way files are stored and referenced in the new version of the Open Journal System is entirely different to how things were in the original version.  Previously there was an ‘article_files’ table where files associated with articles were located but the new version instead features a ‘files’ table that contains paths to files that are entirely different to the earlier version and the actual directory structure of the system.  I realised that it was likely that the upgrade process not only upgraded the database but also moved files around too and as I never got the path to the files right on my Windows PC any upgrades that should have been applied to the files will have failed (although having said that I never saw any errors). I therefore had to begin the upgrade process again so that the files would actually get moved / renamed to match the updates to the database.

After further investigation it appeared that several people have experienced the same issue whilst upgrading, for example https://forum.pkp.sfu.ca/t/none-of-the-pdf-files-can-be-viewed-or-downloaded-after-upgraded-to-ojs-3/27381 and https://forum.pkp.sfu.ca/t/solved-cant-see-pdfs-after-upgrade-to-3-1-1-4/49382 and https://forum.pkp.sfu.ca/t/after-upgrading-to-ojs-3-1-1-4-files-are-not-found/48938/23.

I fired up the 2.4.8.5 version of the site that I had running on my old PC and this time ensured the config file included the correct path to the files.  After doing so the instance managed to find the files, meaning I could restart the upgrade from this version.  The first time I upgraded to 3.2.1 with the correct path the files were successfully indexed and added to the database, but no corresponding files were actually generated.  It turned out that this was because my ‘files’ directory had somehow been set to read only.  I fixed this an re-ran the upgrade to 3.2.1 and thankfully the files were all renamed and moved successfully.  After that the upgrade to 3.4 worked fine.

However, the HTML version of the articles were still not opening in the browser but were instead getting downloaded.  I found a forum post about this too: https://forum.pkp.sfu.ca/t/ojs-3-1-0-1-html-downloaded-it-automatically-from-articles/36024/8 which suggests that a plugin needs to be activated.  After managing to log in as an admin user I managed to find and active the required plugin (HTML Article Galley).  I also activated the ‘PDF.js PDF viewer’ plugin too.  This now means you can view the HTML and PDF versions of the article in your browser, as with the old site.  I also needed to update the permissions of a couple of directories and now the upgrade process is complete.  For now the new version of the site is still running on a test server, so I’ll still need to replace the live site with the newly upgraded version once Matt is ready.

Also this week I investigated an issue with the Books and Borrowing server, as staff were unable to log into the CMS.  As I suspected, this was because the server had run out of storage, meaning there was no space for a session variable to be stored.  I contacted Stirling IT about this and thankfully they were able to free up some space.  I also investigated some missing images and register pages from Selkirk library.  It turns out that we only included pages that had been transcribed by a previous researcher, which meant that many of the register pages have been missed out.  Thankfully we have the full set of digitised images and I’m therefore going to have to write a script that creates the missing pages, something I’ll try to tackle next week.

This week I finally moved back to my office in 13 University Gardens from the nicer office I’d been squatting in on Professors Square for the past year and a half.  All my stuff had been moved over for me and when I was on campus on Tuesday I therefore had to spend a bit of time getting everything set up and in order.

Also on Tuesday, I had a meeting with Deven Parker, who has a Leverhulme funded position in English Literature and is working with Matt Sangster.  We discussed a prospective project involving the digitisation of playbills from UK theatre from 1750 to 1843.  There are currently around 300,000 digitised images and may be up to 500,000 in the end.  Deven wants an online database to be made for these, featuring searchable text and the image and we discussed some of the possibilities.  I’m going to have a look at some of the data next week and will think about what can be done.

Also this week we went live with the language search on for the Anglo-Norman Dictionary.  This is now available as a tab on the advanced search page (https://anglo-norman.net/search/) and allows users to find dictionary entries that have been tagged as loanwords.  It’s great to have this feature live.  I also needed to update the bibliographical links to the DEAF website as they had changed their site, breaking all of the links we previously had.  This required me to update the links in a few places, but all is working again now – for example see the DEAF links on this page: https://anglo-norman.net/bibliography/

I also made a couple of small fixes to the Emblems website (https://www.emblems.arts.gla.ac.uk/french/), ensured project images are not too large on the Glasgow Medical Humanities website (https://glasgowmedhums.ac.uk/projects/) and fixed a typo that had existed for many years on the New Modernist Editing ‘Digial Ode’ site (https://nme-digital-ode.glasgow.ac.uk) .  I also had a chat with Craig Lamont about the interactive map of Burns correspondence that I will be developing.  There will soon be an RA who will begin to compile the data and I will need to start creating a content management system for this in the next week or so.

I also continued to make updates to the new Speak For Yersel surveys (that are not yet quite ready to launch).  This included adding in animated ‘bubble’ maps to the homepage and updating the survey tool to incorporate an optional question to the registration page about bilingualism, which I then incorporated into the surveys for Wales and the Republic of Ireland.

I then rounded off the week by making further updates to the Speech Star website, including adding text to the ‘About’ page, adding in some new videos, replacing some existing ones, updating metadata and ensuring all video popups are bookmarkable and have full citation information.  For example, here’s a direct link to the ‘Lip consonants’ video in ‘Sound groups’: https://speechstar.ac.uk/speech-sound-animations/#location=65.

Week Beginning 6th May 2024

It was the May Day bank holiday on Monday this week.  I then spent quite a bit of Tuesday and Wednesday completing several mandatory training courses that all University staff must take every few years.  I had six to complete before the end of July so decided to get them all out of the way this week, each taking between 30 and 90 minutes to finish.

I spent most of the remainder of the week continuing to update the BARS review site for Matt Sangster.  I’d made a bit of progress with this last week (upgrading from 2.4.3 to 2.4.8.5) but this week I needed to perform a major upgrade from 2.4.8.5 to 3.2.1 and then a further upgrade to the current version.  Upon starting the process I quickly realised that version 3.2.1 is still not compatible with PHP 8 so I needed to downgrade PHP on my laptop to PHP7 to get things to work.  It was also not documented that the config script’s database connection needed to be changed from ‘mysql’ to ‘mysqli’, and this took a while to figure out.

I managed to get the upgrade to start executing but the process was failing midway through with a very unhelpful ‘Specified key was too long; max key length is 1000 bytes’ error.  This is a MySQL error but it was in no way clear exactly when and why the error was cropping up.  I spent ages looking through numerous source code files to try and understand what was going on and then tried to modify things.  But each time the error appeared I needed to delete and reinstall the database as it was then left in a half-upgraded state, which was a bit of a pain.  I found this posting https://forum.pkp.sfu.ca/t/install-ojs-3-0-2-on-windows-with-php-7-1-8/32901/7 but it suggested upgrading MySQL would fix the issue and I was already running the most recent version and other information I found (such as https://stackoverflow.com/questions/1814532/mysql-error-1071-specified-key-was-too-long-max-key-length-is-767-bytes) didn’t really help much either.

I noticed that the existing tables were mixture of MyISAM and InnoDB storage engine tables so then decided to set them all to InnoDB (each needed to be done separately and there are almost 150 tables) but this still didn’t sort things.  The main difficulty was that I didn’t know exactly which query the upgrade script was sending to the database was causing the error and looking at the code I couldn’t figure out how to add some kind of log to trace things.  Thankfully one of the answers to this question https://stackoverflow.com/questions/4631133/mysql-log-of-invalid-queries noted that you can tell MySQL to log all queries that are passed to it and after doing so I could pinpoint exactly what in the upgrade script was causing the error.

The upgrade script created a number of new tables and one query was ‘ALTER TABLE email_templates_settings ADD  UNIQUE INDEX email_settings_pkey  (email_id, locale, setting_name)’ and it was this that caused the error and made the process fail.  The problem was this appeared to be a perfectly valid query.  Thankfully I then noticed that while existing tables were using the collation ‘utf8mb3_general_ci’ the tables created by the upgrade script were set to ‘utf8mb4_0900_ai_ci’ as this is the default collation now used by MySQL.  Unfortunately this collation uses more bytes than the other one and was pushing the key length over the limit.  After changing the database’s default collation and running things again the upgrade process finally completed successfully.

However, I was still unable to get the HTML and PDF versions of the articles to appear on my local PC.  When I click on the links I just get a blank page, even though I’ve updated the path to files in the config file.  This will require some further investigation.

I was then able to perform the final upgrade from 3.2.1 to 3.4.  I was hoping this would be straightforward but unfortunately I still ran into a few problems.  The upgrade failed with the error ‘Failed to open the referenced table ‘files’ (SQL: alter table submission_files add constraint submission_files_file_id_foreign foreign key (file_id) references files (file_id))’ and there was a single Google hit referring to this (https://forum.pkp.sfu.ca/t/upgrade-ojs-3-2-1-4-to-3-3-0-5-error/67785) but unfortunately the suggested solution (updating the database user privileges) did not work for me as my database user already had full privileges.  Thankfully I spotted that during the previous upgrade some new tables had been created and these were using the ‘Myisam’ storage engine rather than ‘innodb’.  I managed to update MySQL on my PC to ensure innodb was the default engine (see https://stackoverflow.com/questions/4199446/how-to-make-innodb-as-default-engine) and this fixed the issue.  I then encountered some further errors relating to log files that needed to be deleted, but after doing so the upgrade process completed successfully.  Unfortunately the HTML and PDF articles are still not loading, and I was hoping that this was an issue with running the software on my Windows PC.  However, I then uploaded to newly upgraded site to a test server and the issue persisted.  I’ll need to investigate this further next week – I just hope I don’t need to begin the upgrade process all over again to resolve it.

Also this week I made a number of further changes to the Speech Star website, including adding several new video clips and updating video metadata.  There are still a few additional tasks I need to perform for this project and I’ll start to tackle them next week.

Week Beginning 30th April 2024

I worked for several different projects this week.  On Monday I had a meeting with the Books and Borrowing PI and Co-I Katie and Matt and some researchers at UCL who are working on a project that deals with similar sorts of data to Books and Borrowing.  They were interested in how we developed the B&B resource and we had a good discussion about various technical and organisational issues for about an hour.  As a result of this meeting I decided to publish the development documentation for the project, which is now available here: https://borrowing.stir.ac.uk/documents/.  This includes the data description document, the requirements documents for the project’s content management system, front-end and API, plus my ‘developer diary’, consisting of all of the sections relating to B&B taken from this blog.  Hopefully the documents will be of some use to future projects.

Also for the Books and Borrowing project this week, the server that hosts the site at Stirling underwent a major upgrade and I therefore had to check through the site and the CMS to ensure everything still worked properly.  There were a few issues with the CMS that I needed to fix, but other than that all would appear to be fine.

I also began the process of updating the BARS review site for Matt.  This site is powered by the Open Journal System, but the version in use is now extremely out of date and needs an upgrade – a process that is not at all straightforward.  I spent about a day working on the upgrade and the process is not even close to completion yet.  It would appear that no 2.X versions of OJS will run on PHP 7, never mind the current PHP 8 and unfortunately my local server setup doesn’t function with anything before PHP 7.  I managed to get a version of PHP5 running on an old PC I’ve got but after copying the OJS files and database I’d taken from the live hosting onto this PC (and updating the database connection settings) I couldn’t get the site to load – all I got was a strange browser error that I’d never encountered before.

This current live version of the site is running version 2.4.3 of OJS, and the upgrade pathway states that this would need to be upgraded to 2.4.8.5 (which will then need to be upgraded to 3.2.1, which can then be upgraded to the most recent 3.4 release).  I therefore decided to try installing a fresh version of 2.4.8.5 to see if I had any better luck with that.  Initially things didn’t look good as when attempting to access the locally hosted site all I got was a fatal PHP error.  I managed to track this down and hacked about with the code to stop the error cropping up (I figured that since this version is only an interim installation it wouldn’t really matter).  After that the fresh install (with no BARS data) appeared to run successfully.

The upgrade path documentation (https://docs.pkp.sfu.ca/dev/upgrade-guide/en/) states that to upgrade to a newer version you should set up the newer version and then point it at the older version’s database and reinstate the directories holding the old site’s files, so this is what I did.  The upgrade script successfully noted the older database version and the upgrade process seemed to run successfully.  At this point I then had what appeared to be the BARS site running on my old PC.  However, there were some issues.  The colour scheme appears to have been lost and there are some other differences in the layout.  But more importantly, while the journals and articles all seem to be present and correct, pressing on the ‘HTML’ and ‘PDF’ links to access the content don’t actually load in any content.  Thankfully Matt suggested that my local version of the site may need its config file updated to point to the correct files directory, and this was indeed the case, so progress is being made.

I then migrated everything onto my current PC, which currently has the PHP version set to 7, which OJS 2.4.8.5 should support.  Unfortunately it doesn’t and when I try to access the site I just get fatal PHP errors relating to functions called in the code that have been removed from PHP 7.  This is not necessarily a big issue, though, as now I have the site on my current PC I should be able to upgrade to OJS 3.X, a process I will continue with next week.

Also this week I made a few more tweaks to Matthew Creasy’s conference website and responded to a few emails that came in requesting my help with various things.  I also made several more updates to the Speech Star website, which is now actually live (https://speechstar.ac.uk/).  This included renaming the site and updating the homepage text, updating the favicon on the other Speech Star website to cut down on tab confusion (see https://www.seeingspeech.ac.uk/speechstar/) plus adding in a new menu item and link from this site to the other one.  I also replaced the vocal tract video found here: https://speechstar.ac.uk/speech-sound-animations/ with a newer version.

I also returned to my work on the new language search for the Anglo-Norman Dictionary, completing an initial version of it.  As with the label search, the languages are listed down the left and you can add or remove them from the panel on the right by clicking on them.  Boolean options appear when multiple languages are selected and there’s an option to limit the search to compound words, exclude compound words or search for all.  Below is a screenshot of how the search form currently looks:

The search is fully operational, but is not yet live as I need to hear back from the editor, Geert, about when he’d like the new feature to launch.

Finally this week I made some further updates to the Speak For Yersel follow-on projects.  This included making a number of changes to the Welsh survey, adding in the site logos that Mary had created using ArcGIS and adding in the top-level tabs that will enable users to switch between survey areas.  I also created a new ‘speech bubbles’ animated GIF for Northern Ireland based on images Mary sent me.  The new survey areas are not yet publicly available but below is a screenshot of the Northern Ireland area, showing the logo, the top-level tabs and the animated GIF.  We’re getting there!

Week Beginning 22nd April 2024

The Books and Borrowing project (https://borrowing.stir.ac.uk/) had its official launch on Friday this week, and it was great to celebrate the completion of a project that has been such a major part of my working life for these past four years.  I spent a lot of this week preparing for the launch, at which I gave a talk about the creation of the resource.  This covered the definition of the data structures, the creation of the database, the planning and development of the content management system and then the front-end and API for the project.  There was a lot more I could have said about our use of technologies such as the IIIF server for images and Apache Solr for search facilities, but I had to keep the talk relatively brief and couldn’t include everything.  I think my talk went pretty well, and it was also really great to hear from the other members of the project team, many of whom also gave talks about the research they had undertaken using the resource.

As part of my preparations I also get the site running on my laptop in case there were any issues with the server or general internet connection during the launch.  I also spotted and fixed a small bug with the search results filter options.  The filter by place of publication was working, but when a place was selected it was not getting ‘ticked’ in the left-hand filter options.  This meant it was rather difficult to unselect the filter.  The issue must have been introduced when I updated how publication places were stored in the Solr index a few months ago and thankfully I was able to fix it without having to regenerate the data.

Also this week I made a small tweak to the website for the International James Joyce Symposium (https://ijjf2024.glasgow.ac.uk/), adding a further logo and link to the footer.  I also made a few more minor updates to the Speech Star website.

I also continued to work on the new language search for the Anglo-Norman Dictionary.  I updated my data import scripts so that language data would be extracted during batch import of data and ensured existing language data was deleted during the process of deleting older data.  I then updated the dictionary’s content management system to ensure that language data was properly dealt with when entries were added or edited through the system and updated the ‘view entry’ page in the system so that language data for each entry is now visible on the page, in the same way as parts of speech, labels and other data.

I spent the remainder of my week working on the Speak For Yersel follow-on projects, updating several of the questions for the Republic of Ireland, adding in introductory text for this survey area and updating all three survey areas to add some explanatory text to the start of each survey question.