Week Beginning 17th July 2023

After two weeks out of the office on holiday and at a conference I returned to regular work this week, although it was only a four-day week due to Monday being the Glasgow Fair public holiday.  I spent most of Tuesday dealing with my expenses, figuring out what outstanding tasks I needed to get back to  and catching up on emails and issues that had cropped up whilst I’d been away.  I also spent some time writing up my rather lengthy report from the conference, which you can read in last week’s blog post.

One of the issues that cropped up is that the embedded Twitter feeds on sites such as https://anglo-norman.net/ and https://burnsc21.glasgow.ac.uk/ have stopped working (and at the time of writing are still broken), and only display ‘Nothing to see here – yet’.  It looks like this is yet another occurrence of Twitter being a disaster zone these days and they’ve blocked embedded Twitter feeds.  Information about the issue can be found here https://twittercommunity.com/t/again-list-widget-says-nothing-to-see-here-yet-if-logged-out/198782/205 and at the time of writing it would appear there is no work-around for this, and absolutely no official word from Twitter about the issue.  This may be the end of embedded Twitter feeds and another nail in the coffin for Twitter if so.

I also fixed an issue with the Place-names of Iona project.  The scripts in the content management system for managing place-name elements weren’t working properly since the migration to a new server and I managed to fix this with a couple of updates to the code.

There was also an issue with the DSL’s advanced search that had been introduced since we migrated to a new Solr instance last month.  After the migration the advanced search snippets were sometimes being joined together and were occasionally far too long and I introduced a fix for this.  Unfortunately the fix required the full dataset to be queried in Solr and once these were returned the results were filtered by the source.  This meant that for searches that exceeded our maximum returned results limit of 500, when a source was specified in a search the displayed total number of results referred to the unfiltered total, and if more than 500 results were returned these were being limited to 500 before the source was taken into consideration.  The outcome of this was that when a source dictionary was specified the total number of results was incorrect and the limited number of returned results was not 500 but however many of the first 500 were for the source dictionary in question.  Thankfully I managed to sort this out and all should be behaving properly again now.  Also for the DSL I investigated the entry https://dsl.ac.uk/entry/dost/depredatio(u)n which was giving an error message.  This one took a bit of time to figure out, but in the end it was something simple: this is the only entry that has a closing bracket in its ‘slug’ and my script was stripping closing brackets from slugs before connecting to the API to retrieve data, therefore no matching entry was found and for this entry the XML file was therefore empty.  I’ve fixed this issue now.

An issue had also arisen with the Anglo-Norman Dictionary since a recent PHP upgrade on the server.  In the content management system we have a proofreader feature, which allows an editor to upload a ZIP file containing any number of entry XML files.  These are then extracted and formatted as they would be displayed on the public website, only with all entries in one long page.  However, since the PHP upgrade the proofreader gave a blank page when submitted a ZIP file.  It was definitely not an issue with any one specific ZIP file as I tested the proofreader with other files that should work and the result is the same.  It turned out that the library I use to extract and read ZIP files in the proofreader script is not compatible with the new version of PHP that the server has been upgraded to.  I’ve created a simple test script that reads a ZIP file and it fails on the server but runs with no issues on my desktop PC, which runs an older version of PHP.

I then spent some time getting to grips with the replacement ZIP library but unfortunately the server does not feature this library (or any library for processing zip files in PHP since the upgrade).  I submitted a helpdesk query to ask for the library to be installed and thankfully this was completed a few hours later.  I could then replace my old scripts with a new one and the proofreader appeared to work again.  However, it became apparent that the new ZIP extraction script was cutting larger files off at a certain point, even though I’d set the script to check for the file size of the archived file and grab all of the file up to that size.  Updating the size to a much larger number had no effect either.  IT turns out that when a file is extracted using the new ZIP extraction it’s actually a ‘stream’, which is a chunk of data.  I was placing this in a variable thinking it was the full file, but it isn’t necessarily the case.  Instead I’ve added an extra bit of code that iterates over all of the chunks of data referenced in the stream and ensures they are all added to the variable rather than only one part being added.  With this update in place the proofreader now works, even with very large files.

My final task for the week was to make a lot of updates to the Speech Star website (https://www.seeingspeech.ac.uk/speechstar/).  This included changing the title of the website and the ‘site tabs’ found in the top right of all three associated sites, making tweaks to the IPA and ExtIPA charts, updating the site text and adding in two videos that had been missed out.