
Week Beginning 2nd October 2023
This was a week of many different projects. On Monday I completed work on a new project website for Petra Poncarova in Scottish Literature, and it is now publicly accessible (see https://erskine.glasgow.ac.uk/). I also added a blog page to Ophira Gamliel’s project website, created a page for their first blog post (now available here: https://himuje-malabar.glasgow.ac.uk/reconnecting-the-split-moon/) and updated the site to include a link to the blog in the site menu. This required shifting a few things around to make room for the new menu item. I also investigated an issue Luca was having in migrating one of Graeme Cannon’s old websites which was similarly structured to the House of Fraser Archive site and managed to find the section of code that was causing the problem (a flag in a regular expression that has since been deprecated).
On Tuesday I completed my work on the CSV endpoints for the Books and Borrowing project, ensuring all nested arrays are ‘flattened’ when producing the two-dimensional CSV file. This has been a lengthy and tedious task, but it’s good that it’s done, and it should mean that future researchers will be able to extract and reuse the data in a relatively straightforward manner.
On Wednesday I met Luca and Stevie, two of my fellow College of Arts developers to have a catch-up, which was hugely useful as always. We’ll hopefully meet up again in the next couple of months. I also responded to a request from Luca to help get some screenshots ready for print publication. Screenshots are generally 72DPI but this is too low for print. I’ve previously got around this using Photoshop by loading the image then going to image -> image size. In the options you can then untick ‘Resample Image’ and then update the ‘resolution’ to whatever you want. I’ve never actually printed the resulting images to check any difference, but I’ve never had anyone come back and ask for better versions. I guess another option would be to take the screenshots on something like an iPad that natively runs at a higher DPI.
Also on Wednesday I spent some time on the DSL, investigating an issue with Google Analytics for Pauline Graham and then investigating a problem with phrase searching and highlighting that Pauline had also noticed on both the live and test sites. When a phrase was searched for each individual word in the phrase was being highlighted in the entry, and then if you returned to the search results and went back to an entry from there no highlighting worked. Also some search results were not featuring snippets. This turned out to be three separate issues that needed to be investigated and fixed:
- Separate word highlighting: The default setting in the highlighting library I installed a few months ago highlighted each word in a string. If there were multiple words separated by spaces then all matching words would be highlighted. Thankfully the library (https://markjs.io/) has a setting that only matches the entire string and I’ve activated this now. Now if you perform a search for ‘off or on’ or something and navigate to a result only the exact term will be highlighted.
- Losing the highlighting when navigating back to the results and then to an entry: This was a problem with spaces getting encoded between pages. They were becoming the URL encoded equivalent ‘%B’ or ‘+’ and after that the string no longer matched. I’ve sorted this.
- Lack of snippets: The issue was down to the length of the entry. In Solr, the snippet generation is a separate process to the search matching. While the search checks the entire entry the snippet generation by default only looks at the first 51,200 characters. An entry such as ‘Mak’ is a long entry and if the search term only matches text quite far down the entry a snippet doesn’t get created. After discovering this I’ve updated the setting so that 100,000 characters are analysed instead and this has fixed the issue. More information about this can be found at https://stackoverflow.com/questions/52511154/solr-empty-highlight-entry-on-match.
This investigation took some of Thursday as well, after which I moved back to the Books and Borrowing project, for which I spent some time generating data relating to the Royal High School for checking purposes. I also received some bid documentation for a proposal Gavin Miller is putting together. Gavin wanted me to read through the documentation and add in some further sections relating to the data. The data will consist of a directory of projects and resources which will be available to search and browse, plus will be visualised on an interactive map. I added in some information and hopefully the proposal is a success.
On Friday I made some further updates to the Speech Star websites, adding in some new videos to the Edinburgh MRI Modelled Speech Corpus (https://www.seeingspeech.ac.uk/speechstar/edinburgh-mri-modelled-speech-corpus/) and arranging their layout a bit better. I also replied to a request from Rhona Brown, who would like a website to be set up for a new project she’s starting work on soon. I listed a few options we could pursue and I need to wait to hear more from her now.
I also spent quite of bit of time investigating some minor issues Ann Ferguson had spotted with the predictive search on the DSL website, most of which will thankfully be sorted when the new Solr based headword search goes live.
Finally, I had a meeting with the Placenames of Iona project to discuss the development of a new ‘map first’ interface for the data. I met with Thomas, Sofia and Alasdair and it was really great to actually have an in person meeting with them, having never done so before. We discussed many aspects of the interface and had some really useful discussions. I’ll be starting on the development of the front-end in the coming weeks.