Week Beginning 23rd November 2020

This was a four-day week for me as I had an optician’s appointment on Tuesday, and as my optician is over the other side of the city (handy for work, not so handy for working from home) I decided I’d just take the day off.  I spent most of Monday working on the interactive map of Burns Suppers I’m developing with Paul Malgrati in Scottish Literature.  This week I needed to update my interface to incorporate the large number of filters that Paul wants added to the map.  In doing so I had to change the way the existing ‘period’ filter works to fit in with the new filter options, as previously the filter was being applied as soon as a period button was pressed.  Now you need to press on the ‘apply filters’ button to see the results displayed, and this allows you to select multiple options without everything reloading as you work.  There is also a ‘reset filters’ button which turns everything back on.  Currently the ‘period’, ‘type of host’ and ‘toasts’ filters are in place, all as a series of checkboxes that allow you to combine any selections you want.  Within a section (e.g. type of host) the selections are joined with an OR (e.g. ‘Burns Society’ OR ‘Church’).  Between sections are joined with an AND (e.g. ‘2019-2020’ AND (‘Burns Society’ OR ‘Church’).  For ‘toasts’ a location may have multiple toasts and the location will be returned if any of the selected toasts are associated with the location.  E.g. if a location has ‘Address to a Haggis’ but not ‘Loyal Toast’ it will still be returned if you’ve selected both of these toasts.  I also updated the introductory panel to make it much wider, so we can accommodate some text and also so there is more room for the filters.  Even with this it is going to be a bit tricky to squeeze all of the filters in, but I’ll see what I can do next week.

I then turned my attention to the Dictionary of the Scots Language.  Ann had noticed last week that the full text searches were ‘accent sensitive’ – i.e. entries containing accented characters would only be returned if your search term also contained the accented characters.  This is not what we wanted and I spend a bit of time investigating how to make Solr ignore accents.  Although I found a few articles dealing with this issue they were all for earlier versions of Solr and the process didn’t seem all that easy to set up, especially as I don’t have direct access to the sever that Solr resides on so tweaking settings is not an easy process.  I decided instead to just strip out accents from the source data before it was ingested into Solr, which was a much more straightforward process and fitted in with another task I needed to tackle, which was to regenerate the DSL data from the old editing system, delete the duplicate child entries and set up a new version of the API to use this updated dataset.  This process took a while to go through, but I go there in the end and we now have a new Solr collection set up for the new data, which has the accents and the duplicate child entries removed.  I then updated one of our test front-ends to connect to this dataset so people can test it.  However, whilst checking this I realised that stripped out the accents has introduced some unforeseen issues.  The fulltext search is still ‘accent sensitive’, it’s just that I’ve replaced all accented characters with their non-accented equivalents.  This means an accented search will no longer find anything.  Therefore we’ll need to ensure that any accents in a submitted search string are also stripped out.  In addition, stripping accents out of the Solr text also means that accents no longer appear in the snippets in the search results, meaning the snippets don’t fully reflect the actual content of the entries.  This may be a larger issue and will need further discussion.  I also wrote a little script to generate a list of entry IDs that feature an <su> inside a <cref> excluding those that feature </geo><su> or </geo> <su>.  These entries will need to be manually updated to ensure the <su> tags don’t get displayed.

I spent the remainder of the week continuing with the redevelopment of the Anglo-Norman Dictionary site.  My 51-item ‘to do’ list that I compiled last week has now shot up to 65 items, but thankfully during this week I managed to tick 20 of the items off.  I have now imported all of the corrected data that the editors were working on, including not just the new ‘R’ records but corrections to other letters too.  I also ensured that the earliest dates for entries now appear in the ‘search results’ and ‘log’ tabs in the left-hand panel, and these dates also appear in the fixed header that gets displayed when you scroll down long entries.  I also added the date to the title in the main page.

I added content to the ‘cite this entry’ popup, which now contains the same citation options as found on the Historical Thesaurus site, and I added help text to the ‘browse’, ‘search results’ and ‘entry log’ pop-ups.  I made the ‘clear search’ button right aligned and removed the part of speech tooltips as parts of speech appear all over the place and I thought it would confuse things if they had tooltips in the search results but not elsewhere.  I also added a ‘Try an advanced search’ button at the top of the quick search results page.

I set up WordPress accounts for the editors and added the IP addresses used by the Aberystwyth VPN to our whitelist to ensure that the editors will be able to access the WordPress part of the site and I added redirects that will work from old site URLs once the new site goes live.  I also put redirects in for all static pages that I could find too.  We also now have the old site accessible via a subdomain so we should be able to continue to access the old site when we switch the main domain over to the new site.

I figured out why the entry for ‘ipocrite’ was appearing in all bold letters.  This was caused by an empty XML tag and I updated the XSLT to ensure these don’t get displayed.  I updated Variant/Deviant forms in the search results so they just have the title ‘Forms’ now and I think I’ve figured out why the sticky side panel wasn’t sticking and I think I’ve fixed it.

I updated my script that generates citation date text and regenerated all of the earliest date text to include the prefixes and suffixes and investigated the issue of commentary boxes was appearing multiple times.  I checked all of the entries and there were about 5 that were like this, so I manually fixed them.  I also ensured that when loading the advanced search after a quick search the ‘headword’ tab is now selected and the quick search text appears in the headword search box and I updated the label search so that label descriptions appear in a tooltip.

Also this week I participated in the weekly Iona Placenames Zoom call and made some tweaks to the content management systems for both Iona and Mull to added in researcher notes fields (English and Gaelic) to the place-name elements.  I also had a chat with Wendy Anderson about making some tweaks to the Mapping Metaphor website for REF and spoke to Carole Hough about her old Cogtop site that is going to expire soon.