Week Beginning 9th January 2023

I attended the workshop ‘The impact of multilingualism on the vocabulary and stylistics of Medieval English’ in Zurich this week.  The workshop ran on Tuesday and Wednesday and I travelled to Zurich with my colleagues Marc Alexander and Fraser Dallachy on Monday.  It was really great to travel to a workshop in a different country again as I’d not been abroad since before Lockdown.  I’d never been to Zurich before and it was a lovely city.  The workshop itself was great, with some very interesting papers and good opportunities to meet other researchers and discuss potential future projects.  I gave a paper on the Historical Thesaurus, its categories and data structures and how semantic web technologies may be used to more effectively structure, manage and share the Historical Thesaurus’s semantically arranged dataset.  It was a half-hour paper with 10 minutes for questions afterwards and it went pretty well.  The audience wasn’t especially technical and I’m not sure how interesting the topic was to most people, but it was well received and I’m glad I had the opportunity to both attend the event and to research the topic as I have greatly increased my knowledge of semantic web technologies such as RDF, graph databases and SPARQL, and as part of the research I managed to write a script that generated an RDF version of the complete HT category data, which may come in handy one day.

I got back home just before midnight on the Wednesday and returned to normal work first thing on Thursday.  This included submitting my expenses from the workshop and replying to a few emails that had come in regarding my office (it looks like the dry rot work is going to take a while to resolve and it also looks like I’ll have to share my temporary office) and attempting to set up web hosting for the VARICS project, which Arts IT Support seem reluctant to do.  I also looked into an issue with the DSL that Ann Ferguson had spotted and spoke to the IT people at Stirling about their current progress with setting up a Solr instance for the Books and Borrowing project.  I also replaced a selection of library register images with better versions for that project and arranged a meeting for next Monday with the project’s PI and Co-I to discuss progress with the front-end.

I spent most of Friday writing a Data Management Plan and attending a Zoom call for a new speech therapy project I’m involved with.  It’s an ESRC funding proposal involving Glasgow and Strathclyde and I’ll be managing the technical aspects.  We had a useful call and I managed to complete an initial version of the DMP that the PI is going to adapt if required.

Week Beginning 2nd January 2023

The first week back after the Christmas holidays was supposed to be a three-day week, but unfortunately after returning to work on Wednesday I started with some sort of winter vomiting virus that affected me throughout Wednesday night and I was off work on Thursday.  I was still feeling very shaky on Friday but I managed to do a full day’s work nonetheless.

My two days were mostly spent creating my slides for the talk I’m giving at a workshop in Zurich next week and then practising the talk.  I also engaged in an email conversation about the state of Arts IT Support after the database on the server that hosts many of our most important websites went down on the first day of the Christmas holidays and remained offline for the best part of two weeks.  This took down websites such as the Historical Thesaurus, Seeing Speech, The Glasgow Story and the Emblems websites and I had to spend time over the holidays replying and apologising to people who contacted me about the sites being unavailable.  As I don’t have command-line access to the servers there was nothing I could do to fix the issue and despite several members of staff contacting Arts IT Support no response was received from them.  The issue was finally resolved on the 3rd of January but we have still received no communication from Arts IT Support to either inform us that the issue has been resolved, to let us know what caused the issue or to apologise for the incident, which is really not good enough.  Arts IT Support are in a shocking state at the moment due to critical staff leaving and not being replaced and I’m afraid it looks like the situation may not improve for several months yet, meaning issues with our website are likely to continue in 2023.

Week Beginning 19th December 2022

This was the last week before the Christmas holidays, and Friday was a holiday.  I spent some time on Monday making further updates to the Speech Star data.  I fixed some errors in the data and made some updates to the error type descriptions.  I also made ‘poster’ images from the latest batch of child speech videos I’d created last week as this was something I’d forgotten to do at the time.  I also fixed some issues with the non-disordered speech data, including changing a dash to an underscore in the filenames of the files for one speaker as there had been a mismatch between filenames and metadata, causing none of the videos to open in the site.  I also created records for two projects (The Gentle Shepherd and Speak For Yersel) on this very site (see https://digital-humanities.glasgow.ac.uk/projects/last-updated/) as these are the projects I’ve been working on that have actually launched in the past year.  Other major ones such as Books and Borrowing and Speech Star are not yet ready to share.  I also updated all of the WordPress sites I manage to the latest version.

On Tuesday I travelled into the University to locate my new office.  My stuff had been moved across last week after a leak in the building resulted in water pouring through my office.  Plus work is ongoing to fix the dry rot in the building and I would have needed to move out for that anyway.  It took a little time to get the new office in order and to get my computer equipment set up, but once it was all done it was actually a very nice location – much nicer than the horrible little room I’m usually stuck in.

I spent most of Tuesday upgrading Google Analytics for all of the sites I manage that use it.  Google’s current analytics system is being retired in July next year and I decided to use the time in the run-up to Christmas to migrate the sites over to the new Google Analytics 4 platform.  This was a mostly straightforward process, although as usual Google’s systems feel clunky and counterintuitive at times.  It was also a fairly lengthy process as I had to update the code for each site un question.  Nevertheless I managed to get it done and informed all of the staff whose websites would be affected by the change.  I also had a further chat with Geert, the editor of the Anglo-Norman Dictionary about the new citation edit feature I’m planning at the moment.

On Wednesday I had a meeting with prospective project partners in Strathclyde about a speech therapy proposal we’re putting together.  It was good to meet people and to discuss things.  I’ll be working on the Data Management Plan for the proposal after the holidays.  I spent the rest of the day working on my paper for the workshop I’m attending in Zurich in the second week of January.  I have now finished the paper, which is quite a relief.

On Thursday I spent some time working for the Dictionaries of the Scots Language.  I responded to an email from Ann Fergusson about how we should handle links to ancillary pages in the XML.  There are two issues here that need to be agreed upon.  The first issue is how to represent links to things other than entries in the entry XML.  We currently have the <ref> element that is used to link from one entry to another (e.g. <ref refid=”snd00065761″>Chowky</ref>).  We could use the HTML element <a> in the XML for links to things other than entries but I personally think it’s best not to use this as (in my opinion) it’s better for XML elements to be meaningful when you look at them and the meaning of <a> isn’t especially clear.  It might be better to use <ref> with a different attribute instead of ‘refid’, for example <ref url=”https://dsl.ac.uk/geographical-labels”>.  Reusing <ref> means we don’t need to update the DTD (the rules that define which elements can be used where in the XML) to add a new element.

Of course other people may think that inventing our own way of writing HTML links is daft when everyone is already familiar with <a href=”https://dsl.ac.uk/geographical-labels”> and we could use the latter if people prefer.  If this is the case we would need to update the DTD to allow such elements to be used.  If we didn’t update the DTD the XML files would fail to validate.

Whichever way is chosen, there is a second issue that will need to be addressed:  I will need to update the XSLT that transforms the XML into HTML to tell the script how to handle either a <ref> with a ‘url’ attribute or a <a> with an ‘href’ attribute.  Without updating the XSLT the links won’t work.  I can add such a rule in when we decide how best to represent links in the XML.

I also made a couple of tweaks to the wildcard search term highlighting feature I was working on last week and then published the update on the live DSL site.  Now when you perform a search for something like ‘chr*mas’ and then select an entry to view any work that matches the wildcard pattern will be highlighted.  For example, go to this page: https://dsl.ac.uk/results/chr*mas/fulltext/withquotes/both/ and then select one of the entries and you’ll see the term highlighted in the entry page.

That’s all from me for this year.  Merry chr*mas one and all!

Week Beginning 12th December 2022

There was a problem with the server on which a lot of our major sites such as the Historical Thesaurus and Seeing Speech are hosted that started on Friday and left all of the sites offline until Monday.  This was a really embarrassing and frustrating situation and I had to deal with lots of emails from users of the sites who were unable to access them.  As I don’t have command-line access to the servers all I could do was report the issue via our IT Helpdesk system.  Thankfully by mid-morning on Monday the sites were all back up again, but the incident raised serious issues about the state of Arts IT Support, who are massively understaffed at the moment.  Arts IT also refused to set up hosting for a project that we’re collaborating with Strathclyde University on, and in fact stated that they would not set up hosting for any further websites, which will have a massive negative impact on several projects that are still in the pipeline and ultimately means I will not be able to work on any new projects until this is resolved.  The PI for the new project with Strathclyde is Jane Stuart-Smith, and thankfully she was also not very happy with the situation.  We arranged a meeting with Liz Broe, who oversees Arts IT Support, to discuss the issues and had a good discussion about how we ended up in this state and how things will be resolved.  In the short-term some additional support is being drafted in from other colleges while new staff will be recruited in the medium term, and Liz has stated that hosting for new websites (including the Strathclyde one) will continue to be offered, which is quite a relief.

I also discovered this week that there has been a leak in 13 University Gardens and water has been pouring through my office.  I was already scheduled to be moved out of the building due to the dry rot that they’ve found all the way up the back wall (which my office is on) but this has made things a little more urgent.  I’m still generally working from home every day except Tuesday and apparently all my stuff has been moved to a different building, so I’ll just need to see how the process has gone when I’m back in the University next week.

In terms of actual work this week, I spent a bit more time writing my paper about the Historical Thesaurus and Semantic Web technologies for the workshop in January.  This is coming together now, although I still need to shape it into a presentation, which will take time.  I also spent some time working on the Speech Star project, updating the speech error database to fix a number of issues with the data that Eleanor had spotted and then adding in new error type descriptions for new error types that had been included.  I also added in some ancillary page content and had a chat with Eleanor about the database system the website uses.

I also spent some time working for the DSL this week.  Rhona had noted that when you perform a full text or quotation search (i.e. a search using Solr) with wildcards (e.g. chr*mas) the search results display entries with snippets that highlight the whole word where the search string occurred (e.g. ‘Christmas’).  However, when clicking through to the entry page such highlighting was not appearing, even though highlighting in the entry page does work when performing a search without wildcards.

Highlighting in the entry page was handled by a jQuery plugin, but this was not written to take wildcards into consideration and only works on full words.  I spent some time trying to figure out how to get wildcard highlighting working myself using regular expressions, but I find regular expressions to be pretty awful to work with – an ancient relic left over from computing in the 1980s and although I managed to get something working it wasn’t ideal.  Thankfully I found an existing JavaScript library called https://markjs.io/ that can handle wildcard highlighting and I was able to replace the existing plugin with this script and update the code to work with it.  I tested this out on our test DSL site and all seems to work well.  I haven’t updated the live site yet, though, as the DSL team need to test the new approach out more fully in case they encounter any problems with it.  I also noticed that there was an issue with the quotation search whereby if you returned to the search results from an entry by clicking on the ‘return to results’ button you got an empty page.  I fixed this in both our live and test sites.

I also spent some time working for the Anglo-Norman Dictionary this week.  I updated the citation search on the public website.  Previously the citation text was only added into the search results if you also search for a specific form within a siglum, for example https://anglo-norman.net/search/citation/%22tout%22/null/A-N_Falconry and ther citation searches (e.g. just selecting a siglum and / or a siglum date) would only return the entries the siglum appeared in without the individual citations.  Now the citations appear in these searches too.  For example, all citations from A-N Falconry: https://anglo-norman.net/search/citation/null/null/A-N_Falconry and all citations where the citation date is 1400: https://anglo-norman.net/search/citation/null/1400.  This also means when you view the citations by pressing on the ‘Search AND Citations’ button for a siglum in the bibliography you now see each citation for the listed entries.

I then spent most of a day thinking through all of the issues relating to the new ‘DMS citation search and edit’ feature that the editor wants me to implement and wrote an initial document detailing how the feature will work.  There has been quite a lot to think through and I thought it wise to document the feature rather than just launching into its creation without a clear plan.  I might have some time to start work on this next week as I’m working up to and including Thursday, but it depends how I get on with some other tasks I need to do for other projects.

Also this week I attended the Christmas lunch for the Books and Borrowing project in Edinburgh.  Unfortunately there was a train strike this day and I decided to get the bus through to Edinburgh.  The journey there was fine, talking about an hour and a half, but I got the 4pm bus on the way back and it was a nightmare, taking 2 hours forty minutes.  I would never get the bus between Glasgow and Edinburgh anywhere near rush hour ever again.

Week Beginning 5th December 2022

I continued my research into RDF, semantic web and linked open data and how they could be applied to the data of the Historical Thesaurus this this week in preparation for a paper I’ll be giving at a workshop in January, and also to learn more about these technologies and concepts in general.  I followed a few tutorials about RDF, for example here https://cambridgesemantics.com/blog/semantic-university/learn-rdf/ and read up about linked open data, for example here https://www.ontotext.com/knowledgehub/fundamentals/linked-data-linked-open-data/.  I also found a site that visualises linked open data projects here https://lod-cloud.net/.

I then manually created a small sample of the HT’s category structure, featuring multiple hierarchical levels and both main and subcategories using the RDF/XML format using the Simple Knowledge Organization System model.  This is a W3C standard for representing thesaurus data in RDF.  More information about it can be found on Wikipedia here: https://en.wikipedia.org/wiki/Simple_Knowledge_Organization_System and on the W3C’s website here https://www.w3.org/TR/skos-primer/ and here https://www.w3.org/TR/swbp-skos-core-guide/ and here https://www.w3.org/TR/swbp-thesaurus-pubguide/ and here https://www.w3.org/2001/sw/wiki/SKOS/Dataset.  I also referenced a guide to SKOS for Information Professionals here https://www.ala.org/alcts/resources/z687/skos.  I then imported this manually created sample into the Apache Jena server I set up last week to test that it would work, which thankfully it did.

After that I then wrote a small script to generate a comparable RDF structure for the entire HT category system.  I ran this on an instance of the database on my laptop to avoid overloading the server, and after a few minutes of processing I had an RDF representation of the HT’s hierarchically arranged categories in an XML file that was about 100MB in size.  I fed this into my Apache Jena instance and the import was a success.  I then spent quite a bit of time getting to grips with the SPARQL querying language that is used to query RDF data and by the end of the week I had managed to replicate some of the queries we use in the HT to generate the tree browser, for example ‘get all noun main categories at this level’ or ‘get all noun main categories that are direct children of a specified category’.

I then began experimenting with other RDF tools in the hope of being able to generate some nice visualisations of the RDF data, but this is where things came a bit unstuck.  I set up a nice desktop RDF database called GraphDB (https://www.ontotext.com/products/graphdb/) and also experimented with the Neo4J graph database (https://neo4j.com/) as my assumption had been that graph databases (which store data as dots and lines, like RDF triples) would include functionality to visualise these connections.  Unfortunately I have not been able to find any tools that allow you to just plug RDF data in and visualise it.  I found a Stack Overflow page about this (https://stackoverflow.com/questions/66720/are-there-any-tools-to-visualize-a-rdf-graph-please-include-a-screenshot) but none of the suggestions on the page seemed to work.  I tried downloading the desktop visualisation tool Gephi (https://gephi.org/) as apparently it had a plugin that would enable RDF data to be used, but the plugin is no longer available and other visualisation frameworks such as D3 do not work with RDF data but require the data to be migrated to another format first.  It seems strange that data structured in such a way as to make it ideal for network style visualisations should have no tools available to natively visualise the data and I am rather disappointed by the situation.  Of course it could just be that my Google skills have failed me, but I don’t think so.

In addition to the above I spent some time actually writing the paper that all of this will go into.  I also responded to a query from a researcher at Strathclyde who is putting together a speech and language therapy proposal and wondered whether I’d be able to help out, given my involvement in several other such projects.  I also spoke to the IT people at Stirling about the Solr instance for the Books and Borrowing project and made a few tweaks to the Speech Star project’s introductory text.

 

Week Beginning 28th November 2022

There was another strike day on Wednesday this week so it was a four-day week for me.  On Monday I attended a meeting about the Historical Thesaurus, and afterwards I dealt with some issues that cropped up.  These included getting an up to date dump of the HT database to Marc and Fraser, investigating a new subdomain to use for test purposes, looking into adding a new ‘sensitive’ flag to the database for categories that contain potentially offensive content, reminding people where our latest stats page is located and looking into connections between the HT and Mapping Metaphor datasets.  I also spent some more time this week researching semantic web technologies and how these could be used for thesaurus data.  This included setting up an Apache Jena instance on my laptop with a Fuseki server for querying RDF triples using the SPARQL query language.  See https://jena.apache.org/ and https://jena.apache.org/documentation/fuseki2/index.html for more information on these.  I played around with some sample datasets and thought about how our thesaurus data might be structured to use a similar approach.  Hopefully next week I’ll migrate some of the HT data to RDF and experiment with it.

Also this week I spent quite a bit of time speaking to IT Services about the state of the servers that Arts hosts, and migrating the Cullen Project website to a new server as the server it is currently on badly needs upgrades and there is currently no-one to manage this.  Migrating the Cullen Project website took the best part of a day to complete, as all database queries in the code needed to be upgraded.  This took some investigation as it turns out ‘mysqli_’ requires a connection to be passed to it in many of its functions where ‘mysql_’ doesn’t, plus where ‘mysql_’ does require a connection to be passed ‘mysqli_’ has the connection and the string the other way round.  There were also some character encoding issues that were cropping up.  It turned out that these were caused by the database not being UTF-8 and the database connection script needed to set the character-set to ‘latin1’ for the characters to display properly.  Luca also helped with the migration, dealing with the XML and eXistDB side of things and by the end of the week we had a fully operational version of the site running at a temporary URL on a new server.  We put in a request to have the DNS for the project’s domain switched to the new server and once this takes effect we’ll be able to switch the old server off.

Also this week I fixed a couple of minor issues with a couple of the place-names resources, participated in an interview panel for a new role at college level, duplicated a section of the Seeing Speech website on the Dynamic Dialects website at the request of Eleanor Lawson and had discussions about moving out of my office due to work being carried out in the building.

Week Beginning 14th November 2022

I spent almost all of this week working with a version of Apache Solr installed on my laptop, experimenting with data from the Books and Borrowing project and getting to grips with setting up a data core and customising a schema for the data, preparing data for ingest into Solr, importing the data and running queries on it, including facetted searching.

I started the week experimenting with our existing database, creating a cache table and writing a script to import a sample of 100 records.  This cache table could hold all of the data that the quick search would need to query and would be very speedy to search, but I realised that other aspects related to the searching would still be slow.  Facetted searching would still require several other database queries to be executed, as would extracting all of the fields that would be necessary to display the search results and it seemed inadvisable to try and create all of this functionality myself when an existing package like Solr could already do it all.

Solr is considerably faster than using the database approach and its querying is much more flexible.  It also offers facetted search options that are returned pretty much instantaneously which would be hopelessly slow if I attempted to create something comparable directly with the database.  For example, I can query the Solr data to find all borrowing records that involve a book holding record with a standardised title that includes the word ‘Roman’, returning 3325 records, but Solr can then also return a breakdown of the number of records by other fields, for example publication place:

“London”,2211,

“Edinburgh”,119,

“Dublin”,100,

“Paris”,30,

“Edinburgh; London”,16,

“Cambridge”,4,

“Eton”,3,

“Oxford”,3,

“The Hague”,3,

“Naples”,2,

“Rome”,2,

“Berlin”,1,

“Glasgow”,1,

“Lausanne”,1,

“Venice”,1,

“York”,1

 

Format:

“8vo”,842,

“octavo”,577,

“4to”,448,

“quarto”,433,

“4to.”,88,

“8vo., plates, port., maps.”,88,

“folio”,76,

“duodecimo”,67,

“Folio”,33,

“12mo”,19,

“8vo.”,17,

“8vo., plates: maps.”,16

 

Borrower gender:

“Male”,3128,

“Unknown”,109,

“Female”,64,

“Unclear”,2

These would then allow me to build in the options to refine the search results further by one (or more) of the above criteria.  Although it would be possible to build such a query mechanism myself using the database it is likely that such an approach would be much slower and would take me time to develop.  It seems much more sensible to use an existing solution if this is going to be possible.

In my experiments with Solr on my laptop I Initially imported 100 borrowing records exported via the API call I created to generate the search results page.  This gave me a good starting point to experiment with Solr’s search capabilities, but the structure of the JSON file returned from the API was rather more complicated than we’d need purely for search purposes and includes a lot of data that’s not really needed either, as the returned data contains everything that’s needed to display the full borrowing record.  I therefore worked out a simpler JSON structure that would only contain the fields that we would either want to search or could be used in a simplified search results page.  Here’s an example:

{

“bnid”: 1379,

“lid”: 6,

“slug”: “glasgow-university”,

“lname”: “Glasgow University Library”,

“rid”: 2,

“rname”: “3”,

“syear”: 1760,

“eyear”: 1765,

“rtype”: “Student”,

“pid”: 107,

“fnum”: “4r”,

“transcription”: “Euseb: Eclesiastical History”,

“bday”: 17,

“bmonth”: 9,

“byear”: 1760,

“rday”: 1,

“rmonth”: 10,

“ryear”: 1760,

“borrowed”: “1760-09-17”,

“returned”: “1760-10-01”,

“bdayofweek”: “Wednesday”,

“rdayofweek”: “Wednesday”,

“originaltitle”: “”,

“standardisedtitle”: “Ancient ecclesiasticall histories of the first six hundred years after Christ; written in the Greek tongue by three learned  historiographers, Eusebius, Socrates, and Evagrius.”,

“brids”: [“1”],

“bfnames”: [“Charles”],

“bsnames”: [“Wilson”],

“bfullnames”: [“Charles Wilson”],

“boccs”: [“University Student”, “Education”],

“bgenders”: [“Male”],

“aids”: [“74”],

“asnames”: [“Eusebius of Caesarea”],

“afullnames”: [” Eusebius of Caesarea”],

“beids”: [“88”],

“edtitles”: [“Ancient ecclesiasticall histories of the first six hundred years after Christ; written in the Greek tongue by three learned  historiographers, Eusebius, Socrates, and Evagrius.”],

“estcs”: [“R21513”],

“langs”: [“English”],

“pubplaces”: [“London”],

“formats”: [“folio”]

}

I wrote a script that would export individual JSON files like the above for each active borrowing record in our system (currently 141,335 records).  I ran this on a version of the database stored on my laptop rather than running it on the server to avoid overloading the server.  I then created a Solr Core for the data and specified an appropriate schema.  This defines each of the above fields and the types of data the fields can hold (e.g. some fields can hold multiple values, such as borrower occupations, some fields are text strings, some are integers, some are dates). I then ran the Solr script that ingests the data.

It took a lot of time to get things working as I needed to experiment with the structure of the JSON files that my script generated in order to account for various complexities in the data.  I also encountered some issues with the data that only became apparent at the point of ingest when records were rejected.  These issues only affected a few records out of nearly 150,000 so I needed to tweak and re-run the data export many times until all issues were ironed out.  As both the data export and the ingest scripts took quite a while to run the whole process took several days to get right.

Some issues encountered include:

  1. Empty fields in the data resulting in no data for the corresponding JSON field (e.g. “bday”: <nothing here> ) which invalidated the JSON file structure. I needed to update the data export script to ensure such empty fields were not included.
  2. Solr’s date structure requiring a full date (e.g. 1792-02-16) and partial dates (e.g. 1792) therefore failing. I ended up reverting to an integer field for returned dates as these are generally much more vague and having to generate placeholder days and months where required for the borrowed date.
  3. Solr’s default (and required) ID field having to be a string rather than an integer, which is what I’d set it to in order to match our BNID field. This was a bit of a strange one as I would have expected an integer ID to be allowed and it took some time to investigate why my nice integer ID was failing.
  4. Realising more fields should be added to the JSON output as I went on and therefore having to regenerate the data each time (e.g. I added in borrower gender and IDs for borrowers, editions, works and authors )
  5. Issues with certain characters appearing in the text fields causing the import to break. For example, double quotes needed to be converted to the entity ‘&quote;’ as their appearance in the JSON caused the structure to be invalid.  I therefore updated the translation, original title and standardised title fields, but then the import still failed as a few borrowers also have double quotes in their names.

However, once all of these issues were addressed I managed to successfully import all 141,355 borrowing records into the Solr instance running on my laptop and was able to experiment with queries, all of which are running very quickly and will serve our needs very well.  And now that the data export script is properly working I’ll be able to re-run this and ingest new data very easily in future.

The big issue now is whether we will be allowed to install an Apache Solr instance on a server at Stirling.  We would need the latest release of Solr (v9 https://solr.apache.org/downloads.html) to be installed on a server.  This requires Java JRE version 11 or higher (https://solr.apache.org/guide/solr/latest/deployment-guide/system-requirements.html).  Solr uses the Apache Lucene search library and as far as I know it fires up a Java based server called Jetty when it runs.  The deployment guide can be found here: https://solr.apache.org/guide/solr/latest/deployment-guide/solr-control-script-reference.html

When Solr runs a web-based admin interface is available through which the system can be managed and the data can be queried.  This would need securing, and instructions about doing so can be found here: https://solr.apache.org/guide/solr/latest/deployment-guide/securing-solr.html

I think basic authentication would be sufficient, ideally with access limited to on-campus / VPN users.  Other than for testing purposes there should only be one script that connects to the Solr URL (our API) so we could limit access to the IP address of this server,  or if Solr is going to be installed on the same server then limiting access to localhost could work.

In terms of setting up the Solr instance, we would only need a single node installation (not SolrCloud).  Once Solr is running we’d need a Core to be created.  I have the schema file the core would require and can give instructions about setting this up.  I’m assuming that I would not be given command-line access to the server, which would unfortunately mean that someone in Stirling’s IT department would need to execute a few commands for me, including setting up the Core and ingesting the data each time we have a new update.

One downside to using Solr is it is a separate system to the B&B database and will not reflect changes made to the project’s data until we run a new data export / ingest process.  We won’t want to do this too frequently as exporting the data takes at least an hour, then transferring the files to the server for ingest will take a long time (uploading hundreds of thousands of small files to a server can take hours.  Zipping them up then uploading the zip file and extracting the file also takes a long time).  Then someone with command-line access to the server will need to run the command to ingest the data.  We’ll need to see if Stirling are prepared to do this for us.

Until we hear more about the chances of using Solr I’ll hold off doing any further work on B&B.  I’ve got quite a lot to do for other projects that I’ve been putting off whilst I focus on this issue so I need to get back into that.

Other than the above B&B work I did spent a bit of time on other projects.  I answered a query about a potential training event based on Speak For Yersel that Jennifer Smith emailed me about and I uploaded a video to the Speech Star site.  I deleted a spurious entry from the Anglo-Norman Dictionary and fixed a typo on the ‘Browse Textbase’ page.  I also had a chat with the editor about further developments of the Dictionary Management System that I’m going to start looking into next week.  I also began doing some research into semantic web technologies for structuring thesaurus data in preparation for a paper I’ll be giving in Zurich in January.

Finally, I investigated potential updates to the Dictionaries of the Scots Language quotations search after receiving a series of emails from the team, who had been meeting to discuss how dates will be used in the site.

Currently the quotations are stripped of all tags to generate a single block of text that is then stored in the Solr indexing system and queried against when an advanced search ‘quotes only’ search is performed.  So for example in a search for ‘driech’ (https://dsl.ac.uk/results/dreich/quotes/full/both/) Solr looks for the term in the following block of text for the entry https://dsl.ac.uk/entry/snd/dreich (block snipped to save space):

<field name=”searchtext_onlyquotes”>I think you will say yourself it is a dreich business.

Sic dreich wark. . . . For lang I tholed an’ fendit.

Ay! dreich an’ dowie’s been oor lot, An’ fraught wi’ muckle pain.

And he’ll no fin his day’s dark ae hue the dreigher for wanting his breakfast on account of sic a cause.

It’s a dreich job howkin’ tatties wi’ the caul’ win’ in yer duds.

Driche and sair yer pain.

And even the ugsome driech o’ this Auld clarty Yirth is wi’ your kiss Transmogrified.

See a blanket of September sorrows unremitting drich and drizzle permeates our light outerwear.

</field>

The way Solr handles returning snippets is described on this page: https://solr.apache.org/guide/8_7/highlighting.html and the size of the snippet is set by the hl.fragsize variable, which “Specifies the approximate size, in characters, of fragments to consider for highlighting. The default is 100.”.  We don’t currently override this default so 100 characters is what we use per snippet (roughly – it can extend more than this to ensure complete words are displayed).

The hl.snippets variable specifies the maximum number of highlighted snippets that are returned per entry and this is currently set to 10.  If you look at the SND result for ‘Dreich adj’ you will see that there are 10 snippets listed and this is because the maximum number of snippets has been reached.  ‘Dreich’ actually occurs many more than 10 times in this entry.  We can change this maximum, but I think 10 gives a good sense that the entry in question is going to be important.

As the quotations block of text is just one massive block and isn’t split into individual quotations the snippets don’t respect the boundaries between quotations.  So the first snippet for ‘Dreich Adj’ is:

“I think you will say yourself it is a dreich business. Sic dreich wark. . . . For lang I tholed an”

Which actually comprises the text from almost the entire first two quotes, while the next snippet:

“’ fendit. Ay! dreich an’ dowie’s been oor lot, An’ fraught wi’ muckle pain. And he’ll no fin his day’s”

 

Includes the last word of the second quote, all of the third quote and some of the fourth quote (which doesn’t actually include ‘dreich’ but ‘dreigher’ which is not highlighted).

So essentially while the snippets may look like they correspond to individual quotes this is absolutely not the case and the highlighted word is generally positioned around the middle of around 100 characters of text that can include several quotations.  It also means that it is not possible to limit a search to two terms that appear within one single quotation at the moment because we don’t differentiate individual quotations – the search doesn’t know where one quotation ends and the next begins.

I have no idea how Solr works out exactly how to position the highlighted term within the 100 characters, and I don’t think this is something we have any control over.  However, I think we will need to change the way we store and query quotations in order to better handle the snippets, allow Boolean searches to be limited to the text of specific quotes rather than the entire block and to enable quotation results to be refined by a date / date range, which is what the team wants.

We’ll need to store each quotation for an entry individually, each with its own date fields and potentially other fields later on such as part of speech.  This will ensure snippets will in future only feature text from the quotation in question and will ensure that Boolean searches will be limited to text within individual queries.  However, it is a major change and it will require some time and experimentation to get working correctly and it may introduce other unforeseen issues.

I will need to change the way the search data is stored in Solr and I will need to change how the data is generated for ingest into Solr.  The display of the search results will need to be reworked as the search will now be based around quotations rather than entries.  I’ll need to group quotations into entries and we’ll need to decide whether to limit the number of quotations that get displayed per entry as for something like ‘dreich adj’ we would end up with many tens of quotations being returned, which would swamp the results page and make it difficult to use.  It is also likely that the current ranking of results will no longer work as individual quotations will be returned rather than entire entries.  The quotations themselves will be ranked, but that’s not going to be very helpful if we still want the results to be grouped by entry.  I’ll need to look at alternatives, such as ranking entries by the number of quotations returned.

The DSL team has proposed that a date search could be provided as a filter on the search results page and we would certainly be able to do this, and incorporate other filters such as POS in future.  This is something called ‘facetted searching’ and it’s the kind of thing you see in online shops:  you view the search results then you see a list of limiting options, generally to the left of the results, often as a series of checkboxes with a number showing how many of the results the filter applies to.  The good news is that Solr has these kind of faceting options built in (in fact it is used to power many online shops).  More good news is that this fits in with the work I’m already doing for the Books and Borrowing project as discussed at the start of this post, so I’ll be able to share my expertise between both projects.

Week Beginning 19th September 2022

It was a four-day week this week due to the Queen’s funeral on Monday.  I divided my time for the remaining four days over several projects.  For Speak For Yersel I finally tackled the issue of the way maps are loaded.  The system had been developed for a map to be loaded afresh every time data is requested, with any existing map destroyed in the process.  This worked fine when the maps didn’t contain demographic filters as generally each map only needed to be loaded once and then never changed until an entirely new map was needed (e.g. for the next survey question).  However, I was then asked to incorporate demographic filters (age groups, gender, education level), with new data requested based on the option the user selected.  This all went through the same map loading function, which still destroyed and reinitiated the entire map on each request.  This worked, but wasn’t ideal, as it meant the map reset to its default view and zoom level whenever you changed an option, map tiles were reloaded from the server unnecessarily and if the user was in ‘full screen’ mode they were booted out of this as the full screen map no longer existed.  For some time I’ve been meaning to redevelop this to address these issues, but I’ve held off as there were always other things to tackled and I was worried about essentially ripping apart the code and having to rebuilt fundamental aspects of it.  This week I finally plucked up the courage to delve into the code.

I created a test version of the site so as to not risk messing up the live version and managed to develop an updated method of loading the maps.  This method initiates the map only once when a page is first loaded rather than destroying and regenerating the map every time a new question is loaded or demographic data is changed.  This means the number of map tile loads is greatly reduced as the base map doesn’t change until the user zooms or pans.  It also means the location and zoom level a user has left the map on stays the same when the data is changed.  For example, if they’re interested in Glasgow and are zoomed in on it they can quickly flick between different demographic settings and the map will stay zoomed in on Glasgow rather than resetting each time.  Also, if you’re viewing the map in full-screen mode you can now change the demographic settings without the resource exiting out of full screen mode.

All worked very well, with the only issues being that the transitions between survey questions and quiz questions weren’t as smooth as the with older method.  Previously the map scrolled up and was then destroyed, then a new map was created and the data was loaded into the area before it smoothly scrolled down again.  For various technical reasons this no longer worked quite as well any more.  The map area still scrolls up and down, but the new data only populates the map as the map area scrolls down, meaning for a brief second you can still see the data and legend for the previous question before it switches to the new data.  However, I spent some further time investigating this issue and managed to fix it, with different fixes required for the survey and the quiz.  I also noticed a bug whereby the map would increase in size to fit the available space but the map layers and data were not extending properly into the newly expanded area.  This is a known issue with Leaflet maps that have their size changed dynamically and there’s actually a Leaflet function that sorts it – I just needed to call map.invalidateSize(); and the map worked properly again.  Of course it took a bit of time to figure this simple fix out.

I also made some further updates to the site.  Based on feedback about the difficulty some people are having about which surveys they’ve done, I updated the site to log when the user completes a survey.  Now when the user goes to the survey index page a count of the number of surveys they’ve completed is displayed in the top right and a green tick has been added to the button of each survey they have completed.  Also, when they reach the ‘what next’ page for a survey a count of their completed survey is also shown.  This should make it much easier for people to track what they’ve done.  I also made a few small tweaks to the data at the request of Jennifer, and create a new version of the animated GIF that has speech bubbles, as the bubble for Shetland needed its text changed.  As I didn’t have the files available I took the opportunity regenerate the GIF, using a larger map, as the older version looked quite fuzzy on a high definition screen like an iPad.  I kept the region outlines on as well to tie it in better with our interactive maps.  Also the font used in the new version is now the ‘Baloo’ font we use for the site.  I stored all of the individual frames both as images and as powerpoint slides so I can change them if required.  For future reference, I created the animated GIF using https://ezgif.com/maker with a 150 second delay between slides, crossfade on and a fader delay of 8.

Also this week I researched an issue with the Scots Thesaurus that was causing the site to fail to load.  The WordPress options table had become corrupted and unreadable and needed to be replaced with a version from the backups, which thankfully fixed things.  I also did my expenses from the DHC in Sheffield, which took longer than I thought it would, and made some further tweaks to the Kozeluch mini-site on the Burns C21 website.  This included regenerating the data from a spreadsheet via a script I’d written and tweaking the introductory text.  I also responded to a request from Fraser Dallachy to regenerate some data that a script Id’ previously written had outputted.  I also began writing a requirements document for the redevelopment of the place-names project front-ends to make them more ‘map first’.

I also did a bit more work for Speech Star, making some changes to the database of non-disordered speech and moving the ‘child speech error database’ to a new location.  I also met with Luca to have a chat about the BOSLIT project, its data, the interface and future plans.  We had a great chat and I then spent a lot of Friday thinking about the project and formulating some feedback that I sent in a lengthy email to Luca, Lorna Hughes and Kirsteen McCue on Friday afternoon.

Week Beginning 25th July 2022

I was on holiday for most of the previous two weeks, working two days during this period.  I’ll also be on holiday again next week, so I’ve had quite a busy time getting things done.  Whilst I was away I dealt with some queries from Joanna Kopaczyk about the Future of Scots website.  I also had to investigate a request to fill in timesheets for my work on the Speak For Yersel project, as apparently I’d been assigned to the project as ‘Directly incurred’ when I should have been ‘Directly allocated’.  Hopefully we’ll be able to get me reclassified but this is still in-progress.  I also fixed a couple of issues with the facility to export data for publication for the Berwickshire place-name project for Carole Hough, and fixed an issue with an entry in the DSL, which was appearing in the wrong place in the dictionary.  It turned out that the wrong ‘url’ tag had been added to the entry’s XML several years ago and since then the entry was wrongly positioned.  I fixed the XML and this sorted things.  I also responded to a query from Geert of the Anglo-Norman Dictionary about Aberystwyth’s new VPN and whether this would affect his access to the AND.  I also investigated an issue Simon Taylor was having when logging into a couple of our place-names systems.

On the Monday I returned to work I launched two new resources for different projects.  For the Books and Borrowing project I published the Chambers Library Map (https://borrowing.stir.ac.uk/chambers-library-map/) and reorganised the site menu to make space for the new page link.  The resource has been very well received and I’m pretty pleased with how it’s turned out.  For the Seeing Speech project I launched the new Gaelic Tongues resource (https://www.seeingspeech.ac.uk/gaelic-tongues/) which has received a lot of press coverage, which is great for all involved.

I spent the rest of the week dividing my time primarily between three projects:  Speak For Yersel, Books and Borrowing and Speech Star.  For Books and Borrowing I continued processing the backlog of library register image files that has built up.  There were about 15 registers that needed to be processed, and each needed to be handled in a different way.  This included nine registers from Advocates Library that had been digitised by the NLS, for which I needed to batch process the images to rename them, delete blank pages, create page records in the CMS and then tweak the automatically generated folio numbers to account for discrepancies in the handwritten page number in the images.  I also processed a register for the Royal High School, which involved renaming the images so they match up with image numbers already assigned to page records in the CMS, inserting new page records and updating the ‘next’ and ‘previous’ links for pages for which new images had been uncovered and generating new page records for many tens of new pages that follow on from the ones that have already been created in the CMS.  I also uploaded new images for the Craigston register and created a new register including all page records and associated image URLs for a further register for Aberdeen.  I still have some further RHS registers to do and a few from St Andrews, but these will need to wait until I’m back from my holiday.

For Speech Star I downloaded a ZIP containing 500 new ultrasound MP4 videos.  I then had to process them to generate ‘poster’ images for each video (these are images that get displayed before the user chooses to play the video).  I then had to replace the existing normalised speech database with data from a new spreadsheet that included these new videos plus updates to some of the existing data.  This included adding a few new fields and changing the way the age filter works, as much of the new data is for child speakers who have specific ages in months and years, and these all need to be added to a new ‘under 18’ age group.

For Speak For Yersel I had an awful lot to do.  I started with a further large-scale restructuring of the website following feedback from the rest of the team.  This included changing the site menu order, adding in new final pages to the end of surveys and quizzes and changing the text of buttons that appear when displaying the final question.

I then developed the map filter options for age and education for all of the main maps.  This was a major overhaul of the maps.  I removed the slide up / slide down of the map area when an option is selected as this was a bit long and distracting.  Now the map area just updates (although there is a bit of a flicker as the data gets replaced).  The filter options unfortunately make the options section rather big, which is going to be an issue on a small screen.  On my mobile phone the options section takes up 100% of the width and 80% of the height of the map area unless I press the ‘full screen’ button.  However, I figured out a way to ensure that the filter options section scrolls if the content extends beyond the bottom of the map.

I also realised that if you’re in full screen mode and you select a filter option the map exits full screen as the map section of the page reloads.  This is very annoying, but I may not be able to fix it as it would mean completely changing how the maps are loaded.  This is because such filters and options were never intended to be included in the maps and the system was never developed to allow for this.  I’ve had to somewhat shoehorn in the filter options and it’s not how I would have done things had I known from the beginning that these options were required.  However, the filters work and I’m sure they will be useful.  I’ve added in filters for age, education and gender, as you can see in the following screenshot:

I also updated the ‘Give your word’ activity that asks to identify younger and older speakers to use the new filters too.  The map defaults to showing ‘all’ and the user then needs to choose an age.  I’m still not sure how useful this activity will be as the total number of dots for each speaker group varies considerably, which can easily give the impression that more of one age group use a form compared to another age group purely because one age group has more dots overall.  The questions don’t actually ask anything about geographical distribution so having the map doesn’t really serve much purpose when it comes to answering the question.  I can’t help but think that just presenting people with percentages would work better, or some other sort of visualisation like a bar graph or something.

I then moved on to working on the quiz for ‘she sounds really clever’ and so far I have completed both the first part of the quiz (questions about ratings in general) and the second part (questions about listeners from a specific region and their ratings of speakers from regions).  It’s taken a lot of brain-power to get this working as I decided to make the system work out the correct answer and to present it as an option alongside randomly selected wrong answers.  This has been pretty tricky to implement (especially as depending on the question the ‘correct’ answer is either the highest or the lowest) but will make the quiz much more flexible – as the data changes so will the quiz.

Part one of the quiz page itself is pretty simple.  There is the usual section on the left with the question and the possible answers.  On the right is a section containing a box to select a speaker and the rating sliders (readonly).  When you select a speaker the sliders animate to their appropriate location.  I decided to not include the map or the audio file as these didn’t really seem necessary for answering the questions, they would clutter up the screen and people can access them via the maps page anyway (well, once I move things from the ‘activities’ section).  Note that the user’s answers are stored in the database (the region selected and whether this was the correct answer at the time).  Part two of the quiz features speaker/listener true/false questions and this also automatically works out the correct answer (currently based on the 50% threshold).  Note that where there is no data for a listener rating a speaker from a region the rating defaults to 50.  We should ensure that we have at least one rating for a listener in each region before we let people answer these questions.  Here is a screenshot of part one of the quiz in action, with randomly selected ‘wrong’ answers and a dynamically outputted ‘right’ answer:

I also wrote a little script to identify duplicate lexemes in categories in the Historical Thesaurus as it turns out there are some occasions where a lexeme appears more than once (with different dates) and this shouldn’t happen.  These will need to be investigated and the correct dates will need to be established.

I will be on holiday again next week so there won’t be another post until the week after I’m back.

 

Week Beginning 20th June 2022

I completed an initial version of the Chambers Library map for the Books and Borrowing project this week.  It took quite a lot of time and effort to implement the subscription period range slider.  Searching for a range when the data also has a range of dates rather than a single date means we needed to make a decision about what data gets returned and what doesn’t.  This is because the two ranges (the one chosen as a filter by the user and the one denoting the start and end periods of subscription for each borrower) can overlap in many different ways.  For example, the period chosen by the user is 05 1828 to 06 1829.  Which of the following borrowers should therefore be returned?

  1. Borrowers range is 06 1828 to 02 1829: Borrower’s range is fully within the period so should definitely be included
  2. Borrowers range is 01 1828 to 07 1828: Borrower’s range extends beyond the selected period at the start and ends within the selected period.  Presumably should be included.
  3. Borrowers range is 01 1828 to 09 1829: Borrower’s range extends beyond the selected period in both directions.  Presumably should be included.
  4. Borrowers range is 05 1829 to 09 1829: Borrower’s range begins during the selected period and ends beyond the selected period. Presumably should be included.
  5. Borrowers range is 01 1828 to 04 1828: Borrower’s range is entirely before the selected period. Should not be included
  6. Borrowers range is 07 1829 to 10 1829: Borrower’s range is entirely after the selected period. Should not be included.

Basically if there is any overlap between the selected period and the borrower’s subscription period the borrower will be returned.  But this means most borrowers will always be returned a lot of the time.  It’s a very different sort of filter to one that purely focuses on a single date – e.g. filtering the data to only those borrowers whose subscription periods *begins* between 05 1828 and 06 1829.

Based on the above assumptions I began to write the logic that would decide which borrowers to include when the range slider is altered.  It was further complicated by having to deal with months as well as years.  Here’s the logic in full if you fancy getting a headache:

if(((mapData[i].sYear>startYear || (mapData[i].sYear==startYear && mapData[i].sMonth>=startMonth)) && ((mapData[i].eYear==endYear && mapData[i].eMonth <=endMonth) || mapData[i].eYear<endYear)) || ((mapData[i].sYear<startYear ||(mapData[i].sYear==startYear && mapData[i].sMonth<=startMonth)) && ((mapData[i].eYear==endYear && mapData[i].eMonth >=endMonth) || mapData[i].eYear>endYear)) || ((mapData[i].sYear==startYear && mapData[i].sMonth<=startMonth || mapData[i].sYear>startYear) && ((mapData[i].eYear==endYear && mapData[i].eMonth <=endMonth) || mapData[i].eYear<endYear) && ((mapData[i].eYear==startYear && mapData[i].eMonth >=startMonth) || mapData[i].eYear>startYear)) || (((mapData[i].sYear==startYear && mapData[i].sMonth>=startMonth) || mapData[i].sYear>startYear) && ((mapData[i].sYear==endYear && mapData[i].sMonth <=endMonth) || mapData[i].sYear<endYear) && ((mapData[i].eYear==endYear && mapData[i].eMonth >=endMonth) || mapData[i].eYear>endYear)) || ((mapData[i].sYear<startYear ||(mapData[i].sYear==startYear && mapData[i].sMonth<=startMonth)) && ((mapData[i].eYear==startYear && mapData[i].eMonth >=startMonth) || mapData[i].eYear>startYear)))

I also added the subscription period to the popups.  The only downside to the range slider is that the occupation marker colours change depending on how many occupations are present during a period, so you can’t always tell an occupation by its colour. I might see if I can fix the colours in place, but it might not be possible.

I also noticed that the jQuery UI sliders weren’t working very well on touchscreens so installed the jQuery TouchPunch library to fix that (https://github.com/furf/jquery-ui-touch-punch).  I also made the library marker bigger and gave it a white border to more easily differentiate it from the borrower markers.

I then moved onto incorporating page images in the resource too.  Where a borrower has borrower records the relevant pages where these borrowing records are found now appear as thumbnails in the borrower popup.  These are generated by the IIIF server based on dimensions passed to it, which is much nicer than having to generate and store thumbnails directly.  I also updated the popup to make it wider when required to give more space for the thumbnails.  Here’s a screenshot of the new thumbnails in action:

Clicking on a thumbnail opens a further popup containing a zoomable / pannable image of the page.  This proved to be rather tricky to implement.  Initially I was going to open a popup in the page (outside of the map container) using a jQuery UI Dialog.  However, I realised that this wouldn’t work when the map was being viewed in full-screen mode, as nothing beyond the map container is visible in such circumstances.  I then considered opening the image in the borrower popup but this wasn’t really big enough.  I then wondered about extending the ‘Map options’ section and replacing the contents of this with the image, but this then caused issues for the contents of the ‘Map options’ section, which didn’t reinitialise properly when the contents were reinstated.  I then found a plugin for the Leaflet mapping library that provides a popup within the map interface (https://github.com/w8r/Leaflet.Modal) and decided to use this.  However, it’s all a little complex as the popup then has to include another mapping library called OpenLayers that enables the zooming and panning of the page image, all within the framework of the overall interactive map.  It is all working and I think it works pretty well, although I guess the map interface is a little cluttered, what with the ‘Map Options’ section, the map legend, the borrower popup and then the page image popup as well.  Here’s a screenshot with the page image open:

All that’s left to do now is add in the introductory text once Alex has prepared it and then make the map live.  We might need to rearrange the site’s menu to add in a link to the Chambers Map as it’s already a bit cluttered.

Also for the project I downloaded images for two further library registers for St Andrews that had previously been missed.  However, there are already records for the registers and pages in the CMS so we’re going to have to figure out a way to work out which image corresponds to which page in the CMS.  One register has a different number of pages in the CMS compared to the image files so we need to work out how to align the start and end and if there are any gaps or issues in the middle.  The other register is more complicated because the images are double pages whereas it looks like the page records in the CMS are for individual pages.  I’m not sure how best to handle this.  I could either try and batch process the images to chop them up or batch process the page records to join them together.  I’ll need to discuss this further with Gerry, who is dealing with the data for St Andrews.

Also this week I prepared for and gave a talk to a group of students from Michigan State University who were learning about digital humanities.  I talked to them for about an hour about a number of projects, such as the Burns Supper map (https://burnsc21.glasgow.ac.uk/supper-map/), the digital edition I’d created for New Modernist Editing (https://nme-digital-ode.glasgow.ac.uk/), the Historical Thesaurus (https://ht.ac.uk/), Books and Borrowing (https://borrowing.stir.ac.uk/) and TheGlasgowStory (https://theglasgowstory.com/).  It went pretty and it was nice to be able to talk about some of the projects I’ve been involved with for a change.

I also made some further tweaks to the Gentle Shepherd Performances page which is now ready to launch, and helped Geert out with a few changes to the WordPress pages of the Anglo-Norman Dictionary.  I also made a few tweaks to the WordPress pages of the DSL website and finally managed to get a hotel room booked for the DHC conference in Sheffield in September.  I also made a couple of changes to the new Gaelic Tongues section of the Seeing Speech website and had a discussion with Eleanor about the filters for Speech Star.  Fraser had been in touch with about 500 Historical Thesaurus categories that had been newly matched to OED categories so I created a little script to add these connections to the online database.

I also had a Zoom call with the Speak For Yersel team.  They had been testing out the resource at secondary schools in the North East and have come away with lots of suggested changes to the content and structure of the resource.  We discussed all of these and agreed that I would work on implementing the changes the week after next.

Next week I’m going to be on holiday, which I have to say I’m quite looking forward to.