Week Beginning 18th June 2018

This week I continued with the new ‘Search Results’ page for the Historical Thesaurus.  Last week I added in the mini-timelines for search results, but I wanted to bring some of the other updated functionality from the ‘browse’ page to the ‘search results’ page too.

There is now a new section on the search results page where sort options and such things are located.  This is currently always visible, as it didn’t seem necessary to hide it away in a hamburger menu as there’s plenty of space at the top of the page.  Options include the facility to open the full timeline visualisation, turn the mini-timelines on or off and set the sorting options.  These all tie into the options on the ‘browse’ page too, and are therefore ‘remembered’.

It took some time to get this working as the search results page is rather different to the browse page.  I also had to introduce a new sort option (by ‘Thesaurus Category’) as this is how things are laid out by default.  It’s also been a bit of a pain as both word and category results are lumped together, but the category results always need to appear after the ‘word’ results, and the sort options don’t really apply to them.  Also, I had to make sure the ‘recommended’ section got ordered as well as the main results.  This wasn’t so important for searches with few recommended categories, such as ‘sausage’ but for things like ‘strike’ that have lots of ‘recommendeds’ I figured ordering them would be useful.  I also had to factor the pagination of results into the ordering options too.  It’s also now possible to bookmark / share the results page with a specific order set, allowing users to save or share a link to a page with results ordered by length of attestation, for example.  Here’s a screenshot showing the results ordered by length of attestation:

I then set about implementing the full timeline visualisation for the search results.  As with the other updates to the search results page, this proved to be rather tricky to implement as I had to pull apart the timeline visualisation code I’d made for the ‘browse’ page and reformat it so that it would work with results from different categories.  This introduced a number of weird edge cases and bugs that took me a long time to track down.  One example that I’ve only just fixed and has taken about two hours to get to the bottom of:

When ordering by length of attestation all OE dates were appearing at the end, even though many were clearly the longest attested words.  Why was this happening?  It turns out that elsewhere in the code where ‘OE’ appears in the ‘fulldate’ I was replacing this with an HTML ‘span’ to give the letters the smallcaps font.  But having HTML in this field was messing up the ‘order by duration’ code, buried deep within a function called within a function within a function.  Ugh, getting to the bottom of that almost had me at my wit’s end.

But I got it all working in the end, and the visualisation popup is now working properly on the search results page, including a new ‘Thesaurus Category’ ordering option.  I’ve also made the row label field wider and have incorporated the heading and PoS as well as the word.  ‘Thesaurus Category’ ordering might seem a little odd as the catnum doesn’t appear on the visualisation, but adding this in would made the row label very long.  Here’s how the timeline visualisation for results looks:

Note that this now search results page isn’t ‘live’ yet.  Fraser also wanted me to update how the search works to enable an ‘exact’ search to be performed, as currently a search for ‘set’ (for example) brings back things like ‘set (of teeth)’, which Fraser didn’t want included.  I did a little further digging into this as I had thought we once allowed exact searches to be performed, and I was right.  Actually, when you search for ‘set’ you are doing an exact search.  If it was a partial match search you’d use wildcards at the beginning and end and you’d end up with more results than the website allows you to see.


Maybe an example with less results would work better.  E.g. ‘wolf’.  Using the new results page, here’s an exact search: https://ht.ac.uk/category-selection/index-test.php?qsearch=wolf with 36 results, and here’s a wildcard search: https://ht.ac.uk/category-selection/index-test.php?qsearch=*wolf* with 240 results.

Several years ago (back when Christian was still involved) we set about splitting up results to allow multiple forms to be returned, and I wrote some convoluted code to extract possible permutations so that things like ‘wolf/wolf of hell/devil’s wolf < (deofles) wulf’ would be found when doing an exact search for ‘wolf’ or ‘wolf of hell’ or whatever.

When you do an exact search it searches the ‘searchterms’ table that contains all of these permutations.  One permutation type was to ignore stuff in brackets, which is why things like ‘set (about)’ are returned when you do an exact search for ‘set’, but handily keeps other stuff like ‘Wolffian ridge’ and ‘rauwolfia (serpentina)’ out of exact searches for ‘wolf’.  A search for ‘set’ is an exact match for one permuation of ‘set (about)’ that we have logged, so the result is returned.

To implement a proper ‘exact’ search I decided to allow users to surround a term with double quotes.  I updated my search code so that when double quotes are supplied the search code disregards the ‘search terms’ table and instead only searches for exact matches in the ‘wordoe’ and ‘wordoed’ fields.  This then strips out things like ‘set (about)’ but still ensures that words with OE forms, such as ‘set < (ge)settan’ are returned, as such words have ‘set’ in the ‘wordoed’ field and ‘(ge)settan’ in the ‘wordoe’ field.  This new search seems to be working very well but as with the other updates I haven’t made it live yet.

One thing I noticed with the search results timeline visualisation that might need fixed:  The contents of the visualisation are limited to a single results page, so reordering the visualisation will only reorder a subset of the results.  E.g. if the results go over two pages and are ordered by ‘Thesaurus Category’ and you open the visualisation, if you then reorder the visualisation by ‘length of attestation’ you’re only ordering those results that appeared on the first page of results when ordered by ‘Thesaurus Category’.

So to see the visualisation with the longest period of attestation you first need to reorder the results page by this option and then open the visualisation.  This is possibly a bit clunky and might lead to confusion.  I can change how things work if required.  The simplest way around this might be to just display all results in the visualisation, rather than just one page of results.  That might lead to some rather lengthy visualisations, though.  I’ve asked Marc and Fraser what they think about this.

I’m finding the search results visualisation to be quite fun to use.  It’s rather pleasing to search for a word (e.g. the old favourite ‘sausage’) and then have a visualisation showing when this word was used in all its various senses, or which sense has been used for longer, or which sense came first etc.

Also this week the Historical Thesaurus website moved to it’s new URL.  The site can now be accessed at https://ht.ac.uk/, which is much snappier than the old https://historicalthesaurus.arts.gla.ac.uk URL.  I updated the ‘cite’ options to reflect this change in URL and everything seems to be working very well.  I also discussed some further possible uses for the HT with Fraser and Marc, but I can’t really go into too many details at this point.

Also this week I fixed a minor issue on a page for The People’s Voice project, and a further one for the Woolf short story site, and gave some further feedback about a couple of further updates for the Data Management Plan for Faye Hammill.  I also had some App duties to take care of and gave some feedback on a Data Management Plan that Graeme had written for a project for someone in History.  I also created the interface for the project website for Matthew Creasy’s Decadence and Translation Network project, which I think is looking rather good (but isn’t live yet).  I had a chat with Scott Spurlock about his crowdsourcing project, which it looks like I’m going to start working on later in the summer, I spoke to David Wilson, a developer elsewhere in the college, about some WordPress issues, and I gave some feedback to Kirsteen McCue on a new timeline feature she is hoping to add to the RNSN website.  I also received an email from the AHRC this week thanking me for acting as a Technical Reviewer for them.  It turns out that I completed 39 reviews during my time as a reviewer, which I think it pretty good going!

Also this week I had a meeting with Megan Coyer where we discussed a project she’s putting together that’s related to Scottish Medicine in history.  We discussed possible visualisation techniques, bibliographical databases and other such things.  I can’t go into any further details just now, but it’s a project that I’ll probably be involved with writing the technical parts for in the coming months.

Eleanor Lawson sent me some feedback about my reworking of the Seeing Speech and Dynamic Dialects websites, so I acted on this feedback.  I updated the map so that the default zoom level is one level further out than before, centred on somewhere in Nigeria.  This means on most widths of screen it is possible to see all of North America and Australia.  New Zealand might still get cut off, though.  Obviously on narrower screens less of the world will be shown.

On the chart page I updated the way the scrollbar works.  It now appears at both the top and the bottom of the table.  This should hopefully make it clearer to people that they need to scroll horizontally to see additional content, and also make it easier to do the scrolling – no need to go all the way to the bottom of the table to scroll horizontally.  I also replaced one of the videos with an updated one that Eleanor sent me.

On Friday I began to think about how to implement a new feature for the SCOSYA atlas.  Gary had previously mentioned that he would like to be able to create groups of locations and for the system to then be able to generate summary statistics about the group for a particular search that a user has selected.  I wrote a mini ‘requirements’ document detailing how this might work, and it took some time to think through all of the various possibilities and to decide how the feature might work best.  By the end of the day I had a plan and had emailed the document to Gary for feedback.  I’m hoping to get started on this new feature next week.