Week Beginning 18th June 2018

This week I continued with the new ‘Search Results’ page for the Historical Thesaurus.  Last week I added in the mini-timelines for search results, but I wanted to bring some of the other updated functionality from the ‘browse’ page to the ‘search results’ page too.

There is now a new section on the search results page where sort options and such things are located.  This is currently always visible, as it didn’t seem necessary to hide it away in a hamburger menu as there’s plenty of space at the top of the page.  Options include the facility to open the full timeline visualisation, turn the mini-timelines on or off and set the sorting options.  These all tie into the options on the ‘browse’ page too, and are therefore ‘remembered’.

It took some time to get this working as the search results page is rather different to the browse page.  I also had to introduce a new sort option (by ‘Thesaurus Category’) as this is how things are laid out by default.  It’s also been a bit of a pain as both word and category results are lumped together, but the category results always need to appear after the ‘word’ results, and the sort options don’t really apply to them.  Also, I had to make sure the ‘recommended’ section got ordered as well as the main results.  This wasn’t so important for searches with few recommended categories, such as ‘sausage’ but for things like ‘strike’ that have lots of ‘recommendeds’ I figured ordering them would be useful.  I also had to factor the pagination of results into the ordering options too.  It’s also now possible to bookmark / share the results page with a specific order set, allowing users to save or share a link to a page with results ordered by length of attestation, for example.  Here’s a screenshot showing the results ordered by length of attestation:

I then set about implementing the full timeline visualisation for the search results.  As with the other updates to the search results page, this proved to be rather tricky to implement as I had to pull apart the timeline visualisation code I’d made for the ‘browse’ page and reformat it so that it would work with results from different categories.  This introduced a number of weird edge cases and bugs that took me a long time to track down.  One example that I’ve only just fixed and has taken about two hours to get to the bottom of:

When ordering by length of attestation all OE dates were appearing at the end, even though many were clearly the longest attested words.  Why was this happening?  It turns out that elsewhere in the code where ‘OE’ appears in the ‘fulldate’ I was replacing this with an HTML ‘span’ to give the letters the smallcaps font.  But having HTML in this field was messing up the ‘order by duration’ code, buried deep within a function called within a function within a function.  Ugh, getting to the bottom of that almost had me at my wit’s end.

But I got it all working in the end, and the visualisation popup is now working properly on the search results page, including a new ‘Thesaurus Category’ ordering option.  I’ve also made the row label field wider and have incorporated the heading and PoS as well as the word.  ‘Thesaurus Category’ ordering might seem a little odd as the catnum doesn’t appear on the visualisation, but adding this in would made the row label very long.  Here’s how the timeline visualisation for results looks:

Note that this now search results page isn’t ‘live’ yet.  Fraser also wanted me to update how the search works to enable an ‘exact’ search to be performed, as currently a search for ‘set’ (for example) brings back things like ‘set (of teeth)’, which Fraser didn’t want included.  I did a little further digging into this as I had thought we once allowed exact searches to be performed, and I was right.  Actually, when you search for ‘set’ you are doing an exact search.  If it was a partial match search you’d use wildcards at the beginning and end and you’d end up with more results than the website allows you to see.


Maybe an example with less results would work better.  E.g. ‘wolf’.  Using the new results page, here’s an exact search: https://ht.ac.uk/category-selection/index-test.php?qsearch=wolf with 36 results, and here’s a wildcard search: https://ht.ac.uk/category-selection/index-test.php?qsearch=*wolf* with 240 results.

Several years ago (back when Christian was still involved) we set about splitting up results to allow multiple forms to be returned, and I wrote some convoluted code to extract possible permutations so that things like ‘wolf/wolf of hell/devil’s wolf < (deofles) wulf’ would be found when doing an exact search for ‘wolf’ or ‘wolf of hell’ or whatever.

When you do an exact search it searches the ‘searchterms’ table that contains all of these permutations.  One permutation type was to ignore stuff in brackets, which is why things like ‘set (about)’ are returned when you do an exact search for ‘set’, but handily keeps other stuff like ‘Wolffian ridge’ and ‘rauwolfia (serpentina)’ out of exact searches for ‘wolf’.  A search for ‘set’ is an exact match for one permuation of ‘set (about)’ that we have logged, so the result is returned.

To implement a proper ‘exact’ search I decided to allow users to surround a term with double quotes.  I updated my search code so that when double quotes are supplied the search code disregards the ‘search terms’ table and instead only searches for exact matches in the ‘wordoe’ and ‘wordoed’ fields.  This then strips out things like ‘set (about)’ but still ensures that words with OE forms, such as ‘set < (ge)settan’ are returned, as such words have ‘set’ in the ‘wordoed’ field and ‘(ge)settan’ in the ‘wordoe’ field.  This new search seems to be working very well but as with the other updates I haven’t made it live yet.

One thing I noticed with the search results timeline visualisation that might need fixed:  The contents of the visualisation are limited to a single results page, so reordering the visualisation will only reorder a subset of the results.  E.g. if the results go over two pages and are ordered by ‘Thesaurus Category’ and you open the visualisation, if you then reorder the visualisation by ‘length of attestation’ you’re only ordering those results that appeared on the first page of results when ordered by ‘Thesaurus Category’.

So to see the visualisation with the longest period of attestation you first need to reorder the results page by this option and then open the visualisation.  This is possibly a bit clunky and might lead to confusion.  I can change how things work if required.  The simplest way around this might be to just display all results in the visualisation, rather than just one page of results.  That might lead to some rather lengthy visualisations, though.  I’ve asked Marc and Fraser what they think about this.

I’m finding the search results visualisation to be quite fun to use.  It’s rather pleasing to search for a word (e.g. the old favourite ‘sausage’) and then have a visualisation showing when this word was used in all its various senses, or which sense has been used for longer, or which sense came first etc.

Also this week the Historical Thesaurus website moved to it’s new URL.  The site can now be accessed at https://ht.ac.uk/, which is much snappier than the old https://historicalthesaurus.arts.gla.ac.uk URL.  I updated the ‘cite’ options to reflect this change in URL and everything seems to be working very well.  I also discussed some further possible uses for the HT with Fraser and Marc, but I can’t really go into too many details at this point.

Also this week I fixed a minor issue on a page for The People’s Voice project, and a further one for the Woolf short story site, and gave some further feedback about a couple of further updates for the Data Management Plan for Faye Hammill.  I also had some App duties to take care of and gave some feedback on a Data Management Plan that Graeme had written for a project for someone in History.  I also created the interface for the project website for Matthew Creasy’s Decadence and Translation Network project, which I think is looking rather good (but isn’t live yet).  I had a chat with Scott Spurlock about his crowdsourcing project, which it looks like I’m going to start working on later in the summer, I spoke to David Wilson, a developer elsewhere in the college, about some WordPress issues, and I gave some feedback to Kirsteen McCue on a new timeline feature she is hoping to add to the RNSN website.  I also received an email from the AHRC this week thanking me for acting as a Technical Reviewer for them.  It turns out that I completed 39 reviews during my time as a reviewer, which I think it pretty good going!

Also this week I had a meeting with Megan Coyer where we discussed a project she’s putting together that’s related to Scottish Medicine in history.  We discussed possible visualisation techniques, bibliographical databases and other such things.  I can’t go into any further details just now, but it’s a project that I’ll probably be involved with writing the technical parts for in the coming months.

Eleanor Lawson sent me some feedback about my reworking of the Seeing Speech and Dynamic Dialects websites, so I acted on this feedback.  I updated the map so that the default zoom level is one level further out than before, centred on somewhere in Nigeria.  This means on most widths of screen it is possible to see all of North America and Australia.  New Zealand might still get cut off, though.  Obviously on narrower screens less of the world will be shown.

On the chart page I updated the way the scrollbar works.  It now appears at both the top and the bottom of the table.  This should hopefully make it clearer to people that they need to scroll horizontally to see additional content, and also make it easier to do the scrolling – no need to go all the way to the bottom of the table to scroll horizontally.  I also replaced one of the videos with an updated one that Eleanor sent me.

On Friday I began to think about how to implement a new feature for the SCOSYA atlas.  Gary had previously mentioned that he would like to be able to create groups of locations and for the system to then be able to generate summary statistics about the group for a particular search that a user has selected.  I wrote a mini ‘requirements’ document detailing how this might work, and it took some time to think through all of the various possibilities and to decide how the feature might work best.  By the end of the day I had a plan and had emailed the document to Gary for feedback.  I’m hoping to get started on this new feature next week.


Week Beginning 23rd April 2018

I worked on a number of different projects this week.  Jane Stuart-Smith has secured some funding for me to redevelop a couple of old websites for her (Seeing Speech and Dynamic Dialects) so I spent some time working on this.  My first task was to add Google Analytics to the old pages so we can more easily track current usage and how things might change when we eventually launch the new sites.  It was a bit of a pain to add the code in as the current sites consist of flat HTML files (as opposed to using one template file), so every page had to be updated separately.  But with that in place I could begin to think about developing the new interface.  I decided that it would be a good opportunity to try out the Bootstrap front-end library (https://getbootstrap.com/).  I’d been meaning to look into this for a while and I spent a bit of time experimenting with it.  I like the way it provides a very straightforward grid-based layout system for responsive design, and some of the components such as tabs and drop-down menus are excellent too.  I do find that the documentation is a little haphazard, though.  I found myself having to find third party tutorials and explanations as a lot of stuff you can do with the library just isn’t very well explained in the documentation section.  The jQuery UI site, with its clear and extensive examples plus API seems a lot better to me.  However, I managed to get my head round how Bootstrap works and managed to create a number of different mock-up interfaces for the new websites.  I can’t really share them here, or even show screenshots yet, but I created 6 mock-ups and sent the URLs off to Jane and Eleanor Lawson at QMU for feedback.  Once I hear back from them I will be able to take things further.

I also continued with the new timeline feature for the Historical Thesaurus.  Last week I pretty much completed a version of the timeline that was ready to integrate with the live site, and this week I set about integrated it with a test version of the live site.  In this version a ‘Timeline’ button appears next to the ‘Cite’ button for each category / subcategory on the category browse page.  Pressing on this button opens up a jQuery UI modal popup containing the timeline, the sort options, the ‘save SVG’ option and the category heading / catnum and part of speech, as you can see in the following screenshot:

It took quite a bit of effort to implement this, as the timeline uses D3.js version 4 and the sparklines that are used elsewhere in the site (see http://historicalthesaurus.arts.gla.ac.uk/sparklines/) used version 3.  Rather than have two different versions I thought it would be better to upgrade the sparklines to version 4.  This took a couple of hours to sort out as the changes between versions are rather extensive, but thankfully I managed to get the sparklines working in the end.  Getting the timeline to work within a jQuery dialog box was also rather tricky.  The timeline just kept giving error messages and failed to work for ages until I worked out that the dialog box needs to be open and visible before the timeline loads, otherwise the timeline code falls over.  Previously all of my timeline code was contained in one PHP file, including database queries, working with the data in PHP, CSS styles, HTML and the actual processing of the final data outputted by PHP in JavaScript.  This all needed to be split up, with the PHP stuff going into the API, the JavaScript stuff getting added to the existing HT tree JavaScript file, CSS into the existing stylesheet and HTML into the existing category HTML page.  However, it’s all in place now. I still need to create mini-inline timelines for the category page, and I’m hoping to get some time to look into this next week.

Also this week I met with Luca to discuss changes to Joanna Kopaczyk’s project, and I had to spend some time deleting lots of old emails as I received an actual legitimate “your mailbox is almost full” message.  I was also sent a bunch of tweaks and some new song materials for ThePeoplesVoice website by Catriona MacDonald.  The project has now ended and these are the final updates.  You can view the website, the songs and the database of poems here: https://thepeoplesvoice.glasgow.ac.uk/.

My final project of the week was the REELS project, for which I added in the final bits of functionality for the front-end.  This included adding a ‘cite’ option to the record page and updating the API so that the formatting of the CSV for a record is more useable.  Rather than all of the data appearing on one very long row the file places each bit of data on a new row, with the item label in the first column and the corresponding data in the second.  Historical forms and sources now have separate rows for each of their bits of data rather than having everything joined in one field.

I’d also noticed that the ‘show on map’ link in the text list of results was failing to pan and zoom sometimes.  This was an intermittent fault that never produced any error messages and although I spent some time trying to figure out what caused it I didn’t manage to find an answer.  For this reason I decided to replace the feature that does a nice animation, zooming out from the current map location, panning to the new location and zooming back in.  Instead when you click on the ‘show on map’ link the map loads directly at the point.  It’s not as pleasing to look at as the animated version, but at least it works consistently.  I also updated the layout of the record page, splitting historical forms off from the rest of the record and placing them in their own tab, which I think works pretty well.  Other than some further formatting of pages, changing colour schemes and some other design tweaks I think that’s the front-end for the project pretty much sorted now.

Week Beginning 12th February 2018

I returned to a full week of work this week, after the horribleness of last week’s flu.  I was still feeling pretty exhausted by the end of each working day, but managed to make it through the week.  I’d had several meetings scheduled for last week, and rescheduled them all for this week.  On Monday I met with Kirsteen and Brianna to discuss the website for the Romantic National Song Network.  I’d been working on some interactive stories, on timeline based, the other map based and we talked about how we were going to proceed with these.  The team seem pretty happy with how things are developing, and the next step will be to take the proof of concept that I created and add in some higher resolution images, more content and try to make the overall interface a bit larger so as to enable embedded images to be viewed more clearly.  On Tuesday I met with Faye Hammill in English Literature to discuss a project she is currently putting together with colleagues at the University of Birmingham.  I have agreed to write a technical plan for this project, although it’s still not clear exactly when the AHRC are going to replace the technical plans with their new data management plans.  We also discussed a couple of older project websites she has that are currently based at Strathclyde University but will need to be migrated to Glasgow.  The sites are currently ASP based and we’d need to migrate them to something else as we don’t support ASP here.

On Wednesday I had three meetings.  The first was with Honor Riley of The People’s Voice project.  This project launched on Thursday so we met to discuss making all of the online resources live.  This included the database of poems I had been developing, which can now be viewed here:  http://thepeoplesvoice.glasgow.ac.uk/poems/.  I spent some time on Wednesday making final tweaks to the site, and attended the project’s launch event, which lasted all day Thursday.  It was an interesting event, held at the Trades Hall in the Merchant City.  It was great to see the online resources being launched at the event, and it was also good to learn more about the subject and hear the various talks that took place.  The event concluded with a performance of some of the political songs by Bill Adair, which was excellent.

The remaining meetings I had on Wednesday were with Matthew Creasey and Megan Coyer.  Matthew has an AHRC leadership fellowship starting up soon and I’m going to help him put a project website together.  This will include an online resource, a sort of mini digital edition of some poems.  Megan wanted to discuss some potential research tools that might be used to help her in her studying of British periodicals, specifically tools that might help with annotation and note taking.  We discussed a few options and considered how a new tool might be developed, if there was a gap in the market.  Developing such a tool would not be something I’d be able to manage myself, though.  My final meeting of the week was with Stuart Gillespie on Friday.  I’d put together a website that will accompany Stuart’s recent publication, and we met to discuss some final tweaks to the website and to discuss how to handle updates to it in future years.  The website is now currently available here: http://www.nrect.glasgow.ac.uk/

Other than attending these various meetings and the launch event on Thursday, I managed to squeeze in some other work too.  I had an email conversation with Thomas Widmann of the DSL about the API that was developed for the DSL website, and I also helped Ann Ferguson to get access to the WordPress version of the DSL website that I created in November last year.  I also spent a bit of time updating all of the WordPress sites I manage, as yet another new version of WordPress had been released (the third so far this year, which is a bit ridiculous).  I’d had an email from someone at the Mitchell Library to say that some of the images on TheGlasgowStory weren’t working so I spent a small amount of time investigating and fixing this.  It turned out that some images had upper case extensions (JPG instead of jpg) and as Linux is case sensitive the images weren’t getting found.  I also had an email chat with Fraser about some of the outstanding work that needs to be done for the Linguistic DNA project.  We’re going to meet with Marc in the next few weeks to discuss this further.  Fraser also gave me access to the tagged EEBO resource, from which I will need to extract some frequency data.

I spent the remainder of the week working on the front end and API for the REELS project.  I managed to complete several new endpoints for accessing the data in the API.  The most important of these was the advanced search endpoint, which allows any combination of up to 16 different fields to be submitted in order to return results as JSON or CSV data.  I also created other endpoints that will be used for autocomplete features and lists of things, such as sources, parishes, classification codes and elements.  With all of this in place I could start working on the actual advanced search form in the front end, and although I haven’t managed to complete this yet I am making pretty good progress with it.  Hopefully I’ll have this completed before the REELS team meeting on Tuesday next week.

Week Beginning 5th February 2018

There’s not much to report from this week.  I had to head home on Friday afternoon last week as I wasn’t feeling well. Actually, I’d started to feel unwell on Thursday evening, but then foolishly struggled into work and then didn’t make it through the day.  I thought I had a bad cold, but it turned out to be flu, the like of which I have not experienced for decades.  I could barely move all weekend and although I was over the worst of it on Monday it took me many days before I was well enough to do anything at all.  I finally managed to return to work on Friday, although I was still exhausted and if I hadn’t been working from home I don’t think I would have made it through a full day.

I spent some of the day replying to emails that had mounted up whilst I’d been off sick, and rescheduling meetings that I’d had to cancel due to being off.  I also fixed an old website of mine that had stopped working when it was moved to a new server, due to some calls to PHP functions that are not supported in more recent versions of PHP.  I then made some tweaks to the poems section of The People’s Voice project website, fixing an issue whereby poems that don’t have an archive or library specified weren’t displaying publication dates either.

I then spent the rest of the day on the REELS project, adding in in an initial version of the text view of the search results.  The results page now features two tabs: ‘Map’ and ‘Text’.  If you click on the ‘Text’ tab you can access an alphabetical list of the results (just the matching results, not the ‘grey’ data).  Currently there’s no pagination of results and the links don’t lead anywhere, but it gives an idea of how the feature will work.  Clicking on the ‘map’ tab takes you back to the map.  I’m also wondering now whether I should add a ‘highlight on map’ option to the text list as already when looking down the list I spot interesting names and wonder where they actually appear on the map.  A ‘highlight’ button that then returns to the map tab and makes the corresponding circle blink or something might be handy.

Hopefully I’ll be back to full health next week and will be able to work a full week again.

Week Beginning 29th January 2018

I spent the majority of this week beginning on the implementation of the front-end for the REELS project.  I’d finished an initial version of my specification document last week and had sent it to the team for comment.  After receiving some feedback and thinking through some issues further myself, I created a slightly revised version and then set to work on the actual development.  As with other recent projects, I decided to create an API for the front-end, so that all querying, whether it comes from server-side PHP scripts or client-side JavaScript files, is handled via the same interface.  It keeps things nice and compartmentalised and ensures the data can easily be returned in different formats and by different systems now and in the future.  I based the API on the one I’d already created for the SCOSYA project, so thankfully I just had to adapt this code for the specific queries required by REELS rather than having to start from scratch.

I began with the implementation of the quick search.  This searches a variety of fields such as place-name and element.  I implemented wildcard searches for single characters using MySQL’s ‘_’ search and beginning, end and middle searches using MySQL’s ‘%’ search, and added in an exact search using double quotes too.  The API can return the results as either JSON or CSV, and with the addition of a few new indexes the queries return data pretty speedily.  I also ensured that data which doesn’t meet the criteria is included in the JSON view of the data, so that I’ll be able to add these as ‘grey spots’ on the map in addition to the actual data.

With an initial end point for the API in place I then decided to begin work on the map interface for the front-end.  As with previous projects, I decided to use Leaflet for the map, as it is a simple, lightweight library with no external dependencies that you can easily install on your own server (unlike Google Maps where everything has to get sent to Google’s servers for processing).  I set Leaflet up with an initial MapBox basemap (which I am still going to work with to improve the interface) and managed to connect the map to the API in order to display the search results.  I then split the results up into different map layers based on the place-name type, and assigned each type a different coloured dot.  Eventually I will replace these with icons, but this was a good first step.  With this in place and the legend visible it then became possible to turn on or off a particular type, for example hiding all of the settlements, or hiding all of the grey dots.  Here’s an example of the map, showing a search for ‘h_ll’ (all names with these characters somewhere in them):

I then updated the API to add in a further endpoint for retrieving the data required for the popup and updated the map to add in popups.  These are AJAX powered – none of the map markers actually includes the content of their pop-ups until the user clicks on the map marker.  At that point an AJAX request is sent and the data is retrieved in JSON format, then formatted by the script and the pop-up is displayed.  If the user clicks to open a popup a second time the system can tell that the popup is already populated and therefore a second AJAX request is not made.  Still lots to do for the project, but I feel that I’ve made really good progress this week.

In addition to REELS I worked for some other projects as well this week.  I made some further updates to the NRECT website for Stuart Gillespie, I provided some information to old colleagues in the DCC about the AHRC’s decision to drop Technical Plans, I performed some App account management duties, which involved setting up a new account for a developer who is working with MVLS, and I made a few corrections to some information on the DSL website.  I also tweaked a couple of fields in The People’s Voice database and updated the way poem titles are ordered in the front-end, basically adding a new ‘order’ field that ignores any non-alphanumeric characters.  I also spent a bit of time investigating why an old website had stopped working.  It turned out that the site (and others) had recently been moved to a new server, and the old website (written more than 15 years ago) used a couple of functions that are no longer supported.  A quick find and replace sorted the issue, but I will still need to address the problem in a couple of other old sites next week.  I had to head home at lunchtime on Friday as I was feeling unwell.

Week Beginning 15th January 2018

I worked on a number of projects and gave advice to several members of staff this week.  Megan Coyer sent me an example document that she will need to perform OCR on in order to extract the text from the digitised images.  The document is a periodical from Proquest’s British periodicals collection and was a PDF containing digitised images.  I was hoping that the full text would be indexed in the PDF and allow searching using Acrobat’s search facility (as is possible with some supposedly image based PDFs) but unfortunately this was not the case.  Proquest’s website states that ‘All of this material is available in page image format with fully searchable text. Users can filter results by article type and download articles as either PDFs or JPEG page images’ so it would appear that they limit the fully searchable text to their own system, and the only outputs they make available to download are purely image based.  Megan needs access to the full text so we’re going to have to do our own OCR.

I downloaded a free OCR package based on the Tesseract engine used by Google Books (https://github.com/A9T9/Free-Ocr-Windows-Desktop/releases) and experimented with the document.   The software allows PDFs to be OCRed, but when I ran the first page of the PDF through the software the results were terrible, resulting in a text file that was completely unusable.  This didn’t look promising at all, but via a subscription to Proquest from the University library we can access the actual image files.  I downloaded the first page and running this through the OCR software was a huge improvement, with only a few very minor errors cropping up.  I’m guessing this is because the images contained in the PDF are of a much lower resolution than the actual image files that are available, although there may be other factors involved too.  But whatever the reason, it looks like it will be possible to extract the text, which is very promising.

On Monday I met with Honor Riley, the RA on The People’s Voice project to discuss the poems database and how we will add song recordings to the database, and also to the main project website.  It was a useful meeting and we figured out a method that should work.  On Tuesday I implemented the methods we had agreed upon the previous day.  Honor had given me an initial batch of song recordings and I converted these from WAV to MP3 and uploaded them to the project’s WordPress site.  I then updated the poems database front end I’d previously created to include an HTML5 audio player that references the MP3’s URL (if one is included for the poem’s record) and also links through to a page about the song that will be set up via WordPress.  I also updated the poem database front end to include a new facility that will allow poems to be browsed for by publication.  That should be most of the technical things completed ahead of the project’s launch next month.

On Tuesday afternoon I attended a meeting for the REELS project.  It’s been a while since I’ve been involved with this project, but we’ve reached the stage where I will need to start working on the front end for the project’s data.  We had a long and very useful meeting where we discussed the requirements for the front end – the sorts of search and browse facilities that we want to include and how the map interface should work.  We also looked at a few existing map-based place-name websites to get some ideas from those.  I’m hoping to be able to start work on the front-end next week.

There were a few other tweaks I needed to make to the REELS content management system.  Simon had encountered an issue with a special character not being saved in the database.  It was the ‘e-caudata’ character (ę) and even though I had the table set up as UTF-8 this was still failing to insert properly.

It turns out MySQL only supports UTF-8 characters that take up a maximum of 3 bytes by default, and this character takes up 4 bytes.  What’s needed instead is to set MySQL up to run in ‘utf8mb4’ (multi-byte 4) mode.  But setting the collation alone didn’t fix this for me, I had to convert the character set to utf8mb4 as well:

ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

There’s a very handy page about this here: https://mathiasbynens.be/notes/mysql-utf8mb4

I also added in a new column to the ‘browse place-names’ page in the content management system for displaying former parishes, which makes things easier for the project team.

I bumped into Jane during the week and we talked a bit about updates to a couple of project websites.  I also replied to an email from Helen Kingstone in English Literature who wanted some advice on corpus linguistics tools.  I also did a bit of App admin duties as colleagues in MVLS are in the process of creating a new app and needed my help to set up a new user account.  I also arranged to meet with Anna McFarlane to discuss a website for a research project she’s putting together, and I responded to an email from Joanna Kopaczyk about the proposal she’s currently putting together.

I spent a fun few hours on Friday morning designing a little website for Stuart Gillespie, that will accompany his printed volume ‘Newly Recovered English Classical Translations 1600-1800’.  The Annexe to the volume is going to be made available from this website, and Stuart had sent me a copy o the volume’s cover, so I could take some design cues from it.  I created a nice little responsive interface that I think complements the printed volume very well.  I can’t link to it yet, though, as the site currently doesn’t feature any content.

I spent the rest of the week on Historical Thesaurus duties.  My first task was to create an updated version of one of my scripts for showing links between HT and OED categories, specifically the one that brings back categories that match and displays lists of words within each that either match or don’t.  Previously the script would just bring back all of the matches, or a subset (e.g. the first 1000), but Fraser wanted a version that you could request a specific category and both it and its subcats would be returned.  I managed to create such a script fairly quickly and it seems to fit the bill.

I also tweaked the ‘fixed header’ I created last week.  I’d made it so that if you select a subcat then the ID of that subcat replaces the maincat ID in the address bar.  However, if you deselect the subcat the ID does not revert, when really it makes sense for it to do so.  A swift update to the code and it now does.  Much better.  I also updated the font used in the header at Marc’s suggestion.  All I need now is final approval and this new feature can go live.

I then continued to investigate how to add in arrows and curved edges to the timeline.  By updating the timeline code I figured out how to add in circles and squares to the beginning and end of a timeline shape.  By rotating the square 45 degrees I could make it into a diamond than when positioned next to the shape poked out as if it was an arrow.  This rotating took some figuring out as with D3.js you can’t just rotate a shape after it has been added, otherwise all subsequent shapes appear on this rotated line as well.  Instead you need to specify the ‘x’ and ‘y’ coordinates and the rotation at the point of creation in the same call, like so:


var x = getXPos(d,i) – 10;

var y = getStackPosition(d,i) + 10;

return “translate(“+x+”,”+y+”) rotate(-45)”;


My initial version with different ends had arrows at the left hand side of all timeline shapes and circles at the right hand side, but with this proof of concept in place I could then add in a filter to only add in the shapes when required.  This meant updating the structure of the data that is fed to the timeline code to add in new fields for whether a timeline shape is ‘ante’ or ‘circa’ (or neither).  It took some time to update the script that generates the data to account for all of the various ‘ac’ fields in the database and figure out whether these apply to the start date or end date, but I got there in the end.  I also had to work with the D3.js ‘filter’ option, which is what needs to be used instead of a regular ‘if’ statement.  E.g. this is how I check to see whether a start date is ‘ante’ or is Old English (both of which need arrows to the left):

.filter(function(d){ var check = false; if(d.starting_ac == “a” || d.starting_time == -27454550400000) check = true; return check;})

With this in place I then had a timeline with shapes that have different ends depending on the types of dates that are present, and I must say I’m pretty pleased with how this is working out, as I was worried I wouldn’t be able to get this feature working.  Here’s a screenshot:

Note that there are further things to do.  For example, some dates have an ‘ante’ end date, but I’m not currently sure how these should be handled.  Also using dots for single years means it’s not possible to differentiate between ‘circa’ single dates and regular single dates.  Marc, Fraser and I will need to meet again to consider how best to deal with these instances.

My final task for the week was to look into sorting the timeline.  Currently the timeline is listed by date, but we want an option to list if alphabetically by word as well.  I managed to get a rudimentary sort function working, but as of yet it redraws the whole timeline when the sort option is selected and I’d rather it animated the moving of rows instead, which would look a lot nicer.  This might take some work, though.

Week Beginning 9th October 2017

It was another week of working on fairly small tasks for lots of different projects.  I helped Gerry McKeever to put the finishing touches to his new project website, and this has now gone live and can be accessed here:  http://regionalromanticism.glasgow.ac.uk/.  I also spent some further time making updates to the Burns Paper Database website for Ronnie Young.  This included adding in a site menu to facilitate navigation, adding a subheader to the banner, creating new pages for ‘about’ and ‘contact’, adding some new content, making repositories appear with their full names rather than acronyms, updating the layout of the record page and tweaking how the image pop-up works. It’s all pretty much done and dusted now, although I can’t share the URL as the site is password protected due to the manuscript images being under copyright restrictions.

I spent about a day this week on AHRC review duties and also spent some time working on the new interface for Kirsteen McCue’s ‘Romantic National Song Network’ project website.  This took up a fair amount of time as I had to try out a few different designs, work with lots of potential images, set up a carousel, and experiment with fonts for the site header.  I’m pretty pleased with how things are looking now, although there are four different font styles that we still need to choose one from.

I had a couple of conference calls and a meeting with Marc and Fraser about the Linguistic DNA project.  I met with Marc and Fraser first, in order to go over the work Fraser is currently doing and how my involvement in the project might proceed.  Fraser and I then had a Skype call with Iona and Seth in Sheffield about the work the researchers are currently doing and some of the issues they are encountering when dealing with the massive dataset they’re working with.  After the call Fraser sent me a sample of the data, which really helped me to understand some of the technical issues that are cropping up.  On Friday afternoon the whole project had a Skype call.  This included the DHI people in Sheffield and it was useful to hear something about the technical work they are currently doing.

I had a couple of other meetings this week too.  On Wednesday morning I had a meeting with Jennifer Smith about a new pilot project she’s putting together in order to record Scots usage in schools.  We talked through a variety of technical solutions and I was able to give some advice on how the project might be managed from a technical point of view.  On Wednesday afternoon I had a meeting for The People’s Voice project, at which I met with new project RA, who has taken over from Michael Shaw as he’s now moved to a different institution.  I helped the new RA get up to speed with the database and how to update the front-end.

Also this week I had an email conversation with the SPADE people about how we will set up a server for the project’s infrastructure at Glasgow.  I’m going to be working on this the week after next.  I also made a few further updates to the DSL website and had a chat with Thomas Widmann about a potential reworking of some of the SLD’s websites.

There’s not a huge amount more to say about the work I did this week.  I was feeling rather unwell all week and it was a bit of a struggle getting through some days during the middle of the week, but I made it through to the end.  I’m on holiday all of next week so there won’t be an update from me until the week after.

Week Beginning 28th August 2017

This week was rather a hectic one as I was contacted by many people who wanted my help and advice with things.  I think it’s the time of year – the lecturers are returning from their holidays but the students aren’t back yet so they start getting on with other things, meaning busy times for me.  I had my PDR session on Monday morning, so I spent a fair amount of time at this and then writing things up afterwards.  All went fine, and it’s good to know that the work I do is appreciated.  After that I had to do a few things for Wendy for Mapping Metaphor.  I’d forgotten to run my ‘remove duplicates’ script after I’d made the final update to the MM data, which meant that many of the sample lexemes were appearing twice.  Thankfully Wendy spotted this and a quick execution of my script removed 14,286 duplicates in a flash.  I also had to update some of the text on the site, update the way search terms are highlighted in the HT to avoid links through from MM highlighting multiple terms.  I also wrote a little script that displays the number of strong and weak metaphorical connections there are for each of the categories, which Wendy wanted.

My big task for the week was to start on the redevelopment of the ARIES app.  I had been expecting to receive the materials for this several weeks earlier as Marc wanted the new app to be ready to launch at the beginning of term.  As I’d heard nothing I assumed that this was no longer going to happen, but on Monday Marc gave me access to the files and said the launch must still go ahead at the start of term.  There is rather a lot to do and very little time to do it in, especially as preparing stuff for the App Store takes so much time once the app is actually developed.  Also, Marc is still revising the materials so even though I’m now creating the new version I’m still going to have to go back and make further updates later on.  It’s not exactly an ideal situation.  However, I did manage to get started on the redevelopment on Tuesday, and spent pretty much all of my time on Tuesday, Wednesday and Thursday on this task.  This involved designing a new interface based on the colours found in the logo file, creating the structure of the app, and migrating the static materials that the team had created in HTML to the JSON file I’m creating for the app contents.  This included creating new styles for the new content where required and testing things out on various devices to make sure everything works ok.  I also implemented two of the new quizzes, which also took quite a bit of time, firstly because I needed to manually migrate the quiz contents to a format that my scripts could work with and secondly because although the quizzes were similar to ones I’ve written before they were not identical in structure, so needed some reworking in order to meet the requirements.  I’m pretty happy with how things are developing, but progress is slow.  I’ve only completed the content for three subsections of the app, and there are a further nine sections remaining.  Hopefully the pace will quicken as I proceed, but I’m worried that the app is not going to be ready for the start of term, especially as the quizzes should really be tested out by the team and possibly tweaked before launch.

I spent most of Friday this week writing the Technical Plan for Thomas Clancy’s new place-name project.  Last week I’d sent off a long list of questions about the project and Thomas got back to me with some very helpful answers this week, which really helped in writing the plan.  It’s still only a first version and will need further work, but I think the bulk of the technical issues have been addressed now.

Other than these tasks, I responded to a query from Moira Rankin from the Archives about an old project I was involved with, I helped Michael Shaw deal with some more data for The People’s Voice project, I had a chat to Catriona MacDonald about backing up The People’s Voice database, I looked through a database that Ronnie Young had sent me, which I will be turning into an online resource sometime soon (hopefully), I replied to Gerry McKeever about a project he’s running that’s just starting up which I will be involved with, and I replied to John Davies in History about a website query he had sent me.  Unfortunately I didn’t get a chance to continue with the Edinburgh Gazetteer work I’d started last week, but I’ll hopefully get a chance to do some further work on this next week.

Week Beginning 21st August 2017

I worked on quite a number of different projects this week, mostly lots of little bits of work rather than major things.  I set up an initial website for Kirsteen McCue’s Romantic National Song Network project, which involved trying out different themes, preparing background images and the like.  I also upgraded all of the WordPress instances I manage to the latest release and spoke to Chris McGlashan about the possibility of moving all our sites from HTTP to HTTPS.  This would be great from a security point of view and as the majority of our sites are just subdomains of the main University domain I’m hoping we can just use the existing certificate with our sites.

I replied to Gavin Miller, who wanted my input into a new Wellcome Trust bid he is putting together and I continued an email discussion with Alison Wiggins about her new project.  I also updated the Digital Humanities at Glasgow website to add several new projects to the resource and to update the records of some existing projects, such as ‘Basics of English Metre’, which now contains information about the app rather than the ancient web resource.  See all of the projects here: http://digital-humanities.glasgow.ac.uk/projects/.

On Thursday I attended the ‘SICSA Digital Humanities meets Computer Science Workshop’ at the University of Strathclyde.  It was a very interesting event with lots of opportunities to talk to other digital humanities and computing specialists and to learn more about other projects.  Unfortunately I had to leave early due to childcare obligations, but I found the parts I was able to attend to be very useful.

The biggest chunk of work I did this week was to develop a map of reform societies for Rhona Brown’s Edinburgh Gazetteer project.  Rhona had prepared a Word document that listed about 90 reform societies that were mentioned across all of the pages of the Gazetteer and I had to convert this into data that could then be plugged into a map interface.  We had previously arranged with the NLS to use on of their geocoded historical maps as a base map – John Thomson’s map of Scotland from 1815, which is the same base map I’d previously used for the Robert Burns walking tours feature (see http://burnsc21.glasgow.ac.uk/highland-tour-interactive/) so I got to work setting this up.  I decided to structure the data using JSON, as this could very easily be plugged into the map but also then reused for a textual list of the societies.  I had to manually grab the latitude and longitude values for each location using Google Maps, which was a bit of a pain, but thankfully although there were about 90 records many of these were at the same location, which cut down on the required work slightly.  For example, there are 13 reform societies in Edinburgh and 9 in Glasgow.  In the end I had a JSON structure for each record as follows:

{“id”:61, “latLng”: [55.941855, -3.054019], “toolTip”: “Musselburgh”, “title”: “Friends of Reform, Musselburgh “,”people”:”Preses: Colen Clerk<br />Secretary: William Wilson”,”pageID”:92,”linkText”:”19 February 1793, p.4″}

This provided the information for the location on the map, the tooltip that appears when you hover over a point and the contents of the popup, including a link through to the actual page of the Gazetteer where the society is mentioned.  I spent a bit of time thinking about the best way to represent there being multiple records at a single point.  I considered using circles of different sizes to let people see at a glance where the largest number of societies were, but realised this actually made it look like a larger geographical area was being covered instead.  I then decided to have a number in the marker to show how many societies were there.  I was using Leaflet circlemarkers rather than pins, as I didn’t want to give the impression that the societies were associated with an exact point on the map, but unfortunately adding text to Leaflet circlemarkers isn’t possible.  Instead I switched to using Leaflet’s divicon (see http://leafletjs.com/reference-1.2.0.html#divicon).  This marker type allows you to specify HTML to appear on the map and to then style the marker with regular CSS styling.  It took a bit of experimentation to get the style looking as I wanted – positioning the text was especially tricky – but in the end I had a map featuring circles with numbers in the middle, which I think works rather well.  Another issue is the old map is not completely accurate, meaning the real latitude and longitude values for a place may actually result in a marker some way off on the historical map.  However, I spoke to Rhona about this and she said it didn’t really matter too much.  I also added in a ‘full screen’ option for the map, and for good measure I added the same feature to the Gazetteer page too, for browsing round the large Gazetteer page images.  It all seems to be working pretty well.  The site isn’t live yet so I can’t include the URL, but here’s an image of the map:

Also this week I helped Michael Shaw of The People’s Voice project with a file upload issue he was experiencing.  I created a CSV upload facility for adding data to the project’s database but his file just wouldn’t upload.  It turned out to be an issue with CSVs created on a Mac, but we implemented a workaround for this.  I also had an email conversation with Joanna Kopaczyk, who will be starting in English Language next month.  She has an idea for a project and wanted to ask for my advice on some technical matters.

Finally this week I started working on the Technical Plan for a project Thomas Clancy is putting together.  It’s another place-name project and it will use a lot of the same technologies as the REELS project so I’m helping out with this.  I should hopefully get a first draft of the Technical Plan together during next week, although this depends on when some of the questions I’ve asked can be answered.

Week Beginning 12th June 2017

I continued to work on the redevelopment of the Historical Thesaurus website this week, which took up the bulk of my time.  Marc, Fraser and I have made some really good progress on this and by the end of the week we had most of the new version complete, apart from adding in new content to some of the ‘About’ pages.  I think it’s looking really great and the tree browse in particular makes accessing the content so much quicker and easier.

The first major item I attempted to implement was the facility to open the tree at a specific category rather than at the top level.  Being able to do this was absolutely vital for the new design as if I couldn’t figure out a way to do it we wouldn’t be able to go from a search result or citation to a specific category and the tree view would therefore be pretty useless.

Initially I attempted to generate the full, open tree structure on the server side and to return this to the ‘fancytree’ plugin but figuring out a script that would generate the full, nested JSON structure where children arrays are children of children of children etc was taking too long for me to figure out.

Instead I turned my attention to the tree itself and wondered whether instead of generating the entire, potentially horribly complicated structure on the server-side I could instead work out which tree elements should be expanded and simulate the expanding of branches in JavaScript.  This is the solution that is currently live.  For example, if you load a subcategory the ‘browse’ loads with the maincat loaded and the page then scrolls down to the subcategory.  Whilst this is taking place the JavaScript fires off a request to get the hierarchy from the parent category upwards, which returns the catIDs of all of the parents.  The JavaScript then simulates the clicking of these nodes – e.g. first loading ‘The World’ and once it’s loaded it then loads ‘Food and Drink’ etc.  Once the full tree is expanded the relevant maincat is then highlighted in the tree.  This all works rather well.

I also make lots of other, smaller updates, namely: I fixed a gap that appeared under the arrow showing selected nav menu item; I ensured that the resizing of the tree / category section on narrow screens worked; I fixed the formatting of the random category ‘reload’ button; I updated the search page to remove the tabs and to add in proper PoS abbreviations to the parts of speech buttons; I reimplemented the ‘Jump to category number’ feature, and gave it the proper PoS abbreviations; I created one specific page for the tree, rather than there being a separate ‘browse’ and ‘search’ page.  The system highlights the appropriate nav menu item based on what the user is looking at.  I fully integrated the search with the tree view – you can go back to search results or clear your search and breadcrumbs now appear on the tree page.  The ‘Select category’ search results now display the proper PoS abbreviations and search word highlighting is now working in the tree.

I also added an ‘autocomplete’ function for the ‘label’ search – so now if you start typing in a label all of the labels in the system that match your text appear in a selectable list.  I also fixed the issue I mentioned last week whereby ‘T7’ categories looked like they had child categories and then gave an error when you tried to expand them.  I added the ‘cite’ popup to the tree view too, appearing beside each maincat and subcat.  I also upgraded the version of jQuery UI that the site uses and implemented a new ‘search’ option beside each work that opens a pop-up that allows the user to search for the word in the HT, the OED and TOE.

I added in a little function to display the HT version and added a call to this wherever the version is displayed, so that in future we will only have to update the version in one place.  I updated the tooltip styles to match the new colours Marc has been working on.  I fixed a bug whereby if a search word had a space in it the category pane failed to load in the tree view.  I slightly altered the category pane so there is no padding between the heading and the border and I tweaked the appearance of the PoS boxes on the search page and have made the ‘clear search boxes’ link into a button.  I also implemented a completely new structure for the ‘About’ pages, which included incorporating my Sparklines and also embedding Flikr photo galleries in the ‘photo gallery’ page rather than just linking out to that site.  I also added in a ‘top’ button that appears on screen when the user is mid-way down a page and updated the way dates are searched for in the advanced search.  There is now a ‘standard’ and ‘advanced’ date search option.  It has certainly been a productive week!

In addition to the above I spent about a day working on the poem database for The People’s Voice project, which I have now completed.  This included making some updates to the search facility and implementing the ‘view record’ page.  You can now click on a search / browse result to view the information about a poem.   I think all of the information about each poem is displayed, other than sound files as there are none in the system yet.  Currently the page is split into two panes.  The left-hand pane includes all of the information about the poem (authors, publication details, library details etc) while the right-hand pane contains the researcher’s comments.  Poem title and franchise appear above these panes.  Information that can be searched for appears as blue links (e.g. author, year of publication).  Clicking on one of these links performs a search for this information.

I also upgraded all of my WordPress sites to the latest version of WordPress that was recently released, fixed an issue Rob Maslen was having with his blog and met with Jennifer Smith, Gary Thoms and Niels Cadee from the library to discuss managing the research data for the SCOSYA project.

Next week I will continue to tweak the new HT website and I might have a go at updating the Thesaurus of Old English site as well, depending on what other things crop up.