Week Beginning 20th March 2017

I managed to make a good deal of progress with a number of different projects this week, which I’m pretty pleased about.  First of all there is the digital edition that I’m putting together for Bryony Randall’s ‘New Modernist Editing’ project.  Last week I completed the initial transcript of the short story and created a zoomable interface for browsing through the facsimiles.  This week I completed the transcription view, which allows the user to view the XML text, converted into HTML and styled using CSS.  It includes the notes and gaps and deletions but doesn’t differentiate between pencil and ink notes as of yet.  It doesn’t include the options to turn on / off features such as line breaks at this stage either, but it’s a start at least.  Below is a screenshot so you can see how things currently look.

The way I’ve transformed and styled the XML for display is perhaps a little unusual.  I wanted the site to be purely JavaScript powered – no server-side scripts or anything like that.  This is because the site will eventually be hosted elsewhere.  My plan was to use jQuery to pull in and process the XML for display, probably by means of an XSLT file.  But as I began to work on this I realised there was an even simpler way to do this.  With jQuery you can traverse an XML file in exactly the same way as an HTML file, so I simply pulled in the XML file, found the content of the relevant page and spat it out on screen.  I was expecting this to result in some horrible errors but… it just worked.  The XML and its tags get loaded into the HTML5 document and I can just style these using my CSS file.

I tested the site out in a variety of browsers and it works fine in everything other than Internet Explorer (Edge works, though).  This is because of the way jQuery loads the XML file and I’m hoping to find a solution to this.  I did have some nagging doubts about displaying the text in this way because I know that even though it all works it’s not valid HTML5. Sticking a bunch of <lb>, <note> and other XML tags into an HTML page works now but there’s no guarantee this will continue to work and … well, it’s not ‘right’ is it.

I emailed the other Arts Developers to see what they thought of the situation and discussed some other possible ways for handling things.  I decided I could leave things as they were.  I could use jQuery to transform the XML tags into valid HTML5 tags.  I could run my XML file through an XSLT file to convert it to HTML5 before adding it to the server so no transformation needs to be done on the fly.  I could see if it’s possible to call an XSLT file from jQuery to transform the XML on the fly.  Graeme suggested that it would be possible to process an XSLT file using JavaScript (as is described here https://www.w3schools.com/xml/xsl_client.asp) so I started to investigate this.

I managed to get something working, but… I was reminded just how much I really dislike XSLT files.  Apologies to anyone who likes that kind of thing but my brain just finds them practically incomprehensible.  Doing even the most simple of things seems far too convoluted.  So I decided to just transform the XML into HTML5 using jQuery.  There are only a handful of tags that I need to deal with anyway.  All I do is find each occurrence of an XML tag, grab its contents, add a span after the element and then remove the element, e.g:




var content = “<span class=\”del\”>”+$(this).html()+”</span>”;





I can even create a generic function that will pass the tag name and spit out a span with that tag name while removing the tag from the page.  When it comes to modifying the layout based on user preferences I’ll be able to handle that straightforwardly via jQuery too.  E.g. whether line breaks are on or off:


//line breaks



$(this).after(“<br />”);


$(this).after(“ “);




For me at least this is a much easier approach than having to pass variables to an XSLT file.

I spent a day or so working on the SCOSYA atlas as well and I have now managed to complete work on an initial version of the ‘my map data’ feature.  This feature lets you upload previously downloaded files to visualise the data on the atlas.

When you download a file now there is a new row at the top that includes the URL of the query that generated the file and some explanatory text.  You can add a title and a description for your data in columns D and E of the first row as well.  You can make changes to the rating data, for example deleting rows or changing ratings and then after you’ve saved your file you can upload it to the system.

You can do this through the ‘My Map Data’ section in the ‘Atlas Display Options’.  You can either drag and drop your file into the area or click to open a file browser.  An ‘Upload log’ displays any issues with your file that the system may encounter.  After upload your file will appear in the ‘previously uploaded files’ section and the atlas will automatically be populated with your data.  You can re-download your file by pressing on the ‘download map data’ button again and you can delete your uploaded file by pressing on the appropriate ‘Remove’ button.  You can switch between viewing different datasets by pressing on the ‘view’ button next to the title.  The following screenshot shows how this works:

I tested the feature out with a few datasets, for example I swapped the latitude and longitude columns round and the atlas dutifully displayed all of the data in the sea just north of Madagascar, so things do seem to be working.  There are a couple of things to note, though.  Firstly, the CSV download files currently do not include data that is below the query threshold, so no grey spots appear on the user maps.  We made a conscious decision to exclude this data but we might now want to reinstate it.  Secondly, the display of the map is very much dependent on the URL contained in the CSV file in row 1 column B.  This is how the atlas knows whether to display an ‘or’ map or an ‘and’ map, and what other limits were placed on the data.  If the spreadsheet is altered so that the data contained does not conform to what is expected by the URL (e.g. different attributes are added or new ratings are given) then things might not display correctly.  Similarly, if anyone removes or alters that URL from the CSV files some unexpected behaviour might be encountered.

Note also that ‘my map data’ is private – you can only view your data if you’re logged in.  This means you can’t share a URL with someone.  I still need to add ‘my map data’ to the ‘history’ feature and do a few other tweaks.  I’ve just realised trying to upload ‘questionnaire locations’ data results in an error, but I don’t think we need to include the option to upload this data.

I also started working on the new visualisations for the Historical Thesaurus that will be used for the Linguistic DNA project, based on the spreadsheet data that Marc has been working on.  We have data about how many new words appeared in each thematic heading in every decade since 1000 and we’re going to use this data to visualise changes in the language.  I started by reading through all of the documentation that Marc and Fraser had prepared about the data, and then I wrote some scripts to extract the data from Marc’s spreadsheet and insert it into our online database.  Marc had incorporated some ‘sparklines’ into his spreadsheet and my first task after getting the data available was to figure out a method to replicate these sparklines using the D3.js library.  Thankfully, someone had already done this for stock price data and had created a handy walkthrough of how to do it (see http://www.tnoda.com/blog/2013-12-19).  I followed the tutorial and adapted it for our data, writing a script that created sparklines for each of the almost 4000 thematic headings we have in the system and displaying these all on a page.  It’s a lot of data (stored in a 14Mb JSON file) and as of yet it’s static, so users can’t tweak the settings to see how this affects things, but it’s a good proof of concept.  You can see a small snippet from the gigantic list below:

Other than these tasks I published this week’s new Burns song (see http://burnsc21.glasgow.ac.uk/braw-lads-on-yarrow-braes/) and I had a meeting with The People’s Voice project team where we discussed how the database of poems will function, what we’ll be doing about the transcriptions, and when I will start work on things.  It was a useful meeting and in addition to these points we identified a few enhancements I am going to make to the project’s content management system.  I also answered a query about some App development issues from elsewhere in the University and worked with Chris McGlashan to implement an Apache module that limits access to the pages held on the Historical Thesaurus server so as to prevent people from grabbing too much data.