Week Beginning 13th June 2016

The UCU called for further strike action on Thursday this weeks, and I participated in this, meaning it was a four-day week for me.  I spent a lot of the week continuing with the redevelopment of the old STELLA resource ‘The Basics of English Metre’.  By the end of last week I had completed 12 of the 13 pages of Unit 1 of the app.  The thirteenth page proved to be rather tricky and led me to change the way I had been handling some types of exercise.  The exercises on this page are three stage exercises.  For the first stage the user must identify syllable boundaries.  For the second stage the metre needs to be added while for the third stage the rhythm must be added in.  The second and third stages are very similar in their aims (putting a cross or a dash in a box above or below a syllable) and I wanted to ensure that I used the same bit of code for both sections.  Previously I had been including a lot of the formatting for the questions within the JSON file itself – the ‘div’ for the syllable boundary, the box where the user tapped etc.  I decided that it would make much more sense to only include the actual syllable and to add all of the formatting in programmatically when the question was pulled into the script for processing.  It took a bit of time to rework the JavaScript to add all of this in, but it has reduced the size of the JSON file considerably and makes it easier to read and to update in future.

Previously when a new stage was loaded for a question this loaded in a new block of JavaScript that replaced the previous block – so the functions that handled what to do when the ‘check answer’ button was pressed were replaced dynamically.  However, I realised that this approach was not going to work when there are multiple questions on a page.  If the user moves one question to the next stage then all of the logic associated with the first stage was being replaced.  This meant if the user tried to check the answers for another question that was still on the previous stage things broke.  I reworked the functions to allow for the loading in of different stages for a specific question and ensured that the buttons for checking answers etc took the individual questions current stage into consideration before processing.  I’ve had a  rework (and re-test) a lot of the functionality associated with the exercises, but things are a lot more streamlined now and will work better with the multi-stage exercises that are found in Units 2 and 3.  I also started to work on Unit 2, getting as far as the sections on foot boundaries.

I spent about a day this week working with the Hansard data again.  By Friday morning the frequencies database contained 358,408,449 rows, with just under half of the data processed.  However, I’m going to have to go back to square one again as I’ve noticed an inconsistency with the data.  I had split the base64 encoded data from Lancaster up into about 1200 separate files and I noticed on Friday that up until about midway through the 49th file the metadata has the following structure:

Commons 2000-2005/2004/sep/16/S6CV0424P1_20040916_HOC_99.xml

But then after that the structure changes as follows:

commons 1803-1820/commons/1805/mar/01/S1V0003P0-01431.xml

That extra /commons/ in there messed up the part of my file that split this information up and lead to the loss of the actual filename from my processed data.  It meant that I had to re-run everything through the grid again, wipe the database and re-run the insertion jobs again.

I returned to my original shell script that extracted the Base64 data and reworked it to add in some checks for the structure of the data.  I also added in some error checking to ensure that if (for example) the ‘year’ field doesn’t contain a number that an error is raised.  I also took the opportunity to update the SQL statements that were generated, firstly to add in the all-important semi-colon delimiting character that I had missed out first time around and secondly to make the insert statements standard SQL rather than the MySQL specific syntax that I’ve tended to use in the past.  The standard way is ‘insert into table(column1, column2) values(‘value1’, ‘value2’);’ while MySQL also allows ‘insert into table set column1 = ‘value1’, column2 = ‘value2’’.  Having updated and tested out the file I then submitted a new batch of jobs to ScotGrid, and the output files seemed to work well with both possible metadata structures.  I submitted all of the 1200 odd files to run over the weekend.

In addition to the above work I did a few other tasks.  I met with Jane Stuart Smith to discuss a couple of upcoming projects she’s putting together, plus I gave her some further input into the project I advised her on last week.  I also upgraded the WordPress installations for a number of sites that I’ve set up over the years as Chris had pointed out that they were running older versions of the software.  I was also supposed to meet Flora on Friday to discuss the issue relating to the H27 categories for the Old English data for Mapping Metaphor, but unfortunately Flora was ill and we weren’t able to meet.  Hopefully we can fit this in next week.