Week Beginning 17th August 2015

I had taken two days of annual leave this week so it was a three-day week for me. I still managed to pack quite a lot into three days, however. I had a long meeting with Fraser on Monday to discuss future updates to the HTE using new data from the OED. We went through some sample data that the OED people had sent and figured out how we would go about checking which parts of our data would need updating (mostly dates and some new words added to categories). We also discussed the Hansard visualisations and the highcharts example I thought I would be able to get working with the data. I spent about a day working on the highcharts visualisations, which included creating a PHP script that queried my two years of sample data for any thematic code passed to it, bundling up usage of the code by day and then spitting out the data in the JSON format. This sort of got the information into the format highcharts required (date / frequency key / value pairs) but the script itself was horribly slow due to the volume of data that is being queried. My two year sample has more than 13 million rows. So rather than connecting the chart directly to my PHP script I decided to cache the output as static JSON files. I think this is the approach we will have to take with the final visualisations too if we want them to be usable. Plugging the data into highcharts didn’t work initially, and I finally realised that this was because the timestamps highcharts uses are not just standard unix timestamps (the number of seconds since 1970) but Javascript timestamps, which use milliseconds instead of seconds. Adding three zeros onto the end of my timestamps did the trick and after some tweaking of axes and layout options I managed to get a nice time-based graph that plotted the usage of two thematic categories over two years. It worked very well and I’m confident I’ll be able to extend this out both to the full dataset and with limiting options (e.g. speaker).

I had to deal with some further Apple Developer Program issues this week, which took up a little time. I also continued to work on the Scots Thesaurus project. First up was investigating why the visualisations weren’t working in IE9, which is what Magda has on her office PC. I had thought that this might be caused by compatibility mode being turned on for University sites, but this wasn’t actually the case. I was rather stumped for a while as to what the problem was but I managed to find a solution. The problem seems to be with how older versions of IE pull in data from a server after a page has loaded. When the visualisation loads, Javascript is connecting to the server to pull in data behind the scenes. The method I was using to do this should wait until it receives the data before it processes things, but in older versions of IE it doesn’t wait, meaning that the script attempts to generate the visualisation before it has any data to visualise! I switched to an alternative method that does wait properly in older versions of IE. I’ve tested this out in IE on my PC, which I’ve figured out I can set to emulate IE9. Before the change, with it set to IE9 I was getting the ‘no data’ error. After changing the method the visualisation loads successfully.

After fixing this issue I continued to work with the visualisations. I added in an option to show or hide the words in a category, as the ‘infobox’ was taking up quite a lot of space when viewing a category that contains a lot of words. I also developed a first version of the part of speech selector. This displays the available parts of speech as checkboxes above the visualization and allows the user to select which parts to view. Ticking or unticking a box automatically updates the visualization. The feature is still unfinished and there are some aspects that need sorted, for example the listed parts of speech only show those that are present at the current level in the hierarchy but as things stand there are sometimes a broader range of parts lower down the hierarchy and these are not available to choose until the user browses down to the lower level. I’m still uncertain as to whether multiple parts of speech in one visualisation is going to work very well and whether a simpler switch from one part to another might work better, but we’ll see how it goes.

I also spent a bit of time on the Medical Humanities Network website, continuing to add new features to it and I set up a conference website for Sean Adams in Theology. This is another WordPress powered site but Sean wanted it to look like the University website. A UoG-esque theme for WordPress had been created a few years ago by Dave Beavan and then subsequently tweaked by Matt Barr, but the theme was rather out of date and didn’t look exactly like the current University website so I spent some time updating the theme, which will probably see some use on other websites too. This one, for example.