
Month: August 2015
Week Beginning 24th August 2015
It was another short week for me this week, as I was off on Monday, Tuesday and Wednesday. I had to spend a little time whilst I was off fixing a problem with one of our WordPress sites. Daria alerted me to the fact that the ISAS conference website was displaying nothing but a database connection error, which is obviously a fairly major problem. It turned out that one of the underlying database tables had become corrupted, but thankfully WordPress includes some tools to fix such issues and after figuring that out I managed to get the site up and running again. I’m not sure what caused the problem but hopefully it won’t happen again.
I attended to some further WordPress duties later on in the week when I was back at work. I’ve set up a conference website for Sean Adams in Theology and he was wanting a registration page to be set up. I was supposed to meet with his RA on Thursday to discuss the website but unfortunately she was ill, so I just had to get things set up without any formal discussions. I investigated a couple of event management plugins for WordPress, but the ones I tried seemed a bit too big and clunky for what we need. All we need is a single registration page for one event, but the plugins provide facilities to publish multiple events, manage payments, different ticket types and all of this stuff. It was all far too complicated, yet at the same time it seemed rather difficult to customise some fairly obvious things such as which fields are included in the registration form. After trying two plugins and being dissatisfied with both of them I just settled for using a contact form that emails Sean and the RA whenever someone registers. It’s not the ideal setup but for a relatively small event that we have very little time to get things set up for it should work out ok.
I had some further AHRC review duties to take care of this week, which took up the best part of one of my available days. I also had some more iOS developer account management issues to take care of, which also took up some time. Some people elsewhere in the University are wanting to upload a paid app to the App Store, but in order to do this a further contract needs to be signed with Apple, and this needs some kind of legal approval from the University before we agree to it. I had a couple of telephone conversations with a lawyer working on behalf of the University about the contracts for Apple and also for the Google Play store. I also had email conversations with Megan Coyer and Gavin Miller about the development of their respective online resources and spoke to Magda and Susan regarding some Scots Thesaurus issues.
On Friday morning I had a meeting with Gerry Carruthers and Catriona MacDonald to discuss their ‘People’s Voice’ project, which is due to start in January and for which I had been assigned two months of effort. We had a really useful meeting, going over some initial requirements for the database of songs and poems that they want to put together and thinking about how their anthology of songs will be marked up and managed. We also agreed that Mark Herraghty would do the bulk of the development work for the project. Mark knows a lot more about XML markup than I do which makes him a perfect fit for the project so this is really good news.
This was all I managed to squeeze into my two days of work this week. I didn’t get a chance to do any real development work but hopefully next week I’ll be able to get back stuck into it.
Week Beginning 17th August 2015
I had taken two days of annual leave this week so it was a three-day week for me. I still managed to pack quite a lot into three days, however. I had a long meeting with Fraser on Monday to discuss future updates to the HTE using new data from the OED. We went through some sample data that the OED people had sent and figured out how we would go about checking which parts of our data would need updating (mostly dates and some new words added to categories). We also discussed the Hansard visualisations and the highcharts example I thought I would be able to get working with the data. I spent about a day working on the highcharts visualisations, which included creating a PHP script that queried my two years of sample data for any thematic code passed to it, bundling up usage of the code by day and then spitting out the data in the JSON format. This sort of got the information into the format highcharts required (date / frequency key / value pairs) but the script itself was horribly slow due to the volume of data that is being queried. My two year sample has more than 13 million rows. So rather than connecting the chart directly to my PHP script I decided to cache the output as static JSON files. I think this is the approach we will have to take with the final visualisations too if we want them to be usable. Plugging the data into highcharts didn’t work initially, and I finally realised that this was because the timestamps highcharts uses are not just standard unix timestamps (the number of seconds since 1970) but Javascript timestamps, which use milliseconds instead of seconds. Adding three zeros onto the end of my timestamps did the trick and after some tweaking of axes and layout options I managed to get a nice time-based graph that plotted the usage of two thematic categories over two years. It worked very well and I’m confident I’ll be able to extend this out both to the full dataset and with limiting options (e.g. speaker).
I had to deal with some further Apple Developer Program issues this week, which took up a little time. I also continued to work on the Scots Thesaurus project. First up was investigating why the visualisations weren’t working in IE9, which is what Magda has on her office PC. I had thought that this might be caused by compatibility mode being turned on for University sites, but this wasn’t actually the case. I was rather stumped for a while as to what the problem was but I managed to find a solution. The problem seems to be with how older versions of IE pull in data from a server after a page has loaded. When the visualisation loads, Javascript is connecting to the server to pull in data behind the scenes. The method I was using to do this should wait until it receives the data before it processes things, but in older versions of IE it doesn’t wait, meaning that the script attempts to generate the visualisation before it has any data to visualise! I switched to an alternative method that does wait properly in older versions of IE. I’ve tested this out in IE on my PC, which I’ve figured out I can set to emulate IE9. Before the change, with it set to IE9 I was getting the ‘no data’ error. After changing the method the visualisation loads successfully.
After fixing this issue I continued to work with the visualisations. I added in an option to show or hide the words in a category, as the ‘infobox’ was taking up quite a lot of space when viewing a category that contains a lot of words. I also developed a first version of the part of speech selector. This displays the available parts of speech as checkboxes above the visualization and allows the user to select which parts to view. Ticking or unticking a box automatically updates the visualization. The feature is still unfinished and there are some aspects that need sorted, for example the listed parts of speech only show those that are present at the current level in the hierarchy but as things stand there are sometimes a broader range of parts lower down the hierarchy and these are not available to choose until the user browses down to the lower level. I’m still uncertain as to whether multiple parts of speech in one visualisation is going to work very well and whether a simpler switch from one part to another might work better, but we’ll see how it goes.
I also spent a bit of time on the Medical Humanities Network website, continuing to add new features to it and I set up a conference website for Sean Adams in Theology. This is another WordPress powered site but Sean wanted it to look like the University website. A UoG-esque theme for WordPress had been created a few years ago by Dave Beavan and then subsequently tweaked by Matt Barr, but the theme was rather out of date and didn’t look exactly like the current University website so I spent some time updating the theme, which will probably see some use on other websites too. This one, for example.
Week Beginning 10th August 2015
This week I continued to work on the projects I’d started work on again last week after launching the three Old English resources. For the Science Fiction and the Medical Humanities project I completed a first draft of all of the management scripts that are required for managing the bibliographic data that will be published through the website. It is now possible to manage all of the information relating to bibliographical items through the WordPress interface, including adding, editing and deleting mediums, themes, people, places and organisations. The only thing it isn’t possible to do is to update the list of options that appear in the ‘connection’ drop-down lists when associating people, places and organisations. But I can very easily update these lists directly through the database and the new information then appears wherever it is required so this isn’t going to be a problem.
Continuing on the Medical Humanities theme, I spent about a day this week starting work on the new Medical Humanities network website and content management system for Megan Coyer. This system is going to be an adaptation of the existing Digital Humanities network system. Most of my time was spent on ‘back end’ stuff like setting up the underlying database, password protecting the subdomain until we’re ready to ‘go live’ and configuring the template scripts. The homepage is in place (but without any content), it is possible to log into the system and the navigation menu is set up, but no other pages are currently in place. I spent a bit of time tidying up the interface, for example adding in more modern looking up and down arrows to the ‘log in’ box, tweaking the breadcrumb layout and updating the way links are styled to bring things more into line with the main University site.
I also spent a bit of time advising staff and undertaking some administrative work. Rhona Brown asked me for some advice on the project she is putting together and it took a little time to formulate a response to her. I was also asked by Wendy and Nikki to complete a staff time allocation survey for them, which also took a bit of time to go through. I also had an email from Adam Zachary Wyner in Aberdeen about a workshop he is putting together and I gave him a couple of suggestions about possible Glasgow participants. I’m also in the process of setting up a conference website for Sean Adams in Theology and have been liaising with the RA who is working on this with him.
Other than these matters the rest of my week was spent on two projects, the Scots Thesaurus and SAMUELS. For the Scots Thesaurus I continued to work on the visualizations. Last week I adapted an earlier visualization I had created to make it ‘dynamic’ – i.e the contents change depending on variables passed to it by the user. This week I set about integrating this with the WordPress interface. I had initially intended to make the visualisations available as a separate tab within the main page. E.g. the standard ‘browse’ interface would be available and by clicking on the visualization tab this would be replaced in page by the visualization interface. However, I realized that this approach wasn’t really going to work due to the limited screen space that we have available within the WordPress interface. As we are using a side panel the amount of usable space is actually quite limited and for the visualizations we need as much screen width as possible. I decided therefore to place the visualizations within a jQuery modal dialog box which takes up 90% of the screen width and height and have provided a button from the normal browse view to open this. When clicked on the visualization now loads in the dialog box, showing the current category in the centre and the full hierarchy from this point downwards spreading out around it. Previously the contents of a category were displayed in a pop-up when the user clicked on a category in the visualization, but this wasn’t ideal as it obscured the visualization itself. Instead I created an ‘infobox’ that appears to the right of the visualization and I’ve set this up so that it lists the contents of the selected category, including words, sources, links through to the DSL and facilities to centre the visualization on the currently selected category or to browse up the hierarchy if the central node is selected. The final thing I added in was highlighting of the currently selected node in the visualization and facilities to switch back to the textual browse option at the point at which the user is viewing the visualization. There is still some work to be done on the visualizations, for example adding in the part of speech browser, sorting out the layout and ideally providing some sort of animations between views, but things are coming along nicely.
For SAMUELS I continued to work on the visualizations of the Hansard data. Unfortunately it looks like I’m unable to make any further progress with Bookworm. I’ve spent several days trying to get the various parts of the Bookworm system to communicate with each other using the sample ‘congress’ data but the API component is returning errors that I just can’t get to the bottom of. I have BookwormDB (https://github.com/Bookworm-project/BookwormDB) set up and the congress data appears to have been successfully ingested. I have installed the API (https://github.com/Bookworm-project/BookwormAPI) and it is executing and apparently successfully connecting to the database. This page http://bookworm-project.github.io/Docs/API.html says the API should be able to query the database to return the possible fields I can run such a query successfully on my test server. I have installed the BookwormGUI (https://github.com/Bookworm-project/BookwormGUI), but the javascript in the front end just doesn’t seem to be able to pass a valid query to the API. I added in an ‘alert’ that pops up to display the query that gets passed to the API, but running this through the API just gives Python errors. I’ve tried following the API guidelines on query structure (http://bookworm-project.github.io/Docs/query_structure.html) in order to create a simple, valid query but nothing I’ve tried has worked. The Python errors seem to suggest that the API is having some difficulty connecting to the database (there’s an error ‘connect() argument 12 must be string, not None’) but I don’t know enough about Python to debug this problem. Plus I don’t understand how the API can connect to the database to successfully query the possible fields but then fail to connect for other query types. It’s infuriating. Without access to a friendly Python expert I’m afraid it’s looking like we’re stuck.
However, I have figured out that BookwormGUI is based around the Highcharts.js library (see http://www.highcharts.com/demo/line-ajax) and I’m wondering now whether I can just use this library to connect to the Hansard data instead of trying to get Bookworm working, possibly borrowing some of the BookwormGUI code for handling the ‘limit by’ options and the ‘zoom in’ functionality (which I haven’t been able to find in the highcharts examples). I’m going to try this with the two years of Hansard data that I previously managed to extract, specifically this visualisation style: http://www.highcharts.com/stock/demo/compare. If I can get it to work the timeslider along the bottom would work really nicely.
Week Beginning 3rd August 2015
The ISAS (International Society of Anglo-Saxonists) conference took place this week and two projects I have been working on over the past few weeks were launched at this event. The first was A Thesaurus of Old English (http://oldenglishthesaurus.arts.gla.ac.uk/), which went live on Monday. As is usual with these things there were some last minute changes and additions that needed to be made, but overall the launch went very smoothly and I’m particularly pleased with how the ‘search for word in other online resources’ feature works.
The second project that launched was the Old English Metaphor Map (http://mappingmetaphor.arts.gla.ac.uk/old-english/). We were due to launch this on Thursday but due to illness the launch was bumped up to Tuesday instead. Thankfully I had completed everything that needed sorting out before Tuesday so making the resource live was a very straightforward process. I think the map is looking pretty good and it complements the main site nicely.
With these two projects out of the way I had to spend about a day this week on AHRC duties, but once all that was done I could breathe a bit of a sigh of relief and get on with some other projects that I haven’t been able to devote much time to recently due to other commitments. The first of these was Gavin Miller’s Science Fiction and the Medical Humanities project. I’m developing a WordPress based tool for his project to manage a database of sources and this week I continued adding functionality to this tool as follows:
- I removed the error messages that were appearing when there weren’t any errors
- I’ve replaced ‘publisher’ with a new entity named ‘organisation’. This allows the connection the organisation has with the item (e.g. Publisher, Studio) to be selected in the same way as connections to items from places and people are handled.
- I’ve updated the way in which these connections are pulled out of the database to make it much easier to add new connection types. After adding a new connection type to the database this then immediately appears as a selectable option in all relevant places in the system.
- I’ve updated the underlying database so that data can have an ‘active’ or ‘deleted’ state, which will allow entities like people and places to be ‘deleted’ via WordPress but still retained in the underlying database in case they need to be reinstated.
- I’ve begun work on the pages that will allow the management of types and mediums, themes, people, places and organisations. Currently there are new menu items that provide options to list these data types. The lists also include counts of the number of bibliographic items each row is connected to. The next step will be to add in facilities to allow admin users to edit, delete and create types, mediums, themes, people, places and organisations.
The next project I worked on was the Scots Thesaurus project. Magda has emailed me stating she was having problems uploading words via CSV files and also assigning category numbers. I met with Magda on Thursday to discuss these issues and to try and figure out what was going wrong. The CSV issue was being caused by the CSV files created by Excel on Magda’s PC being given a rather unexpected MIME type. The upload script was checking the uploaded file for specific CSV MIME types but Excel was giving them a MIME type of ‘application/vnd.ms-excel’. I have no idea why this was happening, and even more strangely, when Magda emailed me one of her files and I uploaded it on my PC (without re-saving the file) it uploaded fine. I didn’t really get to the bottom of this problem, but instead I simply fixed it by allowing files of MIME type ‘application/vnd.ms-excel’ to be accepted.
The issue with certain category numbers not saving was being caused by deleted rows in the system. When creating a new category the system checks to see if there is already a row with the supplied number and part of speech in the system. If there is then the upload fails. However, the check wasn’t taking into consideration categories that had been deleted from within WordPress. These rows were being marked as ‘trash’ in WordPress but still existed in our non-Wordpress ‘category’ table. I updated the check to link up the category table to WordPress’s posts table to check the status of the category there. Now if a category number exists but it’s associated with a WordPress post that is marked as deleted then the upload of a new row can proceed without any problems.
In addition to fixing these issues I also continued working on the visualisations for the Scots Thesaurus. Magda will be presenting the thesaurus at a conference next week and she was hoping to be able to show some visualisations of the weather data. We had previously agreed at a meeting with Susan that I would continue to work on the static visualisation I had made for the ‘Golf’ data using the d3.js ‘node-link tree’ diagram type (see http://bl.ocks.org/mbostock/4063550). I would make this ‘dynamic’ (i.e. it would work with any data passed to it from the database and it would be possible to update the central node). Eventually we may choose a completely different visualisation approach but this is the one we will focus on for now. I spent some time adapting my ‘Golf’ visualisation to work with any thesaurus data passed to it – simply give it a category ID and a part of speech and the thesaurus structure (including subcategories) from this point downwards gets displayed. There’s still a lot of work to do on this (e.g. integrating it within WordPress) but I happy with the progress I’m making with it.
The last project I worked on this week was the SAMUELS Hansard data, or more specifically trying to get Bookworm set up on the test server I have access to. Previously I had managed to get the underlying database working and the test data (US Congress) installed. I had then installed the Bookworm API but I was having difficulty getting Python scripts to execute. I’m happy to report that I got to the bottom of this. After reading this post (https://www.linux.com/community/blogs/129-servers/757148-configuring-apache2-to-run-python-scripts) I realised that I had not enabled the CGI module of Apache, so even though the cgi-bin directory was now web accessible nothing was getting executed there. The second thing I realised was that I’d installed the API in a subdirectory within cg-bin and I needed to add privileges in the Apache configuration file for this subdirectory as well as the parent directory. With that out of the way I could query the API from a web browser, which was quite a relief.
After this I installed the Bookworm GUI code, which connects to the API in order to retrieve data from the database. I still haven’t managed to get this working successfully. The page surroundings load but the connection to the data just isn’t working. One reason why this is the case is because I’d installed the API in a subdirectory of the cgi-bin, but even after updating every place in the Javascript where the API is called I was still getting errors. The AJAX call is definitely connecting to the API as I’m getting a bunch of Python errors returned instead of data. I’ll need to further investigate this next week.
Also this week I had a meeting with Gary Thomas about Jennifer Smith’s Syntactic Atlas of Scots project. Gary is the RA on the project and we met on Thursday to discuss how we should get the technical aspects of the project off the ground. It was a really useful meeting and we already have some ideas about how things will be managed. We’re not going to get started on this until next month, though, due to the availability of the project staff.
Week Beginning 27th July 2015
It was another four-day week for me this week as I’d taken the Friday off to go away for a long weekend. The timing of this wasn’t all that great as it was another very busy week getting everything ready for the ISAS conference next week. However, I managed to get everything that needed to be done completed by skipping a few lunch breaks and working late a bit.
The first of three Old English related projects that I completed this week was the Essentials of Old English app. I’d actually completed this a couple of weeks ago but it takes some time for apps to be approved by Apple before they get added to the App Store, and thankfully I received the confirmation email this week that the app was up there. The Android version had already activated itself last week, and this is the first of the STELLA apps that is available on both platforms, which is rather nice. You can download the app for free from the Apple App Store and the Google Play store now. Search the stores for ‘Old English’ to find them. The browser based version of the resource that I also created, which also features links to the App / Play store, is available here:
http://www.arts.gla.ac.uk/stella/apps/web/eoe/
I’ve also updated the STELLA homepage (http://www.arts.gla.ac.uk/stella/) to give it the University interface and to divide the four resources that I’ve redeveloped so far from the ‘legacy’ resources that still need updating. I really need to make Android versions of the other three STELLA apps, but I don’t have the time to create these at the moment. Hopefully soon though.
The second Old English project was the redeveloped Thesaurus of Old English website, which I’ve been working on over the past few weeks with Fraser. We met with Marc to discuss the outstanding tasks on Monday and we also agreed on a colour scheme to use (greens). Fraser sent me all of the ancillary page text during the week and I added all of this too. Some of the pages took quite a while to mark up as they were full of footnotes and italic or bold text, but it’s great to have all of the text in place. I also implemented a very handy little pop-up box that allows users to search for a word in another external resource. The main HT site provides links to the OED, but for TOE we now provide a dialog box with several external search options, including the HT and the Bowsorth-Toller Anglo Saxon Dictionary. Clicking on one of the links automatically searches the selected resource for the word in question, which is rather nice. I had some problem linking through to the Dictionary of Old English at the University of Toronto as their search page uses POST not GET, meaning a simple link through will not work. Instead there’s actually a POST form behind the link and a bit of jQuery magic in the background actually submits the form when a user clicks on the link. Marc encountered a problem with it when ‘middle clicking’ to open the link in a new tab, and I realised that this was caused by jQuery’s ‘click’ function only picking up left clicks. ‘middle clicking’ wasn’t executing the jQuery function and just opens the TOE site in a new tab instead. I spent some time trying to get this working and realised that I’d need to use a different function instead – either ‘mouseup’ or ‘mousedown’. Unfortunately Firefox doesn’t allow these functions to open a new tab automatically – the user has to agree to allow the page to open one. There doesn’t seem to be a way around this, but it’s only an issue when a user ‘middle clicks’ – a normal left click opens the link as normal. I’ve also realised that due to a session variable on the Toronto site the search won’t run automatically the first time a user clicks on the button but instead shows the empty search form. Subsequent searches all work fine but the initial one doesn’t. Unfortunately there isn’t much we can do about this unless I get jQuery to sneakily connect to the Toronto site to initialise the session variable before a user clicks the search button, which doesn’t seem like very good behaviour. Anyway, apart from this little blip the site is all pretty much good to go ahead of next Monday’s launch.
The third Old English site I worked on was the Old English Metaphor map. Ellen supplied me with some updated data and I got this in place. I also fixed a highlighting bug in the links to the HT which was previously stopping any word with a less-than sign (actually quite a few Old English ones that show modern English equivalents) highlighting properly. The highlighting is working now, thankfully. I also uploaded the new OE specific site text. Ellen also sent me a list of updates to make and I managed to get these all ticked off as follows:
- ‘Categories complete’ now links to the OE page rather than the main page
- ‘Both’ is now the default strength for the OE map
- Tab and heading text has been updated
- I’ve updated the ‘number of categories’ that gets displayed at the top level when you click on a category. This now queries the database and only counts those categories that actually have OE links in them. It also takes strength into consideration but currently this only works if the page is fully reloaded. The strength selection boxes don’t fully reload the page, they just refresh the visualisation so the counts still refer to ‘both’. I’ve added this to my longer term ‘to do’ list.
- I’ve added buttons to add ashes and thorns to the quick search box in the top right of the page and also the quick search in the ‘search’ tab. I have not added them to the advanced search as the only text box here is explicitly for searching category names and descriptors and the facility to search HT is not part of the advanced search.
- I’ve removed the ‘click on category name…’ text
- I’ve fixed the yellow circle size problem in the aggregate view – I’d forgotten to update the script that generates the circles to use the OE table so it was actually showing the data for the main site.
Ellen also sent me a document containing the OE word counts for each category and I created a new OE specific column in the metaphor category table to hold these. I then updated the site to ensure that these OE word counts are displayed in the metaphor cards rather than showing the total number of HT words in the category.
I also squeezed in a couple of meetings with people about new projects this week too. On Tuesday I met with Rhona Brown to discuss a bid she’s putting together. I can’t really go into any details here but we discussed a few possible avenues and have agreed on a course of action to take for the digital components of her project. On Thursday I met with Megan Coyer and Hannah Tweed, who is the new RA for her project. The project is a Medical Humanities Network, and I will be setting up the website for this, based on the existing Digital Humanities Network resource.
Next week is the ISAS conference and fingers crossed the launch of the Thesaurus of Old English and the Old English Metaphor Map will go smoothly.