This week was a pretty busy one, working on a number of projects and participating in a number of meetings. I spent a bit of time working on Bryony Randall’s New Modernist Editing project. This involved starting to plan the workshop on TEI and XML – sorting out who might be participating, where the workshop might take place, what it might actually involve and things like that. We’re hoping it will be a hands-on session for postgrads with no previous technical experience of transcription, but we’ll need to see if we can get a lab booked that has Oxygen available first. I also worked with the facsimile images of the Woolf short story that we’re going to make a digital edition of. The Woolf estate wants a massive copyright statement to be plastered across the middle of every image, which is a little disappointing as it will definitely affect the usefulness of the images, but we can’t do anything about that. I also started to work with Bryony’s initial Word based transcription of the short story, thinking how best to represent this in TEI. It’s a good opportunity to build up my experience of Oxygen, TEI and XML.
I also updated the data for the Mapping Metaphor project, which Wendy has continued to work on over the past few months. We now have 13,083 metaphorical connections (down from 13931), 9,823 ‘first lexemes’ (up from 8,766) and 14,800 other lexemes (up from 13,035). We also now have 300 categories completed, up from 256. I also replaced the old ‘Thomas Crawford’ part of the Corpus of Modern Scottish Writing with my reworked version. The old version was a WordPress site that hadn’t been updated since 2010 and was a security risk. The new version (http://www.scottishcorpus.ac.uk/thomascrawford/) consists of nothing more than three very simple PHP pages and is much easier to navigate and use.
I had a few Burns related tasks to take care of this week. Firstly there was the usual ‘song of the week’ to upload, which I published on Wednesday as usual (see http://burnsc21.glasgow.ac.uk/ye-jacobites-by-name/). I also had a chat with Craig Lamont about a Burns bibliography that he is compiling. This is currently in a massive Word document but he wants to make it searchable online so we’re discussing the possibilities and also where the resource might be hosted. On Friday I had a meeting with Ronnie Young to discuss a database of Burns paper that he has compiled. The database currently exists as an Access database with a number of related images and he would like this to be published online as a searchable resource. Ronnie is going to check where the resource should reside and what level of access should be given and we’ll take things from there.
I had been speaking to the other developers across the College about the possibility of meeting up semi-regularly to discuss what we’re all up to and where things are headed and we arranged to have a meeting on Tuesday this week. It was a really useful meeting and we all got a chance to talk about our projects, the technologies we use, any cool developments or problems we’d encountered and future plans. Hopefully we’ll have these meetings every couple of months or so.
We had a bit of a situation with the Historical Thesaurus this week relating to someone running a script to grab every page of the website in order to extract the data from it, which is in clear violation of our terms and conditions. I can’t really go into any details here, but I had to spend some of the week identifying when and how this was done and speaking to Chris about ensuring that it can’t happen again.
The rest of my week was spent on the SCOSYA project. Last week I updated the ‘Atlas Display Options’ to include accordion sections for ‘advanced attribute search’ and ‘my map data’. I’m still waiting to hear back from Gary about how he would like to advanced search to work so instead I focussed on the ‘my map data’ section. This section will allow people to upload their own map data using the same CSV format as the atlas download files in order to visualise this data on the map. I managed to make some pretty good progress with this feature. First of all I needed to create new database tables to house the uploaded data. Then I needed to add in a facility to upload files. I decided to use the ‘dropzone.js’ scripts that I had previously used for uploading the questionnaires to the CMS. This allows the user to drag and drop one or more files into a section of the browser and for this data to then be processed in an AJAX kind of way. This approach works very well for the atlas as we don’t want the user to have to navigate away from the atlas in order to upload the data – all needs to be managed from within the ‘display options’ slideout section.
I contemplated adding the facility to process the uploaded files to the API but decided against it as I wanted to keep the API ‘read only’ rather than also handling data uploads and deletions. So instead I created a stand-along PHP script that takes the uploaded CSV files and adds them to the database tables I had created. This script then echoes out some log messages that then get pulled into a ‘log’ section of the display in an AJAX manner.
I then had to add in a facility to list previously uploaded files. I decided the query for this should be part of the API as it is a ‘GET’ request. However, I needed to ensure that only the currently logged in user was able to access their particular list of files. I didn’t want anyone to be able to pass a username to the API and then get that user’s files – the passed username must also correspond to the currently logged in user. I did some investigation about securing an API, using access tokens and things like that but in the end I decided that accessing the user’s data would only ever be something that we would want to offer through our website and we could therefore just use session authentication to ensure the correct user was logged in. This doesn’t really fit in with the ethos of a RESTful API, but it suits our purposes ok so it’s not really an issue.
With the API updated to be able to accept requests for listing a user’s data uploads I then created a facility in the front-end for listing these files, ensuring that the list automatically gets updated with each new file upload. You can see the work in progress ‘my map data’ section in the following screenshot.