I spent quite a bit of time this week continuing to work on the systems for the Place-names of Mull and Ulva project. The first thing I did was to figure out how WordPress sets the language code. It has a function called ‘get_locale()’ that bring back the code (e.g. ‘En’ or ‘Gd’). Once I knew this I could update the site’s footer to display a different logo and text depending on the language the page is in. So now if the page is in English the regular UoG logo and English text crediting the map and photo are displayed whereas is the page is in Gaelic the Gaelic UoG logo and credit text is displayed. I think this is working rather well.
I managed to get all of the new Gaelic fields added into the CMS and fully tested by Thursday and asked Alasdair to testing things out. I also had a discussion with Rachel Opitz in Archaeology about incorporating LIDAR data into the maps and started to look at how to incorporate data from the GB1900 project for the parishes we are covering. GB1900 (http://www.gb1900.org/) was a crowdsourced project to transcribe every place-name that appears on OS maps from 1888-1914, which resulted in more than 2.5 million transcriptions. The dataset is available to download as a massive CSV file (more than 600Mb). It includes place-names for the three parishes on Mull and Ulva and Alasdair wanted to populate the CMS with this data as a starting point. On Friday I started to investigate how to access the information. Extracting the data manually from such a large CSV file wasn’t feasible so instead I created a MySQL database and wrote a little PHP script that iterated through each line of the CSV and added it to the database. I left this running over the weekend and will continue to work with it next week.
Also this week I continued to add new project records to the new Digital Humanities at Glasgow site. I only have about 30 more sites to add now, and I think it’s shaping up to be a really great resource that we will hopefully be able to launch in the next month or so.
I also spent a bit of further time on the SCOSYA project. I’d asked the university’s research data management people whether they had any advice on how we could share our audio recording data with other researchers around the world. The dataset we have is about 117GB, and originally we’d planned to use the University’s file transfer system to share the files. However, this can only handle files that are up to 20Gb in size, which meant splitting things up. And it turned out to take an awfully long time to upload the files, a process we would have to do each time the data was requested. The RDM people suggested we use the University’s OneDrive system instead. This is part of Office365 and gives each member of staff 1TB of space, and it’s possible to share uploaded files with others. I tried this out and the upload process was very swift. It was also possible to share the files with users based on their email addresses, and to set expiration dates and password for file access. It looks like this new method is going to be much better for the project and for any researchers who want to access our data. We also set up a record about the dataset in the Enlighten Research Data repository: http://researchdata.gla.ac.uk/951/ which should help people find the data.
Also for SCOSYA we ran into some difficulties with Google’s reCAPTCHA service, which we were using to protect the contact forms on our site from spam submissions. There was an issue with version 3 of Google’s reCAPTCHA system when integrated with the contact form plugin. It works fine if Google thinks you’re not a spammer but if you somehow fail its checks it doesn’t give you the option of proving you’re a real person, it just blocks the submission of the form. I haven’t been able to find a solution for this using v3, but thankfully there is a plugin that allows the contact form plugin to revert back to using reCAPTCHA v2 (the ‘I am not a robot’ tickbox). I got this working and have applied it to both the contact form and the spoken corpus form and it works for me as someone Google somehow seems to trust and for me when using IE via remote desktop, where Google makes me select features in images before the form submits.
Also this week I met with Marc and Fraser to discuss further developments for the Historical Thesaurus. We’re going to look at implementing the new way of storing and managing dates that I originally mapped out last summer and so we met on Friday to discuss some of the implications of this. I’m hoping to find some time next week to start looking into this.
We received the reviews for the Iona place-name project this week and I spent some time during the week and over the weekend going through the reviews, responding to any technical matters that were raised and helping Thomas Clancy with the overall response, that needed to be submitted the following Monday. I also spoke to Ronnie Young about the Burns Paper Database, that we may now be able to make publicly available, and made some updates to the NME digital ode site for Bryony Randall.