I returned to work this week after spending most of the past two weeks off, which I spent swimming in the warm seas off Turkey and visiting the wonderful city of Stockholm. I did work on Wednesday last week, which I mostly spent catching up with emails, arranging meetings with people, making critical updates to the variety of WordPress installations I’m responsible for, dealing with some issues with the University’s Apple developer account and reading through some AHRC materials.
I have been really rather busy this week and seem to have worked on a large number of projects and proposals. I’ve had six meetings, which I’ll briefly summarise now. On Monday I attended the first meeting of the ‘Metaphor in the Curriculum’ project, the follow-on project for Mapping Metaphor. It was nice to meet up with members of the team again and hear about how the plans for the project have been progressing. We discussed some of the technical requirements for the project and how and when development will proceed. Some focus groups will take place over the next couple of weeks to gather feedback and requirements and I will probably start on the development work towards the end of the summer. We will have another meeting in July to finalise this. At the meeting Wendy mentioned that another batch of Mapping Metaphor data was ready to be uploaded, which Ellen sent to me after the meeting. I ran the file through my batch upload script and the online database now has 17,171 metaphor connections, down from 17,952 (due to the deletion of ‘noise’ and ‘relevant’), and 5,531 sample lexemes, up from 1,235. It’s looking pretty good, I think. Wendy is intending to launch the Mapping Metaphor website sometime in July, so we’re getting close now.
I had a further Metaphor related meeting with Ellen on Thursday to discuss the Old English metaphor data. We’d previously agreed that I would create an Old English version of the metaphor map in June, but there have been some delays getting the data together. The data was located in around 400 Excel spreadsheets and Ellen was wondering whether I could create a script that would automatically extract this data, pick out the columns that she needed and create one big file for her to work on. I spent some time on Thursday and Friday creating such a script, using a handy PHP library called PHPExcel (https://phpexcel.codeplex.com/) to automatically read the spreadsheet files and extract the content. This worked pretty well, although it did take a while for the script to run through all 399 files, and for some reason it silently failed on file number 282, which took some investigation. I think I’ve got a version of the data that Ellen will be able to use now, containing 32,421rows of data.
On Monday I had an impromptu meeting with Susan and Magda about the Scots Thesaurus. Magda had spent some time working with the tool that I’d created that connects the Historical Thesaurus of English with the Dictionary of the Scots Language to allow for searching between the sites and the creation of category records for the Scots Thesaurus. She’d come up with a few suggestions for improvements so we discussed these and I made a bit of a ‘to do’ list. I also went through the WordPress plugin I’d created with Susan and Magda and showed them how it might be used. They seemed pretty pleased with the way things were working out, although there is still a lot of technical work left to do. Later in the week Magda sent me a CSV file containing a lot more of the content she had created, and I uploaded this to the database for the Tool and the WordPress plugin. I really need to get around to amalgamating these databases and joining the functionality of the tool with that of WordPress. I aim to get this done (together with the updates Magda has requested) in the next couple of weeks.
On Tuesday I had a meeting the Jennifer Smith to discuss a possible dialect resource for high-school children. Jennifer wondered whether creating an app might work out and we discussed some possibilities. As she is wanting one of her students to work on developing the resource we agreed that for the time being the best approach would be for the student to just create the resource using the form capabilities of Google Docs and we’ll think about how this content can then be reshaped in future. We also briefly discussed her big AHRC project. This is due to start in August and I will be involved quite closely with this once it kicks off. It should be an interesting project.
I also met with Carolyn Jess-Cooke on Tuesday to discuss her ideas for a project. I can’t really go into too much detail here but it will probably take the form of an app. We spent some time discussing the possibilities and I wrote a brief outline of how the technical portion of the project may proceed to help with the bid.
I had a further meeting on Friday with Mary Gibson and her PhD student about a project they are hoping to put together, which will probably involve geographical and temporal data. It seems like quite an exciting project, but I can’t really go into any detail here. I will probably be contributing to the technical side of this bid some time towards the end of the summer.
Other than all of the above, I spent several hours this week on AHRC duties and I will have to spend several more over the next few weeks too. I also helped Gavin Miller out with a website he is setting up for a project that I previously advised him on and which recently received funding from the Wellcome Trust. I also heard from Gerry Carruthers that a project he and Catriona Macdonald put together and which I gave technical advice on has received funding from the Carnegie Trust, which is excellent news. I also spoke to Megan Coyer and George Pattison about projects they are working on that will need by input. Christian had also gone through the Essentials of Old English app and had made a list of things for me to change and although I didn’t have time to update these this week I will try to get this done soon as it would be good to launch the app.
I spent some time on Friday making some updates to the development version of the DSL website, including adding in new text for the advanced search and fixing the highlighting of search terms in the entry page when using Boolean terms. No words were being highlighted when Boolean terms such as ‘AND’ were part of the search string but I’ve figured out how to get around this and the terms now get highlighted in their fetching purple.
Throughout the course of the week I have also been working with the Hansard data for the Samuels project. You may recall from earlier reports that I’d managed to get the script from Lancaster that splits the data up to work using my Macbook but the data was just too massive for the storage capabilities I had at my disposal. Before I went off on my holidays Fraser had given me a 2Tb external hard drive, which should be just about big enough for the data. I set about extracting the data on Monday, and it’s a very long process indeed. My poor little laptop is still pegging away at it, having been running constantly day and night for almost 5 days now. I was hoping that the process would have completed by Friday but it’s going to continue into next week. Hopefully the data that is being extracted is going to be usable and complete. I am going to have to ask the Lancaster people to compare the file size and counts of our data with theirs as rather strangely the extracted data appears to be smaller than the joined data, which doesn’t seem right to me. Phew, that’s all for this week.