Week beginning 11th November 2013

This was an important week, as the new version of the Historical Thesaurus went live!  I spent most of Monday moving the new site across to its proper URL, testing things, updating links, validating the HTML, putting in redirects from the old site and ensuring everything was working smoothly and the final result was unveiled at the Samuels lecture on Tuesday evening.  You can find the new version here:
It was also the week that the new version of the SCOTS corpus and CMSW went live too, and I spent a lot of time on Tuesday and Wednesday working on setting up these new versions, undertaking the same tasks as I did for the HT.  The new version of SCOTS and CMSW can be accessed here:
I had to spend some extra time following the relaunch of SCOTS updating the ‘Thoms Crawford’s diary’ microsite.  This is a WordPress powered site and I updated the header and footer template files so as to give each page of the microsite the same header and footer as the rest of the CMSW site, which looks a lot better than the old version which didn’t have any link back to the main CMSW site.
I had a couple of meetings this week, the first with a PhD student who is wanting to OCR some 19th century Scottish courts records.  I received some very helpful pointers on this from Jenny Bann, who was previously involved with the massive amounts of OCR work that went on in the CMSW project and was able to give some good advice on how to improve the success rate of OCR on historical documents thanks to Jenny’s input.
My second meeting was with a member of staff in English Literature who is putting together an AHRC bid.  Her project will involve the use of Google Books and the Ngram interface, plus developing some visualisations of links between themes in novels.  We had a good meeting and should hopefully proceed further with the writing of the bid in future weeks.  Having not used Ngrams much I spent a bit of time researching it and playing around with it.  It’s a pretty amazing system and has lots of potential for research due to the size of the corpus and the also the sophisticated query tools that are on offer.