On Friday last week I submitted a job to ScotGrid that would extract all of the data from the Hansard dataset that was supplied by Lancaster. I had to submit this job because I’d noticed that the structure of the metadata had changed midway through the data, which had messed up my extraction script. I submitted the 1252 files and left them running over the weekend and by Monday morning they had all completed, giving me a set of 1252 SQL files. None of the error checks I’d added into my extraction script last week had tripped so hopefully the metadata structure doesn’t have any other surprises waiting. On Monday I started running batches of the SQL files into the MySQL database that I have for the data, but it’s going to take quite a while for these to process as I have to send them through to ScotGrid in small batches of around 20 otherwise the poor database has too many connections and returns an error.
I spent most of the rest of the week working on the ‘Basics of English Metre’ app and made some good progress with it. I have now completed Unit 2 and have made a start on Unit 3. I did get rather bogged down in Unit 2 for a while as several of the exercises looked like regular exercises that I had already developed code to process, only to have extra pesky questions added on the end that only appear when the final question on a page is correctly answered. These included selecting the foot type for a set of lines (e.g. Iambic pentameter) or identifying a poem based on its metre. However, I managed to find a solution to all of these quirks and added in some new question styles. I’m currently on page 2 of Unit 3, which consists of four questions that each have four stages. The first is syllable boundary identification, the second is metre analysis, the third is putting in the foot boundaries while the fourth is adding the rhythm. I’ve got all of this working, although have only supplied data for the fourth stage for the last of the lines on the page. Also there are some more of the pesky additional questions that need to be integrated and rather strangely the existing website doesn’t supply answers for the fourth stage, so I’m going to need to get someone in English Language to supply these.
Other than the above I helped Carolyn Jess-Cooke from English Literature to add a forum to her ‘writing mental health’ website. I also had an email conversation with Rhona Brown about the digitised images and OCR possibilities for her ‘Edinburgh Gazetteer’ project that is starting soon. I had a chat with Graeme Cannon about an on-screen keyboard I had developed for the Essentials of Old English app, as he is going to need a similar feature for one of his projects. I also spoke with Flora about the dreaded H27 error with the OE data for Mapping Metaphor. A solution to this is still eluding her, but I’m afraid I wasn’t able to offer much advice as I don’t know much about the Access database and the forms that were created for the data. I might see if I can extract the data and do something with it using PHP if she hasn’t found a solution soon. I also spoke to Rob Maslen about a new blog he’s wanting to set up for student of his Fantasy course next year and talked to Scott Spurlock about a possible crowdsourcing tool for a project he is putting together.
I am going to be on holiday for the next two weeks so there won’t be a further update until after I’m back.