I worked on many different projects this week, with most of my time being split between the Dictionary of the Scots Language, the Anglo-Norman Dictionary, the Books and Borrowing project and the Scots Language Policy project. For the DSL I began investigating adding the bibliographical data to the new API and developing bibliographical search facilities. Ann Ferguson had sent me spreadsheets containing the current bibliographical data for DOST and SND and I migrated this data into a database and began to think about how the data needs to be processed in order to be used on the website. At the moment links to bibliographies from SND entries are not appearing in the new version of the API, while DOST bibliographical links do appear but don’t lead anywhere. Fixing the latter should be fairly straightforward but the former looks to be a bit trickier.
For SND for the live site using the original V1 API it looks like the bibliographical links are stored in a database table and these are then injected into the XML entries whenever an entry is displayed. A column in the table contains the order the citation appears in the entry and this is how the system knows which bibliographical ID to assign to which link in the entry. This raises some questions about what happens when an entry is edited. If the order of the citations in the XML is changed, or a new citation is added then all of the links to the bibliographies will be out of sync. Plus, unless the database table is edited no new bibliographical links will ever display. It is possible that the data in bibliographical links table is already out of date and we are going to need to try and find a way to add these bibliographical links into the actual XML entries rather than retaining the old system of storing them separately and then injected then each time the entry is requested. I emailed Ann for further discussion about these points. Also this week I made a few updates to the live DSL website, changing the logos that are used and making ‘Dictionary’ in the title plural.
For the AND this week I added in the missing academic articles that Geert had managed to track down and then began focusing on updating the source texts and working with the commentaries for the R data. The commentaries were sent to me in two Word files, and although we had hoped to be able to work out a mechanism for automatically extracting these and adding them to their corresponding entries it looks like this will be very difficult to achieve with any accuracy. I concluded that I could split the entries up in Geert’s document based on the ‘**’ characters between commentaries and possibly split Heather’s up based on blank lines. I could possibly retain the formatting (bold, italic, superscript text etc) and convert this to HTML, although even this would be tricky, time consuming and error-prone. The commentaries include links to other entries in bold, and I would possibly be able to automatically add in links to other entries based on entries appearing in bold in the commentaries, but again this would be highly error-prone as bold text is used for things other than entries, and sometimes the entry number follows a hash while at other times it’s superscript. It would also be difficult to automatically ascertain which entry a commentary belongs to as there is some inconsistency here too – e.g. the commentary for ‘remuement’ is listed as ‘[remuement]??’ and there are other occasions where the entry doesn’t appear on its own on a line – e.g. ‘Retaillement xref with recelement’ and ‘Reverdure—Geert says to omit’. Then there are commentaries that are all crossed out, e.g. ‘resteot’. We decided that attempting to automatically process the commentaries would not be feasible and instead the editors would add them to the entry XML files manually, adding the tags for bold, italic, superscript and other formatting as required. Geert added commentaries to two entries to see how this would work and it worked very well.
For the source texts, we had originally discussed the editors editing these via a spreadsheet that I’d generated from the online data last year, but I decided it would be better if I just start work on the new online Dictionary Management System (DMS) and create the means of adding, listing and editing the source texts as the first thing that can be managed via the new DMS. This seemed preferable to establishing a new, temporary workflow that may take some time to set up and may end up not being used for very long. I therefore created the login and initial pages for the DMS (by repurposing earlier content management systems I’d created). I then set up database tables for holding the new source text data, which includes multiple potential items for each source and a range of new fields that the original source text data does not contain. With this in place I created the DMS pages for browsing the source texts and deleting them, and I’m midway through writing the scripts for editing existing and adding new source texts. I aim to have this finished next week.
For the Books and Borrowing project I continued to make refinements to the CMS, namely reducing the number of books and borrowers from 500 to 200 to speed up page loads, adding in the day of the week that books were borrowed and returned, based on the date information already in the system, removing tab characters for edition titles as these were causing some issues for the system, replacing the editor’s notes rich text box with a plain text area to save space on the edit page and adding a new field to the borrowing record that allows the editor to note when certain items appear for display only and should otherwise be overlooked, for example when generating stats. This is to be used for duplicate lines and lines that are crossed out. I also had a look through the new sample data from Craigston that was sent to us this week.
For the Scots Language Policy project I set up the project’s website, including the user interface, adding in fonts, plugins, initial page structure, site graphics, logos etc. Also this week I fixed an issue with song downloads on the Burns website (the plugin the controls the song downloads is very old and had broken. I needed to install a newer version and upgrade the song data for the downloads to work again. I also continued my email conversation with Rachel Fletcher about a project she’s putting together and created a user account to allow Simon Taylor to access the Ayr Placenames CMS.