I spent most of my time this week split between two projects: SCOSYA and REELS. I’ll discuss REELS first. On Wednesday I attended a project meeting for the REELS project, the first one I’ve attended for several months as I had previously finished work on the content management system for the project and didn’t have anything left to do for the project for a while. The project has recently appointed their PhD student so it seemed like a good time to have a full project meeting and it was god to catch up with the project again and meet the new member of staff. A few updates and fixes to the content management system were requested at the meeting, so I spent some time this week working on these, specifically:
- I added a new field to the place-name table for recording whether the place-name is ‘non-core’ or not. This is to allow names like ‘Edrington’, that appear in names like ‘Edrington Castle’ but don’t appear to exist as names in their own right to be recorded. It’s a ‘yes/no’ field as with ‘Obsolete’ and ‘Linear’ and appears on the ‘add’ and ‘edit’ place-name page underneath the ‘Linear’ option.
- I fixed the issue caused when selecting the same parish as both ‘current’ and ‘former’. The system was giving an error when this situation arose and I realised this is because the primary key for the table connecting place-name and parish was composed of the IDs for the relevant place-name and parish – i.e. only one join was possible for each place-name / parish pairing. I fixed this by adding in the ‘type’ field (current or former) to the primary key, this allowing one of each type to appear for each pairing.
- I updated the column sorting in the ‘browse place-names’ page so that pressing on a column heading sorts the complete dataset on this column rather than just the 50 rows that are displayed at any one time. Pressing on the column header once orders it ascending and a second time orders it descending. This required a pretty major overhaul of the ‘browse’ page as sorting had to be done on the server side rather than the client side. Still, it works a lot better now.
- I added a rudimentary search facility to the ‘browse place-names’ page, which replaces the ‘select parish’ facility. The search facility allows you to select a parish and/or a code and/or supply some text that may be found in the place-name field. All three search options may be combined – e.g. list all place-names that include the text ‘point’ that are coastal in EYM. The text search is currently pretty basic: it matches any part of the place-name text and no wildcards can be used. E.g. a search for ‘rock’ finds ‘Brockholes’. Hopefully this will suffice until we’re thinking about the public website.
- I tested adding IPA characters to the ‘pronunciation’ field and this appears to work fine (I’m sure I would have tested this out when I originally created the CMS anyway but just thought I’d check again).
In addition I also met separately with the project’s PhD student to go over the content management system with him. That’s probably all I will need to do for the project until we come to develop the front end, which I’ll make a start on sometime next year.
For the SCOSYA project this week I finished work on a table in the CMS that shows consistent / conflicted data. This can be displayed as a table in your browser or saved as a CSV to open in Excel. The structure of the table is as Gary suggested to me last week:
One row per attribute (e.g. ‘A1’) and one column per location (e.g. Airdrie). If all of the ratings for an attribute for a location are 4 or 5 then the cell contains ‘High’. If all of the ratings are 1-2 then the cell contains ‘Low’. If the ratings are something else then the cell contains ‘Mixed’. Note that if the attribute was not recorded for a location the cell is left blank. In the browser based table I’ve given the ‘Mixed’ cells a yellow border so you can more easily see where these appear.
I have also added in a row at the top of the table that contains the percentage of attributes for each location that are ‘Mixed’. Note that this percentage does not take into consideration any attributes that are not recorded for a location. E.g. if ‘Location A’ has ‘High’ for attribute A1, ‘Mixed’ for A2 and blank for A3 then the percentage mixed will be 50%. I have also added in facilities to limit the data to the young or old age groups. Towards the end of the week I met with Gary again and he suggested some further updates to the table, which I will hopefully implement next week.
I also met with Gary and Jennifer this week to discuss the tricky situation with grey squares vs grey circles on the map as discussed in last week’s post. We decided to include grey circles (i.e. there is data but it doesn’t meet your criteria) for all locations where there is data for the specified attributes, so long as the attribute is not included with a ‘NOT’ joiner. After the meeting I updated the map to include such grey circles and it appears to be working pretty well. I also updated the map pop-ups to include more information about each rating, specifically the text for the attribute (as opposed to just the ID) and the age group for each rating.
The last big thing I did for the project was to add in a ‘save image’ facility to the atlas, which allows you to save an image of the map you’re viewing, complete with all markers. This was a pretty tricky thing to implement as the image needs to be generated in the browser by pulling in and stitching together the visible map tiles, incorporating all of the vector based map marker data and then converting all of this into a raster image. Thankfully I found a plugin that handled most of this (https://github.com/mapbox/leaflet-image), although it required some tweaking and customisation to get it working. The PNG data is created as Base64 encoded text, which can then be appended to an image tag’s ‘src’ attribute. What I really wanted was to have the image automatically work as a download rather than get displayed in the browser. Unfortunately I didn’t manage to get this working. I know it is possible if the Base64 data is posted to a server which then fires it back as a file for download (I did this with Mapping Metaphor) but for some reason the server was refusing to accept the data. Also, I wanted something that worked on the client side rather than posting and then retrieving data to / from the server, which seems rather wasteful. I managed to get the image to open in a new window, but this meant the full image data appeared in the browser’s address bar, which was horribly messy. It also meant the user still had to manually select ‘save’. So instead I decided to have the image open in page, in an overlay. The user still has to manually save the image, but it looks neater and it allows the information about image attribution to be displayed too. The only further issue was that this didn’t work if the atlas was being viewed in ‘full screen’ mode so I had to figure out a way of programmatically exiting out of full screen mode if the user pressed the ‘save image’ button when in this view. Thankfully I found a handy function call that did just this: fullScreenApi.cancelFullScreen();
Fraser contacted me on Monday to say that Lancaster have finished tagging the EEBO dataset for the LinguisticDNA project and were looking to hand this over to us. On Tuesday Lancaster placed the zipped data on a server and I managed to grab it, extracting the 11GB zip file into 25,368 XML files (although upon closer inspection the contents don’t appear to really be XML and are really just tab delimited text with a couple of wrapper tags). I copied this to the J: drive for Fraser and Marc to look at.
I also had an email chat with Thomas Widmann at SLD about the font used for the DSL website. Apparently this doesn’t include IPA characters which is causing the ‘pronunciation’ field to display inconsistently (i.e. the letters that the font does include are in the font and the ones it doesn’t are in the computer’s default sans serif font). On My PC the different in character size is minimal but I think it looks worse on Thomas’s computer. We discussed possible solutions, the easiest of which would be to simply ensure that the ‘pronunciation’ field is fully displayed in the default sans serif font. He said he’d get back to me about this. I also gave a little bit of WordPress help to Maria Economou in HATII, who had an issue with a drop-down menu not working in iOS. We upgraded her theme to one that supported responsive menus and fixed that issue pretty quickly.
I also met with Fraser and Marc on Friday to discuss the new Historical Thesaurus data that we had received from the OED people. We are going to want to incorporate words and dates from this data into our database, which is going to involve several potentially tricky stages. The OED data is in XML and as I mentioned in a previous week there is no obvious ID that can be used to link their data to ours. Thankfully during the SAMUELS project someone had figured out how to take data in one of our columns and rework it so it matches up with the OED category IDs. My first step will be to extract the OED data from XML and convert it into a format similar to our structure and then create a script that will allow categories in the two datasets to be aligned. After that we’ll need to compare the contents of the categories (i.e. the words) and work out which are new ones plus which dates don’t match up. It’s going to be a fairly tricky process but it should be fun. On Friday afternoon I decided to up the HT database to remove all of the unnecessary backup tables that I had created over the years. I did a dump of the database before I did this in case I messed up. It turns out I did mess up as I accidentally deleted the ‘lexeme_search_terms’ table, which broke the HT search facility. I then discovered that my SQL dump was incomplete and had quit downloading mid-way through without telling me. Thankfully Chris managed to get the database from a backup file and I’ve reinstated the required table, but it was a rather stressful way to end the week!