After meeting with Fraser to discuss his Scots Thesaurus project last Friday I spent some time on Monday this week writing a script that returns some random SND or DOST entries that met certain criteria, so as to allow him to figure out how these might be placed into HT categories. The script brings back main entries (as opposed to supplements) that are nouns, are monosemous (i.e. no other noun entries with the same headword), have only one sense (i.e. not multiple meanings within the entry), have fewer than 5 variant spellings, have single word headwords and have definitions that are relatively short (100 characters or less). Whilst writing the script I realised that database queries are somewhat limited on the server and if I try to extract the full SND or DOST dataset to then select rows that meet the criteria in my script these limits are reached and the script just displays a blank page. So what I had to do is to set the script up to bring back a random sample of 5000 main entry nouns that don’t have multiple words in their headword in the selected dictionary. I then have to apply the other checks on this set of 5000 random entries. This can mean that the number of outputted entries ends up being less than the 200 that Fraser was hoping for, but still provides a good selection of data. The output is currently an HTML table, with IDs linking through to the DSL website and I’ve given the option of setting the desired number of returned rows (up to 1000) and the number of characters that should be considered a ‘short’ definition (up to 5000). Fraser seemed pretty happy with how the script is working.
Also this week I made some further updates to the new song story for RNSN and I spent a large amount of time on Friday preparing for my upcoming PDR session. On Tuesday I met with Luca to have a bit of a catch-up, which was great. I also fixed a few issues with the Thesaurus of Old English data for Jane Roberts and responded to a request for developer effort from a member of staff who is not in the College of Arts. I also returned to working on the Books and Borrowing pilot system for Matthew Sangster, going through the data I’d uploaded in June, exporting rows with errors and sending these to Matthew for further checking. Although there are still quite a lot of issues with the data, in terms of its structure things are pretty fixed, so I’m going to begin work on the front-end for the data next week, the plan being that I will work with the sample data as it currently stands and then replace it with a cleaner version once Matthew has finished working with it.
I divided the rest of my time this week between DSL and SCOSYA. For the DSL I integrated the new APIs that I was working on last week with the ‘advanced search’ facilities on both the ‘new’ (v2 data) and ‘sienna’ (v3 data) test sites. As previously discussed, the ‘headword match type’ from the live site has been removed in favour of just using wildcard characters (*?”). Full-text searches, quotation searches and snippets should all be working, in addition to headword searches. I’ve increased the maximum number of full-text / quotation results from 400 to 500 and I’ve updated the warning messages so they tell you how many results your query would have returned if the total number is greater than this. I’ve tested both new versions out quite a bit and things are looking good to me, and I’ve contacted Ann and Rhona to let them know about my progress. I think that’s all the DSL work I can do for now, until the bibliography data is made available.
For SCOSYA I engaged in an email conversation with Jennifer and others about how to cover the costs of MapBox in the event of users getting through the free provision of 200,000 map loads a month after the site launches next month. I also continued to work on the public atlas interface based on discussions we had at a team meeting last Wednesday. The main thing was replacing the ‘Home’ map, which previously just displayed the questionnaire locations, with a new map that highlights certain locations that have sound clips that demonstrate an interesting feature. The plan is that this will then lead users on to finding out more about these features in the stories, whilst also showing people where some of the locations to project visited are. This meant creating facilities in the CMS to manage this data, updating the database, updating the API and updating the front-end, so a fairly major thing.
I updated the CMS to include a page to manage the markers that appear on the new ‘Home’ map. Once logged into the CMS click on the ‘Browse Home Map Clips’ menu item to load the page. From here staff can see all of the locations and add / edit the information for a location (adding an MP3 file and the text for the popup). I added the data for a couple of sample locations that E had sent me. I then added a new endpoint to the API that brings back the information about the Home clips and updated the public atlas to replace the old ‘Home’ map with the new one. Markers are still the bright blue colour and drop into the map. I haven’t included the markers for locations that don’t have clips. We did talk at the meeting about including these, but I think they might just clutter the map up and confuse people.
I also reordered and relabelled the menu, and have changed things so that you can now click on an open section to close it. Currently doing so still triggers the map reload for certain menu items (e.g. Home). I’ll try to stop it doing so, but I haven’t managed to yet.
I also implemented the ‘Full screen’ slide type, although I think we might need to change the style of this. Currently it takes up about 80% of the map width, pinned to the right hand edge (which it needs to be for the animated transitions between slides to work). It’s only as tall as the content of the slide needs it to be, though, so the map is not really being obscured, which is what Jennifer was wanting. Although I could set it so that the slide is taller, this would then shift the navigation buttons down to the bottom of the map and if people haven’t scrolled the map fully into view they might not notice the buttons. I’m not sure what the best approach here might be, and this needs further discussion.
I also changed the way location data is returned from the API this week, to ensure that the GeoJSON area data is only returned from the API when it is specifically asked for, rather than by default. This means such data is only requested and used in the front-end when a user selects the ‘area’ map in the ‘Explore’ menu. The reason for doing this is to make things load quicker and to reduce the amount of data that was being downloaded unnecessarily. The GeoJSON data was rather large (several megabytes) and requesting this each time a map loaded meant the maps took some time to load on slower connections. With the areas removed the stories and ‘explore’ maps that are point based are much quicker to load. I did have to update a lot of code so that things still work without the area data being present, and I also needed to update all API URLs contained in the stories to specifically exclude GeoJSON data, but I think it’s been worth spending the time doing this.