Week Beginning 26th September 2016

I spent a lot of this week continuing to work on the Atlas and the API for the SCOSYA project, tackling a couple of particularly tricky feature additions, amongst other things.  The first tricky thing was adding in the facility to add all of the selected search options to the page URL so as to allow people to bookmark and share URLs. This feature should be pretty useful for people, allowing them to save and share specific views of the atlas.  People will also be able to cite exact views in papers and we’ll be able to add a ‘cite this page’ feature to the atlas too.  It will also form the basis for the ‘history’ feature I’m going to develop too, which will track all of the views a user has created during a particular session.  There were two things to consider when implementing this feature, firstly getting all of the search options added to the address bar, and secondly adding in the facilities to get the correct options selected and the correct map data loaded when someone loads a page containing all of the search options.  Both tasks were somewhat tricky.

I already use a Leaflet plugin call leaflet hash which adds the zoom level, latitude and longitude to the page URL as a hash (the bit after the ‘#’ in a URL).  This nice little plugin already ensures that a page loads the map at the correct location and zoom level if the variables are present in the URL.  I decided to extend this to add the search criteria as additional hash elements.  This meant I had to rework the plugin slightly as it was set to fail if more than the expected variables were passed in the hash.   But with this updated all I had to do was update my JavaScript that runs when a search is submitted and ensure that all of the submitted options are added to the hash.  I had to test this out a bit as I kept getting an unwanted slash added to the address bar sometimes, but eventually I sorted that.

So far so good, but I still had to process the address bar to extract the search criteria variables and then build the search facilities, including working out how many ‘attribute’ boxes to generate, which attributes should be selected, which limits should be highlighted and which joiners between variables needed to be selected.  This took some figuring out, but it was hugely satisfying to see it all coming together piece by piece.  With all of the search boxes pre-populated based on the information in the address bar the only thing left to do was automatically fire the ‘perform search’ code.  With that in place we now have a facility to store and share exact views of the atlas, which I’m rather pleased with.

I also decided to implement a ‘full screen’ view for the map, mainly because I’d tried it out on my mobile phone and the non-map parts of the page really cluttered things up and made the map pretty much impossible to use.  Thankfully there are already a few Leaflet plugins that provide ‘full screen’ functionality and I chose to use this one: https://github.com/brunob/leaflet.fullscreen.  It works very nicely – just like the ‘full screen’ option in YouTube – and with it added in the atlas becomes much more usable on mobile devices, and much prettier on any screen, actually.

The second big feature addition I focussed on was ensuring that ‘absent’ data points also appear on the map when a search is performed.  There are two different types of ‘absent’ data points – locations where no data exists for the chosen attributes (we decided these would be marked with grey squares) and locations where there is data for the chosen attributes but it doesn’t meet the threshold set in the search criteria (e.g. ratings of 4 or 5 only).  These were to be marked with grey circles.  Adding in the first type of ‘absent’ markers was fairly straightforward, but the second type raised some tricky questions.

For one attribute this is relatively straightforward – if there isn’t at least one rating for the attribute at the location with the supplied limits (age, number of people, rating) then see if there are any ratings without these limits applied.  If so then return these and display these locations with a grey circle.

But what happens if there are multiple attributes?  How should these be handled when different joiners are used between attributes?  If the search is ‘x AND y NOT z’ without any other limits should a location that has x and y and z be returned as a grey circle?  What about a location that has x but not y?  Or should both these locations just be returned as a grey square because there is no recorded data that matches the criteria?

Should locations with grey circles have to match the criteria (x AND y NOT z) but ignore other limits – e.g. for the query ‘x in old group rated by 2 people giving it 4-5 AND y in young group rated by 2 people giving it 1-2 NOT z in all ages rated by 1 or more giving it 1-5’.  A further query that ignores the limits will then run and any locations that appear in this that are not found in the full query will then be displayed as grey circles.  All other locations will be displayed as grey squares.

Deciding on a course of action for this required consultation with other team members, so Gary, Jennifer and I are going to meet next Monday.  I managed to get the grey circles working for a single attribute as described above.  It’s just the multiple attributes that are causing some uncertainty.

Other than SCOSYA work I did a few other things this week.  Windows 10 decided to upgrade to ‘anniversary edition’ without asking me on Tuesday morning when I turned on my PC, which meant I was locked out of my PC for 1.5 hours while the upgrade took place.  This was hugely frustrating as all my work was on the computer and there wasn’t much else I could do.  If only it had given me the option of installing the update when I next shut down the PC I wouldn’t have wasted 1.5 hours of work time.  Very annoying.

Anyway, I did some AHRC review work this week.  I also fixed a couple of bugs in the Mapping Metaphor website.  Some categories were appearing out of order when viewing data via the tabular view.  These categories were the ‘E’ ones – e.g. ‘1E15’.  It turns out that PHP was considering these category IDs to be numbers written using ‘E Notation’ (See https://en.wikipedia.org/wiki/Scientific_notation#E_notation).  Even when I explicitly cast the IDs as strings PHP still treated them as numbers, which was rather annoying.  I eventually solved the problem by adding a space character before the category ID for table ordering purposes.  Having this space made PHP treat the string as an actual String rather than a number.  I also took the opportunity to update the staff ‘browse categories’ pages of the main site and the OE site to ensure that the statistics displayed were the same as the ones on the main ‘browse’ pages – i.e. they include duplicate joins as discussed in a post from a few weeks ago.

I also continued my email conversation with Adrian Chapman about a project he is putting together and I spent about half a day working on the Historical Thesaurus again.  Over the summer the OED people sent us the latest version of their Thesaurus data, which includes a lot of updated information from the OED, such as more accurate dates for when words were first used.  Marc, Fraser and I had arranged to meet on Friday afternoon to think about how we could incorporate this data into the Glasgow Historical Thesaurus website.  Unfortunately Fraser was ill on Friday so the meeting had to be postponed, but I’d spent most of the morning looking at the OED people’s XML data plus a variety of emails and Word documents about the data and had figured out an initial plan for matching up their data with ours.  It’s complicated a bit because there is no ‘foreign key’ we can use to link their data to ours.  We have category and lexeme IDs and so do they but these do not correspond to each other.  They include our category notation information (e.g. 01.01) but we have reordered the HT several times since the OED people got the category notation so what they consider to be category ’01.01’ and what we do don’t match up.  We’re going to have to work some magic with some scripts in order to reassemble the old numbering.  This could quite easily turn into a rather tricky task.  Thankfully we should only have to do it once because after I’d done it this time I’m going to add new columns to our tables that contain the OED’s IDs so in future we can just use these.