Week Beginning 2nd May 2022

Monday was the May Day holiday so it was a four-day week.  I spent three of the available days working on the Speak For Yersel project.  I completed work on the age-based questions for the lexical follow-on section.  We wanted to split responses based on the age of the respondent, but I had a question about this:  Should the age filters be fixed or dynamic?  We say 18 and younger / 60 and older but we don’t register ages for users, we register dates of birth.  I can therefore make the age filters fixed (i.e. birth >=2004 for 18, birth <=1962 for 60) or dynamic (e.g. birth >= currentyear-18 and birth <= currentyear -60).  However, each of these approaches have issues.  With the former with each passing year the boundaries will change.  With the latter we end up losing data with each passing year (if someone is 18 when they submitted their data in 2022 then their data will be automatically excluded next year).  I realised that there is a third way:  When a person registers I log the exact time of registration so I can ascertain their age at the point when they registered and this will never change.  I decided to do this instead, although it does mean that the answers of someone who is 18 today will be lumped in with the answers of someone who is 18 in 10 years time, which might cause issues.  However, we can always change how the age boundaries work at a later date.  Below is a screenshot of one of the date questions (more data is obviously still needed):

Whilst working on this I realised there is another problem with this type of question:  Unless we have equal numbers of young and old respondents is it not likely that the data visualised on the map will be misleading?  Say we have 100 ‘older’ respondents but 1000 ‘younger’ ones due to us targeting school children.  If 50% of the older respondents say ‘scunnered’ then there will be 50 ‘older’ markers on the map.  If 10% of the younger respondents say ‘scunnered’ then there will be 100 ‘younger’ markers on the map, meaning our answer ‘older’ (which is marked as ‘correct’) will look wrong even though statistically it is correct.  I’m not sure how we can get around this unless we maybe plot the markers for each age group who don’t use the form as well, so as to let people see the total number of people in each group.  Maybe using a smaller marker and / or a lighter shade for the people who didn’t say a form.  I raised this issue with the team and this is the approach we will probably take.

I then moved onto the follow-on activities for the ‘Sounds about right’ section.  Tis involved creating a ‘drag and drop’ feature where possible answers need to be dropped into boxes.  The mockup suggested that the draggable boxes should disappear from the list of options when dropped elsewhere but I’ve changed it so that the choices don’t disappear from the list, but instead the action copies the contents to the dotted area when you drop your selection.  The reason I’ve done it this way is that if the entire contents move over we could end up with someone dropping several into one box, or if they drop an option into the wrong box they would then have to drag it from the wrong box into the right one before they can try another word in the same box and it can all get very messy (e.g. if there are several words dropped into one box then do we consider this ‘correct’ if one of the words is the right one?).  This way keeps things a lot simpler.  However, it does mean the words the user has already successfully dropped still appear as selectable in the list, which might confuse people and I could disable or remove an option once it’s been correctly placed.  Below is a screenshot of the activity with one of the options dropped:

The next activity asks people to see whether rules apply to all words with the same sounds by selecting ‘yes’ or ‘no’ for each.  I set it up so that the ‘check answers’ button only appears once the user has selected ‘yes’ or ‘no’ for all of the words, and on checking the answers a tick or a cross is added to the right of the ‘yes’ and ‘no’ options.  The user must correct their answers and select ‘check answers’ again before the ‘Check answers’ button is replaced with a ‘Next’ button.  See a screenshot below:

With these in place I then moved onto the ‘perception’ activity, that I’d started to look into last week.  I completed stages 1 and 2 of this activity, allowing the user to rate how they think a person from a region sounds using the seven sliding scales as criteria, as you can see below:

And then rating actual sound clips of speakers from certain areas using the same seven criteria, as the screenshot below shows:

Finally, I created the ‘explore more’ option for the perception activity, which consists of two sections.  The first allows the user to select a region and view the average rating given by all respondents for that region, plotted on ‘read only’ versions of the same sliding scales.  The team had requested that the scales animated to their new locations when a new region was selected and although it took me a little bit of time to implement this I got it working in the end and I think it works really well.  The second option is very similar only it allows the user to select both the speaker and the listener, so you can see (for example) how people from Glasgow rate people from Edinburgh.  At the moment we don’t have information in the system that links up a user and the broader region, so for now this option is using sample data, but the actual system is fully operational.  Below is a screenshot of the first ‘explore’ option:

I feel like I’ve made really good progress with the project this week, but there is still a lot more to implement and I’ll continue with this next week.

I spent Friday working on another project, generating some views of performance data relating to performances of The Gentle Shepherd by Allan Ramsay ahead of a project launch at the end of the month.  I’d been given a spreadsheet of the data so my first step was to write a little script to extract the data, format it (e.g. extracting years from the dates) and save it as JSON, which I would then use to generate a timeline, a table view and a map-based view.  On Friday I completed an initial version of the timeline view and the table view.

I made the timeline vertical rather than horizontal as there are so many years and so much data that a horizontal timeline would be very long, and these days most people use touchscreens and are more used to scrolling down a page than along a page.  I added a ‘jump to year’ feature that lists all of the years as buttons.  Pressing on one of these scrolls to the appropriate year.  There are rather a lot of years so I’ve hidden them in a ‘Jump to Year’ section.  It may be better to have a drop-down list of options instead and I’ll maybe change this.  Each year has a header and a dividing line and a ‘top’ button that allows you to quickly scroll back to the top of the timeline.  Each item in the timeline is listed in a fixed-width box, with multiple boxes per row depending on your screen width and the data available.  Currently all fields are displayed, but this can be changed.

The table view displays all of the data in a table.  You can click on a column heading to sort the data by that heading.  Pressing a heading a second time reverses the order.  I still need to add in the filter options to the table view and then work on the map view once I’m given the latitude and longitude data that is still needed for this view to work.  I’ll continue with this next week.

Also this week I make a couple of minor tweaks to the DSL website and had some discussions with the DSL people about the SLD data and the fate of the old DSL website.  I also updated some of the data for the Books and Borrowing project and had a chat with Thomas Clancy about hosting an external website that is in danger of disappearing.