I split most of my time this week between the SCOSYA project and the Historical Thesaurus. The launch of the SCOSYA atlases is scheduled to take place in November and I had suggested to Jennifer that it might be good to provide access to the project’s data via tables rather than through the atlas interfaces. This is because although the atlases look great and are a nice interactive way of accessing and visualising the data, some people prefer looking at tables of data instead, and other people may struggle to use the interactive atlases due to accessibility issues, but may still want to be able to view the project’s data. We will of course provide free access to the project’s API, through which all of the data can be accessed as CSV or JSON files, or can even be incorporated into a completely new interface, but I thought it might be useful if we provided text-based access to the data through the project’s front-end as well. Jennifer agreed that this would be useful, so I spent some time writing a specification document for the new features, sending it to the team for feedback and developing the new features.
I created four new features. First was a table of dialect samples, which lists all of the locations that have dialect sample recordings and provides access to these recordings and the text that accompanies them, replicating the data as found on the ‘home’ map of the public atlas. The second feature provides a tabular list of all of the locations that have community voice recordings. Clicking on a location then displays the recordings and the transcriptions of each, as the following screenshot shows:
The third new feature lists all of the examples that can be browsed for through the public atlas. You can then click on one of these examples to listen to the example sound clips of the example and to view a table of results for all of the questionnaire locations. Users can also click through to view this example on the atlas itself too, as I figured that some people might want to view the results as a table but then see how these look on the atlas too. The following screenshot shows the ‘explore’ feature for a particular example:
The fourth new feature replicates the full list of examples as found in the linguists’ atlas. There are many examples nested within parent and sub-parent categories and it can be a bit difficult to get a clear picture of what is available through the nested menus in the atlas, so this new feature provides access to a complete list of the examples that is fully expanded and more easy to view, as the following screenshot demonstrates:
It’s then possible to click on an example to view the results of this example for every location in a table, again with a link through to the result on the atlas, which then enables the user to customise the display of results further, for example focussing on older or younger speakers or limiting the display to particular rating levels.
Finally for the project this week I met with Jennifer and E to discuss the ancillary pages and text that need to be put in place before the launch, and we discussed the launch itself and what this would involve.
For the HT I generated some datasets that an external researcher had requested from the Thesaurus of Old English data, and I generated some further datasets from the main HT database for another request. I also started to implement a system to generate the new dates table. I created the necessary table and wrote a function that takes a lexeme and goes through all of the 19 date fields to generate the rows that would need to be created for the lexeme. As of yet I haven’t set this running on the whole dataset, but instead I’ve created a test script that allows you to pass a catid and view all of the date rows that would be created for each lexeme in the category so I (and Marc and Fraser) can test things out. I’ve tested it out with categories that have some complicated date structures and so far I’ve not encountered any unexpected behaviour, apart from one thing: Some lexemes have a full date such as ‘1623 Dict. + 1642–1711’. The script doesn’t analyse the ‘fulldate’ field but instead looks at each of the actual date fields. There is only one ‘label’ field so it’s not possible to ascertain that in this case the label is associated with the first date. Instead the script always associates the label with the last date that a lexeme has. I’m not sure how common it is for a label to appear in the middle of a full date, but it definitely crops up fairly regularly when I load a random category on the HT homepage, always as ‘Dict.’ so far. We’ll need to see what we can do about this, if it turns out to be important, which I guess it probably will.
Also this week I performed some App Manager duties, had a conversation with Dauvit Broun, Nigel Leask and the RDM team about the ArtsLab session on data management plans next week, and spoke to Ann Ferguson of the DSL about how changes to the XML structure of entries will be reflected in the front-end of the DSL.