This week the Scots School Dictionary app I developed went live in the Apple App Store and the Android Play store! (click on the links to view the information about the apps). Other than a few further bits of App related business, I spent a lot of this week further developing the Mapping Metaphor website.
I’ve been working on alternative views of the data. It’s now possible to view the data for any ‘drilldown’ visualisation as a table (I’ll be adding this functionality to the aggregate view too but haven’t done so yet). When viewing the visualisation for a category, underneath the ‘info box’ on the left there’s now a ‘view’ section. Clicking ‘Change view’ opens up a list of alternative view options (visualisation, table, card and timeline). Select ‘Tabular view’ and you’re now presented with two tables – one showing the open L3 to L3 connections (e.g. all the Vs and Bs) and one showing the aggregate data (e.g. the number of connections each L3 ‘V’ category has with categories within each of the remaining L2 categories). If you’ve selected an L3 category the rows in the tables that relate to this category are highlighted with the yellow border.
In the ‘info box’ you can still access the category info for the selected category and change the metaphor strength. There is also a ‘download’ option and although this doesn’t work yet it will eventually allow you to download the tables as CSV files for use in Excel. The ‘key’ option will also be updated to show information relating to the tables too. You can order the tables by clicking on their headings (click a second time to reverse the order) and I’ve also ensured that the selected category always appears in the ‘category 1’ column. In the earlier ‘view category’ table view the selected category (e.g. V01) would appear in either the ‘category 1’ or the ‘category 2’ column depending on how the data was originally coded, which made the tables somewhat difficult to use and inconsistent.
It’s possible to return to the visualisation view from the table view at the click of a button and it is all working rather smoothly (so far as I can tell at least). There are still things I need to tweak (e.g. adding in table headings and colour coding categories) but it’s getting there. I’ll start work on the card and timeline views once this is complete.
The table view linked to directly from all visualisations is going to replace the existing category view that is reached from the ‘browse’ facility. The new tabular view provides a way of viewing the textual data that’s much more integrated with the rest of the site and I think works a lot better than the old view. I’m hoping that the card and timeline views will fall nicely into place once the table view has been fully completed. Extracting and formatting the data for use in the table required quite different techniques to those used for the visualisation so it took quite a lot of time to get the queries and the structure right, but this same structure should be usable for the card view and with a few tweaks for the timeline view too.
Other than working on the table view I made a few further tweaks and fixes to the visualisations, for example the way in which the number of aggregate links in the hybrid visualisation are calculated. These previously defaulted to the total number of metaphorical connections and ignored the selected metaphor strength but now they take the strength into consideration.
I also spent some time this week working on the Scots Thesaurus project. I investigated how to query the DSL XML data using FLOWR syntax using the BaseX front-end. I’ve created a little query that searches through all the ‘senses’ but ignoring the citations for a supplied search term and then returns those entries that include the search term. This will form an important part of the automated workflow that will allow a researcher to search the Historical Thesaurus of English for words and then automatically search the DSL for words returned. I’ve begun working on a tool that will connect to both the HTE’s MySQL database and the DSL’s BaseX XML database and will provide an interface for researchers to use to do the following:
1. enter a word (e.g. ‘golf’) to find all lexemes and categories in the HTE that include this (e.g. all lexemes within a category that includes the word ‘golf’ plus any other lexemes that include this word)
2. view the returned results and filter out the ones that aren’t relevant (e.g. ‘hagolfaru’)
3. submit this filtered list to BaseX (containing the DSL XML), which will then automatically search the XML for occurrences of each word within entries, excluding citations.
4. View and download all entries that are returned from the baseX query.
So far I’ve got steps one and two completed but I haven’t got BaseX installed on a server yet so I can’t integrate the DSL side of things. I’ve submitted a request to install BaseX to Arts Support so hopefully I’ll be able to finish work on this little tool next week.
This week was almost entirely devoted to completing work on the Scots School Dictionary app. I had previously completed a second version of the app based on feedback received from various people and had then received feedback on this updated version, resulting in a small number of tweaks being required. I had hoped that implementing these changes, wrapping the apps for iOS and Android, testing these wrapped versions and submitting them to the App and Play stores would only take up 2-3 days but in the end these tasks took up 4-5 days. I’d forgotten just how many hoops you need to jump through in order to submit an App to Apple, alas.
The updates resulting from feedback took about half a day to get sorted. These included adding in the proper introduction text and the help text that was supplied by Chris and fixing the ‘full text’ search. Rather foolishly, I’d set the ‘full text’ search to search the description of entries but not the headword itself. This led to no results being found when the headword didn’t actually appear in the description, which was a bit of a confusing situation. The search facility works a lot better now that I’ve fixed this little quirk.
1. External links opened up the target website within the app, replacing the app itself and then offering no way for the user to get back to the app. (Well, with Android devices you always have a handy ‘back’ button available but iOS devices have no such button).
2. Sound files wouldn’t play in the Android version of the app. Using the HTML5 audio tag works fine in iOS. The play / pause button looks a bit clunky but the sounds play without any problems. The same cannot be said for Android. The HTML 5 audio ‘play’ button appears but pressing it does nothing. A bit of Googling revealed one possible cause for this – Android devices need a different path to the sound file directory to be specified – ‘/android_asset/www/’ needs to be added before any other directory you have. However, even after adding this I just couldn’t get any sounds to play.
With the app now fully working in iOS and Android emulators it was time to test them out on actual devices. I tried the app on my iPad and my Nexus 4 phone and didn’t run into any difficulties whatsoever. The next step was to actually submit the app to the stores. For Apple devices this is a somewhat laborious process that involves generating icons at a somewhat ridiculous number of sizes, creating screenshots of the app running on a wide variety of screen sizes and dealing with provisioning profiles and itunesconnect. I simply couldn’t figure out how to submit the app to itunesconnect from within xCode following the instructions given by Apple and this seemed to be a problem with my account already being associated with the University of Glasgow account. In the end I had to submit it via the ApplicationLoader utlity instead, using Chris’s account rather than my own. I eventually got the app submitted though.
I’d never submitted an Android app to the Play store before and the first hurdle I encountered was the fee charged to do so. Although Apple charges developers an annual $100 a year fee to publish stuff on the App Store I had thought it was free on Android devices. However, it turns out that you do still have to pay a one off fee of $25. Chris sorted this out and I began the submission process. For Android you need to get Cordova to build a release version (the default is a debug version) and this can be accomplished by the command ‘cordova build android –release’, which creates an APK file in the ‘platforms/android/ant-built’ directory. I also realised that I needed to replace the default Cordova icons with proper icons. Four sizes were required (36, 48, 72, 96), replacing the ‘icon.png’ files within various folders within ‘platforms/res’. A rebuild was required to pull these in.
The Android app file is ‘unsigned’ at this stage and there are still some processes that need to be completed before the app can be submitted. I achieved this by following the ‘Signing you app manually’ steps detailed on this page: http://developer.android.com/tools/publishing/app-signing.html . After completing this I managed to upload the APK file, supply store icons, screenshots, descriptions etc and submit the app for publication. Apple can take up to two weeks to approve an app with Google is a little more swift (and no doubt less thorough). I submitted the app to Google on Friday and on Saturday the app was available from the Google Play store. If you have an android device search for ‘Scots Dictionary’ and you will find it. Phew!
Other than app related stuff I had a meeting with Gerry Carruthers to discuss a project he’s putting together and made a couple of tweaks to the Digital Humanities Network website. Next week I intend to return to Mapping Metaphor work and also to spend some more time on the Scots Thesaurus.
It’s hard to believe we’re into November already – scary stuff. Anyway, on with the weekly report: My time this week was mostly devoted to Mapping Metaphor duties, although I had to spend roughly 1.5 days on AHRC duties reviewing more Technical Plans – they just keep coming in.
On Monday we had a Mapping Metaphor team meeting, which was useful to participate in, as always. Other than that I continued with development duties. I’ve now implemented the bulk of the updates to the visualisations that are on my ‘to do’ list and I’ve started to move on to some of the other items (e.g. metaphor cards). I’ll return to some of the outstanding items in the visualisation part of my list at a later date.
Last week I started to implement little rectangular markers to differentiate L3 categories and L2 categories in the ‘hybrid’ view, and I continued with this work this week. These markers also now appear as a means of identifying L3 categories in other views too (e.g. when viewing all the L3 connections between two L2 categories or when viewing the L3 categories connected to either / both selected L3 categories). I also updated the ‘Download SVG’ option so that the exported SVGs reflect these changes too.
I was originally intending to use arcs around the circle rather than individual rectangles at each label, and this effect is given when viewing a crowded visualisation (e.g. drill down into category B and then select to expand category D), but on less populated visualisations the rectangles are quite distinct (e.g. drill down into category V and expand O). I’m not really sure if I like how this looks or not. The advantage of using individual rectangles as opposed to arcs (other than it being easier to implement) is that it is easy to then highlight these rectangles to make it clearer where the connections are (e.g. drill down into B, expand D then select B20 Plants).
I made some further revisions to the ‘info box’ this week. I’ve reworked the ‘key’ again as it was adding too much clutter to the info box. It is now a pop-up that you can move around the screen and resize. The descriptions in the key will need some work, especially the descriptions for the purple and orange rectangles. Having ‘categories within categories’ is rather confusing.
I’ve realised that I’m going to run into a potentially sizable problem once the L2 categories are reclassified. My code expects the L2 category to be the first character in a category ID and for the L3 category to be the first 3 characters. If ‘D’ is being replaced by ’22’ or whatever two character code it is then everything is going to break. There are currently 26 L2 categories with the letter of the alphabet as ID. But after the reclassification there will be about 38 categories so a single character just isn’t going to work. My code does a lot of processing based on the assumption that L2 category is the first character in any category ID so I’m going to have to rewrite a lot of it. That’s not going to be a fun task.
Next week I will continue to implement the alternative views, plus I will need to spend some time wrapping the Scots School Dictionary app and also begin working with the HT data for the Scots Thesaurus project.
I spent a couple of days this week reworking the Scots School dictionary app that I’m developing for Scottish Language Dictionaries. Over the past couple of weeks I have received feedback from a number of people about the first version I put together over the summer and last week created a ‘to do’ list of things that needed tweaked. I’m happy to say that I managed to complete all of these items this week and by Friday I had sent out the URL for a new version for people to try out. I can’t really share the URL here, but hopefully it won’t be long now until the app is available through the iOS App Store and the Android Play Store. The main tasks I managed to tick off were:
1. Updating the interface slightly, mainly just tweaking the colours a bit
2. Creating an introduction page with some sample text (to be replaced with real text). This is now accessible by clicking on the ‘Scots Dictionary for Schools’ header at the top of every page.
3. Creating a ‘random word’ feature for the welcome page, with a button allowing users to load a new word.
4. Creating a placeholder ‘help’ page
5. Updating the footer to include the SLD logo on the right and some text on the left. The text changes each time the page fully reloads (only when you navigate between the intro page, browse page, search page and help page, not when you navigate between words on a page).
6. Ensuring the ‘back to search results’ and ‘back to letter’ links when viewing a word now take you to the relevant part of the list rather than dumping you at the top of the page
7. Adding in form numbers, which now appear in grey in the word page (e.g. ‘wey (1)’)
8. Ensuring that if there’s only one search result the navigate between results buttons are now hidden
9. Adding in search term highlighting, with the search term highlighted in yellow in the word page
10. Making all words in ‘related words’ and ‘meaning’ sections clickable. Click on one to bring up a popup allowing you to select ‘Scots’ or ‘English’ then press search to search the full text for this word.
11. Ensuring the ‘search’ button no longer gets hidden by the footer. You still have to scroll down the page to see it, but at least it should be visible on al screens now.
12. Removing the confusing ‘<<‘ and ‘>>’ symbols and instead styling the text within these as dark blue and bold to make them stand out.
A further day was mostly taken up with AHRC duties, reviewing yet another technical plan. I’ve got another one to do next week too. I also met with Scott Spurlock from Theology to discuss a project he is putting together. He was interested in the possibilities of performing the equivalent of OCR on handwritten texts, more specifically on cursive, historical texts written by many different hands. I spent a bit of time researching this possibility and speaking to other developers in the College about it but it’s not looking massively promising. There is a Wikipedia page on handwriting recognition (http://en.wikipedia.org/wiki/Handwriting_recognition) but it rather unpromisingly states “There is no OCR/ICR engine that supports handwriting recognition as of today.” The current state of the art seems to be tools that can either:
1. Recognise individual handwritten printed characters in separate boxes (e.g. in forms or post codes)
2. Be trained to recognise the handwriting of one individual, converting this automatically to machine readable text on the fly as a user writes on a touchscreen.
Neither of these approaches would suit the project, which has cursive texts written by numerous hands. There is a concept called ‘Intelligent Word Recognition’ (http://en.wikipedia.org/wiki/Intelligent_word_recognition) and there may be tools in this field that are worth pursuing. These tools aim to extract and pattern match words rather than attempting to split words into individual characters. Unfortunately information about products that claim to be able to achieve this is rather vague. I’ve found this product http://www.a2ia.com/en/handwriting-recognition and their ‘white paper’ (http://www.a2ia.com/sites/default/files/industry_solutions/a2ia-using_iwr_to_cut_labor_costs_without_outsourcing.pdf) provides quite a lot of information about how their product works and it does look sort of promising, but I think it is intended to work on free-text boxes in forms rather than page after page of cursive text.
I also found this blog post: http://blog.parascript.com/icr-software-101-handprint-recognition that discusses ‘handprint recognition’ and links to another ‘white paper’ at the bottom (you need to subscribe to receive it and I haven’t done this but the blurb does state ‘Advanced ICR technology thinks like a human to process documents that include any type of handwriting, including unconstrained handprint, cursive and more’). However I’m a little sceptical as ‘handprint’ means individually printed characters.
Other developers in the College of Arts that I have spoken to thought that current technology would not be able to automatically extract text from the sorts of handwritten historical documents the project will likely be dealing with and the general consensus was that crowdsourcing or outsourcing transcription would be more suitable. However, it’s an emerging technology that is worth keeping track of.
Other than fixing a bug with the Scottish Corpus (a server setting was limiting the number of documents that could be downloaded at once to 1000) and dealing with the DSL website stopping working briefly I spent the rest of the week on Mapping Metaphor duties. I completed the updates to the way the ‘centre on category’ feature works, as discussed last week. I also completed the reformatting of the ‘info box’, adding the ‘key’ information to the bottom of it and reducing the amount of space taken up by the links to other categories, into ‘view info’ button, the ‘download’ button etc.
The big task I worked on for the project this week was to try and get some sort of background colour for the visualisation labels. Why is this required? Mainly to allow L3 categories in the hybrid view to stand out from the L2 categories – so that the L3 categories within the L2 category you’re looking at can have one background colour and the L3 categories that these link to in the L2 category you’ve opened can have another colour.
It is not as straightforward as you might think to get a text background colour as SVG does not allow text elements to have background colour styling. Instead what you need to do is create a new rectangle object that is the right dimensions and position and then place this behind the text item. I found a possible way of doing this here: http://stackoverflow.com/questions/15500894/background-color-of-text-in-svg and after a lot of tweaking I managed to get all of the L3 categories in the selected L2 category with a red background colour (for test purposes). I wasn’t really very satisfied with how this looked though. The labels are positioned round a circle so the background colours jutted out as individual spokes. I decided that having a small rectangle on the side of the label nearest the inner circle looked a lot better and implemented this instead. These small rectangles aren’t ‘text background’ but are just added to the SVG group in the same way as the node circles I added a few weeks ago. I think this looks ok, and it allows the selected or linked to category to be given a differently coloured rectangle – something I will try to implement next week.