This week was predominantly spent working on the Historical Thesaurus redesign, both the database and the page design for the front end. For the database I created a bunch of upload and data processing scripts to get the almost 800,000 rows of data from the Access database into the new MySQL database that will be used to power the website. Despite stating last week that I wouldn’t change the structure of the data, this week I decided to do just that by moving the 13 fields that make up category information to a dedicated category table rather than having this information as part of the lexeme table. Splitting the information up reduces the amount of needlessly repeated data – for example there are up to 50 lexemes in each category and previously all 13 category fields were being repeated up to 50 times whereas now the information is stored once and then linked to the lexeme table, which is much neater.
By the end of the week I had all of the data migrated and moved into the new database structure, with a number of indices in place to make data retrieval speedier too. One slight issue with the data was that ‘empty’ categories in the hierarchy (i.e. ones that don’t have any associated lexemes) are not present in the Access database. This makes sense when you’re focussing on lexemes, but in order to develop a browse option or present breadcrumbs the full hierarchy is needed. For example 01.01.06.01 is ‘Europe’ and its parent category is ’01.01.06’, regions of the earth. But as this category has no lexemes of its own it isn’t represented in the database. I met with Marc on Thursday and he managed to get a complete list of the categories to me, including the ‘empty’ ones and I spent some time working on a script that would pull out the empty ones and add them to my new ‘category’ database. While doing this I came across a few errors in the data, where the full combination of headings and part of speech was not unique. I also noticed that I had somehow made an error in my database structure, missing out three parts of speech types. Rectifying this will mean reuploading all the data, which I will do next week.
In terms of front end work, I made some further possible interface designs, all of which are ‘responsive’ designs (they automatically change with screen size, meaning no separate mobile / tablet interface needs to be developed). It was a good opportunity to learn more about responsive web design. My second possible interface can be found here http://historicalthesaurus.arts.gla.ac.uk/new-design-2/ and possibly looks a bit too ‘bloggy’. I further adapted this design to use a horizontal navigation section, which you can view here: http://historicalthesaurus.arts.gla.ac.uk/new-design-3/. At the meeting with Marc on Thursday I received some feedback from him and the other people involved with the project regarding colour schemes and fonts, and as a result of this I came up with a fourth design, which will probably end up being used and can be viewed here: http://historicalthesaurus.arts.gla.ac.uk/new-design-4/. This combines the horizontal navigation of the previous design with the left-hand navigation of design number 2, and I think it looks quite appropriate.
Also this week I helped to set up the domain and provided some feedback to Daria for the ICOS2014 conference website and did some Digital Humanities Network related tasks.