Another predominantly Historical Thesaurus based week this week, with lots of progress being made on both the front end and the database. After a detailed email discussion with Marc I got a much clearer picture about what is required from the front end in terms of colour schemes, logos and fonts, and I have now completed a fifth (and hopefully for the most part final) design. After finalising that I also set up a skeleton structure for the new site, creating a PHP template script that generates all of the interface, page headings, navigation bars and breadbrumbs, and individual pages where the actual content will reside. This site structure will allow the site to be maintained very easily as all structure / design elements are contained in one single template script. Completely changing the site design, overhauling the navigation options or adding lots of new pages will not be a problem. The empty site, awaiting content, can be found here:
This week Christian and Marc emailed their search and browse requirements document, and I spent quite a bit of time going through this and creating a bit long list of questions and comments (three pages of questions and comments for a two page document). On Friday we had a two hour meeting to discuss my questions, which was hugely useful. I think we all now agree exactly how the search and browse options should operate, and I will be able to begin working on this straight away.
Also this week I noticed some errors with the MySQL database structure that I had set up to hold the data exported from Access. Somehow I had managed to miss out three Parts of Speech, and when these occurred in the data they were being converted to blank fields. I updated the structure and reimported the data, which was actually a very worthwhile process as this time I documented the whole procedure required, including which upload scripts to run and what order to run them in. This will be a very useful document to reference in future. Another strange thing with the data is that yogh characters (ȝ) weren’t appearing in the MySQL data but had been converted to question marks, even though the database was set to use UTF-8 and other unusual characters like ashes and thorns had been uploaded fine. Thankfully Flora was able to supply me with a list of word IDs that included yoghs and I was able to create another little script that fixed these errors.
On Friday afternoon I created another new database table that will hold the word forms that will be used for search purposes. As was discussed with Marc and Christian, some words have multiple forms, split with a slash, and other forms using brackets. For example ‘Palæarctic/Palearctic’ and ‘brin(e)y’. A non-wildcard search won’t find such forms, and even using a wildcard it would be difficult to find the latter. Instead we decided that I would create another table for holding all the variations of each word, with each row having a foreign key linking in to one single word ID. The search will then use this table, allowing a user to (for example) search for ‘briney’ or ‘briny’ and find the same word in each case. I am still working on the script to populate this table at the moment as there are (perhaps inevitably) some inconsistencies with the data. I will continue with this on Tuesday next week, as Monday is a holiday (woo!).