Friday, December 11, 2009

BHL interface ideas

I've been buried in programming (and it's exam time at Glasgow) so I've not blogged for a month (gasp). I've been playing with ways to visualise Biodiversity Heritage Library content for a while (click here for a list of previous posts), and have occasionally surfaced to tweet a screenshot via twitpic. The more I play with the BHL content the more I think it's a gold mine, and that many of the ideas I played with for my ill-fated Elsevier Challenge entry (website here, background paper at hdl:10101/npre.2009.3173.1) are taking on a new life with this project.

I'm hoping to release my BHL article finding and visualising web site by the end of the month, but meantime I'm gathering the screenshots here.

The first shows a Google map generated from latitude and longitudes extracted from OCR text using some simple regular expressions from page 7705952 in the BHL.There's quite a bit of latitude and longitude information in BHL, and that's before trying georeferencing tools.

<46740423.png


The idea is to display this map next to the article so that user get's an immediate sense of what region in the world the article covers, such as this article about Riekia wasps:

46744940.png


I'm also interested in useful ways to display search results. Here's an experiment using TileBars to visualise how relevant a search result is. The width of the bar is a function of how many pages are in the article, the vertical stripes indicate pages that have the search term. The idea is to get a quick visual impression of whether the article mentions the term in parsing, or treats it in some detail.

48350737.png


TileBars were developed by Marti Hearst, whose web site has some great resources. Partly inspired by her BioText projec, as well as the thumbnail page display in JSTOR I'm now experimenting with showing thumbnails in search results. For example, here's a search for the deep sea octopus Graneledone pacifica, showing two articles:


48832222-196574b7b6d6a2bc5764a5e853cd478b.4b228b85-full.png


I display thumbnails for pages that (a) have the name on the page, and (b) have what look like figure captions on them. The idea is that an article that figures a taxon is likely to be a fairly important article to look at, so displaying thumbnails will highlight those articles. The second article in the search results is the paper that published the name Graneledone pacifica, and the figures illustrate the taxon.

These are all pretty rough, but they give some idea of what I've been working on the last month.