I had to read Humanities Data: a necessary contradiction, by Miriam Posner a couple of times before I really understood her point – which probably makes me a bit thick. The title, of course, says it all – to boil humanities research data down seems at odds with the origin of that data. The analogy Posner uses is actually particularly poignant.
“Imagine that someone called your family photograph album a dataset. It’s not inaccurate per se, but it suggests that this person just fundamentally doesn’t understand why you value this artefact.”
Initially I felt that Posner was at odds with my view that preserving the context of data is essential, but in fact this article supports my view.
This article also has a link to the work of Trevor Munoz and Katie Rawson – Digital Heritage Curation http://www.dhcuration.org/. This is a fantastic site that includes numerous links to scholarly articles exploring issues and concepts relating to curating digital collections. It is well worth a look. You can follow @DHCuration on twitter.
Open Refine is a great tool… I am keen to test this program on some data exported from Emu – the National Museum of Australia’s (NMA) collection database. Like many institutions, the NMA is keen to publish as many collection records on the web as soon as practical. A barrier to publishing collection records to the web can be the quality of the data.
Till next week..