No, I’m not talking about toxic assets, but rather historical assets. (And it may well be time to merge my history blog with this blog, because lately it’s not always clear where my writing should go.)
There is an article in EDUCAUSE Review last month about audio archives, in particular the difficulty of making the Supreme Court data collection useful. It discusses how the Oyez project has been building an RDF schema to create a useful database. Doing so is part of a larger problem:
The accelerating growth in spoken-word documents will generate demand for efficient archiving and retrieval strategies. But these resources will prove stillborn if we do not identify ways to reveal their contents.
As a historian, I was immediately reminded of a very similar problem we have with medieval manuscripts, and indeed most written information going back thousands of years. In the case of the Middle Ages, James Burke put it best in his series The Day the Universe Changed (1985 c. BBC):
Previously, which historical documents survived was often controlled by chance, physical conditions of storage, and class or wealth. Contribution to a library or museum collection, or passing something down through the family, were pretty good methods of conservation. So was publication, for those who could afford it. We have these same determinants today, but now we also have choice, as groups rather than families created collections and thus a need to catalog them on a scale never seen before.
The stockpiles of raw data do indeed remind me of the piles of moldering books in monastic libraries and the homes of wealthy nobles. As Burke says, not only were things literally lost, but they were lost because they could not be found. And technology, then as now, is being utilized to reduce this problem and make the content deemed important accessible to future generations.
In the realm of digital or digitized content, this is where tagging really, really matters. I get that folksonomies etc are fun and interesting. But we collectively (people who care about information, knowledge, culture) have to get our act together in terms of spreading best practices with tags, developing standards etc. Who takes the lead in such an enterprise? Library of Congress and its international counterparts? Unless or until the semantic web emerges and fulfils the hopes of those who are waiting for it, tags is where it’s at.
…and aggregation, syndication and sharing. Personally I rely largely on content being pushed out to me through recommendations, Diigo groups, RSS feeds, Tweets, etcetera. Tags in and of themselves are important, but if we’re still required to actively go out to look for them (or the content they reference) I think the process is slowed considerably. The filtering of networks is a powerful thing, and I think tagging is one element of that.
As far as standards go, I think the main standard that’s important is the one that facilitates aggregation and syndication. Providing they are in place the networks and users within them can interpret and describe content in the way that best suits them – I think that’s how stuff is found and shared, by using common language and descriptors.
While I understand the basic principles of tagging and aggregation, I’m not sure they are reliable without standards, what Mike’s calling “common language and descriptors”. We don’t have those. In many ways, we shouldn’t have those — they might defeat the purpose.
To try to make the Program for Online Teaching site more useful, I began putting everything (web pages, podcasts, video clips, etc) into Moodle’s Glossary. The way to find them is to use Search for everything, because the content exists, essentially, within the index. Thus I have to tag everything.
After awhile, I had to start listing my tags on paper, because I could not remember what I had done already. Did I use “techniques” or “technology”? “pedagogy” or “teaching”? “discussion” or “communication”? It was a mess. It’s still a mess. (You can see by going in as a guest to http://miracosta.mrooms.net/course/view.php?id=13).
I realize that I, working alone, am not a community, and that tagging is based on everyone doing it, not one person. But if I cannot even determine common descriptors in my own mind… Let’s just say that I have more faith in those concertedly trying to index. Perhaps that’s just because I grew up with the old card catalogs, where I found it easier to find things, in many ways, than I do online, even with Google’s complex algorithms.
Lisa, your comment has reminded me that I’ve had this conversation before locally – and the same sorts of points came up regarding lack of common language, and the need for some degree of consistency of terminology.
I don’t know if this is of any use to your current predicament, but what we were thinking was to implement a two-level tagging/categorising convention where there was a list of predefined terms – one of which had to be used – and after that free-tagging was supported. This would ensure that the item was classified in at least one formal category, while leaving room for organically developing terms and tags.
However I don’t even know if Moodle supports this – I’m no Moodle expert
From personal experience I find I have trouble sticking to a firm, established convention and treat categories as tags eventually – I had a few dozen until recently and culled most of them. So I’m not sure how best to ensure the formal categories are adhered to…
Two-level tagging is a great standard, actually. Diigo groups do this, where the group administrator(s) can set standard tags for the group, which then automatically pop up in clickable form when a member wants to mark up and share a page with the group. The member then clicks the relevant tags, adds any additional ones they feel like, and off you go. For this to work well, though, it still requires some careful taxonomical thought up front by the group tag creators.