Considering open data

Libraries have always been about open data, haven’t they?  Well, yes, in a way.  Aside from the notion of a free lending library, our bibliographic data is freely available and shared, if you know where to find it and if you know how to read it.  We do offer our users a lot of data that may appear to be “open,” i.e. free, but, in reality, we pay a premium to offer said data.  Publishers snatch up primary source materials and then sell it back to us, as if they’re doing us a favor.  A recent example being the Readex collection of documents related to slavery in the United States, The American Slavery Collection, 1820-1922.  John E. Drabinski has written a thoughtful essay about the inherent dilemma in charging for access to documents that make up our own cultural heritage.  Even our own faculty members are unable to free their research data due to agreements with publishers and the “publish or perish” nature of tenure.

We have been promised a future of linked data, which will make up the semantic web, where point A will lead to point B in such a way that is both novel and accurate.  Serendipitous discovery will live alongside the assurance that the John Smith you are interested in is the John Smith you are following through the tangled Web.  Which is great!  I can’t wait!  But we’re not there yet.  There has been encouraging work, most notably, for librarians, with the Virtual International Authority File, which integrates a number of national library authority files for names and provides a single numeric identifier for each, a URI or Universal Resource Identifier that can then be used across the web.  You can see it at work in Wikipedia by scrolling down to the bottom of biographical entries and checking out the “Authority Control” box.  Still a ways to go (subjects, anyone?), but it’s a start.opendataSo, want to get started freeing some data?  There are lots of ways you can start small.  As library folks, we are used to thinking about ways to make our data useful and transparent.  The rest of the world is really into this now, too, but we’ve been doing it forever.  So, consider contributing your talents!

I’ve often said I wished our library catalogs worked as well as Ravelry, the free database for fiber arts.  It’s interesting that Ravelry views itself as a community rather than a database, making the data it presents personal, and thereby relevant to its users.  Libraries have struggled with how to do this.  It’s something we’d like to do, but we’re, honestly, afraid of what it means for our stated aim of objectivity.  And that’s a serious concern.    Still, Ravelry manages to combine a materials database, a pattern database, and forums along with a personalized user experience.

Are you interested in 3D printing?  There are a lot of amazing repositories for free data files you can print yourself, the highest profile database of late being that of the Smithsonian’s own 3D modeling project, Smithsonian X 3D, which allows you to download and print models of artifacts from the Smithsonian’s collection.  Thingiverse allows you to browse, organize, and customize models contributed by other users and to contribute your own models.  Other museums have made their 3D scan files available to download, including The Met, which encourages creative use of their files to make new art, or mashups.  Want to get started?  You can find open source options for all the software you need to start creating your own 3D models. (TinkerCad, OpenSCAD, SketchUp, Blender)

There are also many citizen science projects you can contribute data to as well.  Perhaps one of the longest standing, the annual Great Backyard Birdcount, just happened earlier this month.  Maybe you’d prefer culling through radio signals to help SETI in the search for extraterrestrial life?

On the local front, there is a new group in Richmond formed as part of Code for America, called Code for RVA , which is a “civic hacking brigade” that works to “improve our city through better technology.”  Their next civic hack night, where they work on civic projects and hack open data, is Tuesday March 25th at 6pm and they’ll be working on a project using real-time data to build an app that lets parents and students know exactly where their school bus is.

Know of any other open data projects?  Share them in the comments.