Friday, May 21, 2010

Beyond open data

I'm in the basement of The World Bank building in DC, listening to Beth Noveck and Hans Rosling talk about #opendata and "Mindset Upgrades for a Multipolar World."

One slide from Rosling shows a sideview of a garden. Underneath the ground surface are "root balls" of data repositories - from NGOs, the national government, and sources like the World Bank or UN. Shining down on the garden plot is the sun - represented here by the "public" - in other words one of the sources of life to the data rootballs. The data form roots that interconnect and eventually send shoots above ground. A variety of applications, translations, visualizations and other "story telling interpretations" of the data bring life from these seedlings and eventually sprout "flowers" - useful, beautiful manifestations of that interconnected, unseen, but vitally critical set of root databases.

If ever there were proof that a picture is worth a 1000 words, that preceding paragraph is it. Hans Rosling made the above point with a picture - it took me 10 times longer to write the words than make sense of the picture.

And that's his point. We need good data to help us tell stories. We need a lot of tools to interpret, visualize, represent, and mashup the data - and the good news is, we have these tools and many are free. Gapminder is one among many tools that let us "see" data.*

Tools aren't enough. Sensemakers and storytellers are key - those who will look at the data and make the pictures, look at the pictures, and ask for more data. Many of the people who will do this, and who can do it best, are the people whose lives are represented in the data - those in the neighborhoods, reporting the corruption, with kids in the schools, looking for work, and seeking healthcare. The Grameen AppLab is one example of working this way.

There are important analogs for this. Meteorologists are leaders in gathering huge, complicateed datasets and then standing in front of a map and making the information meaningful to anyone. Music is written in a notation form that is standardized and read by many - but it is only through the associated efforts of instrument makers, musicians, conductors and audiences that the scribbles on the page become beautiful to most of us.

As once independent datasets from government, multilaterals, and NGOs get connected through interoperable data standards several things will happen:
  1. The free open access to the data becomes platform for private and community innovation in terms of using it, presenting it, and deriving value from it. (this is what Clayton Christensen refers to as "adjacent profit" - where once there was value in holding the databases close, now there is value in releasing the data and building it from it)
  2. As more and more data becomes connected, those whose data are not included will cease to be found. It will be like writing a book but not letting Amazon find it. Even those of us who shop at indie bookstores or borrow from libraries often search Amazon to find out what exists. From the perspective of the reader, if you're book isn't there, it might as well not exist. You can lead or you can follow in this regard, but if you don't connect, your data won't matter.
  3. Data are the beginning, not the end. They are the rootballs, not the flowers.
  4. For #opendata sharing efforts the key is issues of interoperability so that datasets can be mashed together, which allows new questions to be asked, by new people. These people -in turn - add and look for other new data. There is a feedback loop that will drive use, improve data, improve its representation, and improve its use.
After the meeting I had a chance to talk with several of the meeting hosts and other attendees about the opportunity to deliberately work in this feedback-driven way. We were talking specifically about Apps and Contests - a topic I'm quite excited about. These are exploding in number - as they serve one important purpose in efforts to share data - they create tools to put the data to use. But we need to take this a few steps further down the feedback loop:
  1. Put the built apps into use in communities or organizations and watch how they are used. Give built apps for healthy eating into the program participants in a community health program, for example, see what they do with it, improve it based on their feedback and listen carefully to what they say they need in terms of other data or other features. Pushing out data is one step. Building an app is a next step. Using it and improving it and putting it to work in the context of community improvement efforts is what really matters.
  2. We need to connect those building the connections - what are all the App contests out there, what are they focused on, how do you participate, what is missing, who is partnering with whom, what tools are they using to run their contests, and what do these contests accomplish?
An easy proposal for addressing the issues of number two above:
  1. begin identifying all contests, share that info on web. (wiki)
  2. build small cadre of people that will provide data and turn to crowds for more
  3. Identify public agencies and private players and communities doing apps and contests in open way
  4. encourage app contests to share info with each other deliberately on the web
To do: I will reach out to a few of the folks at the World Bank event, plus those who've done research on apps/contests (White House, Case Foundation, McKinsey and Arabella), compile slide decks, lists, and links to apps. Involve as many people as want to be engaged in this conversation and sharing about apps.

Can you help? Want to build the public wiki? Have info on apps contests? Research that matters? Comment below or email me lucy at blueprintrd dot com

* (Sidebar - I'm now at the New America Foundation Retreat learning how "bad" the data we have on our economy are - because they are out of sync with the shape of the global, supply chain economy. So good data, that capture the information we need, are important and shouldn't just be assumed to exist)


1 comment:

Jana Byington-Smith said...

Nice post and am looking forward to your compilation. The 'bad' data (not misbehaving, but of bad quality) problem can seem overwhelming, and I'm curious about who will step up on the ROI of unfiltered vs filtered data sets in a large scale way. It's a massive standardization issue, so I'm thinking that some of the approach to sharing will come from the private sector, those who see the financial benefit to, and have the pressing need for, marketable products, who will invest first.

(You did a great job of describing the slide in 113 words -- who needs a thousand when such efficiency is possible!)