With the possible exception of weather data, the most successful open data set out there at the moment is transit data. It remains the data with which developers have experimented and innovated the most. Why is this? Because it’s been standardized. Ever since Google and the City of Portland creating the General Transit Feed Specification (GTFS) any developer that creates an application using GTFS transit data can port their application to over 100+ cities around the world with 10s and even 100s of millions of potential users. Now that’s scale!
All in all the benefits of a standard data structure are clear. A public good is more effectively used, citizens receive enjoy better service and companies (both Google and the numerous smaller companies that sell transit related applications) generate revenue, pay salaries, etc…
This is why, with a number of jurisdictions now committed to open data, I believe it is time for advocates to start focusing on the next big issue. How do we get different jurisdictions to align around standard structures so as to increase the number of people to whom an application or analysis will be relevant? Having cities publish open data sets is a great start and has led to real innovation, next generation open data and the next leaps in innovation will require some more standards.
The key, I think, is to find areas that meet three criteria:
- Government Data: Is there relevant government data about the service or issue that is available?
- Demand: Is this a service for which there is regular demand? (this is why transit is so good, millions of people touch the service on a daily basis)
- Business Model: Is there a business that believes it can use this data to generate revenue (either directly, or indirectly)
Two comments on this.
First, I think we should look at this model because we want to find places where the incentives are right for all the key stakeholders. The wrong way to create a data structure is to get a bunch of governments together to talk about it. That process will take 5 years… if we are lucky. Remember the GTFS emerged because Google and Portland got together, after that, everybody else bandwagoned because the value proposition was so high. This remains, in my mind, not the perfect, but the fastest and more efficient model to get more common data structures. I also respect it won’t work for everything, but it can give us more successes to point to.
Which leads me to point two. Yes, at the moment, I think that target in the middle of this model is relatively small. But I think we can make it bigger. The GTFS shows cities, citizens and companies that there is value in open data. What we need are more examples so that a) more business models emerge and b) more government data is shared in a structured way across multiple jurisdictions. The bottom and and right hand circles in this diagram can, and if we are successful will, move. In short, I think we can create this dynamic:
So, what does this look like in practice?
I’ve been trying to think of services that fall in various parts of the diagram. A while back I wrote a post about using open restaurant inspection data to drive down health costs. Specifically around finding a government to work with a Yelp!, Bing or Google Maps, Urban Spoon or other company to integrate the inspection data into the application. That for me is an example of something that I think fits in the middle. Government’s have the data, its a service citizens could touch on a regular base if the data appeared in their workflow (e.g. Yelp! or Bing Maps) and for those businesses it either helps drive search revenue or gives their product a competitive advantage. The Open311 standard (sadly missing from my diagram), and the emergence of SeeClickFix strike me as another excellent example that is right on the inside edge of the sweet spot).
Here’s a list of what else I’ve come up with at the moment:
You can also now see why I’ve been working on Recollect.net – our garbage pick up reminder service – and helping develop a standard around garbage scheduling data – the Trash & Recycling Object Notation. I think it is a service around which we can help explain the value of common standards to cities.
You’ll notice that I’ve put “democracy data” (e.g. agendas, minutes, legislation, hansards, budgets, etc…) in the area where I don’t think there is a business plan. I’m not fully convinced of this – I could see a business model in the media space for this – but I’m trying to be conservative in my estimate. In either case, that is the type of data the good people at the Sunlight Foundation are trying to get liberated, so there is at least, non-profit efforts concentrated there in America.
I also put real estate in a category where I don’t think there is real consumer demand. What I mean by this isn’t that people don’t want it, they do, but they are only really interested in it maybe 2-4 times in their life. It doesn’t have the high touch point of transit or garbage schedules, or of traffic and parking. I understand that there are businesses to be built around this data, I love Viewpoint.ca – a site that takes mashes opendata up with real estate data to create a compelling real estate website – but I don’t think it is a service people will get attached to because they will only use it infrequently.
Ultimately I’d love to hear from people on ideas they on why might fit in this sweet spot. (if you are comfortable sharing the idea, of course). Part of this is because I’d love to test the model more. The other reason is because I’m engaged with some governments interested in getting more strategic about their open data use and so these types of opportunities could become reality.
Finally, I just hope you find this model compelling and helpful.