The next Open Data battle: Advancing Policy & Innovation through Standards

With the possible exception of weather data, the most successful open data set out there at the moment is transit data. It remains the data with which developers have experimented and innovated the most. Why is this? Because it’s been standardized. Ever since Google and the City of Portland creating the General Transit Feed Specification (GTFS) any developer that creates an application using GTFS transit data can port their application to over 100+ cities around the world with 10s and even 100s of millions of potential users. Now that’s scale!

All in all the benefits of a standard data structure are clear. A public good is more effectively used, citizens receive enjoy better service and companies (both Google and the numerous smaller companies that sell transit related applications) generate revenue, pay salaries, etc…

This is why, with a number of jurisdictions now committed to open data, I believe it is time for advocates to start focusing on the next big issue. How do we get different jurisdictions to align around standard structures so as to increase the number of people to whom an application or analysis will be relevant? Having cities publish open data sets is a great start and has led to real innovation, next generation open data and the next leaps in innovation will require some more standards.

The key, I think, is to find areas that meet three criteria:

  • Government Data: Is there relevant government data about the service or issue that is available?
  • Demand: Is this a service for which there is regular demand? (this is why transit is so good, millions of people touch the service on a daily basis)
  • Business Model: Is there a business that believes it can use this data to generate revenue (either directly, or indirectly)

 

 

opendata-1.0151

Two comments on this.

First, I think we should look at this model because we want to find places where the incentives are right for all the key stakeholders. The wrong way to create a data structure is to get a bunch of governments together to talk about it. That process will take 5 years… if we are lucky. Remember the GTFS emerged because Google and Portland got together, after that, everybody else bandwagoned because the value proposition was so high. This remains, in my mind, not the perfect, but the fastest and more efficient model to get more common data structures. I also respect it won’t work for everything, but it can give us more successes to point to.

Which leads me to point two. Yes, at the moment, I think that target in the middle of this model is relatively small. But I think we can make it bigger. The GTFS shows cities, citizens and companies that there is value in open data. What we need are more examples so that a) more business models emerge and b) more government data is shared in a structured way across multiple jurisdictions. The bottom and and right hand circles in this diagram can, and if we are successful will, move. In short, I think we can create this dynamic:

opendata4.016

So, what does this look like in practice?

I’ve been trying to think of services that fall in various parts of the diagram. A while back I wrote a post about using open restaurant inspection data to drive down health costs. Specifically around finding a government to work with a Yelp!, Bing or Google Maps, Urban Spoon or other company to integrate the  inspection data into the application. That for me is an example of something that I think fits in the middle. Government’s have the data, its a service citizens could touch on a regular base if the data appeared in their workflow (e.g. Yelp! or Bing Maps) and for those businesses it either helps drive search revenue or gives their product a competitive advantage. The Open311 standard (sadly missing from my diagram), and the emergence of SeeClickFix strike me as another excellent example that is right on the inside edge of the sweet spot).

Here’s a list of what else I’ve come up with at the moment:

opendata3.015

You can also now see why I’ve been working on Recollect.net – our garbage pick up reminder service – and helping develop a standard around garbage scheduling data – the Trash & Recycling Object Notation. I think it is a service around which we can help explain the value of common standards to cities.

You’ll notice that I’ve put “democracy data” (e.g. agendas, minutes, legislation, hansards, budgets, etc…) in the area where I don’t think there is a business plan. I’m not fully convinced of this – I could see a business model in the media space for this – but I’m trying to be conservative in my estimate. In either case, that is the type of data the good people at the Sunlight Foundation are trying to get liberated, so there is at least, non-profit efforts concentrated there in America.

I also put real estate in a category where I don’t think there is real consumer demand. What I mean by this isn’t that people don’t want it, they do, but they are only really interested in it maybe 2-4 times in their life. It doesn’t have the high touch point of transit or garbage schedules, or of traffic and parking. I understand that there are businesses to be built around this data, I love Viewpoint.ca – a site that takes mashes opendata up with real estate data to create a compelling real estate website – but I don’t think it is a service people will get attached to because they will only use it infrequently.

Ultimately I’d love to hear from people on ideas they on why might fit in this sweet spot. (if you are comfortable sharing the idea, of course). Part of this is because I’d love to test the model more. The other reason is because I’m engaged with some governments interested in getting more strategic about their open data use and so these types of opportunities could become reality.

Finally, I just hope you find this model compelling and helpful.

21 thoughts on “The next Open Data battle: Advancing Policy & Innovation through Standards

  1. Raphael Sussman

    Actually, it is the individual subject area experts that tend to standardize rather than the Open Data authorities at any given organization.  Geographic information, has long had standards whereby data are published as Web Services (see http://www.opengeospatial.org/standards) that can be consumed easily in mashups, and it is no coincidence that a large proportion of the sucessful uses of data from multiple sources includes geographic data.

    Reply
  2. Doconnor

    I think you overestimate the importance of business. Citizens can make a great deal of use of data without needing a business plan. It doesn’t take a lot of money to process data into something useful, just a couple skills and a little time.

    The majority of the people who have made use of TTC transit data hasn’t been in it for the money. There has been a trip planner made, Steve Munro analyzed vehicle location data to identify the causes of irregular service and I made an improved next vehicle arrival site, all without a business plan.

    Reply
    1. David Eaves

      I agree that it isn’t required to have a common standard to use open data (this isn’t my point) – there is lots of innovation taking place on data sets in local communities – but if you do have some common standards it vastly increases the number of people who can utilize a data set, or contribute to a project that makes use of it.

      Again, I’m not saying the developers aren’t going to hack on data whose structure isn’t standardized across cities, or that their products don’t have value. They do. But if we want to create the standard data structures (and I think we do, so that more of that great work can benefit a bigger audience – especially in smaller communities with fewer developers) then I think the role of businesses (or organization that can scale to create value out of the data across jurisdictions), will be instrumental.

      Reply
    2. David Eaves

      I agree that it isn’t required to have a common standard to use open data (this isn’t my point) – there is lots of innovation taking place on data sets in local communities – but if you do have some common standards it vastly increases the number of people who can utilize a data set, or contribute to a project that makes use of it.

      Again, I’m not saying the developers aren’t going to hack on data whose structure isn’t standardized across cities, or that their products don’t have value. They do. But if we want to create the standard data structures (and I think we do, so that more of that great work can benefit a bigger audience – especially in smaller communities with fewer developers) then I think the role of businesses (or organization that can scale to create value out of the data across jurisdictions), will be instrumental.

      Reply
    3. Momoko Price

      I think Dave’s emphasis on the integration of business models is in fact key to the creation of sustainable, scalable open-data structure.

      It is true that many transit apps have been created without explicit business models in mind, but many of them suffered needless setbacks and red tape at the start because their pioneering initiatives weren’t recognized by data owners/creators as legitimate businesses with quantifiable stakes or benefits for the community/economy. 

      Business models help make those stakes and benefits explicit and visible, which then (hopefully) gives data owners incentives and vision for why they need to keep upgrading, standardizing and publishing their data. 

      Reply
      1. Doconnor

        “Business models help make those stakes and benefits explicit and visible”

        Only considering benefits created by business is a serious flaw that perminates society.

        Reply
  3. Momoko Price

    Last, I think your emphasis on highlighting frequent-use data is an important insight, Dave. Will be keeping this in mind and follow up soon. Speaking of that, how has the data-integration process been for ReCollect so far? I never got around to talking to you about that. Did you get the June data you needed in time? Was the snag about getting the data ahead of time ever fixed?

    Reply
  4. Alex Sirota

    David, kudos once again. Check out the efforts of CASRAI.org and ORCID.org for research data. This standardization is starting in earnest in kuali-style governance and sustainability models. Very early days, but very promising.

    Reply
  5. mattdance

    I think there is a place for environmental data in this framework.  For instance, the Air Quality Health Index (AQHI) is an open data set from Environment Canada that could be deployed via a location aware mobile application to deliver personalized health notifications. For example, if someone with asthma was concerned about AQ, they could set their ‘tolerance’ in this mobile app and receive personalized messages based on the AQHI (current or forecast) for their location.  The AQHI could also be incorporated with localized crowdsourced health indicators. If someone I follow in this application were to post AQ concerns for a specific location, I could receive these posts.  If there were enough take up of the application these two data streams (‘official’ AQHI and volunteered) could even be combined.

    Reply
  6. Pingback: Moving toward open data standards « ext337

  7. Anonymous

    The best model in Canada on the topic of standards is the Canadian Geospatial Data Infrastructure (CGDI) delivered by the GeoConnections program (http://www.geoconnections.org/en/index.html).  They created multidisciplinary advisory nodes in 2000 or so, such as Standards, Portal, Access, policy, Marine, Technology Advisory Panel, Framework Data, Atlas of Canada, Environment, etc. (see their resource library – http://www.geoconnections.org/en/resourcelibrary, http://www.geoconnections.org/en/resourcelibrary/keyStudiesReports and for developers http://www.geoconnections.org/en/communities/developers/index.html) Members were sometimes at multiple tables to ensure the knowledge and ideas rotated between them all.  Members were from all fed departments dealing with geo, private sector (e.g. ogc) , libraries, provinces and territories.  They devised an open architecture, supported innovative programs like GeoGratis and Geobase, evolved, developed open standards and now support the creation of initiatives that built according to CGDI specs and standards.  So now we see public health, environment etc. applications that are based on these standards.  The archival community has come up with OAIS (http://en.wikipedia.org/wiki/Open_Archival_Information_System) and Library Archives Canada has come up with some preservation file format standards (http://www.collectionscanada.gc.ca/digital-initiatives/012018-2200-e.html).  Also, librarians have developed some of the best catalogs see (http://liswire.com/content/second-phase-natural-resources-canada-libraries-now-live-evergreen) and ODESI (http://search1.odesi.ca/) and have adopted standards such as (http://spotdocs.scholarsportal.info/display/odesi/links).  It would be good to collaborate with these well established communities before re-inventing any wheels.  Librarians especially since they were one of the first to advocate for open data (http://www.statcan.gc.ca/dli-ild/dli-idd-eng.htm).  Research libraries in particular are developing trusted digital data repositories and have well developed data sharing protocols between institutions and faculty that are also worth looking into.  It would be great to see what exists and works in other data communities as opposed to just the open data cities.  Cities would do well to look at existing systems, especially since many of these are open source and work well with large sets of data.

    Reply
    1. Michael Richardson

      @ThomKearney, the Canadian Standards Council is still worrying over how to best insert DRM into the PDF files of the specifications that they BUY into.  If the CSC is going to be involved in open data, it needs to first get a business model, one where the public can actually get the specifications.

      Reply
  8. Pingback: The apps vanguard | Words

  9. Pingback: Weekly Link Roundup (weekly) | Jon Stahl's Journal

  10. Michael Richardson

    David, if by real-estate data you mean things like land-transfers and ownership and lot sizes, then I can see a definite use, and even a business plan.  I’d like to know how various changes I might make to my house will affect my property taxes.  In Ontario we have this economic abomination called market-value assessment, which has nothing to do with market, value or assessment. A black chamber organization (MPAC) does this process based upon data that is often hard to get.  If the public had the ability to reproduce *OR NOT* the assessment, I think it would lead to much better land utilization.

    Reply
  11. Pingback: The State of Open Data Licenses in Canada and where to go from here | eaves.ca

  12. Pingback: OpenData als Chance für Verkehrsunternehmen » Open, Data, Verkehrsdaten, Applikationen, Wiener, Services » open3.at

  13. Herb Lainchbury

    I agree with Michael.  Real estate data would be extremely valuable to the people I have talked to about it.  Combined with things like land use rezoning applications, building permits, property taxes and crime rates one could create very interesting apps about the single largest investment that many of us make in our lifetimes.  I think the fact that people are not paying attention to that now is a symptom of the difficulty of gaining access to this sort of data today.

    Reply
  14. Herb Lainchbury

    I agree with Michael.  Real estate data would be extremely valuable to the people I have talked to about it.  Combined with things like land use rezoning applications, building permits, property taxes and crime rates one could create very interesting apps about the single largest investment that many of us make in our lifetimes.  I think the fact that people are not paying attention to that now is a symptom of the difficulty of gaining access to this sort of data today.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s