The State of Open Data in Canada: The Year of the License

Open Data now an established fact in a growing list of Canadian cities. Vancouver, Toronto, Edmonton, Ottawa have established portals, Montreal, Calgary, Hamilton and some other cities are looking into launching their own and a few provinces are rumored to be exploring open data portals as well.

This is great news and a significant accomplishment. While at the national level Canadian is falling further behind leaders such as England, the United States, Australia and New Zealand, at the local and potentially provincial/state level, Canada could position itself as an international leader.

There is however, one main obstacle: our licenses.

The current challenge:

So far most Open Data portals adopt what has been termed the Vancouver License (it was created by Vancouver for its open data portal and has subsequently been adopted, with occasional minor changes, by virtually every other jurisdiction).

The Vancouver license, however, suffers from a number of significant defects. As someone who was involved in its creation these “bugs” were a necessary tradeoff. If we were looking for a perfect license that satisfied all stakeholders, I suspect we’d still be arguing about it and there’d be no open data or data portal with the Vancouver license. Today, thanks in part to the existence of these portals our politicians, policy makers and government lawyers understanding of this issue has expanded. This fact, in combination with a growing number of complaints about the licenses from non-profits and businesses interested in using open data, has fostered growing interest in adjusting it.

This is encouraging. And we must capitalize on the moment. I wish to be clear: until Canadian governments get the licensing issue right, Open Data cannot advance in this country. Open Data released by governments will not enjoy significant reuse undermining one of the main reasons for doing Open Data.

There are a few things everyone agrees a new license needs to cover. It must establish there is no warranty to the data and that the government cannot be held liable for any reuse. So let’s focus on the parts that governments most often get wrong.

Here, there are 3 things a new license needs to get right.

1. No Attribution

NASCAR-2-300x199

Nascar Jeff Gordon #24 by Dan Raustadt licensed CC-NC-ND

We need a license that does not require attribution. First, attribution gets messy fast – all those cities logos crammed in on a map, on a mobile phone? It’s fine when you are using data from one or two cities, but what happens when you start using data from 10 different governments, or 50? Pretty soon you’ll have NASCAR apps, that will look ugly and be unusable.

More importantly, the goal of open data isn’t to create free advertising for governments, its to support innovation and reuse. These are different goals and I think we agree on which one is more important.

Finally, what government is actually going to police this part of the license? Don’t demand what you aren’t going to enforce – and no government should waste precious resources by paying someone to scour the internet to find websites and apps that don’t attribute.

2. No Share alike

One area the Vancouver license falls down is on the share is in this clause:

If you distribute or provide access to these datasets to any other person, whether in original or modified form, you agree to include a copy of, or this Uniform Resource Locator (URL) for, these Terms of Use and to ensure they agree to and are bound by them but without introducing any further restrictions of any kind.

The last phrase is particularly problematic as it makes the Vancouver license “viral.” Any new data created through a mash up that involves data with the Vancouver license must also use the Vancouver license. This will pretty much eliminate any private sector use of the data since any new data set a company creates they will want to be able to license in manner that is appropriate to their business model. It also has a chilling effect on those who would like to use the data but would need to keep the resulting work private, or restricted to a limited group of people. Richard Weait has an unfortunately named blog post that provides an excellent example of this problem.

Any new license should not be viral so as to encourage a variety or reuses of any data.

3. Standardized

The whole point of Open Data is to encourage the reuse of a public asset. So anything a government does that impedes this reuse will hamper innovation and undermine the very purpose of the initiative. Indeed, the open data movement has, in large part, come to life because one traditional impediment to using data has disappeared: data can now usually be downloaded and available in open formats that anyone can use. The barriers to use have declined so more and more people are interested.

But the other barrier to re-use is legal. If licenses are not easily understood then individuals and businesses will not reuse data, even when it is easily downloadable from a government’s website. Building a businesses or a new non-profit activity on a public asset to which your rights are unclear is simply not viable for many organizations. This is why you want every government should want its license to be easily understood – lowering the barriers to access means making data downloadable and reducing the legal barriers.

Most importantly, it is also why it is ideal if there is a single license in the whole country, as this would significantly reduce transaction and legal costs for all players. This is why I’ve been championing Canada’s leading cities to adopt a single common license.

So, there are two ways of doing this.

The easiest is for Canadian governments to align themselves with several of the international standardized open data licenses that already exist. There are a variety out there. My preference is the Open Commons’ Public Domain Dedication and License (PDDL), although they also publish the Open Database License (ODC-ODbL) and the Attribution License (ODC-By). There is also Creative Commons CC-0 license which Creative Commons suggests to use for open data (I actually recommend against all of these except the PDDL for governments, but more on that later).

These licenses has several advantages.

First, standardized licenses are generally well understood. This means people don’t have to educate themselves on the specifics of dozens of different licenses.

Second, they are stable. Because these licenses are managed by independent authorities and many people use them, they evolve cautiously, and balance the interest of consumers and sharers of data or information.

Third, these licenses balance interests responsible. The creators of these licenses are thought through all the issues that pertain to open data and so give both consumers of data and distributors of data comfort in knowing that they have a licenses that will work.

A second option is for governments in Canada to align around a self-generated common license. Indeed, this is one area where the Federal Government could show (some presently lacking) leadership.(although GeoGratis does have a very good license). This, for example appears to be happening in the UK, where the national government has created an Open Government Licence.

My hope is that, before the year is out, jurisdictions in Canada began to move towards a common licenses, or begin adopting some standard licenses.

Specifically, it would be great to see various Canadian jurisdictions either:

a) Adopt the PDDL (like the City of Surrey, BC). There are some reference to European Data Rights in the PDDL but these have no meaning in Canada and should not be an obstacle – and may even reassure foreign consumers of Canadian data. The PDDL is the most open and forward looking license.

b) Adopt the UK government’s Open Government Licence. This license is the best created by any government to date (with the exemption of simple making the data public domain, which, of course, is far more ideal.

c) Use a modified version of the Geogratis license that adjusts the “3.0 PROTECTION AND ACKNOWLEDGEMENT OF SOURCE” clause to prevent the NASCAR effect from taking place.

What I hope does not happen is that:

a) More and more jurisdictions continue to use the Vancouver License. There are better options and it is an opportunity to launch an open data policy and leapfrog the current leaders in the space.

b) Jurisdictions adopt a Creative Commons license. Creative Commons was created to help license copyrighted material. Since data cannot be copyrighted, the use of creative commons risks confusing the public about the inherent rights they have to data. This is, in part, a philosophical argument, but it matters, especially for governments. We – and our governments especially – cannot allow people to begin to believe that data can be copyrighted.

c) There is no change to the current licenses being used, or a new license, like Open Database License (ODC-ODbL) which goes against the attributes described above, is adopted.

Let’s hope we make progress on this front in 2011.

13 thoughts on “The State of Open Data in Canada: The Year of the License

  1. Anonymous

    Hey David;

    I think there is great merit in discussing the work the G4 cities are doing on this file. In particular, Ottawa has taken the lead on licencing which started with work I initiated with Ottawa and the G4 as you know (http://traceyplauriault.ca/2010/07/21/changecamp-ottawa-2010-open-data-terms-of-use-session/). This work has led to a collaboration with David Fewer, oline Twiss and Kent Mewhort at CIPPIC (http://www.cippic.ca/) and the City of Ottawa. CIPPIC undertook to examine the current TOU which was copied from the City of Vancouver TOU which you were instrumental at creating. Their first report was presented at a G4 meeting by teleconference and in person in Ottawa in the fall of 2010 which you participated in as a consultant to the City of Edmonton. The first report can be found here – (http://www.cippic.ca/uploads/open-licensing/CIPPIC-Ottawa_License_Report-2010-11-15.pdf). Subsequently, upon the request of those in attendance, CIPPIC conducted a risk analysis as requested by the G4 Cities on the adoption of a ODC-by and included in that analysis a comparison of the ODC-PDDL. That report has been submitted and is under review by the G4 (http://www.cippic.ca/uploads/open-licensing/Open_License_Comparison_Report-v2-10Feb2011.pdf). Recommendations were also made some time ago in a report commissioned by the G4 and authored by Jury Konga which I believe you have a copy. The Cities will be meeting in the coming weeks to discuss the results of these reports and the City of Ottawa in particular are examining the report with their Legal team. Also, the issue of license interoperability have been articulated at the Standing Committee on Access to Information, Privacy and Ethics by both David H. Mason from Visible Government and also in my Submission (http://datalibre.ca/2011/02/13/submission-to-standing-committee-on-access-to-information-privacy-and-ethics-study-on-open-government/) which fully references the work of CIPPIC. CIPPIC and the G4 would welcome your comments to that second report.

    I think it is really important to acknowledge the work being done by this file and to attribute those who have endeavoured to follow it through from beginning to end and those who are creating the key documents in the public interest upon which decisions will be made.

    Cheers
    Tracey

    Reply
  2. Anonymous

    Hey David;

    I think there is great merit in discussing the work the G4 cities are doing on this file. In particular, Ottawa has taken the lead on licencing which started with work I initiated with Ottawa and the G4 as you know (http://traceyplauriault.ca/2010/07/21/changecamp-ottawa-2010-open-data-terms-of-use-session/). This work has led to a collaboration with David Fewer, oline Twiss and Kent Mewhort at CIPPIC (http://www.cippic.ca/) and the City of Ottawa. CIPPIC undertook to examine the current TOU which was copied from the City of Vancouver TOU which you were instrumental at creating. Their first report was presented at a G4 meeting by teleconference and in person in Ottawa in the fall of 2010 which you participated in as a consultant to the City of Edmonton. The first report can be found here – (http://www.cippic.ca/uploads/open-licensing/CIPPIC-Ottawa_License_Report-2010-11-15.pdf). Subsequently, upon the request of those in attendance, CIPPIC conducted a risk analysis as requested by the G4 Cities on the adoption of a ODC-by and included in that analysis a comparison of the ODC-PDDL. That report has been submitted and is under review by the G4 (http://www.cippic.ca/uploads/open-licensing/Open_License_Comparison_Report-v2-10Feb2011.pdf). Recommendations were also made some time ago in a report commissioned by the G4 and authored by Jury Konga which I believe you have a copy. The Cities will be meeting in the coming weeks to discuss the results of these reports and the City of Ottawa in particular are examining the report with their Legal team. Also, the issue of license interoperability have been articulated at the Standing Committee on Access to Information, Privacy and Ethics by both David H. Mason from Visible Government and also in my Submission (http://datalibre.ca/2011/02/13/submission-to-standing-committee-on-access-to-information-privacy-and-ethics-study-on-open-government/) which fully references the work of CIPPIC. CIPPIC and the G4 would welcome your comments to that second report.

    I think it is really important to acknowledge the work being done by this file and to attribute those who have endeavoured to follow it through from beginning to end and those who are creating the key documents in the public interest upon which decisions will be made.

    Cheers
    Tracey

    Reply
  3. Pingback: Tweets that mention The State of Open Data in Canada: The Year of the License | eaves.ca -- Topsy.com

  4. NetScr1be

    You’ve got this exactly right David. I’m hoping you’ll keep pushing by beating the drums for open data standards. Not just in data formats but policies and procedures (P&P).

    All the W’s are open questions right now. Why should the data be open? Get this one right and first and the rest will follow naturally. My answer would be (at the risk of sounding like a left-wing firebrand) because the data actually belongs to the those who pay the people generating the data. The various orgs definitely have a responsibility to protect themselves from liability but they also have a responsibility to their constituents and stakeholders to be as transparent as possible.
    Who is ultimately responsible in each organisation?

    What data should(n’t) be open?

    How is usage data collected?

    Where should metadata (think catalogues and usage data) reside? In Toronto, I’m advocating that the Toronto Public Library be the central access point for open data. They have all the mechanisms in place for curation and access control (to collect usage data?) AND are just beginning their strategic planning process for the next 3 (?) years. Expect to be invited to show up and comment (lead?).

    I could go on but I won’t (here and now anyway ;> ). My point is without comprehensive standards and P&P regarding the production, release, curation and metadata we will have to spend as much time filtering noise as we do mashing up data.

    This comment will be copied to the DataTO forum at http://groups.google.com/group/datato?lnk=srg&hl=en

    –NetScr1be–

    Reply
  5. Pingback: Links 17/2/2011: Linux 2.6.38 RC5, SplashTop Makes MeeGo-based Platform | Techrights

  6. Carnets DianeMercier

    I also agree with Patrick (NetScr1be) on the involvement of public libraries in municipal Open Data (collective catalog, metadata, etc.). Not only for the expertise of information professionals to citizens, but also because it would also allow public libraries to do their work with municipal employees who are poorly equipped in information support. The information professionals of public libraries in addition to supporting the development of digital literacy of citizens could (should) do the same for municipal employees. (Please, apologized my english)

    Reply
  7. Pingback: The State of Open Data Licenses in Canada and where to go from here | eaves.ca

  8. Pingback: 3 Quick Wins for your Open Gov Initiative | OpenHalton

  9. Pingback: 3 Quick Wins for your Open Gov Initiative | Port 25

  10. Pingback: Burlington Open Data Pilot | OpenHalton

  11. Pingback: Canada: Burlington Open Data Pilot | Government In The Lab

  12. Pingback: The Key to Open Gov Success: Common Standards « Civic Innovations

  13. Pingback: Impose proper licensing and streamline procurement | Stop

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s