Open Data now an established fact in a growing list of Canadian cities. Vancouver, Toronto, Edmonton, Ottawa have established portals, Montreal, Calgary, Hamilton and some other cities are looking into launching their own and a few provinces are rumored to be exploring open data portals as well.
This is great news and a significant accomplishment. While at the national level Canadian is falling further behind leaders such as England, the United States, Australia and New Zealand, at the local and potentially provincial/state level, Canada could position itself as an international leader.
There is however, one main obstacle: our licenses.
The current challenge:
So far most Open Data portals adopt what has been termed the Vancouver License (it was created by Vancouver for its open data portal and has subsequently been adopted, with occasional minor changes, by virtually every other jurisdiction).
The Vancouver license, however, suffers from a number of significant defects. As someone who was involved in its creation these “bugs” were a necessary tradeoff. If we were looking for a perfect license that satisfied all stakeholders, I suspect we’d still be arguing about it and there’d be no open data or data portal with the Vancouver license. Today, thanks in part to the existence of these portals our politicians, policy makers and government lawyers understanding of this issue has expanded. This fact, in combination with a growing number of complaints about the licenses from non-profits and businesses interested in using open data, has fostered growing interest in adjusting it.
This is encouraging. And we must capitalize on the moment. I wish to be clear: until Canadian governments get the licensing issue right, Open Data cannot advance in this country. Open Data released by governments will not enjoy significant reuse undermining one of the main reasons for doing Open Data.
There are a few things everyone agrees a new license needs to cover. It must establish there is no warranty to the data and that the government cannot be held liable for any reuse. So let’s focus on the parts that governments most often get wrong.
Here, there are 3 things a new license needs to get right.
1. No Attribution
Nascar Jeff Gordon #24 by Dan Raustadt licensed CC-NC-ND
We need a license that does not require attribution. First, attribution gets messy fast – all those cities logos crammed in on a map, on a mobile phone? It’s fine when you are using data from one or two cities, but what happens when you start using data from 10 different governments, or 50? Pretty soon you’ll have NASCAR apps, that will look ugly and be unusable.
More importantly, the goal of open data isn’t to create free advertising for governments, its to support innovation and reuse. These are different goals and I think we agree on which one is more important.
Finally, what government is actually going to police this part of the license? Don’t demand what you aren’t going to enforce – and no government should waste precious resources by paying someone to scour the internet to find websites and apps that don’t attribute.
2. No Share alike
One area the Vancouver license falls down is on the share is in this clause:
The last phrase is particularly problematic as it makes the Vancouver license “viral.” Any new data created through a mash up that involves data with the Vancouver license must also use the Vancouver license. This will pretty much eliminate any private sector use of the data since any new data set a company creates they will want to be able to license in manner that is appropriate to their business model. It also has a chilling effect on those who would like to use the data but would need to keep the resulting work private, or restricted to a limited group of people. Richard Weait has an unfortunately named blog post that provides an excellent example of this problem.
Any new license should not be viral so as to encourage a variety or reuses of any data.
The whole point of Open Data is to encourage the reuse of a public asset. So anything a government does that impedes this reuse will hamper innovation and undermine the very purpose of the initiative. Indeed, the open data movement has, in large part, come to life because one traditional impediment to using data has disappeared: data can now usually be downloaded and available in open formats that anyone can use. The barriers to use have declined so more and more people are interested.
But the other barrier to re-use is legal. If licenses are not easily understood then individuals and businesses will not reuse data, even when it is easily downloadable from a government’s website. Building a businesses or a new non-profit activity on a public asset to which your rights are unclear is simply not viable for many organizations. This is why you want every government should want its license to be easily understood – lowering the barriers to access means making data downloadable and reducing the legal barriers.
Most importantly, it is also why it is ideal if there is a single license in the whole country, as this would significantly reduce transaction and legal costs for all players. This is why I’ve been championing Canada’s leading cities to adopt a single common license.
So, there are two ways of doing this.
The easiest is for Canadian governments to align themselves with several of the international standardized open data licenses that already exist. There are a variety out there. My preference is the Open Commons’ Public Domain Dedication and License (PDDL), although they also publish the Open Database License (ODC-ODbL) and the Attribution License (ODC-By). There is also Creative Commons CC-0 license which Creative Commons suggests to use for open data (I actually recommend against all of these except the PDDL for governments, but more on that later).
These licenses has several advantages.
First, standardized licenses are generally well understood. This means people don’t have to educate themselves on the specifics of dozens of different licenses.
Second, they are stable. Because these licenses are managed by independent authorities and many people use them, they evolve cautiously, and balance the interest of consumers and sharers of data or information.
Third, these licenses balance interests responsible. The creators of these licenses are thought through all the issues that pertain to open data and so give both consumers of data and distributors of data comfort in knowing that they have a licenses that will work.
A second option is for governments in Canada to align around a self-generated common license. Indeed, this is one area where the Federal Government could show (some presently lacking) leadership.(although GeoGratis does have a very good license). This, for example appears to be happening in the UK, where the national government has created an Open Government Licence.
My hope is that, before the year is out, jurisdictions in Canada began to move towards a common licenses, or begin adopting some standard licenses.
Specifically, it would be great to see various Canadian jurisdictions either:
a) Adopt the PDDL (like the City of Surrey, BC). There are some reference to European Data Rights in the PDDL but these have no meaning in Canada and should not be an obstacle – and may even reassure foreign consumers of Canadian data. The PDDL is the most open and forward looking license.
b) Adopt the UK government’s Open Government Licence. This license is the best created by any government to date (with the exemption of simple making the data public domain, which, of course, is far more ideal.
c) Use a modified version of the Geogratis license that adjusts the “3.0 PROTECTION AND ACKNOWLEDGEMENT OF SOURCE” clause to prevent the NASCAR effect from taking place.
What I hope does not happen is that:
a) More and more jurisdictions continue to use the Vancouver License. There are better options and it is an opportunity to launch an open data policy and leapfrog the current leaders in the space.
b) Jurisdictions adopt a Creative Commons license. Creative Commons was created to help license copyrighted material. Since data cannot be copyrighted, the use of creative commons risks confusing the public about the inherent rights they have to data. This is, in part, a philosophical argument, but it matters, especially for governments. We – and our governments especially – cannot allow people to begin to believe that data can be copyrighted.
c) There is no change to the current licenses being used, or a new license, like Open Database License (ODC-ODbL) which goes against the attributes described above, is adopted.
Let’s hope we make progress on this front in 2011.