Tag Archives: gov20

The Importance of Open Data Critiques – thoughts and context

Over at the Programmable City website Rob Kitchin has a thoughtful blog post on open data critiques. It is very much worth reading and wider discussion. Specifically, there are two competing things worth noting. First, it is important for the open data community – and advocates in particular – to acknowledge the responsibility we have in debates about open data. Second, I’d like to examine some of the critiques raised and discuss those I think misfire and those that deserve deeper dives.

Open Data as Dominant Discourse

During my 2011 keynote at Open Government Data camp I talked about how the open data movement was at an inflection point:

For years we have been on the outside, yelling that open data matters. But now we are being invited inside.

Two years later the transition is more than complete. If you have any doubts, consider this picture:OD as DCOnce you have these people talking about things like a G8 Open Data Charter you are no longer on the fringes. Not even remotely.

It also means understanding the challenges around open data has never been more important. We – open data advocates – are now complicit it what many of the above (mostly) men decide to do around open data. Hence the importance of Rob’s post. Previously those with power were dismissive of open data – you had to scream to get their attention. Today, those same actors want to act now and go far. Point them (or the institutions they represent) in the wrong direction and/or frame an issue incorrectly and you could have a serious problem on your hands. Consequently, the responsibility of advocates has never been greater. This is even more the case as open data has spread. Local variations matter. What works in Vancouver may not always be appropriate in Nairobi or London.

I shouldn’t have to say this but I will, because it matters so much: Read the critiques. They matter. They will make you better, smarter, and above all, more responsible.

The Four Critiques – a break down

Reading the critiques and agreeing with them is, of course, not the same thing. Rob cites four critiques of open data: funding and sustainability, politics of the benign and empowering the empowered, utility and usability, and neoliberalisation and marketisation of public services. Some of these I think miss the real concerns and risks around open data, others represent genuine concerns that everyone should have at the forefront of their thinking. Let me briefly touch on each one.

Funding and sustainability

This one strikes me as the least effective criticism. Outside the World Bank I’ve not heard of many examples where government effectively sell their data to make money. I would be very interested in examples to the contrary – it would make for a great list and would enlighten the discussion – although not, I suspect in ways that would make either side of the discussion happy.

The little research that has been done into this subject has suggested that charging for government data almost never yields much money, and often actually serves as a loss creating mechanism. Indeed a 2001 KPMG study of Canadian geospatial data found government almost never made money from data sales if purchases by other levels of government were not included. Again in Canada, Statistics Canada argued for years that it couldn’t “afford” to make its data open (free) as it needed the revenue. However, it turned out that the annual sum generated by these sales was around $2M dollars. This is hardly a major contributor to its bottom line. And of course, this does not count the money that had to go towards salaries and systems for tracking buyers and users, chasing down invoices, etc…

The disappointing line in the critique however was this:

de Vries et al. (2011) reported that the average apps developer made only $3,000 per year from apps sales, with 80 percent of paid Android apps being downloaded fewer than 100 times.  In addition, they noted that even successful apps, such as MyCityWay which had been downloaded 40 million times, were not yet generating profits.

Ugh. First, apps are not what is going to make open data interesting or sexy. I suspect they will make up maybe 5% of the ecosystem. The real value is going to be in analysis and enhancing other services. It may also be in the costs it eliminates (and thus capital and time it frees up, not in the companies it creates), something I outlined in Don’t Measure the Growth, Measure the Destruction.

Moreover, this is the internet. The average doesn’t mean anything. The average webpage probably gets 2 page views per day. That hardly means there aren’t lots of very successful webpages. The distribution is not a bell curve, its a long tail, so it is hard to see what the average tells us other than the cost of experimentation is very, very low. It tells us very little about if there are, or will be successful uses of open data.

Politics of the benign and empowering the empowered

The is the most important critique and it needs to be engaged. There are definitely cases where data can serve to further marginalize at risk communities. In addition, there are data sets that for reasons of security and privacy, should not be made open. I’m not interested in publishing the locations of women’s shelters or worse, the list of families taking refuge in them. Nor do I believe that open data will always serve to challenge the status quo or create greater equality. Even at its most reductionist – if one believes that information is power, then greater ability to access and make us of information makes one more powerful – this means that winners and losers will be created by the creation of new information.

There are however, two things that give me some hope in this space. The first is that, when it comes to open data, the axis of competition among providers usually centers around accessibility. For example, the Socrata platform (an provider of open data portals to government) invests heavily in creating tools that make government data accessible and usable to the broadest possible audience. This is not a claim that all communities are being engaged (far from it) and that a great deal more work cannot be done, but there is a desire to show greater use which drives some data providers to try to find ways to engage new communities.

The second is that if we want to create data literate society – and I think we do, for reasons of good citizenship, social justice and economic competitiveness – you need the data first for people to learn and play with. One of my most popular blog posts is Learning from Libraries: The Literacy Challenge of Open Data in which I point out that one of the best ways to help people become data literate is to give them more interesting data to play with. My point is that we didn’t build libraries after everyone knew how to read, we built them beforehand with the goal of having them as a place that could facilitate learning and education. Of course libraries also often have strong teaching components to them, and we definitely need more of this. Figuring out who to engage, and how it can be done most effectively is something I’m deeply interested in.

There are also things that often depress me. I struggle to think of technologies that did not empower the empowered – at least initially. From the cell phone to the car to the printing press to open source software, all these inventions have had helped billions of people, but they did not distribute themselves evenly, especially at first. So the question cannot be reduced to – will open data empower the empowered, but to what degree, and where and with whom. I’ve seen plenty of evidence where data has enabled small groups of people to protect their communities or make more transparent the impact (or lack there of) of a government regulation. Open data expands the number of people who can use government information for their own ends – this, I believe is a good thing – but that does not mean we shouldn’t be constantly looking for ways to ensure that it does not reinforce structural inequity. Achieving perfect distribution of the benefits of a new technology, or even public policy, is almost impossible. So we cannot make perfect the enemy of the good. However, that does not hide the fact that there are real risk – and responsibilities as advocates – that need to be considered here. This is an issue that will need to be constantly engaged.

Utility and Usability

Some of the issues around usability I’ve addressed above in the accessibility piece – for some portals (that genuinely want users) the axis of evolution is pointed in the right direction with governments and companies (like Socrata) trying to embed more tools on the website to make the data more usable.

I also agree with the central concern (not a critique) of this section, which is that rather than creating a virtuous circle, poorly thought out and launched open data portals will create a negative “doomloops” in which poor quality data begets little interest which begets less data. However, the concern, in my mind, focuses on to narrow a problem.

One of the big reasons I’ve been an advocate of open data was a desire not just to help citizens, non-profits and companies gain access to information that could help them with their missions, but to change the way government deals with its data so that it can share it internally more effectively. I often cite a public servant I know who had a summer intern spend 3 weeks surfing the national statistical agency website to find data they knew existed but could not find because of terrible design and search. A poor open data site is not just a sign that the public can’t access or effectively use government data, it usually suggests that the governments employees can’t access or effectively use their own data. This is often deeply frustrating to many public servants.

Thus, the most important outcome created by the open data movement may have been making governments realize that data represents an asset class that of which they have had little understanding (outside, sadly, the intelligence sector, which has been all too aware of this) and little policy and governance (outside, say, the GIS space and some personal records categories). Getting governments to think about data as a platform (yes, I’m a fan of government as a platform for external use, but above all for internal use) is, in my mind, one way we can both enable public servants to get better access to information while simultaneously attacking the huge vendors (like SAP and Oracle) whose $100 million dollar implementations often silo off data, rarely produce the results promised and are so obnoxiously expensive it boggles the mind (Clay Johnson has some wonderful examples of the roughly 50% of large IT projects that fail).

They key to all this is that open data can’t be something you slap on top of a big IT stack. I try to explain this in It’s the Icing Not the Cake, another popular blog post about why Washington DC was able to effectively launch an open data program so quickly (which was, apparently, so effective at bringing transparency to procurement data the subsequent mayor rolled it back). The point is, that governments need to start thinking in terms of platforms if – over the long term – open data is going to work. And it needs to start thinking of itself as the primary consumer of the data that is being served on that platform. Steve Yegge’s brilliant and sharp witted rant on how Google doesn’t get platforms is an absolute must read in this regard for any government official – the good news is you are not alone in not finding this easy. Google struggles with it as well.

My main point. Let’s not play at  the edges and merely define this challenge as one of usability. It is much, much bigger problem than that. It is a big, deep, culture-changing BHAG problem that needs tackling. If we get it wrong, then the big government vendors and he inertia of bureaucracy win. We get it right and we potentially could save taxpayers millions while enabling a more nimble, effective and responsive government.

Neoliberalisation and Marketisation of Government

If you not read Jo Bates article “Co-optation and contestation in the shaping of the UK’s Open Government Data Initiative” I highly recommend it. There are a number of arguments in the article I’m not sure I agree with (and feel are softened by her conclusion – so do read it all first). For example, the notion that open data has been co-opted into an “ideologically framed mould that champions the superiority of markets over social provision” strikes me as lacking nuance. One of the things open data can do is create a public recognition of a publicly held data set and the need to protect these against being privatized. Of course, what I suspect is that both things could be true simultaneously – there can be increased recognition of the importance of a public asset while also recognizing the increased social goods and market potential in leveraging said asset.

However, there is one thing Bates is absolutely correct about. Open data does not come into an empty playing field. It will be used by actors – on both the left and right – to advance their cause. So I too am uncomfortable with those that believe open data is going to somehow depoliticize government or politics – indeed I made a similar argument in a piece in Slate on the politics of data. As I try to point out you can only create a perverse, gerrymandered electoral district that looks like this…

gerrymandered in chicago… if you’ve got pretty good demographic data about target communities you want to engage (or avoid). Data – and even open data – doesn’t magically make things better. There are instances where open data can, I believe, create positive outcomes by shifting incentives in appropriate ways… but similarly, it can help all sorts of actors find ways to satisfy their own goals, which may not be aligned with your – or even society at large’s – goals.

This makes voices like Bates deeply important since they will challenge those of us interested in open data to be constantly evaluating the language we use, the coalitions we form and the priorities that get made, in ways that I think are profoundly important. Indeed, if you get to the end of Bates article there are a list of recommendations that I don’t think anyone I work with around open data would find objectionable, quite the opposite, they would agree are completely critical.

Summary

I’m so grateful to Rob for posting this piece. It is has helped me put into words some thoughts I’ve had, both about the open data criticisms as well as the important role the critiques play. I try hard to be critical advocate of open data – one who engages the risks and challenges posed by open data. I’m not perfect, and balancing these two goals – advocacy with a critical view – is not easy, but I hope this shines some window into the ways I’m trying to balance it and possible helps others do more of it as well.

Government Procurement Reform – It matters

Earlier this week I posted a slidecast on my talk to Canada’s Access to Information Commissioners about how, as they do their work, they need to look deeper into the government “stack.”

My core argument was how decisions about what information gets made accessible is no longer best managed at the end of a policy development or program delivery process but rather should be embedded in it. This means monkeying around and ensuring there is capacity to export government information and data from the tools (e.g. software) government uses every day. Logically, this means monkeying around in procurement policy (see slide below) since that is where the specs for the tools public servants use get set. Trying to bake “access” into processes after the software has been chosen is, well, often an expensive nightmare.

Gov stack

Privately, one participant from a police force, came up to me afterward and said that I was simply guiding people to another problem – procurement. He is right. I am. Almost everyone I talk to in government feels like procurement is broken. I’ve said as much myself in the past. Clay Johnson is someone who has thought about this more than others, here he is below at the Code for America Summit with a great slide (and talk) about how the current government procurement regime rewards all the wrong behaviours and often, all the wrong players.

Clay Risk profile

So yes, I’m pushing the RTI and open data community to think about procurement on purpose. Procurement is borked. Badly. Not just from a wasting tax dollars money perspective, or even just from a service delivery perspective, but also because it doesn’t serve the goals of transparency well. Quite the opposite. More importantly, it isn’t going to get fixed until more people start pointing out that it is broken and start contributing to solving this major bottle neck of a problem.

I highly, highly recommend reading Clay Johnson’s and Harper Reed’s opinion piece in today’s New York Times about procurement titled Why the Government Never Gets Tech Right.

All of this becomes more important if the White House’s (and other governments’ at all levels) have any hope of executing on their digital strategies (image below).  There is going to be a giant effort to digitize much of what governments do and a huge number of opportunities for finding efficiencies and improving services is going to come from this. However, if all of this depends on multi-million (or worse 10 or 100 million) dollar systems and websites we are, to put it frankly, screwed. The future of government isn’t to be (continue to be?) taken over by some massive SAP implementation that is so rigid and controlled it gives governments almost no opportunity to innovate. And this is the future our procurement policies steer us toward. A future with only a tiny handful of possible vendors, a high risk of project failure and highly rigid and frail systems that are expensive to adapt.

Worse there is no easy path here. I don’t see anyone doing procurement right. So we are going to have to dive into a thorny, tough problem. However, the more governments that try to tackle it in radical ways, the faster we can learn some new and interesting lessons.

Open Data WH

What Traffic Lights Say About the Future of Regulation

I have a piece up on TechPresident about some crazy regulations that took place in Florida that put citizens at greater risk all so the state and local governments can make more money.

Here’s a chunk:

In effect, what the state of Florida is saying is that a $20 million increase in revenue is worth an increase in risk of property damage, injury and death as a result of increased accidents. Based onnational statistics, there are likely about 62 deaths and 5,580 injuries caused by red light running in Florida each year. If shorter yellow lights increased that rate by 10 percent (far less than predicted by the USDOT) that could mean an additional 6 deaths and 560 injuries. Essentially the state will raise a measly extra $35,000 for each injury or death its regulations help to cause, and possibly far less.

Some Nice Journalistic Data Visualization – Global’s Crude Awakening

Over at Global, David Skok and his team have created a very nice visualization of the over 28,666 crude oil spills that have happened on Alberta pipelines over the last 37 years (that’s about two a day). Indeed, for good measure they’ve also visualized the additional 31,453 spills of “other” substance carried by Alberta pipeline (saltwater, liquid petroleum, etc..)

They’ve even created a look up feature so you can tackle the data geographically, by name, or by postal code. It is pretty in depth.

Of course, I believe all this data should be open. Sadly, they have to get at it through a complicated Access to Information Request that appears to have consumed a great deal of time and resources and that would probably only be possible by a media organizations with the  dedicated resources (legal and journalistic) and leverage to demand it. Had this data been open there would have still been a great deal of work to parse, understand and visualize it, but it would have helped lower the cost of development.

In fact, if you are curious about how they got the data – and the sad, sad, story it involved – take a look at the fantastic story they wrote about the creation of their oilspill website. This line really stood out for me:

An initial Freedom of Information request – filed June 8, 2012, the day after the Sundre spill – asked Alberta Environment and Sustainable Resource Development for information on all reported spills from the oil and gas industry, from 2006 to 2012.

About a month later, Global News was quoted a fee of over $4,000 for this information. In discussions with the department, it turned out this high fee was because the department was unable to provide the information in an electronic format: Although it maintained a database of spills, the departmental process was to print out individual reports on paper, and to charge the requester for every page.

So the relevant government department has the data in a machine readable form. It just chooses to only give it out in a paper form. Short of simply not releasing the data at all it is hard to imagine a more obstructionist approach to preventing the public from accessing environmental data their tax dollars paid to collect and that is supposed to be in the public interest. You essentially look at thousands of pieces of paper and re-enter tens, if not hundreds of thousands, of data points into spreadsheets. This is a process designed to prevent you from learning anything and frustrating potential users.

Let’s hope that when the time comes for the Global team to update this tool and webpage there will be open data they can download and access to the task is a little easier.

 

Awesome Simple Open Data use case – Welcome Wagon for New Community Businesses

A few weeks ago I was at an event in Victoria, British Columbia at event where people were discussing the possibilities, challenges and risk of open data. During the conversation, one of the participants talked about how they wanted an API for business license applications from the city.

This is a pretty unusual request – people have told me about their desire for business licenses data especially at the provincial/state and national level, but also at the local level. However, they are usually happy with a data dump once a year or quarter since they generally want to analyze the data for urban planning or business planning reasons. But an API – which would mean essentially constant access to the data and the opportunity to see changes to the database in real time (e.g. if a business registered or moved) – was different.

The reason? The individual – who was an entrepreneur and part of the local Business Improvement Area – wanted to be able to offer a “welcome wagon” to other new businesses in his community. If he knew when a business opened up he could reach out and help make them welcome them to the neighborhood. He thought it was always nice when shopkeepers knew one another but didn’t always know what was going on even a few blocks away because, well, he was often manning his own shop. I thought it was a deeply fascinating example of how open data could help foster community and is something I would have never imagined.

Food for thought and wanted to share.

 

The Value of Open Data – Don’t Measure Growth, Measure Destruction

Alexander Howard – who, in my mind, is the best guy covering the Gov 2.0 space – pinged me the other night to ask “What’s the best evidence of open data leading to economic outcomes that you’ve seen?”

I’d like to hack the question because – I suspect – for many people, they will be looking to measure “economic outcomes” in ways that I don’t think will be so narrow as to be helpful. For example, if you are wondering what the big companies are going to be that come out of the open data movement and/or what are the big savings that are going to be found by government via sifting through the data, I think you are probably looking for the wrong indicators.

Why? Part of it is because the number of “big” examples is going to be small.

It’s not that I don’t think there won’t be any. For example several years ago I blogged about how FOIed (or, in Canada ATIPed) data that should have been open helped find $3.2B in evaded tax revenues channeled through illegal charities. It’s just that this is probably not where the wins will initially take place.

This is in part because most data for which there was likely to be an obvious and large economic impact (eg spawning a big company or saving a government millions) will have already been analyzed or sold by governments before the open data movement came along. On the analysis side of the question- if you are very confident a data set could yield tens or hundreds of millions in savings… well… you were probably willing to pay SAS or some other analytics firm 30-100K to analyze it. And you were probably willing to pay SAP a couple of million (a year?) to set up the infrastructure to just gather the data.

Meanwhile, on the “private sector company” side of the equation – if that data had value, there were probably eager buyers. In Canada for example, interest in census data – to help with planning where to locate stores or how to engage in marketing and advertising effectively – was sold because the private sector made it clear they were willing to pay to gain access to it. (Sadly, this was bad news for academics, non-profits and everybody else, for whom it should have been free, as it was in the US).

So my point is, that a great deal of the (again) obvious low hanging fruit has probably been picked long before the open data movement showed up, because governments – or companies – were willing to invest some modest amounts to create the benefits that picking those fruit would yield.

This is not to say I don’t think there are diamonds in the rough out there – data sets that will reveal significant savings – but I doubt they will be obvious or easy finds. Nor do I think that billion dollar companies are going to spring up around open datasets over night since –  by definition – open data has low barriers to entry to any company that adds value to them. One should remember it took Red Hat two decades to become a billion dollar company. Impressive, but it is still a tiny compared to many of its rivals.

And that is my main point.

The real impact of open data will likely not be in the economic wealth it generates, but rather in its destructive power. I think the real impact of open data is going to be in the value it destroys and so in the capital it frees up to do other things. Much like Red Hat is fraction of the size of Microsoft, Open Data is going to enable new players to disrupt established data players.

What do I mean by this?

Take SeeClickFix. Here is a company that, leveraging the Open311 standard, is able to provide many cities with a 311 solution that works pretty much out of the box. 20 years ago, this was a $10 million+ problem for a major city to solve, and wasn’t even something a small city could consider adopting – it was just prohibitively expensive. Today, SeeClickFix takes what was a 7 or 8 digit problem, and makes it a 5 or 6 digit problem. Indeed, I suspect SeeClickFix almost works better in a small to mid-sized government that doesn’t have complex work order software and so can just use SeeClickFix as a general solution. For this part of the market, it has crushed the cost out of implementing a solution.

Another example. And one I’m most excited. Look at CKAN and Socrata. Most people believe these are open data portal solutions. That is a mistake. These are data management companies that happen to have simply made “sharing (or “open”) a core design feature. You know who does data management? SAP. What Socrata and CKAN offer is a way to store, access, share and engage with data previously gathered and held by companies like SAP at a fraction of the cost. A SAP implementation is a 7 or 8 (or god forbid, 9) digit problem. And many city IT managers complain that doing anything with data stored in SAP takes time and it takes money. CKAN and Socrata may have only a fraction of the features, but they are dead simple to use, and make it dead simple to extract and share data. More importantly they make these costly 7 and 8 digital problems potentially become cheap 5 or 6 digit problems.

On the analysis side, again, I do hope there will be big wins – but what I really think open data is going to do is lower the costs of creating lots of small wins – crazy numbers of tiny efficiencies. If SAP and SAS were about solving the 5 problems that could create 10s of millions in operational savings for governments and companies then Socrata, CKAN and the open data movement is about finding the 1000 problems for which you can save between $20,000 and $1M in savings. For example, when you look at the work that Michael Flowers is doing in NYC, his analytics team is going to transform New York City’s budget. They aren’t finding $30 million dollars in operational savings, but they are generating a steady stream of very solid 6 to low 7 digit savings, project after project. (this is to say nothing of the lives they help save with their work on ambulances and fire safety inspections). Cumulatively  over time, these savings are going to add up to a lot. But there probably isn’t going to be a big bang. Rather, we are getting into the long tail of savings. Lots and lots of small stuff… that is going to add up to a very big number, while no one is looking.

So when I look at open data, yes, I think there is economic value. Lots and lots of economic value. Hell, tons of it.

But it isn’t necessarily going to happen in a big bang, and it may take place in the creative destruction it fosters and so the capital it frees up to spend on other things. That may make it potentially harder to measure (I’m hoping some economist much smarter than me is going tell me I’m wrong about that) but that’s what I think the change will look like.

Don’t look for the big bang, and don’t measure the growth in spending or new jobs. Rather let’s try to measure the destruction and cumulative impact of a thousand tiny wins. Cause that is where I think we’ll see it most.

Postscript: Apologies again for any typos – it’s late and I’m just desperate to get this out while it is burning in my brain. And thank you Alex for forcing me to put into words something I’ve been thinking about saying for months.

 

Canada Post and the War on Open Data, Innovation & Common Sense (continued, sadly)

Almost exactly a year ago I wrote a blog post on Canada Post’s War on the 21st Century, Innovation & Productivity. In it I highlighted how Canada Post launched a lawsuit against a company – Geocoder.ca – that recreates the postal code database via crowdsourcing. Canada Posts case was never strong, but then, that was not their goal. As a large, tax payer backed company the point wasn’t to be right, it was to use the law as a way to financial bankrupt a small innovator.

This case matters – especially to small start ups and non-profits. Open North – a non-profit on which I sit on the board of directors – recently explored what it would cost to use Canada Posts postal code data base on represent.opennorth.ca, a website that helps identify elected officials who serve a given address. The cost? $9,000 a year, nothing near what it could afford.

But that’s not it. There are several non-profits that use Represent to help inform donors and other users of their website about which elected officials represent geographies where they advocate for change. The licensing cost if you include all of these non-profits and academic groups? $50,000 a year.

This is not a trivial sum, and it is very significant for non-profits and academics. It is also a window into why Canada Post is trying to sue Geocoder.ca – which offers a version of its database for… free. That a private company can offers a similar service at a fraction of the cost (or for nothing) is, of couse, a threat.

Sadly, I wish I could report good news on the one year anniversary of the case. Indeed, I should be!

This is because what should have been the most important development was how the Federal Court of Appeal made it even more clear that data cannot be copyrighted. This probably made it Canada Post’s lawyers that they were not going to win and made it even more obvious to us in the public that the lawsuit against geocoder.ca - which has not been dropped-  was completely frivolous.

Sadly, Canada Post reaction to this erosion of its position was not to back off, but to double down. Recognizing that they likely won’t win a copyright case over postal code data, they have decided:

a) to assert that they hold trademark on the words ‘postal code’

b) to name Ervin Ruci – the opertator of Geocoder.ca – as a defendent in the case, as opposed to just his company.

The second part shows just how vindictive Canada Post’s lawyers are, and reveals the true nature of this lawsuit. This is not about protecting trademark. This is about sending a message about legal costs and fees. This is a predatory lawsuit, funded by you, the tax payer.

But part a is also sad. Having seen the writing on the wall around its capacity to win the case around data, Canada Post is suddenly decided – 88 years after it first started using “Postal Zones” and 43 years after it started using “Postal Codes” to assert a trade mark on the term? (You can read more on the history of postal codes in canada here).

Moreover the legal implications if Canada Post actually won the case would be fascinating. It is unclear that anyone would be allowed to solicit anybody’s postal code – at least if they mentioned the term “postal code” – on any form or website without Canada Posts express permission. It leads one to ask. Does the federal government have Canada Post’s express permission to solicit postal code information on tax forms? On Passport renewal forms? On any form they have ever published? Because if not, they are, I understand Canada Posts claim correctly, in violation of Canada Post trademark.

Given the current government’s goal to increase the use of government data and spur innovation, will they finally intervene in what is an absurd case that Canada Post cannot win, that is using tax payer dollars to snuff out innovators, increases the costs of academics to do geospatial oriented social research and that creates a great deal of uncertainty about how anyone online be they non-profits, companies, academics, or governments, can use postal codes.

I know of no other country in the world that has to deal with this kind of behaviour from their postal service. The United Kingdom compelled its postal service to make postal code information public years ago.In Canada, we handle the same situation by letting a tax payer subsidized monopoly hire expensive lawyers to launch frivolous lawsuits against innovators who are not breaking the law.

That is pretty telling.

You can read more about this this, and see the legal documents on Ervin Ruci’s blog has also done a good job covering this story at canada.com.

Toronto Star Op-Ed: Muzzled Scientists, Open Government and the Limits of Rules

I’ve a piece in today’s Toronto Star “Rules are no substitute for cultivating a culture of open government” about the Information Commissioners decision to investigate the muzzling of Canadian scientists.

Some choice paragraphs:

The actions of the information commissioner are to be applauded; what is less encouraging are the limits of her ability to resolve the problem. The truth is that openness, transparency and accountability cannot be created by the adoption of new codes or rules alone.

This is because even more than programs and regulations, an open government is the result of culture, norms and leadership. And here the message — felt as strongly by government scientists as any other public servants — is clear. Public servants are allowed less and less to have a perspective, to say nothing of the ability to share that perspective.

and on ways I think this has consequences that impact the government’s agenda directly:

This breakdown in culture has consequences — some of which may impact the government’s most important priorities. Take, for example, the United States’ preoccupation with Canada’s environmental record in general and its specific concerns about the oilsands in regards to the proposed Keystone XL pipeline. The government has spent the last month trying to burnish its environmental record in anticipation of the decision. And yet, it is amazing how few in Ottawa recognize the direct link between the openness around which government scientists can speak about their work and the degree of trust that Canadians — as well as our allies — have in our capacity to protect the environment.

I hope you’ll give it a read.

Ontario's Open Data Policy: The Good, The Bad, The Ugly and the (Missed?) Opportunity

Yesterday the province of Ontario launched its Open Data portal. This is great news and is the culmination of a lot of work by a number of good people. The real work behind getting open data program launched is, by and large, invisible to the public, but it is essential – and so congratulations are in order for those who helped out.

Clearly this open data portal is in its early stages – something the province is upfront about. As a result, I’m less concerned with the number of data sets on the site (which however, needs to, and should, grow over time). Hopefully the good people in the government of Ontario have some surprises for us around interesting data sets.

Nor am I concerned about the layout of the site (which needs to, and should, improve over time – for example, once you start browsing the data you end up on this URL and there is no obvious path back to the open data landing page, it makes navigating the site hard).

In fact, unlike some I find any shortcomings in the website downright encouraging. Hopefully it means that speed, iteration and an attitude to ship early has won out over media obsessive, rigid, risk adverse approach governments all to often take. Time will tell if my optimism is warranted.

What I do want to focus on is the license since this is a core piece of infrastructure to an open data initiative. Indeed, it is the license that determines whether the data is actually open or closed. And I think we should be less forgiving of errors in this regard than in the past. It was one thing if you launched in the early days of open data two or four years ago. But we aren’t in early days anymore. There over 200 government open data portal around the world. We’ve crossed the chasm people. Not getting the license right is not a “beta” mistake any more. It’s just a mistake.

So what can we say about the Ontario Open Data license?

First, the Good

There is lots of good things to be said about it. It clearly keys off the UK’s Open Government License much like BC’s license did as does the proposed Canadian Open Government License. This means that above all, it is written in  plain english and is easily understood. In addition, the general format is familiar to many people interested in open data.

The other good thing about the license (pointed out to me by the always sharp Jason Birch) is that it’s attribution clause is softer than the UK, BC or even the proposed Federal Government license. Ontario uses the term “should” whereas the others use the term “must.”

Sadly, this one improvement pales in comparison to some of the problems and, most importantly the potentially lost opportunity I urgently highlight at the bottom of this post.

The Bad

While this license does have many good qualities initiated by the UK, it does suffer from some major flaws. The most notable comes in this line:

Ontario does not guarantee the continued supply of the Datasets, updates or corrections or that they are or will be accurate, useful, complete, current or free and clear of any possible third party copyright, moral right, other intellectual property right or other claim.

Basically this line kills the possibility that any business, non-profit or charity will ever use this data in any real sense. Hobbyests, geeks, academics will of course use it but this provision is deeply flawed.

Why?

Well, let me explain what it means. This says that the government cannot be held accountable to only release data it has the right to release. For example: say the government has software that tracks road repair data and it starts to release it and, happily all sorts of companies and app developers use it to help predict traffic and do other useful things. But then, one day the vendor that provided that road repair tracking software suddenly discovers in the fine print of the contract that they, not the government, own that data! Well! All those companies, non-profits and app developers are suddenly using proprietary data, not (open) government data. And the vendor would be entirely in its rights to go either sue them, or demand a license fee in exchange of letting them continue to use the data.

Now, I understand why the government is doing this. It doesn’t want to be liable if such a mistake is made. But, of course, if they don’t want to absorbe the risk, that risk doesn’t magically disappear, it transfers to the data user. But of course they have no way of managing that risk! Those users don’t know what the contracts say and what the obligations are, the party best positioned to figure that out is the government! Essentially this line transfers a risk to the party (in this case the user) that is least able to manage it. You are left asking yourself, what business, charity or non-profit is going to invest hundreds of thousands of dollars (or more) and people time to build a product, service or analysis around an asset (government data) that it might suddenly discover it doesn’t have the right to use?

The government is the only organization that can clear the rights. If it is unwilling to do so, then I think we need to question whether this is actually open data.

The Ugly

But of course the really ugly part of the license (which caused me to go on a bit of a twitter rant) comes early. Here it is:

If you do any of the above you must ensure that the following conditions are met:

  • your use of the Datasets causes no harm to others.

Wowzers.

This clause is so deeply problematic it is hard to know where to begin.

First, what is the definition of harm? If I use open data from the  Ontario government to rate hospitals and the some hospitals are sub-standard am I “harming” the hospital? Its workers? The community? The Ministry of Health?

So then who decides what the definition is? Well, since the Government of Ontario is the licensor of the data… it would seem to suggest that they do. Whatever the standing government of the data wants to decree is a “harm” suddenly becomes legit. Basically this clause could be used to strip many users – particularly those interested in using the data as a tool for accountability – of their right to use the data, simply because it makes the licensor (e.g. the government) uncomfortable.

A brief history lesson here for the lawyers who inserted this clause. Back in in March of 2011 when the Federal Government launched data.gc.ca they had a similar clause in their license. It read as follows:

“You shall not use the data made available through the GC Open Data Portal in any way which, in the opinion of Canada, may bring disrepute to or prejudice the reputation of Canada.”

While the language is a little more blunt, its effect was the same. After the press conference launching the site I sat down with Stockwell Day (who was the Minister responsible at the time) for 45 minutes and walked him through the various problems with their license.

After our conversations, guess how long it took for that clause to be removed from the license? 3 hours.

If this license is going to be taken seriously, that clause is going to have to go, otherwise, it risks being a laughing stock and a case study of what not to do in Open Government workshops around the world.

(An aside: What was particularly nice was the Minister Day personally called my cell phone to let me know that he’d removed that clause a few hours after our conversation. I’ve disagreed with Day on many, many, many things, but was deeply impressed by his knowledge of the open data file and his commitment to its ideals. Certainly his ability to change the license represents one of the fastest changes to policy I’ve ever witnessed.)

The (Missed?) Opportunity

What is ultimately disappointing about the Ontario license however is that it was never needed. Why every jurisdiction feels the need to invent its own license is always beyond me. What, beyond the softening of the attribution clause, has the Ontario license added to the Open Data world. Not much that I can see. And, as I’ve noted above, it many ways it is a step back.

You know data users would really like? A common license. That would make it MUCH easier to user data from the federal government, the government of Ontario and the Toronto City government all at the same time and not worry about compatibility issues and whether you are telling the end user the right thing or not. In this regard the addition of another license is a major step backwards. Yes, let me repeat that for other jurisdictions thinking about doing open data: The addition of another new license is a major step backwards.

Given that the Federal Government has proposed a new Open Government License that is virtually identical to this license but has less problematic language, why not simply use it? It would make the lives of the people who this license is supposed to enable  - the policy wonks, the innovators, the app developers, the data geeks – lives so much easier.

That opportunity still exists. The Government of Ontario could still elect to work with the Feds around a common license. Indeed given that the Ontario Open Data portal says they are asking for advice on how to improve this program, I implore, indeed beg, that you consider doing that. It would be wonderful if we could move to a single license in this country, and if a partnership between the Federal Government and Ontario might give such an initiative real momentum and weight. If not, into the balkanized abyss of a thousand licenses we wil stumble.

 

The UK's Digital Government Strategy – Worth a Peek

I’ve got a piece up on TechPresident about the UK Government’s Digital Strategy which was released today.

The strategy (and my piece!) are worth checking out. They are saying a lot of the right things – useful stuff for anyone in industry or sector that has been conservative vis-a-vis online services (I’m looking at you governments and banking).

As  I note in the piece… there is reason we should expect better:

The second is that the report is relatively frank, as far as government reports go. The website that introduces the three reports is emblazoned with an enormous title: “Digital services so good that people prefer to use them.” It is a refreshing title that amounts to a confession I’d like to see from more governments: “sorry, we’ve been doing it wrong.” And the report isn’t shy about backing that statement up with facts. It notes that while the proportion of Internet users who shop online grew from 74 percent in 2005 to 86 percent in 2011, only 54 percent of UK adults have used a government service online. Many of those have only used one.

Of course the real test will come with execution. The BC Government, the White House and others have written good reports on digital government, but it is rolling it out that is the tricky part. The UK Government has pretty good cred as far as I’m concerned, but I’ll be watching.

You can read the piece here – hope you enjoy!