Category Archives: public policy

Ontario's Open Data Policy: The Good, The Bad, The Ugly and the (Missed?) Opportunity

Yesterday the province of Ontario launched its Open Data portal. This is great news and is the culmination of a lot of work by a number of good people. The real work behind getting open data program launched is, by and large, invisible to the public, but it is essential – and so congratulations are in order for those who helped out.

Clearly this open data portal is in its early stages – something the province is upfront about. As a result, I’m less concerned with the number of data sets on the site (which however, needs to, and should, grow over time). Hopefully the good people in the government of Ontario have some surprises for us around interesting data sets.

Nor am I concerned about the layout of the site (which needs to, and should, improve over time – for example, once you start browsing the data you end up on this URL and there is no obvious path back to the open data landing page, it makes navigating the site hard).

In fact, unlike some I find any shortcomings in the website downright encouraging. Hopefully it means that speed, iteration and an attitude to ship early has won out over media obsessive, rigid, risk adverse approach governments all to often take. Time will tell if my optimism is warranted.

What I do want to focus on is the license since this is a core piece of infrastructure to an open data initiative. Indeed, it is the license that determines whether the data is actually open or closed. And I think we should be less forgiving of errors in this regard than in the past. It was one thing if you launched in the early days of open data two or four years ago. But we aren’t in early days anymore. There over 200 government open data portal around the world. We’ve crossed the chasm people. Not getting the license right is not a “beta” mistake any more. It’s just a mistake.

So what can we say about the Ontario Open Data license?

First, the Good

There is lots of good things to be said about it. It clearly keys off the UK’s Open Government License much like BC’s license did as does the proposed Canadian Open Government License. This means that above all, it is written in plain english and is easily understood. In addition, the general format is familiar to many people interested in open data.

The other good thing about the license (pointed out to me by the always sharp Jason Birch) is that it’s attribution clause is softer than the UK, BC or even the proposed Federal Government license. Ontario uses the term “should” whereas the others use the term “must.”

Sadly, this one improvement pales in comparison to some of the problems and, most importantly the potentially lost opportunity I urgently highlight at the bottom of this post.

The Bad

While this license does have many good qualities initiated by the UK, it does suffer from some major flaws. The most notable comes in this line:

Ontario does not guarantee the continued supply of the Datasets, updates or corrections or that they are or will be accurate, useful, complete, current or free and clear of any possible third party copyright, moral right, other intellectual property right or other claim.

Basically this line kills the possibility that any business, non-profit or charity will ever use this data in any real sense. Hobbyests, geeks, academics will of course use it but this provision is deeply flawed.

Why?

Well, let me explain what it means. This says that the government cannot be held accountable to only release data it has the right to release. For example: say the government has software that tracks road repair data and it starts to release it and, happily all sorts of companies and app developers use it to help predict traffic and do other useful things. But then, one day the vendor that provided that road repair tracking software suddenly discovers in the fine print of the contract that they, not the government, own that data! Well! All those companies, non-profits and app developers are suddenly using proprietary data, not (open) government data. And the vendor would be entirely in its rights to go either sue them, or demand a license fee in exchange of letting them continue to use the data.

Now, I understand why the government is doing this. It doesn’t want to be liable if such a mistake is made. But, of course, if they don’t want to absorbe the risk, that risk doesn’t magically disappear, it transfers to the data user. But of course they have no way of managing that risk! Those users don’t know what the contracts say and what the obligations are, the party best positioned to figure that out is the government! Essentially this line transfers a risk to the party (in this case the user) that is least able to manage it. You are left asking yourself, what business, charity or non-profit is going to invest hundreds of thousands of dollars (or more) and people time to build a product, service or analysis around an asset (government data) that it might suddenly discover it doesn’t have the right to use?

The government is the only organization that can clear the rights. If it is unwilling to do so, then I think we need to question whether this is actually open data.

The Ugly

But of course the really ugly part of the license (which caused me to go on a bit of a twitter rant) comes early. Here it is:

If you do any of the above you must ensure that the following conditions are met:

your use of the Datasets causes no harm to others.

Wowzers.

This clause is so deeply problematic it is hard to know where to begin.

First, what is the definition of harm? If I use open data from the Ontario government to rate hospitals and the some hospitals are sub-standard am I “harming” the hospital? Its workers? The community? The Ministry of Health?

So then who decides what the definition is? Well, since the Government of Ontario is the licensor of the data… it would seem to suggest that they do. Whatever the standing government of the data wants to decree is a “harm” suddenly becomes legit. Basically this clause could be used to strip many users – particularly those interested in using the data as a tool for accountability – of their right to use the data, simply because it makes the licensor (e.g. the government) uncomfortable.

A brief history lesson here for the lawyers who inserted this clause. Back in in March of 2011 when the Federal Government launched data.gc.ca they had a similar clause in their license. It read as follows:

“You shall not use the data made available through the GC Open Data Portal in any way which, in the opinion of Canada, may bring disrepute to or prejudice the reputation of Canada.”

While the language is a little more blunt, its effect was the same. After the press conference launching the site I sat down with Stockwell Day (who was the Minister responsible at the time) for 45 minutes and walked him through the various problems with their license.

After our conversations, guess how long it took for that clause to be removed from the license? 3 hours.

If this license is going to be taken seriously, that clause is going to have to go, otherwise, it risks being a laughing stock and a case study of what not to do in Open Government workshops around the world.

(An aside: What was particularly nice was the Minister Day personally called my cell phone to let me know that he’d removed that clause a few hours after our conversation. I’ve disagreed with Day on many, many, many things, but was deeply impressed by his knowledge of the open data file and his commitment to its ideals. Certainly his ability to change the license represents one of the fastest changes to policy I’ve ever witnessed.)

The (Missed?) Opportunity

What is ultimately disappointing about the Ontario license however is that it was never needed. Why every jurisdiction feels the need to invent its own license is always beyond me. What, beyond the softening of the attribution clause, has the Ontario license added to the Open Data world. Not much that I can see. And, as I’ve noted above, it many ways it is a step back.

You know data users would really like? A common license. That would make it MUCH easier to user data from the federal government, the government of Ontario and the Toronto City government all at the same time and not worry about compatibility issues and whether you are telling the end user the right thing or not. In this regard the addition of another license is a major step backwards. Yes, let me repeat that for other jurisdictions thinking about doing open data: The addition of another new license is a major step backwards.

Given that the Federal Government has proposed a new Open Government License that is virtually identical to this license but has less problematic language, why not simply use it? It would make the lives of the people who this license is supposed to enable – the policy wonks, the innovators, the app developers, the data geeks – lives so much easier.

That opportunity still exists. The Government of Ontario could still elect to work with the Feds around a common license. Indeed given that the Ontario Open Data portal says they are asking for advice on how to improve this program, I implore, indeed beg, that you consider doing that. It would be wonderful if we could move to a single license in this country, and if a partnership between the Federal Government and Ontario might give such an initiative real momentum and weight. If not, into the balkanized abyss of a thousand licenses we wil stumble.

Re-Architecting the City by Changing the Timelines and Making it Disappear

5 Replies

A couple of weeks ago I was asked by one of the city’s near where I live to sit on an advisory board around the creation of their Digital Government strategy. For me the meeting was good since I felt that a cohort of us on the advisory board were really pushing the city into a place of discomfort (something you want an advisory board to do in certain ways). My sense is a big part of that conversation had to do with a subtle gap between the city staff and some of the participants around what a digital strategy should deal with.

Gord Ross (of Open Roads) – a friend and very smart guy – and I were debriefing afterwards about where and why the friction was arising.

We had been pushing the city hard on its need to iterate more and use data to drive decisions. This was echoed by some of the more internet oriented members of the board. But at one point I feel like I got healthy push back from one of the city staff. How, they asked, can I iterate when I’ve got 10-60 years timelines that I need to plan around? I simply cannot iterate when some of the investments I’m making are that longterm.

Gord raised Stewart Brands building layers as a metaphor which I think sums up the differing views nicely.

Brand presents his basic argument in an early chapter, “Shearing Layers,” which argues that any building is actually a hierarchy of pieces, each of which inherently changes at different rates. In his business-consulting manner, he calls these the “Six S’s” (borrowed in part from British architect and historian F. Duffy’s “Four S’s” of capital investment in buildings).

The Site is eternal; the Structure is good for 30 to 300 years (“but few buildings make it past 60, for other reasons”); the Skin now changes every 15 to 20 years due to both weathering and fashion; the Services (wiring, plumbing, kitchen appliances, heating and cooling) change every seven to 15 years, perhaps faster in more technological settings; Space Planning, the interior partitioning and pedestrian flow, changes every two or three years in offices and lasts perhaps 30 years in the most stable homes; and the innermost layers of Stuff (furnishings) change continually.

My sense is the city staff are trying to figure out what the structure, skin and services layers should be for a digital plan, whereas a lot of us in the internet/tech world live occasionally in the services layer but most in the the space planning and stuff layers where the time horizons are WAY shorter. It’s not that we have to think that way, it is just that we have become accustomed to thinking that way… doubly so since so much of what works on the internet isn’t really “planned” it is emergent. As a result, I found this metaphor useful for trying to understanding how we can end up talking past one another.

It also goes to the heart of what I was trying to convey to the staff: that I think there are a number of assumptions governments make about what has been a 10 or 50 year lifecycle versus what that lifecycle could be in the future.

In other words, a digital strategy could allow some things “phase change” from being say in the skin or service layer to being able to operate on the faster timeline, lower capital cost and increased flexibility of a space planning layer. This could have big implications on how the city works. If you are buying software or hardware on the expectation that you will only have to do it every 15 years your design parameters and expectations will be very different than if it is designed for 5 years. It also has big implications for the systems that you connect to or build around that software. If you accept that the software will constantly be changing, easy integration becomes a necessary feature. If you think you will have things for decades than, to a certain degree, stability and rigidity are a byproduct.

This is why, if the choice is between trying to better predict how to place a 30 year bet (e.g. architect something to be in the skin or services layer) or place a 5 year bet (architect it to be in the space planning or stuff layer) put as much of it in the latter as possible. If you re-read my post on the US government’s Digital Government strategy, this is functionally what I think they are trying to do. By unbundling the data from the application they are trying to push the data up to the services layer of the metaphor, while pushing the applications built upon it down to the space planning and stuff layer.

This is not to say that nothing should be long term, or that everything long term is bad. I hope not to convey this. Rather, that by being strategic about what we place where we can foster really effective platforms (services) that can last for decades (think data) while giving ourselves a lot more flexibility around what gets built around them (think applications, programs, etc…).

The Goal

The reason why you want to do all this, is because you actually want to give the city the flexibility to a) compete in a global marketplace and b) make itself invisible to its citizens. I hinted at this goal the other day at the end of my piece in TechPresident on the UK’s digital government strategy.

On the competitive front I suspect that across Asia and Africa about 200 cities, and maybe a lot more, are going to get brand new infrastructure over the coming 100 years. Heck some of these cities are even being built from scratch. If you want your city to compete in that environment, you’d better be able to offer new and constantly improving services in order to keep up. If not, others may create efficiencies and discover improvements that given them structural advantages in the competition for talent and other resources.

But the other reason is that this kind of flexibility is, I think, critical to making (what Gord now has me referring to as the big “C” city) disappear. I like my government services best when they blend into my environment. If you live a privilidged Western World existence… how often do you think about electricity? Only when you flick the switch and it doesn’t work. That’s how I suspect most people want government to work. Seamless, reliable, designed into their lives, but not in the way of their lives. But more importantly, I want the “City” to be invisible so that it doesn’t get in the way of my ability to enjoy, contribute to, and be part of the (lower case) city – the city that we all belong to. The “city” as that messy, idea swapping, cosmopolitan, wealth and energy generating, problematic space that is the organism humans create where ever the gather in large numbers. I’d rather be writing the blog post on a WordPress installation that does a lot of things well but invisibly, rather than monkeying around with scripts, plugins or some crazy server language I don’t want to know. Likewise, the less time I spend on “the City,” and the more seamlessly it works, the more time I spend focused on “the city” doing the things that make life more interesting and hopefully better for myself and the world.

Sorry for the rambling post. But digesting a lot of thoughts. Hope there were some tasty pieces in that for you. Also, opaque blog post title eh? Okay bed time now.

The UK's Digital Government Strategy – Worth a Peek

Doing Government Websites Right

12 Replies

Today, I have a piece over on Tech President about how the new UK government website – Gov.uk – does a lot of things right.

I’d love to see more governments invest two of the key ingredients that made the website work – good design and better analytics.

Sadly, on the design front many politicians see design as a luxury and fail to understand that good design doesn’t just make things look better, they make websites (and other things) easier to use and so reduce other costs – like help desk costs. I can personally attest to this. Despite being adept at using the web I almost always call the help desk for federal government services because I find federal government websites virtually unnavigable. Often I find these websites transform my personality from happy affable guy into someone who teeters between grumpy/annoyed on the mild side, to raving crazy lunatic on the other as I fail to grasp what I’m supposed to do next.

If I have to choose between wasting 45 minutes on a website getting nowhere versus calling a help line, waiting for 45 minutes on hold while I do other work and then getting my problem resolved… I go with the latter. It’s not a good use of anyone’s time, but it is often the best option at the moment.

On the analytics front, many governments simply lack the expertise to do something as simple as Google analytics, or worse are hamstrung by privacy and procurement rules keep them from using services that would enable them to know how their users are (or are not) using their website.

Aside from gov.uk, another great example of where these two ingredients came together is over at Honolulu Answers. Here a Code for America team worked with the city to see what pages (e.g. services) residents were actually visiting and then prioritized those. In addition, they worked with staff and citizen to construct answers to commonly asked questions. I suspect a simple website like this could generate real savings on the city’s help desk costs – to say nothing of happier residents and tourists.

At some risk of pressing this point too heavily, I hope that my TechPresident piece (and other articles about gov.uk) gets widely read by public servants, managers and, of course, politicians (hint: the public wants easier access to services, not easoer access to photos and press releases about you). I’m especially hoping the good people at Treasury Board Secretariat in the Canadian Federal government read it since the old Common Look and Feel standard sadly ensured that Canadian government websites are particularly terrible when it comes to usability.

The UK has shown how national governments can do better. Let’s hope others follow their lead.

Playing with Budget Cutbacks: On a Government 2.0 Response, Wikileaks & Analog Denial of Service Attacks

Broken Government: A Case Study in Penny Wise but Pound Foolish Management

12 Replies

Often I write about the opportunities of government 2.0, but it is important for readers to be reminded of just how challenging the world of government 1.0 can be, and how far away any uplifting future can feel.

I’ve stumbled upon a horrifically wonderful example of how tax payers are about to spend an absolutely ridiculous amount of money so that a ton of paper can be pushed around Ottawa to little or no effect. Ironically, it will all in the name of savings and efficiency.

And, while you’ll never see this reported in a newspaper it’s a perfect case study of the type of small decision that renders (in this case the Canadian) government both less effective and more inefficient. Governments: take note.

First, the context. Treasury Board (the entity that oversees how money is spent across the Canadian government) recently put out a simple directive. It stipulates all travel costs exceeding $25,000 must get Ministerial approval and costs from $5000-$25,000 must get Deputy Head approval.

Here are the relevant bits of texts since no sane human should read the entire memo (infer what you wish about me from that):

2.5.1 Ministerial approval is required when total departmental costs associated with the event exceed $25,000.

and

2.5.5 Deputy head approval of an event is required when the event has the following characteristics:

Total departmental costs associated with the event exceed $5,000 but are less than $25,000; or

Total hospitality costs associated with the event exceed $1,500 but are less than $5,000; and

None of the elements listed in 2.5.2 a. to g. are present for which delegated authority has not been provided.

This sounds all very prudent-like. Cut down on expenses! Make everyone justify travel! Right? Except the memo suggests (and, I’m told is being interpreted) as meaning that it should be applied to any event – including an external conference, but even internal planning meetings.

To put this in further context for those who work in the private sector: if you worked for a large publicly traded company – say one with over 5,000, 10,000 or even more employees – the Minister is basically the equivalent of the Chairman of the Board. And the Deputy head? They are like the CEO.

Imagine creating a rule at such a company like Ford, that required an imaginary “safety engineering team” to get the chairman of the company to sign off on their travel expenses – months in advance – if, say, 10 of them needed to collectively spend $25,000 to meet in person or attend an important safety conference. It gets worse. If the team were smaller, say 3-5 people and they could keep the cost to $5000 they would still need approval from the CEO. In such a world it would be hard to imagine new products being created, new creative cost saving ideas getting hammered out. In fact, it would be hard for almost any distributed team to meet without creating a ton of paperwork. Over time, customers would begin to notice as work slowly ground to a halt.

This is why this isn’t making government more efficient. It is going to make it crazier.

It’s also going to make it much, much, more ineffective and inefficient.

For example, this new policy may cause a large number of employees to determine that getting approval for travel is too difficult and they’ll simply give up. Mission accomplished! Money saved! And yes, some of this travel was probably not essential. But much, and likely a significant amount was. Are we better governed? Are we safer? Is our government smarter, in a country where say inspectors, auditors, policy experts and other important decision makers (especially those in the regions) are no longer learning at conferences, participating in key processes or attending meetings about important projects because the travel was too difficult to get approval for? Likely not.

But there is a darker conclusion to draw as well. There is probably a significant amount of travel that remains absolutely critical. So now we are going to have public servants writing thousands of briefing notes every year seeking to get approval by directors, and then revising them again for approval by director generals (DGs), and again for the Assistant Deputy Ministers (ADMs), and again for the Deputy Minister (DMs) and again, possibly, for Ministerial approval.

That is a truly fantastic waste of the precious time of a lot of very, very senior people. To say nothing of the public servants writing, passing around, revising and generally pushing all these memos.

I’ll go further. I have every confidence that for every one dollar in travel this policy managed to deter from being requested, $500 dollars in the time of Directors, DGs, ADMs, DMs and other senior staff will have been wasted. Given Canada is a place where the population – and thus a number of public servants – are thinly spread across an over 4000 kilometer wide stretch I suspect there is a fair bit of travel that needs to take place. Using Access to Information Requests you might even be able to ball park how much time was wasted on these requests/memos.

Worse, I’m not even counting the opportunity cost. Rather than tackling the critical problems facing our country, the senior people will be swatting away ~~TPS reports~~ travel budget requests. The only companies I know that run themselves this way are those that have filed for bankruptcy and essentially are not spending any money as they wait to be restructured or sold. They aren’t companies that are trying to solve any new problems, and are certainly not those trying to find creative or effective ways to save money.

In the end, this tells us a lot of about the limits of hierarchical systems. Edicts are a blunt tool – they seldom (if ever) solve the root of a problem and more often simply cause new, bigger problems since the underlying issues remain unresolved. There are also some wonderful analogies to wikileaks and denial of service attacks that I’ll save that for tomorrow.

On Being Misquoted – Access Info Europe and Freedominfo.org

2 Replies

I’ve just been alerted to a new post out on Freedominfo.org has quotes of mine that are used in way that is deeply disappointing. It’s never fund to see your ideas misused to make it appear that you are against something that you deeply support.

The most disappointing misquote comes from Helen Darbishire, a European FOI expert at Access Info Europe. Speaking about the convergence between open data and access to information laws (FOIA) she “lamented that comments like Eaves’ exacerbate divisions at a time when “synergies” are developing at macro and micro levels.” The comment she is referring to is this one:

“I just think FOIA is broken; the wait time makes it broken….” David Eaves, a Canadian open government “evangelist,” told the October 2011 meeting of International information commissioners. He said “efforts to repair it are at the margins” and governments have little incentive for reform.

I’m not sure if Darbishire was present at the 7th International Conference of Information Commissioners where I made this comment in front of a room of mostly FOI experts but the comment actually got a very warm reception. Specifically, I was talking about how the wait times of access to information requests – not theidea of Access to Information. The fact is, that for many people waiting 4-30 weeks for a response from a government for a piece of information makes the process broken. In addition, I often see the conversation among FOIA experts focus on how to reduce that time by a week or a few days. But for most people, that will still leave them feeling like the system is too slow and so, in their mind, broken, particularly in a world where people are increasingly used to getting the information they want in about .3 seconds (the length of a Google search).

What I find particularly disappointing about Darbishire’s comments is that I’ve been advocating for for the open data and access to information communities to talk more to one another – indeed long before I find any reference of her calling for it. Back in April during the OGP meeting I wrote:

There remain important and interest gaps particularly between the more mature “Access to Information” community and the younger, still coalescing “Gov2.0/OpenGov/Tech/Transparency” community. It often feels like members of the access to information community are dismissive of the technology aspects of the open government movement in general and the OGP in particular. This is disappointing as technology is likely going to have a significant impact on the future of access to information. As more and more government work gets digitized, how way we access information is going to change, and the opportunities to architect for accessibility (or not) will become more important. These are important conversations and finding a way to knit these two communities together more could help the advance everyone’s thinking.

And of course, rather than disparage Access to Information as a concept I frequently praise it, such as during this article about the challenges of convergence between open data and access to information:

Let me pause to stress, I don’t share the above to disparage FOI. Quite the opposite. It is a critical and important tool and I’m not advocating for its end. Nor am I arguing the open data can – in the short or even medium term – solve the problems raised above.

That said, I’m willing to point out the failures of both Open Data and Access to information. But to then cherry pick my comments about FOIA and paint me as someone who is being unhelpful strikes me as problematic.

I feel doubly that way since, not only have I advocated for efforts to bridge the communities, I’ve tried to make efforts to make it happen. I was the one who suggested that Warren Krafchik – the Civil Society co-chair of the Open Government Partnership be invited to the Open Knowledge Festival to help with a conversation around helping bring the two communities together and reached out to him with the invitation.

If someone wants to label me as someone who is opinionated in the space, that’s okay – I do have opinions about what works and what doesn’t work and try to share them, sometimes in a constructive way, and sometimes – such as when on a panel – in a way that helps spur discussion. But to lay the charge of being divisive, when I’ve been trying for several years to bridge the conversation and bring the open data perspective into the FOIA community, feels unfair and problematic.

Lying with Maps: How Enbridge is Misleading the Public in its Ads

11 Replies

The Ottawa Citizen has a great story today about an advert by Enbridge (the company proposing to build a oil pipeline across British Columbia) that includes a “broadly representational” map that shows prospective supertankers steaming up an unobstructed Douglas Channel channel on their way to and from Kitimat – the proposed terminus of the pipeline.

Of course there is a small problem with this map. The route to Kitimat by sea looks nothing like this.

Take a look at the Google Map view of the same area (I’ve pasted a screen shot below – and rotated the map so you are looking at it from the same “standing” location). Notice something missing from Enbridge’s maps?

According to the Ottawa Citizens story an Enbridge spokesperson said their illustration was only meant to be “broadly representational.” Of course, all maps are “representational,” that is what a map is, a representation of reality that purposefully simplifies that reality so as to aid the reader draw conclusions (like how to get from A to B). Of course such a representation can also be used to mislead the reader into drawing the wrong conclusion. In this case, removing 1000 square kilometers that create a complicated body of water to instead show that oil tankers can steam relatively unimpeded up Douglas Channel from the ocean.

The folks over at Leadnow.ca have remade the Enbridge map as it should be:

Rubbing out some – quite large – islands that make this passage much more complicated of course fits Enbridge’s narrative. The problem is, at this point, given how much the company is suffering from the perception that it is not being fully upfront about its past record and the level of risk to the public, presenting a rosy eyed view of the world is likely to diminish the public’s confidence in Enbridge, not increase their confidence in the project.

There is another lesson. This is great example of how facts, data and visualization matter. They do. A lot. And we are, almost every day, being lied to through visual representations from sources we are told to trust. While I know that no one thinks of maps as open or public data in many ways they are. And this is a powerful example of how, when data is open and available, it can enable people to challenge the narratives being presented to them, even when those offering them up are powerful companies backed by a national government.

If you are going to create a representation of something you’d better think through what you are trying to present, and how others are going to see it. In Enbridge’s case this was either an effort at guile gone horribly wrong or a communications strategy hopelessly unaware of the context in which it is operating. Whoever you are, and whatever you are visualization – don’t be like Enbridge – think through your data visualization before you unleash it into the wild.

How Government should interact with Developers, Data Geeks and Analysts

2 Replies

Below is a screen shot from the Opendatabc google group from about two months ago. I meant to blog about this earlier but life has been in the way. For me, this is a prefect example of how many people in the data/developer/policy world probably would like to interact with their local, regional or national government.

A few notes on this interaction:

I occasionally hear people try to claim the governments are not responsive to requests for data sets. Some aren’t. Some are. To be fair, this was not a request for the most controversial data set in the province. But it is was a request. And it was responded to. So clearly there are some governments that are responsive. The questions is figuring out which one’s are, why they are, and see if we can export that capacity to other jurisdictions.
This interaction took place in a google group – so the whole context is social and norm driven. I love that public officials in British Columbia as well as with the City of Vancouver are checking in every once in a while on google groups about open data, contributing to conversations and answering questions that citizens have about government, policies and open data. It’s a pretty responsive approach. Moreover, when people are not constructive it is the group that tends to moderate the behaviour, rather than some leviathan.
Yes, I’ve blacked out the email/name of the public servant. This is not because I think they’d mind being known or because they shouldn’t be know, but because I just didn’t have a chance to ask for permission. What’s interesting is that this whole interaction is public and the official was both doing what that government wanted and compliant with all social media rules. And yet, I’m blacking it out, which is a sign of how messed up current rules and norms make citizens relationships with public officials they interact with online -I’m worried of doing something wrong by telling others about a completely public action. (And to be clear, the province of BC has really good and progressive rules around these types of things)
Yes, this is not the be all end all of the world. But it’s a great example of a small thing being doing right. It’s nice to be able to show that to other government officials.

Containers, Facebook, Baseball & the Dark Matter around Open Data (#IOGDC keynote)

4 Replies

Below is a extended blog post that summarizes the keynote address I gave at the World Bank/Data.gov International Open Government Data Conference in Washington DC on Wednesday July 11th. This piece is cross posted over at the WeGov blog on TechPresident where I’m also write on transparency, technology and politics.

Yesterday, after spending the day at the International Open Government Data Conference at the World Bank (and co-hosted by data.gov) I left both upbeat and concerned. Upbeat because of the breadth of countries participating and the progress being made.

I was worried however because of the type of conversation we are having how it might limit the growth of both our community and the impact open data could have. Indeed as we talk about technology and how to do open data we risk missing the real point of the whole exercise – which is about use and impacts.

To get drive this point home I want to share three stories that highlight the challenges, I believe, we should be talking about.

Challenge 1: Scale Open Data

In 1956 Ideal-X, the ship pictured to the left, sailed from Newark to Houston and changed the world.

Confused? Let me explain.

As Marc Levine chronicles in his excellent book The Box, the world in 1956 was very different to our world today. Global trade was relatively low. China was a long way off from becoming the world’s factory floor. And it was relatively unusual for people to buy goods made elsewhere. Indeed, as Levine puts it, the cost of shipping goods was “so expensive that it did not pay to ship many things halfway across the country, much less halfway around the world.” I’m a child of the second era of globalization. I grew up in a world of global transport and shipping. The world before all of that which Levine is referring to is actually foreign to me. What is amazing is how much of that has just become a basic assumption of life.

And this is why Ideal-X, the aforementioned ship, is so important. It is the first cargo container ship (in how we understand containers). Its trip from Newark to Houston marked the beginning of a revolution because containers slashed the cost of shipping goods. Before Ideal-X the cost of loading cargo onto a medium sized cargo ship was $5.83 per ton, with containers, the cost dropped to 15.8 cents. Yes, the word you are looking for is: “wow.”

You have to understand that before containers loading a ship was a lot more like packing a mini-van for a family vacation to the beach than the orderly process of what could be described as stacking very large lego blocks on a boat. Before containers literally everything had to be hand packed, stored and tied down in the hull. (see picture to the right)

This is a little bit what our open data world looks like right today. The people who are consuming open data are like digital longshoreman. They have to look at each open data set differently, unpack it accordingly and figure out where to put it, how to treat it and what to do with it. Worse, when looking at data from across multiple jurisdictions it is often much like cargo going around the world before 1956: a very slow and painful process. (see man on the right).

Of course, the real revolution in container shipping happened in 1966 when the size of containers was standardized. Within a few years containers could move from pretty much anywhere in the world from truck to train to boat and back again. In the following decades global shipping trade increased by 2.5 times the rate of economic output. In other words… it exploded.

Geek side bar: For techies, think of shipping containers as the TCP-IP packet of globalization. TCP-IP standardized the packet of information that flowed over the network so that data could move from anywhere to anywhere. Interestingly, like containers, what was in the package was actually not relevant and didn’t need to be known by the person transporting it. But the fact that it could move anywhere created scale and allowed for logarithmic growth.

What I’m trying to drive at is that, when it comes to open data, the number of open data sets that gets published is no longer the critical metric. Nor is the number of open data portals. We’ve won. There are more and more. The marginal political and/or persuasive benefit of an addition of another open data portal or data set won’t change the context anymore. I want to be clear – this is not to say that more open data sets and more open data portals are not important or valuable – from a policy and programmatic perspective more is much, much better. What I am saying is that having more isn’t going to shift the conversation about open data any more. This is especially true if data continues to require large amounts of work and time for people to unpack and understanding it over and over again across every portal.

In other words, what IS going to count, is how many standardized open data sets get created. This is what we SHOULD be measuring. The General Transit Feed Specification revolutionized how people engaged with public transit because the standard made it so easy to build applications and do analysis around it. What we need to do is create similar standards for dozens, hundreds, thousands of other data sets so that we can drive new forms of use and engagement. More importantly we need to figure out how to do this without relying on a standards process that take 8 to 15 to infinite years to decide on said standard. That model is too slow to serve us, and so re-imaging/reinventing that process is where the innovation is going to shift next.

So let’s stop counting the number of open data portals and data sets, and start counting the number of common standards – because that number is really low. More critically, if we want to experience the kind of explosive growth in use like that experienced by global trade and shipping after the rise of the standardized container then our biggest challenge is clear: We need to containerize open data.

Challenge 2: Learn from Facebook

One of the things I find most interesting about Facebook is that everyone I’ve talked to about it notes how the core technology that made it possible was not particularly new. It wasn’t that Zuckerberg leveraged some new code or invented a new, better coding language. Rather it was that he accomplished a brilliant social hack.

Part of this was luck, that the public had come a long way and was much more willing to do social things online in 2004 than they were willing to do even two years earlier with sites like friendster. Or, more specifically, young people who’d grown up with internet access were willing to do things and imagine using online tools in ways those who had not grown up with those tools wouldn’t or couldn’t. Zuckerberg, and his users, had grown up digital and so could take the same tools everyone else had and do something others hadn’t imagined because their assumptions were just totally different.

My point here is that, while it is still early, I’m hoping we’ll soon have the beginnings of a cohort of public servants who’ve “grown up data.” For whom, despite their short career in the public service, have matured in a period where open data has been an assumption, not a novelty. My hope and suspicion is that this generation of public servants are going to think about Open Data very differently than many of us do. Most importantly, I’m hoping they’ll spur a discussion about how to use open data – not just to share information with the public – but to drive policy objectives. The canonical opportunity for me around this remains restaurant inspection data, but I know there are many, many, more.

What I’m trying to say is that the conferences we organize have got to talk less and less about how to get data open and have to start talking more about how do we use data to drive public policy objectives. I’m hoping the next International Open Government Data Conference will have an increasing number of presentations by citizens, non-profits and other outsiders are using open data to drive their agenda, and how public servants are using open data strategically to drive to a outcome.

I think we have to start fostering that conversation by next year at the latest and that this conversation, about use, has to become core to everything we talk about within 2 years, or we will risk losing steam. This is why I think the containerization of open data is so important, as well as why I think the White House’s digital government strategy is so important since it makes internal use core to the governments open data strategy.

Challenge 3: The Culture and Innovation Challenge.

In May 2010 I gave this talk on Open Data, Baseball and Government at the Gov 2.0 Summit in Washington DC. It centered around the story outline in the fantastic book Moneyball by Michael Lewis. It traces the story about how a baseball team – the Oakland A’s – used a new analysis of players stats to ferret out undervalued players. This enabled them to win a large number of games on a relatively small payroll. Consider the numbers to the right.

I mean if you are the owner of the Texas Rangers, you should be pissed! You are paying 250% in salary for 25% fewer wins than Oakland. If this were a government chart, where “wins” were potholes found and repaired, and “payroll” was costs… everyone in the world bank would be freaking out right now.

For those curious, the analytical “hack” was recognizing that the most valuable thing a player can do on offense is get on base. This is because it gives them an opportunity to score (+) but it also means you don’t burn one of your three “outs” that would end the inning and the chance for other players to score. The problem was, to measure the offensive power of a player, most teams were looking at hitting percentages (along with a lot of other weird, totally non-quantitative stuff) which ignores the possibility of getting walked, which allows you to get on base without hitting the ball!

What’s interesting however is that the original thinking about the fact that people were using the wrong metrics to assess baseball players first happened decades before the Oakland “A”s started to use it. Indeed it was a nighttime security guard with a strong mathematics background and an obsession for baseball that first began point this stuff out.

The point I’m making is that it took 20 years for a manager in baseball to recognize that there was better evidence and data they could be using to make decisions. TWENTY YEARS. And that manager was hated by all the other managers who believed he was ruining the game. Today, this approach to assessing baseball is common place – everyone is doing it – but see how the problem of using baseball’s “open data” to create better outcomes was never just an accessibility issue. Once that was resolved the bigger challenge centered around culture and power. Those with the power had created a culture in which new ideas – ideas grounded in evidence but that were disruptive – couldn’t find an audience. Of course, there were structural issues as well, many people had jobs that depended on not using the data, on instead relying on their “instincts” but I think the cultural issue is a significant one.

So we can’t expect that we are going to go from open portal today to better decisions tomorrow. There is a good chance that some of the ideas data causes us to think will be so radical and challenging that either the ideas, the people who champion them, or both, could get marginalized. On the up side, I feel like I’ve seen some evidence to the contrary to this in city’s like New York and Chicago, but the risk is still there.

So what are we going to do to ensure that the culture of government is one that embraces the challenges to our thinking and assumptions that doesn’t require 20 years to pass for us to make progress. This is a critical challenge for us – and it is much, much bigger than open data.

Conclusion: Focus on the Dark Matter

I’m deeply indebted to my friend – the brilliant Gordon Ross – who put me on to this idea the other day over tea.

Do you remember the briefcase in Pulp Fiction? The on that glowed when opened? That the characters were all excited about but you never knew what was inside it. It’s called a MacGuffin. I’m not talking about the briefcase per se. Rather I mean the object in a story that all the characters are obsessed about, but that you – the audience – never find out what it is, and frankly, really isn’t that important to you. In Pulp Fiction I remember reading that the briefcase is allegedly Marsellus Wallace soul. But ultimately, it doesn’t matter. What matters is that Vincent Vega, Jules Winnfield and a ton of other characters think it is important, and that drives the action and the plot forward.

Again – let me be clear – Open Data Portals are our MacGuffin device. We seem to care A LOT about them. But trust me, what really matters is everything that can happens around them. What makes open data important is not a data portal. It is a necessary prerequisite but it’s not the end, it just the means. We’re here because we believe that the things open data can let us and others do, matter. The Open Data portal was only ever a MacGuffin device – something that focused our attention and helped drive action so that we could do the other things – that dark matter that lies all around the MacGuffin device.

And that is what brings me back to our three challenges. Right now, the debate around open data risks become too much like a Pulp Fiction conference in which all the panels talk about the briefcase. Instead we should be talking more and more about all the action – the dark matter – taking place around the briefcase. Because that is what is really matters. For me, I think the three things that matter most are what I’ve mentioned about in this talk:

standards – which will let us scale, I believe strongly that the conversation is going to shift from portals to standards
strategic use – starting us down the path of learning how open data can drive policy outcomes; and
culture and power – recognizing that there are lots of open data is going to surface a lot of reasons why governments don’t want to engage data driven in decision making

In other words, I want to be talking about how open data can make the world a better place, not about how we do open data. That conversation still matters, open data portals still matter, but the path forward around them feels straightforward, and if they remain the focus we’ll be obsessing about the wrong thing.

So here’s what I’d like to see in the future from our Open Data conferences. We got to stop talking about how to do open data. This is because all of our efforts here, everything we are trying to accomplish… it has nothing to do with the data. What I think we want to be talking about is how open data can be a tool to make the world a better place. So let’s make sure that is the conversation we are have.

eaves.ca

if writing is a muscle, this is my gym