Tag Archives: open data

The Importance of Open Data Critiques – thoughts and context

Over at the Programmable City website Rob Kitchin has a thoughtful blog post on open data critiques. It is very much worth reading and wider discussion. Specifically, there are two competing things worth noting. First, it is important for the open data community – and advocates in particular – to acknowledge the responsibility we have in debates about open data. Second, I’d like to examine some of the critiques raised and discuss those I think misfire and those that deserve deeper dives.

Open Data as Dominant Discourse

During my 2011 keynote at Open Government Data camp I talked about how the open data movement was at an inflection point:

For years we have been on the outside, yelling that open data matters. But now we are being invited inside.

Two years later the transition is more than complete. If you have any doubts, consider this picture:OD as DCOnce you have these people talking about things like a G8 Open Data Charter you are no longer on the fringes. Not even remotely.

It also means understanding the challenges around open data has never been more important. We – open data advocates – are now complicit it what many of the above (mostly) men decide to do around open data. Hence the importance of Rob’s post. Previously those with power were dismissive of open data – you had to scream to get their attention. Today, those same actors want to act now and go far. Point them (or the institutions they represent) in the wrong direction and/or frame an issue incorrectly and you could have a serious problem on your hands. Consequently, the responsibility of advocates has never been greater. This is even more the case as open data has spread. Local variations matter. What works in Vancouver may not always be appropriate in Nairobi or London.

I shouldn’t have to say this but I will, because it matters so much: Read the critiques. They matter. They will make you better, smarter, and above all, more responsible.

The Four Critiques – a break down

Reading the critiques and agreeing with them is, of course, not the same thing. Rob cites four critiques of open data: funding and sustainability, politics of the benign and empowering the empowered, utility and usability, and neoliberalisation and marketisation of public services. Some of these I think miss the real concerns and risks around open data, others represent genuine concerns that everyone should have at the forefront of their thinking. Let me briefly touch on each one.

Funding and sustainability

This one strikes me as the least effective criticism. Outside the World Bank I’ve not heard of many examples where government effectively sell their data to make money. I would be very interested in examples to the contrary – it would make for a great list and would enlighten the discussion – although not, I suspect in ways that would make either side of the discussion happy.

The little research that has been done into this subject has suggested that charging for government data almost never yields much money, and often actually serves as a loss creating mechanism. Indeed a 2001 KPMG study of Canadian geospatial data found government almost never made money from data sales if purchases by other levels of government were not included. Again in Canada, Statistics Canada argued for years that it couldn’t “afford” to make its data open (free) as it needed the revenue. However, it turned out that the annual sum generated by these sales was around $2M dollars. This is hardly a major contributor to its bottom line. And of course, this does not count the money that had to go towards salaries and systems for tracking buyers and users, chasing down invoices, etc…

The disappointing line in the critique however was this:

de Vries et al. (2011) reported that the average apps developer made only $3,000 per year from apps sales, with 80 percent of paid Android apps being downloaded fewer than 100 times.  In addition, they noted that even successful apps, such as MyCityWay which had been downloaded 40 million times, were not yet generating profits.

Ugh. First, apps are not what is going to make open data interesting or sexy. I suspect they will make up maybe 5% of the ecosystem. The real value is going to be in analysis and enhancing other services. It may also be in the costs it eliminates (and thus capital and time it frees up, not in the companies it creates), something I outlined in Don’t Measure the Growth, Measure the Destruction.

Moreover, this is the internet. The average doesn’t mean anything. The average webpage probably gets 2 page views per day. That hardly means there aren’t lots of very successful webpages. The distribution is not a bell curve, its a long tail, so it is hard to see what the average tells us other than the cost of experimentation is very, very low. It tells us very little about if there are, or will be successful uses of open data.

Politics of the benign and empowering the empowered

The is the most important critique and it needs to be engaged. There are definitely cases where data can serve to further marginalize at risk communities. In addition, there are data sets that for reasons of security and privacy, should not be made open. I’m not interested in publishing the locations of women’s shelters or worse, the list of families taking refuge in them. Nor do I believe that open data will always serve to challenge the status quo or create greater equality. Even at its most reductionist – if one believes that information is power, then greater ability to access and make us of information makes one more powerful – this means that winners and losers will be created by the creation of new information.

There are however, two things that give me some hope in this space. The first is that, when it comes to open data, the axis of competition among providers usually centers around accessibility. For example, the Socrata platform (an provider of open data portals to government) invests heavily in creating tools that make government data accessible and usable to the broadest possible audience. This is not a claim that all communities are being engaged (far from it) and that a great deal more work cannot be done, but there is a desire to show greater use which drives some data providers to try to find ways to engage new communities.

The second is that if we want to create data literate society – and I think we do, for reasons of good citizenship, social justice and economic competitiveness – you need the data first for people to learn and play with. One of my most popular blog posts is Learning from Libraries: The Literacy Challenge of Open Data in which I point out that one of the best ways to help people become data literate is to give them more interesting data to play with. My point is that we didn’t build libraries after everyone knew how to read, we built them beforehand with the goal of having them as a place that could facilitate learning and education. Of course libraries also often have strong teaching components to them, and we definitely need more of this. Figuring out who to engage, and how it can be done most effectively is something I’m deeply interested in.

There are also things that often depress me. I struggle to think of technologies that did not empower the empowered – at least initially. From the cell phone to the car to the printing press to open source software, all these inventions have had helped billions of people, but they did not distribute themselves evenly, especially at first. So the question cannot be reduced to – will open data empower the empowered, but to what degree, and where and with whom. I’ve seen plenty of evidence where data has enabled small groups of people to protect their communities or make more transparent the impact (or lack there of) of a government regulation. Open data expands the number of people who can use government information for their own ends – this, I believe is a good thing – but that does not mean we shouldn’t be constantly looking for ways to ensure that it does not reinforce structural inequity. Achieving perfect distribution of the benefits of a new technology, or even public policy, is almost impossible. So we cannot make perfect the enemy of the good. However, that does not hide the fact that there are real risk – and responsibilities as advocates – that need to be considered here. This is an issue that will need to be constantly engaged.

Utility and Usability

Some of the issues around usability I’ve addressed above in the accessibility piece – for some portals (that genuinely want users) the axis of evolution is pointed in the right direction with governments and companies (like Socrata) trying to embed more tools on the website to make the data more usable.

I also agree with the central concern (not a critique) of this section, which is that rather than creating a virtuous circle, poorly thought out and launched open data portals will create a negative “doomloops” in which poor quality data begets little interest which begets less data. However, the concern, in my mind, focuses on to narrow a problem.

One of the big reasons I’ve been an advocate of open data was a desire not just to help citizens, non-profits and companies gain access to information that could help them with their missions, but to change the way government deals with its data so that it can share it internally more effectively. I often cite a public servant I know who had a summer intern spend 3 weeks surfing the national statistical agency website to find data they knew existed but could not find because of terrible design and search. A poor open data site is not just a sign that the public can’t access or effectively use government data, it usually suggests that the governments employees can’t access or effectively use their own data. This is often deeply frustrating to many public servants.

Thus, the most important outcome created by the open data movement may have been making governments realize that data represents an asset class that of which they have had little understanding (outside, sadly, the intelligence sector, which has been all too aware of this) and little policy and governance (outside, say, the GIS space and some personal records categories). Getting governments to think about data as a platform (yes, I’m a fan of government as a platform for external use, but above all for internal use) is, in my mind, one way we can both enable public servants to get better access to information while simultaneously attacking the huge vendors (like SAP and Oracle) whose $100 million dollar implementations often silo off data, rarely produce the results promised and are so obnoxiously expensive it boggles the mind (Clay Johnson has some wonderful examples of the roughly 50% of large IT projects that fail).

They key to all this is that open data can’t be something you slap on top of a big IT stack. I try to explain this in It’s the Icing Not the Cake, another popular blog post about why Washington DC was able to effectively launch an open data program so quickly (which was, apparently, so effective at bringing transparency to procurement data the subsequent mayor rolled it back). The point is, that governments need to start thinking in terms of platforms if – over the long term – open data is going to work. And it needs to start thinking of itself as the primary consumer of the data that is being served on that platform. Steve Yegge’s brilliant and sharp witted rant on how Google doesn’t get platforms is an absolute must read in this regard for any government official – the good news is you are not alone in not finding this easy. Google struggles with it as well.

My main point. Let’s not play at  the edges and merely define this challenge as one of usability. It is much, much bigger problem than that. It is a big, deep, culture-changing BHAG problem that needs tackling. If we get it wrong, then the big government vendors and he inertia of bureaucracy win. We get it right and we potentially could save taxpayers millions while enabling a more nimble, effective and responsive government.

Neoliberalisation and Marketisation of Government

If you not read Jo Bates article “Co-optation and contestation in the shaping of the UK’s Open Government Data Initiative” I highly recommend it. There are a number of arguments in the article I’m not sure I agree with (and feel are softened by her conclusion – so do read it all first). For example, the notion that open data has been co-opted into an “ideologically framed mould that champions the superiority of markets over social provision” strikes me as lacking nuance. One of the things open data can do is create a public recognition of a publicly held data set and the need to protect these against being privatized. Of course, what I suspect is that both things could be true simultaneously – there can be increased recognition of the importance of a public asset while also recognizing the increased social goods and market potential in leveraging said asset.

However, there is one thing Bates is absolutely correct about. Open data does not come into an empty playing field. It will be used by actors – on both the left and right – to advance their cause. So I too am uncomfortable with those that believe open data is going to somehow depoliticize government or politics – indeed I made a similar argument in a piece in Slate on the politics of data. As I try to point out you can only create a perverse, gerrymandered electoral district that looks like this…

gerrymandered in chicago… if you’ve got pretty good demographic data about target communities you want to engage (or avoid). Data – and even open data – doesn’t magically make things better. There are instances where open data can, I believe, create positive outcomes by shifting incentives in appropriate ways… but similarly, it can help all sorts of actors find ways to satisfy their own goals, which may not be aligned with your – or even society at large’s – goals.

This makes voices like Bates deeply important since they will challenge those of us interested in open data to be constantly evaluating the language we use, the coalitions we form and the priorities that get made, in ways that I think are profoundly important. Indeed, if you get to the end of Bates article there are a list of recommendations that I don’t think anyone I work with around open data would find objectionable, quite the opposite, they would agree are completely critical.

Summary

I’m so grateful to Rob for posting this piece. It is has helped me put into words some thoughts I’ve had, both about the open data criticisms as well as the important role the critiques play. I try hard to be critical advocate of open data – one who engages the risks and challenges posed by open data. I’m not perfect, and balancing these two goals – advocacy with a critical view – is not easy, but I hope this shines some window into the ways I’m trying to balance it and possible helps others do more of it as well.

Open Data Movement is a Joke?

Yesterday, Tom Slee wrote a blog post called “Why the ‘Open Data Movement’ is a Joke,” which – and I say this as a Canadian who understands the context in which Slee is writing – is filled with valid complaints about our government, but which I feel paints a flawed picture of the open data movement.

Evgeny Morozov tweeted about the post yesterday, thereby boosting its profile. I’m a fan of Evgeny. He is an exceedingly smart and critical thinker on the intersection of technology and politics. He is exactly what our conversation needs (unlike, say, Andrew Keen). I broadly felt his comments (posted via his Twitter stream) were both on target: we need to think critically about open data; and lacked nuance: it is possible for governments to simultaneously become more open and more closed on different axis. I write all this confident that Evgeny may turn his ample firepower on me, but such is life.

So, a few comments on Slee’s post:

First, the insinuation that the open data movement is irretrievably tainted by corporate interests is so over the top it is hard to know where to begin to respond. I’ve been advocating for open data for several years in Canada. Frankly, it would have been interesting and probably helpful if a large Canadian corporation (or even a medium sized one) took notice. Only now, maybe 4-5 years in, are they even beginning to pay attention. Most companies don’t even know what open data is.

Indeed, the examples of corporate open data “sponsors” that Slee cites are U.S. corporations, sponsoring U.S. events (the Strata conference) and nonprofits (Code for America – of which I have been engaged with). Since Slee is concerned primarily with the Canadian context, I’d be interested to hear his thoughts on how these examples compare to Canadian corporate involvement in open data initiatives – or even foreign corporations’ involvement in Canadian open data.

And not to travel too far down the garden path on this, but it’s worth noting that the corporations that have jumped on the open data bandwagon in the US often have two things in common: First, their founders are bona fide geeks, who in my experience are both interested in hard data as an end unto itself (they’re all about numbers and algorithms), and want to see government-citizen interactions – and internal governmental interactions, too – made better and more efficient. Second, of course they are looking after their corporate interests, but they know they are not at the forefront of the open data movement itself. Their sponsorship of various open data projects may well have profit as one motive, but they are also deeply interested in keeping abreast of developments in what looks to be a genuine Next Big Thing. For a post the Evgeny sees as being critical of open data, I find all this deeply uncritical. Slee’s post reads as if anything that is touched by a corporation is tainted. I believe there are both opportunities and risks. Let’s discuss them.

So, who has been advocating for open data in Canada? Who, in other words, comprises the “open data movement” that Slee argues doesn’t really exist – and that “is a phrase dragged out by media-oriented personalities to cloak a private-sector initiative in the mantle of progressive politics”? If you attend one of the hundreds of hackathons that have taken place across Canada over the past couple years – like those that have happened in Vancouver, Regina, Victoria, Montreal and elsewhere – you’ll find they are generally organized in hackspaces and by techies interested in ways to improve their community. In Ottawa, which I think does the best job, they can attract hundreds of people, many who bring spouses and kids as they work on projects they think will be helpful to their community. While some of these developers hope to start businesses, many others try to tackle issues of public good, and/or try to engage non-profits to see if there is a way they can channel their talent and the data. I don’t for a second pretend that these participants are a representative cross-section of Canadians, but by and large the profile has been geek, technically inclined, leaning left, and socially minded. There are many who don’t fit that profile, but that is probably the average.

Second, I completely agree that this government has been one of the most – if not the most – closed and controlling in Canada’s history. I, like many Canadians, echo Slee’s frustration. What’s worse, is I don’t see things getting better. Canadian governments have been getting more centralized and controlling since at least Trudeau, and possibly earlier (Indeed, I believe polling and television have played a critical role in driving this trend). Yes, the government is co-opting the language of open data in an effort to appear more open. All governments co-opt language to appear virtuous. Be it on the environment, social issues or… openness, no government is perfect and indeed, most are driven by multiple, contradictory goals.

As a member of the Federal Government’s Open Government Advisory Panel I wrestle with this challenge constantly. I’m try hard to embed some openness into the DNA of government. I may fail. I know that I won’t succeed in all ways, but hopefully I can move the rock in the right direction a little bit. It’s not perfect, but then it’s pretty rare that anything involving government is. In my (unpaid, advisory, non-binding) role I’ve voiced that the government should provide the Access to Information Commissioner with a larger budget (they cut it) and that they enable government scientists to speak freely (they have not so far). I’ve also advocated that they should provide more open data. There they have, including some data sets that I think are important – such as aid data (which is always at risk of being badly spent). For some, it isn’t enough. I’d like for there to be more open data sets available, and I appreciate those (like Slee – who I believe is writing from a place of genuine care and concern) who are critical of these efforts.

But, to be clear, I would never equate open government data as being tantamount to solving the problems of a restrictive or closed government (and have argued as much here). Just as an authoritarian regime can run on open-source software, so too might it engage in open data. Open data is not the solution for Open Government (I don’t believe there is a single solution, or that Open Government is an achievable state of being – just a goal to pursue consistently), and I don’t believe anyone has made the case that it is. I know I haven’t. But I do believe open data can help. Like many others, I believe access to government information can lead to better informed public policy debates and hopefully some improved services for citizens (such as access to transit information). I’m not deluded into thinking that open data is going to provide a steady stream of obvious “gotcha moments” where government malfeasance is discovered, but I am hopeful that government data can arm citizens with information that the government is using to inform its decisions so that they can better challenge, and ultimately help hold accountable, said government.

Here is where I think Evgeny’s comments on the problem with the discourse around “open” are valid. Open Government and Open Data should not be used interchangeably. And this is an issue Open Government and Open Data advocates wrestle with. Indeed, I’ve seen a great deal of discussion and reflection come as a result of papers such as this one.

Third, the arguments around StatsCan all feel deeply problematic. I say this as the person who wrote the first article (that I’m aware of) about the long form census debacle in a major media publication and who has been consistently and continuously critical of it. This government has had a dislike for Statistics Canada (and evidence) long before open data was in their vocabulary, to say nothing of a policy interest. StatsCan was going to be a victim of dramatic cuts regardless of Canada’s open data policy – so it is misleading to claim that one would “much rather have a fully-staffed StatsCan charging for data than a half-staffed StatsCan providing it for free.” (That quote comes from Slee’s follow-up post, here.) That was never the choice on offer. Indeed, even if it had been, it wouldn’t have mattered. The total cost of making StatsCan data open is said to have been $2 million; this is a tiny fraction of the payroll costs of the 2,500 people they are looking to lay off.

I’d actually go further than Slee here, and repeat something I say all the time: data is political. There are those who, naively, believed that making data open would depoliticize policy development. I hope there are situations where this might be true, but I’ve never taken that for granted or assumed as much: Quite the opposite. In a world where data increasingly matters, it is increasingly going to become political. Very political. I’ve been saying this to the open data community for several years, and indeed was a warning that I made in the closing part of my keynote at the Open Government Data Camp in 2010. All this has, in my mind, little to do with open data. If anything, having data made open might increase the number of people who are aware of what is, and is not, being collected and used to inform public policy debates. Indeed, if StatsCan had made its data open years ago it might have had a larger constituency to fight on its behalf.

Finally, I agree with the Nat Torkington quote in the blog post:

Obama and his staff, coming from the investment mindset, are building a Gov 2.0 infrastructure that creates a space for economic opportunity, informed citizens, and wider involvement in decision making so the government better reflects the community’s will. Cameron and his staff, coming from a cost mindset, are building a Gov 2.0 infrastructure that suggests it will be more about turning government-provided services over to the private sector.

Moreover, it is possible for a policy to have two different possible drivers. It can even have multiple contradictory drivers simultaneously. In Canada, my assessment is that the government doesn’t have this level of sophistication around its thinking on this file, a conclusion I more or less wrote when assessing their Open Government Partnership commitments. I have no doubt that the conservatives would like to turn government provided services over to the private sector, and open data has so far not been part of that strategy. In either case, there is, in my mind, a policy infrastructure that needs to be in place to pursue either of these goals (such as having a data governance structure in place). But from a more narrow open data perspective, my own feeling is that making the data open has benefits for public policy discourse, public engagement, and economic reasons. Indeed, making more government data available may enable citizens to fight back against policies they feel are unacceptable. You may not agree with all the goals of the Canadian government – as someone who has written at least 30 opeds in various papers outlining problems with various government policies, neither do I – but I see the benefits of open data as real and worth pursuing, so I advocate for it as best I can.

So in response to the opening arguments about the open data movement…

It’s not a movement, at least in any reasonable political or cultural sense of the word.

We will have to agree to disagree. My experience is quite the opposite. It is a movement. One filled with naive people, with skeptics, with idealists focused on accountability, developers hoping to create apps, conservatives who want to make government smaller and progressives who want to make it more responsive and smarter. There was little in the post that persuaded me there wasn’t a movement. What I did hear is that the author didn’t like some parts of the movement and its goals. Great! Please come join the discussion; we’d love to have you.

It’s doing nothing for transparency and accountability in government,

To say it is doing nothing for transparency seems problematic. I need only cite one data set now open to say that isn’t true. And certainly publication of aid data, procurement data, publications of voting records and the hansard are examples of places where it may be making government more transparent and accountable. What I think Slee is claiming is that open data isn’t transforming the government into a model of transparency and accountability, and he’s right. It isn’t. I don’t think anyone claimed it would. Nor do I think the public has been persuaded that because it does open data, the government is somehow open and transparent. These are not words the Canadian public associates with this government no matter what it does on this file.

It’s co-opting the language of progressive change in pursuit of what turns out to be a small-government-focused subsidy for industry.

There are a number of sensible, critical questions in Slee’s blog post. But this is a ridiculous charge. Prior to the data being open, you had an asset that was paid for by taxpayer dollars, then charged for at a premium that created a barrier to access. Of course, this barrier was easiest to surmount for large companies and wealthy individuals. If there was a subsidy for industry, it was under the previous model, as it effectively had the most regressive tax for access of any government service.

Indeed, probably the biggest beneficiaries of open data so far have been Canada’s municipalities, which have been able to gain access to much more data than they previously could, and have saved a significant amount of money (Canadian municipalities are chronically underfunded.) And of course, when looking at the most downloaded data sets from the site, it would appear that non-profits and citizens are making good use of them. For example, the 6th most downloaded was the Anthropogenic disturbance footprint within boreal caribou ranges across Canada used by many environmental groups; number 8 was weather data; 9th was Sales of fuel used for road motor vehicles, by province and territory, used most frequently to calculate Green House Gas emissions; and 10th the Government of Canada Core Subject Thesaurus – used, I suspect, to decode the machinery of government. Most of the other top downloaded data sets related to immigration, used it appears, to help applicants. Hard to see the hand of big business in all this, although if open data helped Canada’s private sector become more efficient and productive, I would hardly complain.

If your still with me, thank you, I know that was a long slog.

Algorithmic Regulation Spreading Across Government?

I was very, very excited to learn that the City of Vancouver is exploring implementing a program started in San Francisco in which “smart” parking meters adjust their price to reflect supply and demand (story is here in the Vancouver Sun).

For those unfamiliar with the program, here is a breakdown. In San Francisco, the city has the goal of ensuring at least one free parking spot is available on every block in the downtown core. As I learned during the San Fran’s presentation at the Code for America summit, such a goal has several important consequences. Specifically, it reduces the likelihood of people double parking, reduces smog and greenhouse gas emissions as people don’t troll for parking as long and because trolling time is reduced, people searching for parking don’t slow down other traffic and buses as they drive around slowly looking for a spot. In short, it has a very helpful impact on traffic more broadly.

So how does it work? The city’s smart parking meters are networked together and constantly assess how many spots on a given block are free. If, at the end of the week, it turns out that all the spaces are frequently in use, the cost of parking on that block is increased by 25 cents. Conversely if many of the spots were free, the price is reduced by 25 cents. Generally, each block finds an equilibrium point where the cost meets the demand but is also able to adjust in reaction to changing trends.

Technologist Tim O’Reilly has referred to these types of automated systems in the government context as “algorithmic regulation” – a phrase I think could become more popular over the coming decade. As software is deployed into more and more systems, the algorithms will be creating market places and resource allocation systems – in effect regulating us. A little over a year ago I said that contrary to what many open data advocates believe, open data will make data political – e.g. that open data wasn’t going to depoliticize public policy and make it purely evidenced base, quite the opposite, it will make the choices around what data we collect more contested (Canadians, think long form census). The same is also – and already – true of the algorithms, the code, that will increasingly regulate our lives. Code is political.

Personally I think the smart parking meter plan is exciting and hope the city will consider it seriously, but be prepared, I’m confident that much like smart electrical meters, an army of naysayers will emerge who simply don’t want a public resource (roads and parking spaces) to be efficiently used.

It’s like the Spirit of the West said: Everything is so political.

The New Government of Canada Open Data License: The OGL by another name

Last week the Minister Clement issued a press release announcing some of the progress the government has made on its Open Government Initiatives. Three things caught my eye.

First, it appears the government continues to revise its open data license with things continuing to trend in the right direction.

As some of you will remember, when the government first launched data.gc.ca it had a license that was so onerous that it was laughable. While several provisions were problematic, my favourite was the sweeping, “only-make-us-look-good-clause” which, said, word for word: “You shall not use the data made available through the GC Open Data Portal in any way which, in the opinion of Canada, may bring disrepute to or prejudice the reputation of Canada.”

After I pointed out the problems with this clause to then Minister Day, he managed to have it revoked within hours – very much to his credit. But it is a good reminder to the starting point of the government license and to the mindset of government Canada lawyers.

With the new license, almost all the clauses that would obstruct commercial and non-profit reuse have effectively been eliminated. It is no longer problematic to identify individual companies and the attribution clauses have been rendered slightly easier. Indeed, I would argue that the new license has virtually the same constraints as the UK Open Government License (OGL) and even the Creative Commons CC-BY license.

All this begs the question… why not simply use the language and structure of the OGL in much the same manner that British Columbia Government tried to with its own BC OGL? Such a standardized license across jurisdictions might be helpful, it would certainly simply life for think tanks, academics, developers and other users of the data. This is something I’m pushing for and hope that we might see progress on.

Second, the idea that the government is going to post completed access to information (ATIP) requests online is also a move in the right direction. I suspect that the most common ATIP request is one that someone else has already made. Being able to search through previous requests would enable you to find what you are looking for without having to wait weeks or make public servants redo the entire search and clearing process. What I don’t understand is why only post the summaries? In a digital world it would be better for citizens, and cheaper for the government to simply post the entire request whenever privacy policies wouldn’t prevent it.

Third, and perhaps most important were the lines noting that “That number (of data sets) will continue to grow as the project expands and more federal departments and agencies come onboard. During this pilot project, the Government will also continue to monitor and consider national and international best practices, as well as user feedback, in the licensing of federal open data.”

This means that we should expect more data to hit the site. I seems as though more departments are being asked to figure out what data they can share – hopefully this means that real, interesting data sets will be made public. In particular one hopes that data sets which legislation mandates the government collect, will be high on the list of priorities. Also interesting in this statement is the suggestion that the government will consider national and international best practices. I’ve talked to both the Minister and officials about the need to create common standards and structures for open data across jurisdictions. Fostering and pushing these is an area where the government could take a leadership role and it looks like there may be interesting in this.

 

International Open Data Hackathon Updates and Apps

With the International Open Data Hackathon getting closer, I’m getting excited. There’s been a real expansion on the wiki of the number of cities where people are sometimes humbly, sometimes grandly, putting together events. I’m seeing Nairobi, Dublin, Sydney, Warsaw and Madrid as some of the cities with newly added information. Exciting!

I’ve been thinking more and more about applications people can hack on that I think would be fun, engage a broad number of people and that would help foster a community around viable, self-sustaining projects.

I’m of course, all in favour of people working on whatever peaks their interest, but here are a few projects I’m encouraging people to look at:

1. Openspending.org

What I really like about openspending.org is that there are lots of ways non-coders can contribute. Specifically finding, scraping and categorizing budget data, which (sadly) is often very messy are things almost anyone with a laptop can do and are essential to getting this project off the ground. In addition, the reward for this project can be significant, a nice visualization of whatever budget you have data for – a perfect tool for helping people better understand where their money (or taxes) go. Another big factor in its favour… openspending.org – a project of the Open Knowledge Foundation who’ve been big supporters and sponsors of the international open data hackathon – is also perfect because, if all goes well, it is the type of project that a group can complete in one day.

So I hope that some people try playing with website using your own local data. It would be wonderful to see the openspending.org community grow.

2. Adopt a Hydrant

Some of you have already seen me blog about this app – a project that comes of out Code for America. If you know of a government agency, or non profit, that has lat/long information for a resource that it wants people to help take care of… then adopt a hydrant could be for you. Essentially adopt a hydrant – which can be changed to adopt an anything – allows people to sign up and “adopt” what ever the application tracks. Could be trees, hydrants, playgrounds… you name it.

Some of you may be wondering… why adopt a hydrant? Well because in colder places, like Boston, MA, adopt a hydrant was created in the hopes that citizens might adopt a hydrant and so agree that when it snows they would keep the hydrant clear of snow. That way, in case their is a fire, the emergency responders don’t end up wasting valuable minutes locating and then digging out, the hydrant. Cool eh?

I think adopt a hydrant has the potential of become a significant open source project, one widely used by cities and non-profits. Would be great to see some people turned on to it!

3. Mapit

What I love about mapit is that it is the kind of application that can help foster other open data applications. Created by the wonderful people over at Mysociety.org this open source software essentially serves as a mapping layer so that you can find out what jurisdictions a given address or postal code or GPS device currently sits in (e.g. what riding, ward, city, province, county, state, etc… am I in?). This is insanely useful for lots of developers trying to build websites and apps that tell their users useful information about a given address or where they are standing. Indeed, I’m told that most of Mysociety.org’s project use their instance of MapIt to function.

This project is for those seeking a more ambitious challenge, but I love the idea that this service might exist in multiple countries and that a community might emerge around another one of mysociety.org’s projects.

No matter what you intend to work on, drop me a line! Post it to the open data day mailing list and let me know about it. I’d love to share it with the world.

Weaving Foreign Ministries into the Digital Era: Three ideas

Last week I was in Ottawa giving a talk at the Department of Foreign Affairs talking about how technology, new media and open innovation will impact the department’s it work internally, across Ottawa and around the world.

While there is lots to share, here are three ideas I’ve been stewing on:

Keep more citizens safe when abroad – better danger zone notification

Some people believe that open data isn’t relevant to departments like Foreign Affairs or the State Department. Nothing could be further than the truth.

One challenge the department has is getting Canadians to register with them when they visit or live in a country labeled by the department as problematic for traveling in its travel reports (sample here). As you can suspect, few Canadians register with the embassy as they are likely not aware of the program or travel a lot and simply don’t get around to  it.

There are other ways of tackling this problem that might yield broader participation.

Why not turn the Travel Report system into an open data with an API? I’d tackle this by approaching a company like TripIt. Every time I book an airplane ticket or a hotel I simply forward TripIt the reservation, which they scan and turn into events that then automatically appear my calendar. Since they scan my travel plans they also know which country, city and hotel I’m staying in… they also know where I live and could easily ask me for my citizenship. Working with companies like TripIt (or Travelocity, Expedia, etc…) DFAIT could co-design an API into the departments travel report data that would be useful to them. Specifically, I could imagine that if TripIt could query all my trips against those reports then any time they notice I’m traveling somewhere the Foreign Ministry has labelled “exercise a high-degree of caution” or worse trip TripIt could ask me if I’d be willing to let them forward my itinerary to the department. That way I could registry my travel automatically, making the service more convenient for me, and getting the department more information that it believes to be critical as well.

Of course, it might be wise to work with the State Department so that their travel advisories used a similarly structured API (since I can assume TripIt will be more interested in the larger US market than the Canadian market) But facilitating that conversation would be nothing but wins for the department.

More bang for buck in election monitoring

One question that arose during my talk came from an official interested in elections monitoring. In my mind, one thing the department should be considering is a fund to help local democracy groups spin up installations of Ushahidi in countries with fragile democracies that are gearing up for elections. For those unfamiliar with Ushahidi it is a platform developed after the disputed 2007 presidential election in Kenya that plotted eyewitness reports of violence sent in by email and text-message on a google map.

Today it is used to track a number of issues – but problems with elections remain one of its core purposes. The department should think about grants that would help spin up a Ushahidi install to enable citizens of the country register concerns and allegations around fraud, violence, intimidation, etc… It could then verify and inspect issues that are flagged by the countries citizens. This would allow the department to deploy its resources more effectively and ensure that its work was speaking to concerns raised by citizens.

A Developer version of DART?

One of the most popular programs the Canadian government has around international issues is the Disaster Assistance Response Team (DART). In particular, Canadians have often been big fans of DART’s work in purifying water after the boxing day tsunami in Asia as well as its work in Haiti. Maybe the department could have a digital DART team, a group of developers that, in an emergency could help spin up Ushahidi, Fixmystreet, or OpenMRS installations to provide some quick but critical shared infrastructure for Canadians, other countries’ response teams and for non-profits. During periods of non-crisis the team could work on these projects or supporting groups like CrisisCommons or OpenStreetMaps, helping contribute to open source projects that can be instrumental in a humanitarian crisis.

 

The State of Open Data 2011

What is the state of the open data movement? Yesterday, during my opening keynote at the Open Government Data Camp (held this year in Warsaw, Poland) I sought to follow up on my talk from last year’s conference. Here’s my take of where we are today (I’ll post/link to a video of the talk as soon as the Open Knowledge Foundation makes it available).

Successes of the Past Year: Crossing the Chasm

1. More Open Data Portals

One of the things that has been amazing to witness in 2011 is the veritable explosion of Open Data portals around the world. Today there are well over 50 government data catalogs with more and more being added. The most notable of these was probably the Kenyan Open Data catalog which shows how far, and wide, the open data movement has grown.

2. Better Understanding and More Demand

The things about all these portals is that they are the result of a larger shift. Specifically, more and more government officials are curious about what open data is. This is not to say that understanding has radically shifted, but many people in government (and in politics) now know the term, believe there is something interesting going on in this space, and want to learn more. Consequently, in a growing number of places there is less and less headwind against us. Rather than screaming from the rooftops, we are increasingly being invited in the front door.

3. More Experimentation

Finally, what’s also exciting is the increased experimentation in the open data space. The number of companies and organizations trying to engage open data users is growing. ScraperWiki, the DataHub, BuzzData, Socrata, Visua.ly, are some of the products and resources that have emerged out of the open data space. And the types of research and projects that are emerging – the tracking of the Icelandic volcano eruptions, the emergence of hacks and hackers, micro projects (like my own Recollect.net) and the research showing that open data could be generating savings of £8.5 million a year to governments in the Greater Manchester area, is deeply encouraging.

The Current State: An Inflection Point

The exciting thing about open data is that increasingly we are helping people – public servants, politicians, business owners and citizens imagine a different future, one that is more open, efficient and engaging. Our impact is still limited, but the journey is still in its early days. More importantly, thanks to success (number 2 above) our role is changing. So what does this mean for the movement right now?

Externally to the movement, the work we are doing is only getting more relevant. We are in an era of institution failure. From the Tea Party to Occupy Wall St. there is a recognition that our institutions no longer sufficiently serve us. Open data can’t solve this problem, but it is part of the solution. The challenge of the old order and the institutions it fostered is that its organizing principle is built around the management (control) of processes, it’s been about the application of the industrial production model to government services. This means it can only move so fast, and because of its strong control orientation, can only allow for so much creativity (and adaption). Open data is about putting the free flow of information at the heart of government – both internally and externally – with the goal of increasing government’s metabolism and decentralizing societies’ capacity to respond to problems. Our role is not obvious to the people in those movements, and we should make it clearer.

Internally to the movement, we have another big challenge. We are at a critical inflection point. For years we have been on the outside, yelling that open data matters. But now we are being invited inside. Some of us want to rush in, keen to make advances, others want to hold back, worried about being co-opted. To succeed, it is essential we must become more skilled at walking this difficult line: engaging with governments and helping them make the right decisions, while not being co-opted or sacrificing our principles. Choosing to not engage would, in my opinion, be to abscond from our responsibility as citizens and open data activists. This is a difficult transition, but it will be made easier if we at least acknowledge it, and support one another in it.

Our Core Challenges: What’s next

Looking across the open data space, my own feeling is that there are three core challenges that are facing the open data movement that threaten to compromise all the successes we’ve currently enjoyed.

1. The Compliance Trap

One key risk for open data is that all our work ends up being framed as a transparency initiative and thus making data available is reduced to being a compliance issue for government departments. If this is how our universe is framed I suspect in 5-10 years governments, eager to save money and cut some services, will choose to cut open data portals as a cost saving initiative.

Our goal is not to become a compliance issue. Our goal is to make governments understand that they are data management organizations and that they need to manage their data assets with the same rigour with which they manage physical assets like roads and bridges. We are as much about data governance as we are open data. This means we need to have a vision for government, one where data becomes a layer of the government architecture. Our goal is to make data platform one that not only citizens outside of government can build on, but one that government reconstructs its policy apparatus as well as its IT systems at top of. Achieving this will ensure that open data gets hardwired right into government and so cannot be easily shut down.

2. Data Schemas

This year, in the lead up to the Open Data Camp, the Open Knowledge Foundation created a map of open data portals from around the world. This was fun to look at, and I think should be the last time we do it.

We are getting to a point where the number of data portals is becoming less and less relevant. Getting more portals isn’t going to enable open data to scale more. What is going to allow us to scale is establishing common schemas for data sets that enable them to work across jurisdictions. The single most widely used open government data set is transit data, which because it has been standardized by the GTFS is available across hundreds of jurisdictions. This standardization has not only put the data into google maps (generating millions of uses everyday) but has also led to an explosion of transit apps around the world. Common standards will let us scale. We cannot forget this.

So let’s stop mapping open data portals, and start mapping datasets that adhere to common schemas. Given that open data is increasingly looked upon favourably by governments, creating these schemas is, I believe, now the central challenge to the open data movement.

3. Broadening the Movement

I’m impressed by the hundreds and hundreds of people here at the Open Data Camp in Warsaw. It is fun to be able to recognize so many of the faces here, the problem is that I can recognize too many of them. We need to grow this movement. There is a risk that we will become complacent, that we’ll enjoy the movement we’ve created and, more importantly, our roles within it. If that happens we are in trouble. Despite our successes we are far from reaching critical mass.

The simple question I have for us is: Where is the United Way, Google, Microsoft, the Salvation Army, Oxfam, and Greenpeace? We’ll know were are making progress when companies – large and small – as well as non-profits – start understanding how open government data can change their world for the better and so want to help us advance the cause.

Each of us needs to go out and start engaging these types of organizations and helping them see this new world and the potential it creates for them to make money or advance their own issues. The more we can embed ourselves into other’s networks, the more allies we will recruit and the stronger we will be.

 

International Open Data Hackathon 2011: Better Tools, More Data, Bigger Fun

Last year, with only a month of notice, a small group passionate people announced we’d like to do an international open data hackathon and invited the world to participate.

We were thinking small but fun. Maybe 5 or 6 cities.

We got it wrong.

In the end people from over 75 cities around the world offered to host an event. Better still we definitively heard from people in over 40. It was an exciting day.

Last week, after locating a few of the city organizers email addresses, I asked them if we should do it again. Every one of them came back and said: yes.

So it is official. This time we have 2 months notice. December 3rd will be Open Data Day.

I want to be clear, our goal isn’t to be bigger this year. That might be nice if it happens. But maybe we’ll only have 6-7 cities. I don’t know. What I do want is for people to have fun, to learn, and to engage those who are still wrestling with the opportunities around open data. There is a world of possibilities out there. Can we seize on some of them?

Why.

Great question.

First off. We’ve got more data. Thanks to more and more enlightened governments in more and more places, there’s a greater amount of data to play with. Whether it is Switzerland, Kenya, or Chicago there’s never been more data available to use.

Second, we’ve got better tools. With a number of governments using Socrata there are more API’s out there for us to leverage. Scrapperwiki has gotten better and new tools like Buzzdata, TheDataHub and Google’s Fusion Tables are emerging every day.

And finally, there is growing interest in making “openess” a core part of how we measure governments. Open data has a role to play in driving this debate. Done right, we could make the first Saturday in December “Open Data Day.” A chance to explain, demo and invite to play, the policy makers, citizens, businesses and non-profits who don’t yet understand the potential. Let’s raise the world’s data literacy and have some fun. I can’t think of a better way than with another global open data hackathon – an maker’s fair like opportunity for people to celebrate open data by creating visualizations, writing up analyses, building apps or doing what ever they want with data.

Of course, like last time, hopefully we can make the world a little better as well. (more on that coming soon)

How.

The basic premises for the event would be simple, relying on 5 basic principles.

1. Together. It can be as big or as small, as long or as short, as you’d like it, but we’ll be doing it together on Saturday, December 3rd, 2011.

2. It should be open. Around the world I’ve seen hackathons filled with different types of people, exchanging ideas, trying out new technologies and starting new projects. Let’s be open to new ideas and new people. Chris Thorpe in the UK has done amazing work getting young and diverse group hacking. I love Nat Torkington’s words on the subject. Our movement is stronger when it is broader.

3. Anyone can organize a local event. If you are keen help organize one in your city and/or just participate add your name to the relevant city on this wiki page. Where ever possible, try to keep it to one per city, let’s build some community and get new people together. Which city or cities you share with is up to you as it how you do it. But let’s share.

4. You can work on anything that involves open data. That could be a local or global app, a visualization, proposing a standard for common data sets, scraping data from a government website to make it available for others in buzzdata.

It would be great to have a few projects people can work on around the world – building stuff that is core infrastructure to future projects. That’s why I’m hoping someone in each country will create a local version of MySociety’s Mapit web service for their country. It will give us one common project, and raise the profile of a great organization and a great project.

We also hope to be working with Random Hacks of Kindness, who’ve always been so supportive, ideally supplying data that they will need to run their applications.

5. Let’s share ideas across cities on the day. Each city’s hackathon should do at least one demo, brainstorm, proposal, or anything that it shares in an interactive way with at members of a hackathon in at least one other city. This could be via video stream, skype, by chat… anything but let’s get to know one another and share the cool projects or ideas we are hacking on. There are some significant challenges to making this work: timezones, languages, culture, technology… but who cares, we are problem solvers, let’s figure out a way to make it work.

Like last year, let’s not try to boil the ocean. Let’s have a bunch of events, where people care enough to organize them, and try to link them together with a simple short connection/presentation.Above all let’s raise some awareness, build something and have some fun.

What next?

1. If you are interested, sign up on the wiki. We’ll move to something more substantive once we have the numbers.

2. Reach out and connect with others in your city on the wiki. Start thinking about the logistics. And be inclusive. Someone new shows up, let them help too.

3. Share with me your thoughts. What’s got you excited about it? If you love this idea, let me know, and blog/tweet/status update about it. Conversely, tell me what’s wrong with any or all of the above. What’s got you worried? I want to feel positive about this, but I also want to know how we can make it better.

4. Localization. If there is bandwidth locally, I’d love for people to translate this blog post and repost it locally. (let me know as I’ll try cross posting it here, or at least link to it). It is important that this not be an english language only event.

5. If people want a place to chat with other about this, feel free to post comments below. Also the Open Knowledge Foundation’s Open Data Day mailing list will be the place where people can share news and help one another out.

Once again, I hope this will sound like fun to a few committed people. Let me know what you think.

The Science of Community Management: DjangoCon Keynote

At OSCON this year, Jono Bacon, argued that we are entering a era of renaissance in open source community management – that increasingly we don’t just have to share stories but that repeatable, scientific approaches are increasingly available to us. In short, the art of community management is shifting to a science.

With an enormous debt to Jono, I contend we are already there. Indeed the tools for enable a science of community management have existed for at least 5 years. All that is needed is an effort to implement them.

A few weeks ago the organizers of DjangoCon were kind enough to invite me to give the keynote at their conference in Portland and I made these ideas the centerpiece of my talk.

Embedded below is the result: a talk that that starts slowly, but that grew with passion and engagement as it progressed. I really want to thank the audience for the excellent Q&A and for engaging with me and the ideas as much as they did. As someone from outside their community, I’m grateful.

My hope in the next few weeks is to write this talk up in a series of blog posts or something more significant, and, hopefully, to redo this video in slideshare (although I’m going to have to get my hands on the audio of this). I’ll also be giving a version of this talk at the Drupal Pacific Northwest Summit in a few weeks. Feedback, as always, is not only welcome, but gratefully received. None of this happens in a vacuum, it is always your insights that help me get better, smarter and more on target.

Big thanks to Dierderik Van Liere and Lauren Bacon for inspiration and help as well as Mike Beltzner, Daniel Einspanjer, David Ascher and Dan Mosedale (among many others) at Mozilla who’ve been supportive and a big assistance.

In the meantime, I hope this is enjoyable, challenging and spurs good thoughts.

Interview with Charles Leadbeater – Monday September 19th

I’m excited to share that I’ll be interviewing British public policy and open innovation expert Charles Leadbeater on September 19th as part of a SIG’s webinar series. For readers not familiar with Charles Leadbeater, he is the author of We-Think and numerous other chapters, pamphlets and articles, ranging in focus from social innovation, to entrepreneurship to public sector reform. He served as an adviser to Tony Blair and has a long standing relationship with the British think tank Demos.

Our conversation will initially focus on open innovation, but I’m sure will range all over, touching on the impact of open source methodologies on the private, non-profit and public sector, the future of government services and, of course, the challenges and opportunities around open data.

If you are interested in participating in the webinar you can register here. There is a small fee I’m told is being charged to recover some of the costs for running the event.

If you are participating and have a question you’d like to see asked, or a theme or topic you’d like to see covered, please feel free to comment below or, if you prefer more discretion, send me an email.