Tag Archives: opendata

It's the icing, not the cake: key lesson on open data for governments

At the 2010 GTEC conference I did a panel with David Strigel, the Program Manager of the Citywide Data Warehouse (CityDW) at the District of Columbia Government. During the introductory remarks David recounted the history of Washington DC’s journey to open data.

Interestingly, that journey began not with open data, but with an internal problem. Back around 2003 the city had a hypothesis that towing away abandoned cars would reduce crime rates in the immediate vicinity, thereby saving more money in the long term than the cost of towing. In order to access the program’s effectiveness city staff needed to “mash-up” longitudinal crime data against service request data – specifically, requests to remove abandoned cars. Alas, the data sets were managed by different departments, so this was tricky task. As a result the city’s IT department negotiated bilateral agreements with both departments to host their datasets in a single location. Thus the DC Data Warehouse was born.

Happily, the data demonstrated the program was cost effective. Building on this success the IT department began negotiating more bilateral agreements with different departments to host their data centrally. In return for giving up stewardship of the data the departments retained governance rights but reduced their costs and the IT group provided them with additional, more advanced, analytics. Over time the city’s data warehouse became vast. As a result, when DC decided to open up its data it was, relatively speaking, easy to do. The data was centrally located, was already being shared and used as a platform internally. Extending this platform externally (while not trivial) was a natural step.

In short, the deep problem that needed to solved wasn’t open data. Its was an information management. Getting the information management and governance policies right was essential for DC to move quickly. Moreover, this problem strikes at the heart of what it means to be government. Knowing what data you have, where it is, and under a governance structure that allows it to be shared internally (as well as externally) is a problem every government is going to face if it wants to be efficient, relevant and innovative in the 21st century. In other words, information management is the cake. Open data – which I believe is essential – is however the sweet icing you smother on top of that dense cake you’ve put in place.

Okay, with that said two points that flow from this.

First: Sometime, governments that “do” open data start off by focusing on the icing. The emphasis in on getting data out there, and then after the fact, figuring out  governance model that will make sense. This is a viable strategy, but it does have real risks. When sharing data isn’t at the core function but rather a feature tacked on at the end, the policy and technical infrastructure may be pretty creaky. In addition, developers may not want to innovate on top of your data platform because they may (rightly) question the level of commitment. One reason DC’s data catalog works is because it has internal users. This gives the data stability and a sense of permanence. On the upside, the icing is politically sexier, so it may help marshal resources to help drive a broader rethink of data governance. Either way, at some point, you’ve got to tackle the cake, otherwise, things are going to get messy. Remember it took DC 7 years to develop its cake before it put icing on it. But that was making it from scratch. Today thanks to new services (armies of consultants on this), tools (eg. Socrata) and models (e.g. like Washington, DC) you can make that cake following a recipe and even use cake mix. As David Strigel pointed out, today, he could do it in a fraction of the time.

Second: More darkly, one lesson to draw from DC is that the capacity of a government to do open data may be a pretty good proxy for their ability to share information and coordinate across different departments. If your government can’t do open data in a relatively quick time period, it may mean they simply don’t have the infrastructure in place to share data internally all that effectively either. In a world where government productivity needs to rise in order to deal with budget deficits, that could be worrying.

Lots of Open Data Action in Canada

A lot of movement on the open data (and not so open data) front in Canada.

Canadian International Development Agency (CIDA) Open Data Portal Launched

IATI-imagesSome readers may remember that last week I wrote a post about the imminent launch of CIDA’s open data portal. The site is now live and has a healthy amount of data on it. It is a solid start to what I hope will become a robust site. I’m a big believer – and supporter of the excellent advocacy efforts of the good people at Engineers Without Borders – that the open data portal would be greatly enhanced if CIDA started publishing its data in compliance with the emerging international standard of the International Aid Transparency Initiative as these 20 leading countries and organizations have.

If anyone creates anything using this data, I’d love to see it. One simple start might be to try using the Open Knowledge Foundation’s open source Where Does my Money Go code, to visualize some of the spending data. I’d be happy to chat with anyone interested in doing this, you can also check out the email group to find some people experienced in playing with the code base.

Improved License on the CIDA open data portal and data.gc.ca

One thing I’ve noticed with the launch of the CIDA open data portal was how the license was remarkably better than the license at data.gc.ca – which struck me as odd, since I know the feds like to be consistent about these types of things. Turns out that the data.gc.ca license has been updated as well and the two are identical. This is good news as some of the issues that were broken with the previous license have been fixed. But not all. The best license out there remains the license at data.gov (that’s a trick question, because data.gov has no license, it is all public domain! Tricky eh…? Nice!) but if you are going to have a license, the UK Open Government License used by at data.gov.uk is more elegant, freer and satisfies a number of the concerns I cite above and have heard people raise.

So this new data.gc.ca license is a step in the right direction, but still behind the open gov leaders (teaching lawyers new tricks sadly takes a long time, especially in government).

Great site, but not so open data: WellBeing Toronto

Interestingly, the City of Toronto has launched a fabulous new website called Well Being Toronto. It is definitely worth checking out. The main problem of course is that while it is interesting to look at, the underlying data is, sadly, not open. You can’t play with the data, such as mash it up with your own (or another jurisdiction’s) data. This is disappointing as I believe a number of non-profits in Toronto would likely find the underlying data quite helpful/important. I have, however, been told that the underlying data will be made open. It is something I hope to check in on again in a few months as I fear that it may never get prioritized, so it may be up to Torontonians to whold the Mayor and council’s feet to the fire to ensure it gets done.

Parliamentary Budget Office (PBO) launches (non-open) data website

It seems the PBO is also getting in on the data action with the launch of a beta site that allows you to “see” budgets from the last few years. I know that the Parliamentary Budget Office has been starved of resources, so they deserve to be congratulated for taking this first, important step. Also interesting is that the data has no license on the website, which could make it the most liberally licensed open data portal in the country. The site does have big downsides. First, the data can only be “looked” at, there is no obvious (simple) way to download it and start playing with it. More oddly still the PBO requires that users register with their email address to view the data. This seems beyond odd and actually, down right creepy, to me. First, parliament’s budget should be free and open and one should not need to hand over an email address to access it. Second, the email addresses collected appear to serve no purpose (unless the PBO intends to start spamming us), other than to tempt bad people to hack their site so they can steal a list of email addresses.

Mind. Prepare to be blown away. Big Data, Wikipedia and Government.

Okay, super psyched about this. Back at the Strata Conference in Feb (in San Diego) I introduced my long time uber-quant friend and now Wikimedia Foundation data scientist Diederik Van Liere to fellow Gov2.0 thinker Nicholas Gruen (Chairman) and Anthony Goldbloom (Founder and CEO) of an awesome new company called Kaggle.

As usually happens when awesome people get together… awesomeness ensued. Mind. Be prepared to be blown.

So first, what is Kaggle? They’re a company that helps companies and organizations post their data and run competitions with the goal of having it scrutinized by the world’s best data scientists towards some specific goal. Perhaps the most powerful example of a Kaggle competition to date was their HIV prediction competition, in which they asked contestants to use a data set to find markers in the HIV sequence which predict a change in the severity of the infection (as measured by viral load and CD4 counts).

Until Kaggle showed up the best science to date had a prediction rate of 70% – a feat that had taken years to achieve. In 90 days contributors to the contest were able to achieve a prediction rate of 77%. A 10% improvement. I’m told that achieving an similar increment had previously taken something close to a decade. (Data geeks can read how the winner did it here and here.)

Diederik and Anthony have created a similar competition, but this time using Wikipedia participation data. As the competition page outlines:

This competition challenges data-mining experts to build a predictive model that predicts the number of edits an editor will make in the five months after the end date of the training dataset. The dataset is randomly sampled from the English Wikipedia dataset from the period January 2001 – August 2010.

The objective of this competition is to quantitively understand what factors determine editing behavior. We hope to be able to answer questions, using these predictive models, why people stop editing or increase their pace of editing.

This is of course, a subject matter that is dear to me as I’m hoping that we can do similar analysis in open source communities – something Diederik and I have tried to theorize with Wikipedia and actually do Bugzilla data.

There is a grand prize of $5000 (along with a few others) and, amazingly, already 15 participants and 7 submissions.

Finally, I hope public policy geeks, government officials and politicians are paying attention. There is power in data and an opportunity to use it to find efficiencies and opportunities. Most governments probably don’t even know how to approach an organization like Kaggle or to run a competition like this, despite (or because?) it is so fast, efficient and effective.

It shouldn’t be this way.

If you are in government (or any org), check out Kaggle. Watch. Learn. There is huge opportunity here.

12:10pm PST – UPDATE: More Michael Bay sized awesomeness. Within 36 hours of the wikipedia challenge being launched the leading submission has improved on internal Wikimedia Foundation models by 32.4%

CIDA announces Open Data portal: What it means to Canadians

For those who missed it, the Canadian International Development Agency (CIDA) has announced it is launching an open data portal.

This is exciting news. On Monday I was interviewed about the initiative by Embassy Magazine which published the resulting article (behind their paywall) here.

As (I hope) the interview conveys, I’m cautiously optimistic about the Minister’s announcement. I’m conservative in my reaction only because we don’t actually know what the Minister has announced. At the moment the CIDA open data page is, quite literally, a blank slate. I feel positive because pretty much anything that gets more information about Canada’s aid budget available online is a step in the right direction. I’m cautious however, because the text from the Minister’s speech leads me to believe that she is using the term “open data” to describe something that may, in fact, not be open data.

Donors and partner countries must be accountable to their citizens, absolutely, but both must also be accountable to each other.

Transparency underpins these accountabilities.

With this in mind, today I am pleased to announce the Open Data Portal on the CIDA website that will make our searchable database of roughly 3,000 projects quick and simple to access.

The Open Data portal will put our country strategies, evaluations, audits and annual statistical and results reports within easy reach.

One of the core elements of the definition of “open data” is that it be machine readable. I need to actually get the “data” (e.g an excel spreadsheet, or database I can download and/or access) so that I can play with it, mash it up, analyze it, etc… It isn’t clear that this is on offer. The minister’s announcements talks about a database that allows you to search, and quickly download, reports on the 3000 projects that CIDA funds or operates. A report however, is not data. It may cite data, it may (and hopefully does) even contain data in charts or tables, but if what we are getting is access to reports then this is not an open data portal.

What I hope is happening – and what I advocated for in an oped in the Toronto Star – is that the Minister is launching a true open data portal which will share actual data – not analysis – with Canadians. More importantly, I hope this means Canada will be joining the efforts of Publish What you Fund, as it pushes donor organizations to share their aid data in a single common structure, so that budgets, contributions, projects, timelines, geography and other information about aid can be compared across countries, agencies, and organizations.

Open data, and especially in a internationally recognized standardized format, matters because no one is going to read all 10,000 reports about all 3000 projects CIDA funds. However, if we had access to the data, in a structured manner, there are those at non-profits, in universities and colleges and in the media (among other places) that could map the projects, compare budgets and results more clearly, compare our efforts against those of other countries, and do their own analysis to say, find duplication and overlap. I don’t, for a second, believe that 99.9% of Canadians will use CIDA’s open data portal, but the .1% who do will be able to create products that can inform the rest of us, and allow us to better understand Canada’s role in the world. In other words, Open Data portal could be empowering and educating to a broad number of people. Access to 10,000 reports, while a good step, simply won’t be able to create a similar outcome on any scale. The difference is, quite frankly, dramatic.

So let’s wait and see. I’m excited that the Minister of International Cooperation is using the language of Open Data – it means that she and her staff understand it has currency. What I also hope is that they understand its meaning – so far we have no data on whether they do or do not, and I remain cautiously optimistic, they should, after all, realize the significance of the language they are using. Either way, they have set high expectations among those of us who think about, talk about and work in, this area. As a Canadian, I’m hoping those expectations get fulfilled.

The next Open Data battle: Advancing Policy & Innovation through Standards

With the possible exception of weather data, the most successful open data set out there at the moment is transit data. It remains the data with which developers have experimented and innovated the most. Why is this? Because it’s been standardized. Ever since Google and the City of Portland creating the General Transit Feed Specification (GTFS) any developer that creates an application using GTFS transit data can port their application to over 100+ cities around the world with 10s and even 100s of millions of potential users. Now that’s scale!

All in all the benefits of a standard data structure are clear. A public good is more effectively used, citizens receive enjoy better service and companies (both Google and the numerous smaller companies that sell transit related applications) generate revenue, pay salaries, etc…

This is why, with a number of jurisdictions now committed to open data, I believe it is time for advocates to start focusing on the next big issue. How do we get different jurisdictions to align around standard structures so as to increase the number of people to whom an application or analysis will be relevant? Having cities publish open data sets is a great start and has led to real innovation, next generation open data and the next leaps in innovation will require some more standards.

The key, I think, is to find areas that meet three criteria:

  • Government Data: Is there relevant government data about the service or issue that is available?
  • Demand: Is this a service for which there is regular demand? (this is why transit is so good, millions of people touch the service on a daily basis)
  • Business Model: Is there a business that believes it can use this data to generate revenue (either directly, or indirectly)

 

 

opendata-1.0151

Two comments on this.

First, I think we should look at this model because we want to find places where the incentives are right for all the key stakeholders. The wrong way to create a data structure is to get a bunch of governments together to talk about it. That process will take 5 years… if we are lucky. Remember the GTFS emerged because Google and Portland got together, after that, everybody else bandwagoned because the value proposition was so high. This remains, in my mind, not the perfect, but the fastest and more efficient model to get more common data structures. I also respect it won’t work for everything, but it can give us more successes to point to.

Which leads me to point two. Yes, at the moment, I think that target in the middle of this model is relatively small. But I think we can make it bigger. The GTFS shows cities, citizens and companies that there is value in open data. What we need are more examples so that a) more business models emerge and b) more government data is shared in a structured way across multiple jurisdictions. The bottom and and right hand circles in this diagram can, and if we are successful will, move. In short, I think we can create this dynamic:

opendata4.016

So, what does this look like in practice?

I’ve been trying to think of services that fall in various parts of the diagram. A while back I wrote a post about using open restaurant inspection data to drive down health costs. Specifically around finding a government to work with a Yelp!, Bing or Google Maps, Urban Spoon or other company to integrate the  inspection data into the application. That for me is an example of something that I think fits in the middle. Government’s have the data, its a service citizens could touch on a regular base if the data appeared in their workflow (e.g. Yelp! or Bing Maps) and for those businesses it either helps drive search revenue or gives their product a competitive advantage. The Open311 standard (sadly missing from my diagram), and the emergence of SeeClickFix strike me as another excellent example that is right on the inside edge of the sweet spot).

Here’s a list of what else I’ve come up with at the moment:

opendata3.015

You can also now see why I’ve been working on Recollect.net – our garbage pick up reminder service – and helping develop a standard around garbage scheduling data – the Trash & Recycling Object Notation. I think it is a service around which we can help explain the value of common standards to cities.

You’ll notice that I’ve put “democracy data” (e.g. agendas, minutes, legislation, hansards, budgets, etc…) in the area where I don’t think there is a business plan. I’m not fully convinced of this – I could see a business model in the media space for this – but I’m trying to be conservative in my estimate. In either case, that is the type of data the good people at the Sunlight Foundation are trying to get liberated, so there is at least, non-profit efforts concentrated there in America.

I also put real estate in a category where I don’t think there is real consumer demand. What I mean by this isn’t that people don’t want it, they do, but they are only really interested in it maybe 2-4 times in their life. It doesn’t have the high touch point of transit or garbage schedules, or of traffic and parking. I understand that there are businesses to be built around this data, I love Viewpoint.ca – a site that takes mashes opendata up with real estate data to create a compelling real estate website – but I don’t think it is a service people will get attached to because they will only use it infrequently.

Ultimately I’d love to hear from people on ideas they on why might fit in this sweet spot. (if you are comfortable sharing the idea, of course). Part of this is because I’d love to test the model more. The other reason is because I’m engaged with some governments interested in getting more strategic about their open data use and so these types of opportunities could become reality.

Finally, I just hope you find this model compelling and helpful.

If the Prime Minister Wants Accountable Healthcare, let's make it Transparent too

Over at the Beyond the Commons blog Aaron Wherry has a series of quotes from recent speeches on healthcare by Canadian Prime Minister Stephen Harper in which the one constant keyword is… accountability.

Who can blame him?

Take everyone promising to limit growth to a still unsustainable 6% (gulp) and throw in some dubiously costly projects ($1 billion spent on e-health records in Ontario when an open source solution – VistA – could likely have been implemented at a fraction of the cost) and the obvious question is… what is the country going to do about healthcare costs?

I don’t want to claim that open data can solve the problem. It can’t. There isn’t going to be a single solution. But I think it could help spread best practices, improve customer choice and service as well as possibly yield other potential benefits.

Anyone who’s been around me for the last month knows about my restaurant inspection open data example (which could also yield healthcare savings) but I think we can go bigger. A Federal Government that is serious about accountability in Healthcare needs to build a system where that accountability isn’t just between the provinces and the feds, it needs to be between the Healthcare system and its users; us.

Since the feds usually attach several provisions to their healthcare dollars, the one I’d like to see is an open data provision. One where provinces, and hospitals are required to track and make open a whole set of performance data, in machine readable formats, in a common national standard, that anyone in Canada (or around the world) can download and access.

Some of the data I’d love to see mandated to be tracked and shared, includes:

  • Emergency Room wait times – in real time.
  • Wait times, by hospital, for a variety of operations
  • All budget data, down to the hospital or even unit level, let’s allow the public to do a cost/patient analysis for every unit in the country
  • Survival rates for various surgeries (obviously controversial since some hospitals that have the lowest rates are actually the best since they get the hardest cases – but let’s trust the public with the data)
  • Inspection data – especially if we launched something akin to the Institute for Health Management’s Protecting 5 Millions Lives Campaign
  • I’m confident there is much, much more…

I can imagine a slew of services and analysis that emerge from these, if nothing than a citizenry that is better informed about the true state of its healthcare system. Even something as simple as being able to check ER wait times at all the hospitals near you, so you can drive to the one where the wait times are shortest. That would be nice.

Of course, if the Prime Minister wants to go beyond accountability and think about how data could directly reduce costs, he might take a look at one initiative launched south of the border.

If he did, he might be persuaded to demand that the provinces share a set of anonymized patient records to see if academics or others in the country might be able to build better models for how we should manage healthcare costs. In January of this year I witnessed the launch of the $3 million dollar Heritage Health Prize at the O’Reilly Strata Conference in San Diego. It is a stunningly ambitious, but realistic effort. As the press release notes:

Contestants in the challenge will be provided with a data set consisting of the de-identified medical records of 100,000 patients from the 2008 calendar year. Contestants will then be required to create a predictive algorithm to predict who was hospitalized during the 2009 calendar year. HPN will award the $3 million prize(more than twice what is paid for the Nobel Prize in medicine) to the first participant or team that passes the required level of predictive accuracy. In addition, there will be milestone prizes along the way, which will be awarded to teams leading the competition at various points in time.

In essence Heritage Health is doing to patient management what Netflix (through the $1M Netflix prize) did to movie selections. It’s crowdsourcing the problem to get better results.

The problem is, any algorithm developed by the winners of the Heritage Health Prize will belong to… Heritage Health. This means the benefits of this innovation cannot benefit Canadians (nor anyone else). So why not launch a prize of our own. We have more data, I suspect our data is better (not limited to a single state) and we could place the winning algorithm in the public domain so that it can benefit all of humanity. If Canadian data helped find efficiencies that lowered healthcare costs and improved healthcare outcomes for everyone in the world… it could be the biggest contribution to global healthcare by Canada since Federick Banting discovered insulin and rescued diabetics everywhere.

Of course, open data, and sharing (even anonymized) patient data would be a radical experiment for government, something new, bold and different. But 6% growth is itself unsustainable and Canadians need to see that their government can do something bold, new and innovative. These initiatives would fit the bill.

Lost Open Data Opportunities

Even sometimes my home town of Vancouver gets it wrong.

Reading Chad Skelton’s blog (which I frequently regularly and recommend to my fellow Vancouverites) I was reminded of the great work he did creating an interactive visualization of the city’s parking tickets as part of a series around parking in Vancouver. Indeed, it is worth noting that the entire series was powered by data supplied by the city. Sadly, it just wasn’t (and still isn’t) open data. Quite the opposite, it was data that was wrestled, with enormous difficulty, via an FOI (ATIP) request.

parking-tickets

In the same blog post Chad recounts how he struggled to get the parking data from the city:

Indeed, the last major FOI request I made to the city was for its parking-ticket data. I had to fight the city tooth and nail to get them to cough up the information in the format I wanted it in (for months their FOI coordinator claimed, falsely, that she couldn’t provide the records in spreadsheet format). Then, when the parking ticket series finally ran, I got an email from the head of parking enforcement. He was wondering how he could get reprints of the series — he thought it was so good he wanted to hand it out to new parking enforcement officers during their training.

What is really frustrating about this paragraph is the last sentence. Obviously the people who find the most value in this analysis and tool are the city staff who manage parking infractions. So here is someone who, for free(!), provides an analysis and some stories that they now use to train new officers and he had to fight to get the data. The city would have been poorer without Chad’s story and analysis. And yet it fought him. Worse, an important player in the civic space (and an open data ally) feels frustrated by the city.

There are of course, other uses I could imagine for this data. I could imagine the data embedded into an application (ideally one like Washington DC’s Park IT DC – which let’s you find parking meters on a map, identify if they are available or not, and see local car crime rates for the area) so that you can access the risk of getting a ticket if you choose not to pay. This feels like the worse case scenario for the city, and frankly, it doesn’t feel that bad and would probably not affect people’s behaviour that much. But there may be other important uses of this data – it may correlate in some interestingly and unpredictably against other events – connections that if made and shared, might actually allow the city to leverage its enforcement officers more efficiently and effectively.

Of course, we won’t know what those could be, since the data isn’t shared, but it is the kind of thing Vancouver should be doing, given the existence of its open data portal. But all government’s should take note. There is a cost to not sharing data. Lost opportunities, lost insights and value, lost allies and networks of people interested in contributing to your success. It’s all our loss.

Applications and Hardware Already Running On Open Data

Yesterday, Gerry T shared a photo he snapped at the University of Alberta in Edmonton of a “departure board” in the university’s Student Union building that uses open transportation data from the city’s website.

Essentially the display board is composed of a simply application, displayed over a large flat screen TV turned vertically.

TransitApp_BusDepartures-217x300It’s exactly the kind of thing that I imagine University Students in many cities around the world wish they had – especially if you are on a campus that is cold and/or wet. Wouldn’t it be nice to wait inside that warm student union building rather than at the bus stop?

Of course in Boston they’ve gone further than just providing the schedule online. They provide real-time data on bus locations which some students and engineers have used to create $350 LED signs in coffee houses to let users know when the next bus is coming.

It’s the kind of simple innovations you wish you’d see in more places: government’s letting people help themselves at making their lives a little easier. Yes, this isn’t changing the world, but its a start, and an example of what more could happen.

Mostly it’s nice to see innovators in Canada like playing with the technology. Hopefully governments will catch up and let the even bigger ideas students around the country have be more than just visions in their heads.

Not sure who at the University created this, but nice work.

Back to Reality: The Politics of Government Transparency & Open Data

A number of my friends and advocates in the open government, transparency and open data communities have argued that online government transparency initiatives will be permanent since, the theory goes, no government will ever want to bear the political cost of rolling it back and being perceived as “more opaque.” I myself have, at times, let this argument go unchallenged or even run with it.

This week’s US budget negotiations between Congress and the White House should lay that theory to rest. Permanently.

The budget agreement that has emerged from most recent round of negotiations – which is likely to be passed by congress –  slashes funding to an array of Obama transparency initiatives such as USASpending, the ITDashboard, and data.gov from $34M to $8M. Agree or disagree, Republicans are apparently all too happy to kill initiatives which make the spending and activities of the US government more transparent as well as create a number of economic opportunities around open data. Why? Because they believe it has no political consequences.

So unsurprisingly, it turns out that political transparency initiatives – even when they are online – are as bound to the realities of traditional politics as dot.com’s were bound by the realities of traditional economics. It’s not enough to get a policy created or an initiative launched – it needs to have a community, a group of interested supporters, to nurture and protect it. Otherwise, it will be at risk.

Back in 2009, in the lead up to the drafting and launching of Vancouver’s Open Data motion I talked about creating an open-government bargain. Specifically, I argued that:

..in an open city, a bargain must exists between a government and its citizens. To make open data a success and to engage the community a city must listen, engage, ask for help, and of course, fulfill its promise to open data as quickly as possible. But this bargain runs both ways. The city must to its part, but so too must the local tech community. They must participate, be patient (cities move slower than tech companies), offer help and, most importantly, make the data come alive for each other, policy makers and citizens through applications and shared analysis.

Some friends countered that open data and transparency should simply exist because it is the right thing to do. I don’t disagree – and I wish we lived in a world where the existence of this ideal was sufficient enough to guarantee these initiatives. But it isn’t sufficient. It’s easy to kill something that no one uses (or in the case of data.gov, that hasn’t been given enough time to generate a vibrant user base). It’s much, much harder to kill something that has a community that uses it, especially if that community and the products it creates are valued by society more generally. This is why open data needs users, it needs developers, think tanks and above all, the media, to take interest in it and to leverage it to create content. It’s also why I’ve tried to create projects like Emitter.ca, recollect.net, taxicity and others, because the more value we create with open data for everyone, the more secure government transparency policies will be.

It’s use it or risk losing it. I wish this weren’t the case, but it’s the best defense I can think of.