Tag Archives: opendata

Government Procurement Reform – It matters

Earlier this week I posted a slidecast on my talk to Canada’s Access to Information Commissioners about how, as they do their work, they need to look deeper into the government “stack.”

My core argument was how decisions about what information gets made accessible is no longer best managed at the end of a policy development or program delivery process but rather should be embedded in it. This means monkeying around and ensuring there is capacity to export government information and data from the tools (e.g. software) government uses every day. Logically, this means monkeying around in procurement policy (see slide below) since that is where the specs for the tools public servants use get set. Trying to bake “access” into processes after the software has been chosen is, well, often an expensive nightmare.

Gov stack

Privately, one participant from a police force, came up to me afterward and said that I was simply guiding people to another problem – procurement. He is right. I am. Almost everyone I talk to in government feels like procurement is broken. I’ve said as much myself in the past. Clay Johnson is someone who has thought about this more than others, here he is below at the Code for America Summit with a great slide (and talk) about how the current government procurement regime rewards all the wrong behaviours and often, all the wrong players.

Clay Risk profile

So yes, I’m pushing the RTI and open data community to think about procurement on purpose. Procurement is borked. Badly. Not just from a wasting tax dollars money perspective, or even just from a service delivery perspective, but also because it doesn’t serve the goals of transparency well. Quite the opposite. More importantly, it isn’t going to get fixed until more people start pointing out that it is broken and start contributing to solving this major bottle neck of a problem.

I highly, highly recommend reading Clay Johnson’s and Harper Reed’s opinion piece in today’s New York Times about procurement titled Why the Government Never Gets Tech Right.

All of this becomes more important if the White House’s (and other governments’ at all levels) have any hope of executing on their digital strategies (image below).  There is going to be a giant effort to digitize much of what governments do and a huge number of opportunities for finding efficiencies and improving services is going to come from this. However, if all of this depends on multi-million (or worse 10 or 100 million) dollar systems and websites we are, to put it frankly, screwed. The future of government isn’t to be (continue to be?) taken over by some massive SAP implementation that is so rigid and controlled it gives governments almost no opportunity to innovate. And this is the future our procurement policies steer us toward. A future with only a tiny handful of possible vendors, a high risk of project failure and highly rigid and frail systems that are expensive to adapt.

Worse there is no easy path here. I don’t see anyone doing procurement right. So we are going to have to dive into a thorny, tough problem. However, the more governments that try to tackle it in radical ways, the faster we can learn some new and interesting lessons.

Open Data WH

Access to Information, Technology and Open Data – Keynote for the Commissioners

On October 11th I was invited by Elizabeth Denham, the Access to Information and Privacy Commissioner for British Columbia to give a keynote at the Privacy and Access 20/20 Conference in Vancouver to an audience that included the various provincial and federal Information Commissioners.

Below is my keynote, I’ve tried to sync the slides up as well as possible. For those who want to skip to juicier parts:

  • 7:08 – thoughts about the technology dependence of RTI legislation
  • 12:16 –  the problematic approach to RTI implementation that results from these unsaid assumptions
  • 28:25 – the need and opportunity to bring open data and RTI advocates together

Some acronyms used:

The 311 Open Data Competition is now Live on Kaggle

As I shared the other week, I’ve been working on a data competition with Kaggle and SeeClickFix involving 311 data from four cities: Chicago, New Haven, Oakland and Richmond.

So first things first – the competition is now live. Indeed, there are already 19 teams and 56 submissions that have been made. Fortunately, time is on your side, there are 56 days to go.

As I mentioned in my previous post on the subject, I have real hopes that this competition can help test a hypothesis I have about the possibility of an algorithmic open commons:

There is, however, for me, a potentially bigger goal. To date, as far as I know, predictive algorithms of 311 data have only ever been attempted within a city, not across cities. At a minimum it has not been attempted in a way in which the results are public and become a public asset.

So while the specific problem  this contest addresses is relatively humble, I’d see it as a creating a larger opportunity for academics, researchers, data scientists, and curious participants to figure out if can we develop predictive algorithms that work for multiple cities. Because if we can, then these algorithms could be a shared common asset. Each algorithm would become a tool for not just one housing non-profit, or city program but a tool for all sufficiently similar non-profits or city programs.

Of course I’m also discovering there are other benefits that arise out of these competitions.

This last weekend there was a mini-sub competition/hackathon involving a subset of the data. It was amazing to watch from afar. First, I was floored by how much cooperation there was, even between competitors and especially after the competition closed. Take a look at the forums, they are probably make one of the more compelling cases that open data can help foster more people to want to learn how to manipulate and engage with data. Here are contestants sharing their approaches and ideas with one another – just like you’d want them to. I’d known that Kaggle had a interesting community and that learning played an important role in it, but “riding along” in a mini competition has caused me to look again at the competitions through a purely educational lens. It is amazing how much people both wanted to learn and share.

As in the current competition, the team at the hackathon also ran a competition around visualizing the data. And there were some great visualization of the data that came out of it, as well as another example of where people were trying to learn and share. Here are two of my favourites:

map2

I love this visualization by Christoph Molnar because it reveals the different in request locations in each city. In some they are really dense, whereas in others they are much (more) evenly distributed. Super interesting to me.

Most pressing issues in each city

I also love the simplicity of this image created by miswift. There might have been other things I’d done, like colour coded similar problems to make them easier to compare across cities. But I still love it.

Congratulations to all the winners from this weekends event, and I hope readers will consider participating in the current competition.

Announcing the 311 Data Challenge, soon to be launched on Kaggle

The Kaggle – SeeClickFix – Eaves.ca 311 Data Challenge. Coming Soon.

I’m pleased to share that, in conjunction with SeeClickFix and Kaggle I’ll be sponsoring a predictive data competition using 311 data from four different cities. My hope is that – if we can demonstrate that there are some predictive and socially valuable insights to be gained from this data – we might be able to persuade cities to try to work together to share data insights and help everyone become more efficient, address social inequities and address other city problems 311 data might enable us to explore.

Here’s the backstory and some details in anticipation of the formal launch:

The Story

Several months back Anthony Goldbloom, the founder and CEO of Kaggle – a predictive data competition firm – approached me asking if I could think of something interesting that could be done in the municipal space around open data. Anthony generously offered to waive all of Kaggle’s normal fees if I could come up with a compelling contest.

After playing around with some ideas I reached out to Ben Berkowitz, co-founder of SeeClickFix (one of the world’s largest implementers of the Open311 standard) and asked him if we could persuade some of the cities they work for to share their data for a competition.

Thanks to the hard work of Will Cukierski at Kaggle as well as the team at SeeClickFix we were ultimately able to generate a consistent data set with 300,000 lines of data involving 311 issues spanning 4 cities across the United States.

In addition, while we hoped many of who might choose to participate in a municipal open data challenge would do so out curiosity or desire to better understand how cities work, both myself and SeeClickFix agreed to collectively put up $5000 in prize money to help raise awareness about the competition and hopefully stoke some media (as well as broader participant) interest.

The Goal

The goal of the competition will be to predict the number of votes, comments and views an issue is likely to generate. To be clear, this is not a prediction that is going to radically alter how cities work, but it could be a genuinely useful to communications departments, helping them predict problems that are particularly thorny or worthy proactively communicating to residents about. In addition – and this remains unclear – my own hope is that it could help us understand discrepancies in how different socio-economic or other groups use online 311 and so enable city officials to more effectively respond to complaints from marginalized communities.

In addition there will be a smaller competition around visualization the data.

The Bigger Goal

There is, however, for me, a potentially bigger goal. To date, as far as I know, predictive algorithms of 311 data have only ever been attempted within a city, not across cities. At a minimum it has not been attempted in a way in which the results are public and become a public asset.

So while the specific problem  this contest addresses is relatively humble, I’d see it as a creating a larger opportunity for academics, researchers, data scientists, and curious participants to figure out if can we develop predictive algorithms that work for multiple cities. Because if we can, then these algorithms could be a shared common asset. Each algorithm would become a tool for not just one housing non-profit, or city program but a tool for all sufficiently similar non-profits or city programs. This could be exceptionally promising – as well as potentially reveal new behavioral or incentive risks that would need to be thought about.

Of course, discovering that every city is unique and that work is not easily transferable, or that predictive models cluster by city size, or by weather, or by some other variable is also valuable, as this would help us understand what types of investments can be made in civic analytics and what the limits of a potential commons might be.

So be sure to keep an eye on the Kaggle page (I’ll link to it) as this contest will be launching soon.

The Uncertain Future of Open Data in the Government of Canada

It is possible to state that presently, open data is at its high water mark in the Government of Canada. Data.gc.ca has been refreshed, more importantly, the government has signed the Open Data Charter committing it to making data “open” by default, and a rash of new data sets have been made available.

In other words there is a lot of momentum in the right direction. So what could go wrong.

The answer…? Everything.

The reason is the upcoming cabinet shuffle.

I confess that Minister Clement and I have not agreed on all things. I believe – like the evidence shows us – that needle injection sites such as Insite make communities safer, save lives and make it easier for drug users to get help. As Health Minister, Clement did not. I argued strongly against dismantling of the mandatory long form census, noting its demise would make our government dumber and, ultimately, more expensive. As Industry Minister, Minister Clement was responsible for the end of a reliable long form census.

However, when it comes to open data, Minister Clement has been a powerful voice in a government that has, on many occasions, looked for ways to make access to information harder, not easier. Indeed, open data advocates have been lucky to have had two deeply supportive ministers, Clement and, prior to him, Stockwell Day (who also felt strongly about this issue and was incredibly responsive to many of my concerns when I shared them). This run, though, may be ending.

With the Government in trouble there is wide spread acceptance that a major cabinet re-shuffle will be in order. While Minister Clement has been laying a lot of groundwork for the upcoming negotiations with the public sector unions and a rethink of the public service could be more effective and accountable, he may not be sticking around to see this work (that I’m sure the government sees as essential) through to the end. Nor may he want to. Treasury Board remains a relatively inward facing ministry and it would not surprise me if both the Minister, and the PMO, were interested in moving him to a portfolio that was more outward and public facing. Only a notable few politicians dream of wrestling with public servants and figuring out how to reform the public service. (Indeed Reg Alcock is the only one I can think of).

If the Minister is moved it will be a real test for the sustainability of open data at the federal level. Between the Open Data charter, the expertise and team built up within Treasury Board and hopefully some educational work Minister Clement has done within his own caucus, ideally there is enough momentum and infrastructure in place that the open data file will carry on. This is very much what I hope to be the case.

But much may depend on who is made President of the Treasury Board.  If that role changes open data advocates may find themselves busy not doing new things, but rather safe guarding gains already made.

 

Some thoughts on the relaunched data.gc.ca

Yesterday, I talked about what I thought was the real story that got missed in the fanfare surrounding the relaunch of data.gc.ca. Today I’ll talk about the new data.gc.ca itself.

Before I begin, there is an important disclaimer to share (to be open!). Earlier this year Treasury Board asked me to chair five public consultations across Canada to gather feedback on both its open data program and data.gc.ca in particular. As such, I solicited peoples suggestions on how data.gc.ca could be improved – as well as shared my own – but I was not involved in the creation of data.gc.ca. Indeed the first time I saw the site was on Tuesday when it launched. My role was merely to gather feedback. For those curious you can read the report I wrote here

There is, I’m happy to say, much to commend about the new open data portal. Of course, aesthetically, it is much easier on the eye, but this is really trivial compared to a number of other changes.

The most important shift relates to the desire of the site to foster community. Users can now register with the site as well as rate and comment on data sets. There are also places like the Developers’ Corner which contains documentation that potential users might find helpful and a sort of app store where government agencies and citizens can posts applications they have created. This shift mirrors the evolution of data.govdata.gov.uk and DataBC which started out as data repositories but sought to foster and nurture a community of data users. The critical piece here is that simply creating the functionality will probably not be sufficient, in the US, UK and BC it has required dedicated community managers/engagers to help foster such a community. At present it is unclear if that exists behind the website at data.gc.ca.

The other two noteworthy improvements to the site are an improved search and the availability of API’s. While not perfect, the improved search is nonetheless helpful as previously it was basically impossible to find anything on the site. Today a search for “border time” and a border wait time data set is the top result. However, search for “border wait times” and “Biogeochemical exploration using Douglas-fir tree tops in the Mabel Lake area, southern British Columbia (NTS 82L09 and 10)” becomes the top hit with actual border wait time data set pushed down to fifth. That said the search is still a vast improvement and this alone could be a boon to policy wonks, researchers and developers who elect to make use of the site.

The introduction of APIs is another interesting development. For the uninitiated an API (application programming interface) provides continuous access to updated data, so rather than downloading a file, it is more like you are plugging into a socket that delivers data, rather than electricity. The aforementioned border wait time data set is a fantastic example. It is less of a “data set” than of a “data stream” providing the most recent updates of border wait times, like what you would see on the big signs across the highway as you approach the border. By providing it through the open data site it would not, for example, be impossible for Google Maps to scan this data set daily, understand how border wait times fluctuate and incorporate these delays in its predicted travel times. Indeed, it could even querry the API  in real time and tell you how long it will take to drive from Vancouver to Seattle, with border delays taken into account. The opportunity for developers and, equally intriguing, government employees and contractors, to build applications a top of these APIs is, in my mind, quite exciting. It is a much, much cheaper and flexible approach than how a lot of government software is currently built.

I also welcome the addition of the ability to search Access to Information (ATIP) requests summaries. That said, I’d like for there to be more than just the summaries, that actually responses would be nice, particularly given that ATIP requests likely represent information people have identified as important. In addition, the tool for exploring government expenditures is interesting, but it is weirdly more notable because, as far as I can tell, none of the data displayed in the tool can be downloaded, meaning it is not very open.

Finally, I will briefly note that the license is another welcome change. For more on that I recommend checking out Teresa Scassa’s blog post on it. Contrary to my above disclaimer I have been more active on this side of things, and hope to have more to share on that another time.

I’m sure, as I and others explore the site in the coming days we will discover more to like and dislike about it, but it is a helpful step forward and another signal that open data is, slowly, being baked into the public service as a core service.

 

The Real News Story about the Relaunch of data.gc.ca

As many of my open data friends know, yesterday the government launched its new open data portal to great fanfare. While there is much to talk about there – something I will dive into tomorrow – that was not the only thing that happened yesterday.

Indeed, I did a lot of media yesterday between flights and only after it was over did I notice that virtually all the questions focused on the relaunch of data.gc.ca. Yet it is increasingly clear that for me, the much, much bigger story of the portal relaunch was the Prime Minister announcing that Canada would adopt the Open Data Charter.

In other words, Canada just announced that it is moving towards making all government data open by default. Moreover, it even made commitments to make specific “high value” data sets open in the next couple of years.

As an aside, I don’t think the Prime Minister’s office has ever mentioned open data – as far as I can remember, so that was interesting in of itself. But what is still more interesting is what the Prime Minister committed Canada to. The open data charter commits the government to make data open by default as well as four other principles including:

  • Quality and Quantity
  • Useable by All
  • Releasing Data for Improved Governance
  • Releasing Data for Innovation

In some ways Canada has effectively agreed to implement the equivalent to Presidential Executive Order on Open Data the White House announced last month (and that I analyzed in this blog post). Indeed, the charter is more aggressive than the executive order since it goes on to layout the need to open up not just future data, but also current “high value” data sets. Included among these are data sets the Open Knowledge Foundation has been seeking to get opened via its open data census, as well as some data sets I and many others have argued should be made open, such as the company/business register. Other suggested high value data sets include data on crime, school performance, energy and environment pollution levels, energy consumption, government contracts, national budgets, health prescription data and many, many others. Also included on the list… postcodes – something we are presently struggling with here in Canada.

But the charter wasn’t all the government committed to. The final G8 communique contained many interesting tidbits that again, highlighted commitments to open up data and adhere to international data schemas.

Among these were:

  • Corporate Registry Data: There was a very interesting section on “Transparency of companies and legal arrangements” which is essentially on sharing data about who owns companies. As an advisory board member to OpenCorporates, this was music to my ears. However, the federal government already does this, the much, much bigger problem is with the provinces, like BC and Quebec that make it difficult or expensive to access this data.
  • Extractive Industries Transparency Initiative: A commitment that “Canada will launch consultations with stakeholders across Canada with a view to developing an equivalent mandatory reporting regime for extractive companies within the next two years.” This is something I fought to get included into our OGP commitment two years ago but failed to succeed at. Again, I’m thrilled to see this appear in the communique and look forward to the government’s action.
  • International Aid Transparency Initiative (IATI) and Busan Common Standard on Aid Transparency,: A commitment to make aid data more transparent and downloadable by 2015. Indeed, with all the G8 countries agreed to taking this step it may be possible to get greater transparency around who is spending what money, where on aid. This could help identify duplication as well as in assessments around effectiveness. Given how precious aid dollars are, this is a very welcome development. (h/t Michael Roberts of Acclar.org)

So lots of commitments, some on the more vague side (the open data charter) but some very explicit and precise. And that is the real story of yesterday, not that the country has a new open data portal, but that a lot more data is likely going to get put into that portal over then next 2-5 years. And a tsunami of data could end up in it over the next 10-25 years. Indeed, so much data, that I suspect a portal will no longer be a logical way to share it all.

And therein lies the deeper business and government story in all this. As I mentioned in my analysis of the White House Executive Order that made open data default, the big change here is in procurement. If implemented, this could have a dramatic impact on vendors and suppliers of equipement and computers that collect and store data for the government. Many vendors try to find ways to make their data difficult to export and share so as to lock the government in to their solution. Again, if (and this is a big if) the charter is implemented it will hopefully require a lot of companies to rethink what they offer to government. This is a potentially huge story as it could disrupt incumbents and lead to either big reductions in the costs of procurement (if done right) or big increases and the establishment of the same, or new, impossible to work with incumbents (if done incorrectly).

There is potentially a tremendous amount at stake in how the government handles the procurement side of all this, because whether it realizes it or not, it may have just completely shaken up the IT industry that serves it.

 

Postscript: One thing I found interesting about the G8 communique was how many times commitments about open data and open data sets occurred in the section that had nothing to do with open data. Will be interesting if that is a trend that continues at the next G8 meeting. Indeed, I wouldn’t be surprised is a specific open data section disappears and instead these references just become part of various issue related commitments.

 

 

 

Some Nice Journalistic Data Visualization – Global’s Crude Awakening

Over at Global, David Skok and his team have created a very nice visualization of the over 28,666 crude oil spills that have happened on Alberta pipelines over the last 37 years (that’s about two a day). Indeed, for good measure they’ve also visualized the additional 31,453 spills of “other” substance carried by Alberta pipeline (saltwater, liquid petroleum, etc..)

They’ve even created a look up feature so you can tackle the data geographically, by name, or by postal code. It is pretty in depth.

Of course, I believe all this data should be open. Sadly, they have to get at it through a complicated Access to Information Request that appears to have consumed a great deal of time and resources and that would probably only be possible by a media organizations with the  dedicated resources (legal and journalistic) and leverage to demand it. Had this data been open there would have still been a great deal of work to parse, understand and visualize it, but it would have helped lower the cost of development.

In fact, if you are curious about how they got the data – and the sad, sad, story it involved – take a look at the fantastic story they wrote about the creation of their oilspill website. This line really stood out for me:

An initial Freedom of Information request – filed June 8, 2012, the day after the Sundre spill – asked Alberta Environment and Sustainable Resource Development for information on all reported spills from the oil and gas industry, from 2006 to 2012.

About a month later, Global News was quoted a fee of over $4,000 for this information. In discussions with the department, it turned out this high fee was because the department was unable to provide the information in an electronic format: Although it maintained a database of spills, the departmental process was to print out individual reports on paper, and to charge the requester for every page.

So the relevant government department has the data in a machine readable form. It just chooses to only give it out in a paper form. Short of simply not releasing the data at all it is hard to imagine a more obstructionist approach to preventing the public from accessing environmental data their tax dollars paid to collect and that is supposed to be in the public interest. You essentially look at thousands of pieces of paper and re-enter tens, if not hundreds of thousands, of data points into spreadsheets. This is a process designed to prevent you from learning anything and frustrating potential users.

Let’s hope that when the time comes for the Global team to update this tool and webpage there will be open data they can download and access to the task is a little easier.

 

Awesome Simple Open Data use case – Welcome Wagon for New Community Businesses

A few weeks ago I was at an event in Victoria, British Columbia at event where people were discussing the possibilities, challenges and risk of open data. During the conversation, one of the participants talked about how they wanted an API for business license applications from the city.

This is a pretty unusual request – people have told me about their desire for business licenses data especially at the provincial/state and national level, but also at the local level. However, they are usually happy with a data dump once a year or quarter since they generally want to analyze the data for urban planning or business planning reasons. But an API – which would mean essentially constant access to the data and the opportunity to see changes to the database in real time (e.g. if a business registered or moved) – was different.

The reason? The individual – who was an entrepreneur and part of the local Business Improvement Area – wanted to be able to offer a “welcome wagon” to other new businesses in his community. If he knew when a business opened up he could reach out and help make them welcome them to the neighborhood. He thought it was always nice when shopkeepers knew one another but didn’t always know what was going on even a few blocks away because, well, he was often manning his own shop. I thought it was a deeply fascinating example of how open data could help foster community and is something I would have never imagined.

Food for thought and wanted to share.

 

The Value of Open Data – Don’t Measure Growth, Measure Destruction

Alexander Howard – who, in my mind, is the best guy covering the Gov 2.0 space – pinged me the other night to ask “What’s the best evidence of open data leading to economic outcomes that you’ve seen?”

I’d like to hack the question because – I suspect – for many people, they will be looking to measure “economic outcomes” in ways that I don’t think will be so narrow as to be helpful. For example, if you are wondering what the big companies are going to be that come out of the open data movement and/or what are the big savings that are going to be found by government via sifting through the data, I think you are probably looking for the wrong indicators.

Why? Part of it is because the number of “big” examples is going to be small.

It’s not that I don’t think there won’t be any. For example several years ago I blogged about how FOIed (or, in Canada ATIPed) data that should have been open helped find $3.2B in evaded tax revenues channeled through illegal charities. It’s just that this is probably not where the wins will initially take place.

This is in part because most data for which there was likely to be an obvious and large economic impact (eg spawning a big company or saving a government millions) will have already been analyzed or sold by governments before the open data movement came along. On the analysis side of the question- if you are very confident a data set could yield tens or hundreds of millions in savings… well… you were probably willing to pay SAS or some other analytics firm 30-100K to analyze it. And you were probably willing to pay SAP a couple of million (a year?) to set up the infrastructure to just gather the data.

Meanwhile, on the “private sector company” side of the equation – if that data had value, there were probably eager buyers. In Canada for example, interest in census data – to help with planning where to locate stores or how to engage in marketing and advertising effectively – was sold because the private sector made it clear they were willing to pay to gain access to it. (Sadly, this was bad news for academics, non-profits and everybody else, for whom it should have been free, as it was in the US).

So my point is, that a great deal of the (again) obvious low hanging fruit has probably been picked long before the open data movement showed up, because governments – or companies – were willing to invest some modest amounts to create the benefits that picking those fruit would yield.

This is not to say I don’t think there are diamonds in the rough out there – data sets that will reveal significant savings – but I doubt they will be obvious or easy finds. Nor do I think that billion dollar companies are going to spring up around open datasets over night since –  by definition – open data has low barriers to entry to any company that adds value to them. One should remember it took Red Hat two decades to become a billion dollar company. Impressive, but it is still a tiny compared to many of its rivals.

And that is my main point.

The real impact of open data will likely not be in the economic wealth it generates, but rather in its destructive power. I think the real impact of open data is going to be in the value it destroys and so in the capital it frees up to do other things. Much like Red Hat is fraction of the size of Microsoft, Open Data is going to enable new players to disrupt established data players.

What do I mean by this?

Take SeeClickFix. Here is a company that, leveraging the Open311 standard, is able to provide many cities with a 311 solution that works pretty much out of the box. 20 years ago, this was a $10 million+ problem for a major city to solve, and wasn’t even something a small city could consider adopting – it was just prohibitively expensive. Today, SeeClickFix takes what was a 7 or 8 digit problem, and makes it a 5 or 6 digit problem. Indeed, I suspect SeeClickFix almost works better in a small to mid-sized government that doesn’t have complex work order software and so can just use SeeClickFix as a general solution. For this part of the market, it has crushed the cost out of implementing a solution.

Another example. And one I’m most excited. Look at CKAN and Socrata. Most people believe these are open data portal solutions. That is a mistake. These are data management companies that happen to have simply made “sharing (or “open”) a core design feature. You know who does data management? SAP. What Socrata and CKAN offer is a way to store, access, share and engage with data previously gathered and held by companies like SAP at a fraction of the cost. A SAP implementation is a 7 or 8 (or god forbid, 9) digit problem. And many city IT managers complain that doing anything with data stored in SAP takes time and it takes money. CKAN and Socrata may have only a fraction of the features, but they are dead simple to use, and make it dead simple to extract and share data. More importantly they make these costly 7 and 8 digital problems potentially become cheap 5 or 6 digit problems.

On the analysis side, again, I do hope there will be big wins – but what I really think open data is going to do is lower the costs of creating lots of small wins – crazy numbers of tiny efficiencies. If SAP and SAS were about solving the 5 problems that could create 10s of millions in operational savings for governments and companies then Socrata, CKAN and the open data movement is about finding the 1000 problems for which you can save between $20,000 and $1M in savings. For example, when you look at the work that Michael Flowers is doing in NYC, his analytics team is going to transform New York City’s budget. They aren’t finding $30 million dollars in operational savings, but they are generating a steady stream of very solid 6 to low 7 digit savings, project after project. (this is to say nothing of the lives they help save with their work on ambulances and fire safety inspections). Cumulatively  over time, these savings are going to add up to a lot. But there probably isn’t going to be a big bang. Rather, we are getting into the long tail of savings. Lots and lots of small stuff… that is going to add up to a very big number, while no one is looking.

So when I look at open data, yes, I think there is economic value. Lots and lots of economic value. Hell, tons of it.

But it isn’t necessarily going to happen in a big bang, and it may take place in the creative destruction it fosters and so the capital it frees up to spend on other things. That may make it potentially harder to measure (I’m hoping some economist much smarter than me is going tell me I’m wrong about that) but that’s what I think the change will look like.

Don’t look for the big bang, and don’t measure the growth in spending or new jobs. Rather let’s try to measure the destruction and cumulative impact of a thousand tiny wins. Cause that is where I think we’ll see it most.

Postscript: Apologies again for any typos – it’s late and I’m just desperate to get this out while it is burning in my brain. And thank you Alex for forcing me to put into words something I’ve been thinking about saying for months.