Category Archives: public policy

Canada's Action Plan on Open Government: A Review

The other day the Canadian Government published its Action Plan on Open Government, a high level document that both lays out the Government’s goals on this file as well as fulfill its pledge to create tangible goals as part of its participation in next week’s Open Government Partnership 2012 annual meeting in Brazil.

So what does the document say and what does it mean? Here is my take.

Take Away #1: Not a breakthrough document

There is much that is good in the government’s action plan – some of which I will highlight later. But for those hoping that Canada was going to get the Gov 2.0 bug and try to leapfrog leaders like the United States or the United Kingdom, this document will disappoint. By and large this document is not about transforming government – even at its most ambitious it appears to be much more about engaging in some medium sized experiments.

As a result the document emphasizes a number of things that the UK and US started doing several years ago such  getting license that adheres to international norms or posting government resource allocation and performance management information online in machine readable forms or refining the open data portal.

What you don’t see are explicit references to try to re-think how government leverages citizens experience and knowledge with a site like Challenge.gov, engage experts in innovative ways such as with Peer to Patent, or work with industries or provinces to generate personal open data such as the US has done with the Blue Button (for Healthcare) or the Green Button (for utilities).

Take Away #2: A Solid Foundation

This said, there is much in the document that is good. Specifically, in many areas, it does lay a solid foundation for some future successes. Probably the most important statements are the “foundational commitments” that appear on this page. Here are some key points:

Open Government Directive

In Year 1 of our Action Plan, we will confirm our policy direction for Open Government by issuing a new Directive on Open Government. The Directive will provide guidance to 106 federal departments and agencies on what they must do to maximize the availability of online information and data, identify the nature of information to be published, as well as the timing, formats, and standards that departments will be required to adopt… The clear goal of this Directive is to make Open Government and open information the ‘default’ approach.

This last sentence is nice to read. Of course the devil will be in the detail (and in the execution) but establishing a directive around open information could end being as important (although admittedly not as powerful – an important point) as the establishment of Access to Information. Done right such a directive could vastly expand the range of documents made available to the public, something that should be very doable as more and more government documentation moves into digital formats.

For those complaining about the lack of ATI reform in the document this directive, and its creation will be with further exploration. There is an enormous opportunity here to reset how government discloses information – and “the default to open” line creates a public standard that we can try to hold the government to account on.

And of course the real test for all this will come in years 2-3 when it comes time to disclose documents around something sensitive to the government… like, say, around the issue of the Northern Gateway Pipeline (or something akin to the Afghan Prisoner issue). In theory this directive should make all government research and assessments open, when this moment happens we’ll have a real test of the robustness of any new such directive.

Open Government License:

To support the Directive and reduce the administrative burden of managing multiple licensing regimes across the Government of Canada, we will issue a new universal Open Government License in Year 1 of our Action Plan with the goal of removing restrictions on the reuse of published Government of Canada information (data, info, websites, publications) and aligning with international best practices… The purpose of the new Open Government License will be to promote the re-use of federal information as widely as possible...

Full Disclosure: I have been pushing (in an unpaid capacity) for the government to reform its license and helping out in its discussions with other jurisdictions around how it can incorporate the best practices and most permissive language possible.

This is another important foundational piece. To be clear, this is not about an “open data” license. This is about creating a licensing for all government information and media. I suspect this appeals to this government in part because it ends the craziness of having lawyers across government constantly re-inventing new licenses and creating a complex set of licenses to manage. Let me be clear about what I think this means: This is functionally about neutering crown copyright. It’s about creating a licensing regime that makes very clear what the users rights are (which crown copyright does not doe) and that is as permissive as possible about re-use (which crown copyright, because of its lack of clarity, is not). Achieving such a license is a critical step to doing many of the more ambitious open government and gov 2.0 activities that many of us would like to see happen.

Take Away #3: The Good and Bad Around Access to Information

For many, I think this may be the biggest disappointment is that the government has chosen not to try to update the Access to Information Act. It is true that this is what the Access to Information Commissioners from across the country recommended they do in an open letter (recommendation #2 in their letter). Opening up the act likely has a number of political risks – particularly for a government that has not always been forthcoming documents (the Afghan detainee issue and F-35 contract both come to mind) – however, I again propose that it may be possible to achieve some of the objectives around improved access through the Open Government Directive.

What I think shouldn’t be overlooked, however, is the government’s “experiment” around modernizing the administration of Access to Information:

To improve service quality and ease of access for citizens, and to reduce processing costs for institutions, we will begin modernizing and centralizing the platforms supporting the administration of Access to Information (ATI). In Year 1, we will pilot online request and payment services for a number of departments allowing Canadians for the first time to submit and pay for ATI requests online with the goal of having this capability available to all departments as soon as feasible. In Years 2 and 3, we will make completed ATI request summaries searchable online, and we will focus on the design and implementation of a standardized, modern, ATI solution to be used by all federal departments and

These are welcome improvements. As one colleague – James McKinney – noted, the fact that you have to pay with a check means that only people with Canadian bank accounts can make ATIP requests. This largely means just Canadian citizens. This is ridiculous. Moreover, the process is slow and painful (who uses check! the Brits are phasing them out by 2018 – good on em!). The use of checks creates a real barrier – particularly I think, for young people.

Also, being able search summaries of previous requests is a no-brainer.

Take Away #4: The is a document of experiments

As I mentioned earlier, outside the foundational commitments, the document reads less like a grand experiment and more like a series of small experiments.

Here the Virtual Library is another interesting commitment – certainly during the consultations the number one complaint was that people have a hard time finding what they are looking for on government websites. Sadly, even if you know the name of the document you want, it is still often hard to find. A virtual library is meant to address this concern – obviously it is all going to be in the implementation – but it is a response to a genuine expressed need.

Meanwhile the Advancing Recordkeeping in the Government of Canada and User-Centric Web Services feel like projects that were maybe already in the pipeline before Open Government came on the scene. They certainly do conform with the shared services and IT centralization announced by Treasury Board last year. They could be helpful but honestly, these will all be about execution since these types of projects can harmonize processes and save money, or they can become enormous boondoggles that everyone tries to work around since they don’t meet anyone’s requirements. If they do go the right way, I can definitely imagine how they might help the management of ATI requests (I have to imagine it would make it easier to track down a document).

I am deeply excited about the implementation of International Aid Transparency Initiative (IATI). This is something I’ve campaigned for and urged the government to adopt, so it is great to see. I think these types of cross jurisdictional standards have a huge role to play in the open government movement, so joining one, figuring out what about the implementation works and doesn’t work, and assessing its impact, is important both for Open Government in general but also for Canada, as it will let us learn lessons that, I hope, will become applicable in other areas as more of these types of standards emerge.

Conclusion:

I think it was always going to be a stretch to imagine Canada taking a leadership role in Open Government space, at least at this point. Frankly, we have a lot of catching up to do, just to draw even with places like the US and the UK which have been working hard to keep experimenting with new ideas in the space. What is promising about the document is that it does present an opportunity for some foundational pieces to be put into play. The bad news is that real efforts to rethink governments relationship with citizens, or even the role of the public servant within a digital government, have not been taken very far.

So… a C+?

 

Additional disclaimer: As many of my readers know, I sit on the Federal Government’s Open Government Advisory Panel. My role on this panel is to serve as a challenge function to the ideas that are presented to us. In this capacity I share with them the same information I share with you – I try to be candid about what I think works and doesn’t work around ideas they put forward. Interestingly, I did not see even a draft version of the Action Plan until it was posted to the website and was (obviously by inference) not involved in its creation. Just want to share all that to be, well, transparent, about where I’m coming from – which remains as a citizen who cares about these issues and wants to push governments to do more around gov 2.0 and open gov.

Also, sorry or the typos, but I’m sick and it is 1am. So I’m checking out. Will proof read again when I awake.

Here's a prediction: A Canadian F-35 will be shot down by a drone in 2035

One of the problems with living in a country like Canada is that certain people become the default person on certain issues. It’s a small place and the opportunity for specialization (and brand building) is small, so you can expect people to go back to the same well a fair bit on certain issues. I know, when it comes to Open Data, I can often be that well.

Yesterday’s article by Jack Granastein – one of the country’s favourite commentator’s on (and cheerleaders of) all things military – is a great case in point. It’s also a wonderful example of an article that is not designed to answer deep questions, but merely reassure readers not to question anything.

For those not in the know, Canada is in the midst of a scandal around the procurement of new fighter jets which, it turns out, the government not only chose to single source, but has been caught lying misleading the public about the costs despite repeated attempts by both the opposition and the media to ask for the full cost. Turns out the plans will cost twice as much as previously revealed, maybe more. For those interested in reading a case study in how not to do government procurement Andrew Coyne offers a good review in his two latest columns here and here. (Granastein, in the past, has followed the government script, using the radically low-ball figure of $16 billion, it is now accepted to be $26 billion).

Here is why Jack Granastein’s piece is so puzzling. The fact is, there really aren’t that many articles about whether the F-35 is the right plane or not. People are incensed about being radically mislead about the cost and the sole source process – not that we chose the F-35. But Granastein’s piece is all about assuring us that a) a lot of thought has gone into this choice and b) we shouldn’t really blame the military planners (nor apparently, the politicians). It is the public servants fault. So, some thoughts.

These are some disturbing and confusing conclusions. I have to say, it is very, very depressing to read someone as seasoned and knowledgeable as Granastein write:

But the estimates of costs, and the spin that has so exercised the Auditor-General, the media and the Opposition, are shaped and massaged by the deputy minister, in effect DND’s chief financial officer, who advises the minister of national defence.

Errr….Really? I think they are shaped by them at the direction or with the approval of the Minister of Defence. I agree that the Minister and Cabinet probably are not up to speed on the latest in airframe technology and so probably aren’t hand picking the fighter plane. But you know what they are up to speed on? Spinning budgets and political messages to sell to the public. To somehow try to deflect the blame onto the public servants feels, well, like yet another death nail for the notion of ministerial accountability.

But even Granastein’s love of the F-35 is hard to grasp. Apparently:

“we cannot see into the future, and we do not know what challenges we might face. Who foresaw Canadian fighters participating in Kosovo a dozen years ago? Who anticipated the Libyan campaign?”

I’m not sure I want to live and die on those examples. I mean in Libya alone our CF-18’s were joined by F-16s, Rafale fighters, Mirage 2000s and Mirage 2000Ds, Tornados, Eurofighter Typhoons, and JAS 39C Gripen (are you bored yet?). Apparently there were at least 7 other choices that would have worked out okay for the mission. The Kosovo mission had an even wider assortment of planes. Apparently, this isn’t a choice of getting it “just right” more like, “there are a lot of options that will work.”

But looking into the future there are some solid and strong predictions we can make:

1) Granastein himself argued in 2010 that performing sovereignty patrols in the arctic is one of the reasons we need to buy new planes. Here is a known future scenario. So frankly I’m surprised he’s bullish on the F-35s since the F-35’s will not be able to operate in the arctic for at least 5 years and may not for even longer. Given that, in that same article, Granastein swallowed the now revealed to be bogus total cost of owernship figures provided by the Department of National Defence hook, line and sinke, you think he might be more skeptical about other facts. Apparently not.

2) We can’t predict the future. I agree. But I’m going to make a prediction anyway. If Canada fights an enemy with any of the sophistication that would require us to have the F-35 (say, a China in 25 years) I predict that an F-35 will get shot down by a pilotless drone in that conflict.

What makes drones so interesting is that because they don’t have to have pilots they can be smaller, faster and more maneuverable. Indeed in the 1970s UAVs were able to outmaneuver the best US pilots of the day. Moreover, the world of aviation may change very quickly in the coming years. Everyone will tell you a drone can’t beat a piloted plane. This is almost likely true today (although a pilot-less drone almost shot down a Mig in 2002 in Iraq).

But may have two things going for them. First, if drones become cheaper to build and operate, and you don’t have to worry about losing the expensive pilot, you may be able to make up for competency with numbers. Imagining an F-35 defeating a single drone – such as the US Navy’s experimental X-47B – is easy. What about defeating a swarm of 5 of them that are working seamlessly together?

Second, much like nature, survival frequently favours those who can reproduce frequently. The F-35 is expected to last Canada 30-35 years. Yes there will be upgrades and changes, but that is a slow evolutionary pace. In that time, I suspect we’ll see somewhere between 5 (and likely a lot more) generations of drones. And why not? There are no pilots to retrain, just new lessons from the previous generation of drones to draw from, and new technological and geo-political realities to adapt to.

I’m not even beginning to argue that air-to-air combat capable drones are available today, but it isn’t unlikely that they could be available in 5-10 years. Of course, many air forces hate talking about this because, well, drones mean no more pilots and air forces are composed of… well… pilots. But it does suggest that Canada could buy a fighter that is much cheaper, would still enable us to participate in missions like Kosovo and Libya, without locking us into a 30-35 year commitment at the very moment the military aerospace industry is entering what is possibly the most disruptive period in its history.

It would seem that, at the very least, since we’ve been mislead about pretty much everything involved in this project, asking these questions now feels like fair game.

(Oh, and as an aside, as we decide to pay somewhere between $26-44 Billion for fighter planes, our government cut the entire $5 million year budget of the National Aboriginal Health Organization which over research and programs, in areas like suicide prevention, tobacco cessation, housing and midwifery. While today Canada ranks 6th in the world in the UN’s Quality of Life index, it was calculated that in 2007 Canada’s first nation’s population, had they been ranked as a separate group, would have ranked 63rd. Right above healthy countries like Belarus, Russia and Libya. Well at least now we’ll have less data about the problem, which means we won’t know to worry about it.)

 

Using BHAG's to Change Organizations: A Management, Open Data & Government Mashup

I’m a big believer in the ancillary benefits of a single big goal. Set a goal that has one clear objective, but as a result a bunch of other things have to change as well.

So one of my favourite Big Hairy Audacious Goals (BHAG) for an organization is to go paperless. I like the goal for all sorts of reasons. Much like a true BHAG it is is clear, compelling, and has obvious “finish line.” And while hard, it is achievable.

It has the benefit of potentially making the organization more “green” but, what I really like about it is that it requires a bunch of other steps to take place that should position the organization to become more efficient, effective and faster.

This is because paper is dumb technology. Among many, many other things, information on paper can’t be tracked, changes can’t be noted, pageviews can’t be recorded, data can’t be linked. It is hard to run a lean business when you’re using paper.

Getting rid of it often means you have get a better handle on workflow and processes so they can be streamlined. It means rethinking the tools you use. It means getting rid of checks and into direct deposit, moving off letters and into email, getting your documents, agendas, meeting minutes, policies and god knows what else out of MS Word and onto wikis, shifting from printed product manuals to PDFs or better still, YouTube videos. These changes in turn require a rethinking of how your employees work together and the skills they require.

So what starts off as a simple goal – getting rid of paper – pretty soon requires some deep organizational change. Of course, the rallying cry of “more efficient processes!” or “better understanding our workflow” have pretty limited appeal and, can be hard for everyone to wrap their head around. However, “getting rid of paper”? It is simple, clear and, frankly, is something that everyone in the organization can probably contribute an idea towards achieving. And, it will achieve many of the less sexy but more important goals.

Turns out, maybe some governments may be thinking this way.

The State of Oklahoma has a nice website that talks about all their “green” initiatives. Of course, it just so happens that many of these initiatives – reducing travel, getting rid of paper, etc… also happen to reduce costs and improve service but are easier to measure. I haven’t spoken with anyone at the State of Oklahoma to see if this is the real goal, but the website seems to acknowledges that it is:

OK.gov was created to improve access to government, reduce service-processing costs and enable state agencies to provide a higher quality of service to their constituents.

So for Oklahoma, going paperless becomes a way to get at some larger transformations. Nice BHAG. Of course, as with any good BHAG, you can track these changes and share them with your shareholders, stakeholders or… citizens.

And behold! The Oklahoma go green website invites different state agencies to report data on how their online services reduce paper consumption and/or carbon emissions. Data that they in turn track and share with the public via the state’s Socrata data portal. This graph shows how much agencies have reduced their paper output over the past four years.

Notice how some departments have no data – if I were an Oklahoma taxpayer, I’m not too sure I’d be thrilled with them.But take a step back. This is a wonderful example of how transparency and open data can help drive a government initiative. Not only can that data make it easier for the public to understand what has happened (and so be more readily engaged) but it can help cultivating a culture of accountability as well as – and perhaps more importantly – promote a culture of metrics that I believe will be critical for the future of government.

I often say to governments “be strategic about how you use some of the data you make open.” Don’t just share a bunch of stuff, use what you share to achieve policy or organizational objectives. This is a great example. It’s also a potentially a great example at organizational change in a large and complex environment. Interesting stuff.

 

 

Next Generation Open Data: Personal Data Access

Background

This Monday I had the pleasure of being in Mexico City for the OECD’s High Level Meeting on e-Government. CIO’s from a number of countries were present – including Australia, Canada, the UK and Mexico (among others). But what really got me going was a presentation by Chris Vein, the Deputy United States Chief Technology Officer for Government Innovation.

In his presentation he referenced work around the Blue Button and the Green Button – both efforts I was previously familiar with. But my conversation with Chris sparked several new ideas and reminded me of just how revolutionary these initiatives are.

For those unacquainted with them, here’s a brief summary:

The Blue Button Initiative emerged out of the US Department of Veterans Affairs (VA) with a simple goal – create a big blue button on their website that would enable a logged in user to download their health records. That way they can then share those records with whoever they wish, a new doctor, a hospital, an application or even just look at it themselves. The idea has been deemed so good, so important and so popular, that it is now being championed as industry standard, something that not just the VA but all US health providers should do.

The Green Button Initiative is similar. I first read about it on ReadWriteWeb under the catchy and insightful title “Green Button” Open Data Just Created an App Market for 27M US Homes. Essentially the Green Button would enable users to download their energy consumption data from their utility. In the United States 9 utilities have already launched Green Buttons and an app ecosystem – applications that would enable people to monitor their energy use – is starting to emerge. Indeed Chris Vein talked about one app that enabled a user to see their thermostat in real time and then assess the financial and environmental implications of raising and/or lowering it. I personally see the Green Button evolving into an API that you can give others access to… but that is a detail.

Why it Matters

Colleagues like Nigel Shadbolt in the UK have talked a lot about enabling citizens to get their data out of websites like Facebook. And Google has it’s own very laudable Data Liberation Front run by great guy and werewolf expert, Brian Fitzpatrick. But what makes the Green Button and Blue Button initiatives unique and important is that they create a common industry standard for sharing consumer data. This creates incentives for third parties to develop applications and websites that can analyze this data because these applications will scale across jurisdictions. Hence the Read Write Web article’s focus on a new market. It also makes the data easy to share. Healthcare records downloaded using the blue button are easily passed on to a new doctor or a new hospital since now people can design systems to consumer these healthcare records. Most importantly, it gives the option of sharing these records so they don’t have to wait for lumbering bureaucracies.

This is a whole new type of open data. Open not to the public but to the individual to whom the data really belongs.

A Proposal

I would love to see the blue button and green button initiative spread to companies and jurisdictions outside the United States. There is no reason why for examples there cannot be Blue Buttons on Provincial Health Care website in Canada, or the UK. Nor is there any reason why provincial energy corporations like BC Hydro or Bullfrog Energy (there’s a progressive company that would get this) couldn’t implement the Green Button. Doing so would enable Canadian software developers to create applications that could use this data and help citizens and tap into the US market. Conversely, Canadian citizens could tap into applications created in the US.

The opportunity here is huge. Not only could this revolutionize citizens access to their own health and energy consumption data, it would reduce the costs of sharing health care records, which in turn could potentially create savings for the industry at large.

Action

If you are a consumer, tell your local health agency, insurer and energy utility about this.

If you are a energy utility or Ministry of Health and are interested in this – please contact me.

Either way, I hope this is interesting. I believe there is huge potential in Personal Open Data, particular around data currently held by crown corporations and in critical industries, like healthcare.

Citizen Surveillance and the Coming Challenge for Public Institutions

The other day I stumbled over this intriguing article which describes how a group of residents in Vancouver have started to surveille the police as they do their work in the downtown eastside, one of the poorest and toughest neighborhoods in Canada. The reason is simple. Many people – particularly those who are marginalized and most vulnerable – simply do not trust the police. The interview with the founder of Vancouver Cop Watch probably sums it up best:

“One of the complaints we have about District 2 is about how the Vancouver police were arresting people and taking them off to other areas and beating them up instead of taking them to a jail,” Allan told the Georgia Straight in a phone interview. “So what we do is that, when in the Downtown Eastside, whenever we see the police arresting someone, we follow behind them to make sure that the person makes it to the jail.”

In a world where many feel it is hard to hold accountable government in general and police forces specifically, finding alternative means of creating such accountability will be deeply alluring. And people no longer need the funding and coordination of organizations like Witness (which initially focused on getting videocameras into peoples hands in an effort to prevent human rights abuses). Digital video cameras and smart phones coupled with services like youtube now provide this infrastructure for virtually nothing.

This is the surveillance society – predicted and written about by authors like David Brin – and it is driven as much by us, the citizens, as it is by government.

Vancouver Cop Watch is not the first example of this type of activity – I’ve read about people doing this across the United States. What is fascinating is watching the state try to resist and fail miserably. In the US the police have lost key battles in the courts. This after the police arrested people filming them even when while on their own property. And despite the ruling people continue to be arrested for filming the police – a choice I suspect diminishes public confidence in the police and the state.

And it is not just the police getting filmed. Transit workers in Toronto have taken a beating of late as they are filmed asleep on the job. Similarly, a scared passenger filmed an Ottawa bus driver who was aggressive and swearing at an apologizing mentally ill passenger. A few years ago the public in Montreal was outraged as city crews were filmed repairing few potholes and taking long breaks.

The simple fact is, if you are a front line worker – in either the private, but especially, the public sector – there is a good chance that at some point in your career you’re going to be filmed. And even when you are not being filmed, more data is going to be collected about what you do and how you do it.

Part of this reality is that it is going to require a new level of training for front line workers, this will be particularly hard on the police, but they should expect more stories like this one.

I also suspect there will be two reactions to it. Some government services will clam up and try to become more opaque, fearing all public inquiry. Their citizens – armed with cameras – all become potential threats. Over time, it is hard not imagining their legitimacy becoming further and further eroded (I’m thinking of you RCMP) as a video here, and audio clip there, shapes the publics image of the organization. Others will realize that anecdotal and chance views of their operations represents a real risk to their image. Consequently they may strive to be more transparent – sharing more data about their operations and their processes – in an effort to provide the public with greater context. The goal here will be to provide a counter point to any unfortunate incidents, trying to make a single negative anecdotal data point that happened to be filmed part of a larger complex number of data points.

Obviously, I have strong suspicions regarding which strategy will work, and which one won’t, in a democratic society but am confident many will disagree.

Either way, these challenges are going to require adaptation strategies and it won’t be easy for public institutions adverse to both negative publicity and transparency.

Data.gc.ca – Data Sets I found that are interesting, and some suggestions

Yesterday was the one year anniversary of the Canadian federal government’s open data portal. Over the past year government officials have been continuously adding to the portal, but as it isn’t particularly easy to browse data sets on the website, I’ve noticed a lot of people aren’t aware of what data is now available (self included!). Consequently, I want to encourage people to scan the available data sets and blog about ones that they think might be interesting to them personally, to others, or to communities of interests they may know.

Such an undertaking has been rendered MUCH easier thanks to the data.gc.ca administrators decision to publish a list of all the data sets available on the site. Turns out, there are 11680 data sets listed in this file. Of course, reviewing all this data took me much longer than I thought it would! (and to be clear, I didn’t explore each one in detail), but the process has been deeply interesting. Below are some thoughts, ideas and data sets that have come out of this exploration – I hope you’ll keep reading, and that it will be of interest to ordinary citizens, prospective data users and to managers of open government data portals.

TagCloud_GC_OpenData

A TagCloud of the Data Sets on data.gc.ca

Some Brief Thoughts on the Portal (and for others thinking about exploring the data)

Trying to review all the data sets on the portal is a enormous task and trying to do it has taught me some lessons about what works and doesn’t. The first is that, while the search function on the website is probably good if you have a keyword or a specific data you are looking for, it is much easier to browse the data in an excel than on the website. What was particularly nice about this is that, in excel, the data was often clustered by type. This made easy to spot related data sets – a great example of this when I found the data on “Building permits, residential values and number of units, by type of dwelling” I could immediately see there were about 12 other data sets on building permits available.

Another issue that became clear to me is the problem of how a data set is classified. For example, because of the way the data is structured (really as a report) the Canadian Dairy Exports data has a unique data file for every month and year (you can look at May 1988 as an example). That means each month is counted as a unique “data set” in the catalog. Of course, French and English versions are also counted as unique. This means that what I would consider to be a single data set “Canadian Dairy Exports Month Dairy Year from 1988 to present” actually counts as 398 data sets. This has two outcomes. First, it is hard to imagine anyone wants the data for just one month. This means a user looking for longitudinal data on this subject has to download 199 distinct data sets (very annoying). Why not just group it into one? Second, given that governments like to keep score about how many data sets they share – counting each month as a unique data set feels… unsportsmanlike. To be clear, this outcome is an artifact of how Agriculture Canada gathers and exports this data, but it is an example of the types of problems an open data catalog needs to come to grips with.

Finally, many users – particularly, but not exclusively, developers – are looking for data that is up to date. Indeed, real time data is particularly sexy since its dynamic nature means you can do interesting things with it. This it was frustrating to occasionally find data sets that were no longer being collected. A great example of this was the Provincial allocation of corporate taxable income, by industry. This data set jumped out at me as I thought it could be quite interesting. Sadly, StatsCan stopped collecting data on this in 1987 so any visualization will have limited use today. This is not to say data like this should be pulled from the catalog, but it might be nice to distinguish between datasets that are being collected on an ongoing basis versus those that are no longer being updated.

Data Sets I found Interesting

Just quickly before I begin, some quick thoughts on my very unscientific methodology for identifying interesting data sets.

  • First, browsing the data sets really brought home to me how many will be interesting to different groups – we really are in the world of the long tail of public policy. As a result, there is lots of data that I think will be interesting to many, many people that is not on this list.
  • Second, I tried to not include too much of StatsCan’s data. StatsCan data already has a fairly well developed user base. And while I’m confident that base is going to get bigger still now that its data is free, I figure there are already a number of people who will be sharing/talking about it
  • Finally, I’ve tried to identify some data sets that I think would make for good mashups or apps. This isn’t easy with federal government data sets since they tend do be more aggregate and high-level than say municipal data sets… but I’ve tried to tease out what I can. That said, I’m sure there is much, much more.

New GeoSpatial API!

So the first data set is a little bit of a cheat since it is not on the open data portal, but I was emailed about it yesterday and it is so damn exciting, I’ve got to share it. It is a recently released public BETA of a new RESTful API from the very cool people at GeoGratis that provides a consolidated access point to several repositories of geospatial data and information products including GeoGratis, GeoPub and Mirage. (huge thank you to the GeoGratis team for sending this to me).

Documentation can be found here (and in french here) and a sample search client that demonstrates some of its functionality and how to interact with the API can be found here. Formats include ATOM, HTML Fragment, CSV, RSS, JSON, and KML. (So you can see results – for example – in Google Earth by using the KML format (example here).

I’m also told that these fine folks have been working on geolocation service, so you can do sexy things like search by place name, by NTS map or by the first three characters of a postal code. Documentation will be posted here in english and french. Super geeks may notice that there is a field in the JSON called CGNDBkey. I’m also told you can use this key to select an individual placename according to the Canadian Geographic names board. Finally, you can also search all their Metadata through search engines like google (here is a sample search for gold they sent me).

All data is currently licensed under GeoGratis.

The National Pollutant Release Inventory

Description: The National Pollutant Release Inventory (NPRI) is Canada’s public inventory of pollutant releases (to air, water and land), disposals and transfers for recycling.

Notes: This is the same data set (but updated) that we used to create emitter.ca. I frankly feel like the opportunities around this data set, for environmentalists, investors (concerned about regulatory and lawsuit risks), the real estate industry, and others, is enormous. The public could be very interested in this.

Greenhouse Gas Emissions Reporting Program

Description: The Greenhouse Gas Emissions Reporting Program (GHGRP) is Canada’s legislated, publicly-accessible inventory of facility-reported greenhouse gas (GHG) data and information.

Notes: What interesting here is that while it doesn’t have lat/longs, it does have facility names and addresses. That means you should be able to cross reference it with the NPRI (which does have lat/longs) to be able to plot where the big greenhouse gas emitters are on a map. Think the same people as the NPRI might be interested in this data.

The Canadian Ice Thickness Program

Description: The Ice Thickness program dataset documents the thickness of ice on the ocean. Measurements begin when the ice is safe to walk on and continue until it is no longer safe to do so. This data can help gauge the impact of global warming and is relevant to shipping data in the north of Canada.

Notes: Students interested in global warming… this could make for some fun visualization.

Argo: Canadian Tracked Data

Description: Argo Data documents some of the approximately 3000 profiling floats were deployed around the world. Once at sea, the float sinks to a preprogrammed target depth of 2000 meters for a preprogrammed period of time. It then floats to the surface, taking temperature and salinity values during its ascent at set depths. — The Canadian Tracked Argo Datadescribes the Argo programme in Canada and provides data and information about Canadian floats.

Notes: Okay, so I can think of no use for this data, but I just that it was so awesome that people are doing this that I totally geeked out.

Civil Aircraft Register Database

Description: Civil Aircraft Register Database – this file contains the current mark, aircraft and owner information of all Canadian civil registered aircraft.

Notes: Here I really think there could be a geeky app. Just a simple app that you can type an aircraft’s number into and it will tell you the owner and details about the plane. I actually think the government could do a lot of work with this data. If regulatory and maintenance data were made available as well – then you’d have a powerful app that would tell you a lot about the planes you fly in. At a minimum would be of interest to flight enthusiasts.

Real Time Hydrometric Data Tool

Description: Real Time Hydrometric Data Tool – this site provides public access to real-time hydrometric (water level and streamflow) data collected at over 1700 locations in Canada. These data are collected under a national program jointly administered under federal-provincial and federal-territorial cost-sharing agreements. It is through partnerships that the Water Survey of Canada program has built a standardized and credible environmental information base for Canada. This dataset contains both current and historical datasets. The current month can be viewed in an HTML table, and historical data can be downloaded in CSV format.

Notes: So ripe for an API! What is cool is that the people at Environment Canada have integrated it into google maps. I could imagine fly fisherman and communities at risk of flooding being interested in this data set.

Access to information data sets

Description: 2006-2010 Access to Information and Privacy Statistics (With the previous years here, here and here.) is a compilation of statistical information about access to information and privacy submitted by government institutions subject to the Access to Information Act and the Privacy Act for 2006-2010.

Notes: I’d love to crunch this stuff again and see whose naughty and nice in the ATIP world…

Poultry and Forestry data

No links, BECAUSE THERE IS SO MUCH OF IT. Anyone interested in the Poultry or Forestry industry will find lots of data… obviously this stuff is useful to people who analyze these industries but I suspect there are a couple of “A” university level papers hidden in that data set as well.

Building Permits

There is tons on building permits., construction.. Actually one of the benefits of looking at the data in a spread sheet, easy to see other related data sets.

StatsCan

It really is amazing how much Statistic Canada data there is. Even reviewing something like the supply and demand of natural gas liquids got me thinking about the wealth of information trapped in there. One thing I do hope statscan starts to do is geolocate its data whenever possible.

Crime Data

As this has been in the news I couldn’t help but include it. It’s nice that any citizen can look at the crime data direct from StatsCan too see how our crime rate is falling (which is why we should build more expensive prisons) Crime statistics, by detailed offences. Of course unreported crime, which we all know is climbing at 3000% a year, is not included in these stats.

Legal Aid Applications

Legal aid applications, by status and type of matter. This was interesting to me since, here in BC there is much talk about funding for the Justice system and yet, the number of legal aid applications has remained more or less flat over the past 5 years.

National Broadband Coverage data

Description: The National Broadband Coverage Data represents broadband coverage information, by technology, for existing broadband service providers as of January 2012. Coverage information for Broadband Canada Program projects is included for all completed projects. Coverage information is aggregated over a grid of hexagons, which are each 6 km across. The estimated range of unserved / underserved population within in each hexagon location is included.

Notes: What’s nice is that there is lat/long data attached to all this, so mapping it, and potentially creating a heat map is possible. I’m certain the people at OpenMedia might appreciate such a map.

Census Consolidated Subdivision

Description: Census Consolidated Subdivision Cartographic Boundary Files portrays the geographic limits used for the 2006 census dissemination. The Census Consolidated Subdivision Boundary Files contain the boundaries of all 2,341 census consolidated subdivisions.

Notes: Obviously this one is on every data geeks radar, but just in case you’ve been asleep for the past 5 months, I wanted to highlight it.

Non-Emergency Surgeries, distribution of waiting times

Description: Non-emergency surgeries, distribution of waiting times, household population aged 15 and over, Canada, provinces and territories

Notes: Would love to see this at the hospital and clinic level!

Border Wait Times

Description: Estimates Border Wait Times (commercial and travellers flow) for the top 22 Canada Border Services Agency land border crossings.

Notes: Here I really think there is an app that could be made. At the very least there is something that could tell you historical averages and ideally, could be integrated into Google and Bing maps when calculating trip times… I can also imagine a lot of companies that export goods to the US are concerned about this issue and would be interested in better data to predict the costs and times of shipping goods. Big potential here.

Okay, that’s my list. Hope it inspires you to take a look yourself, or play with some of the data listed above!

Access to Information, Open Data and the Problem with Convergence

In response to my post yesterday one reader sent me a very thoughtful commentary that included this line at the end:

“Rather than compare [Freedom of Information] FOI legislation and Open Gov Data as if it’s “one or the other”, do you think there’s a way of talking about how the two might converge?”

One small detail:

So before diving in to the meat let me start by saying I don’t believe anything in yesterday’s post claimed open data was better or worse than Freedom of Information (FOI often referred to in Canada as Access to Information or ATI). Seeing FOI and open data as competing suggests they are similar tools. While they have similar goals – improving access – and there may be some overlap, I increasingly see them as fundamentally different tools. This is also why I don’t see an opportunity for convergence in the short term (more on that below). I do, however, believe open data and FOI processes can be complimentary. Indeed, I’m hopeful open data can alleviate some of the burden placed on FOI system which are often slow. Indeed, in Canada, government departments regularly violate rules around disclosure deadlines. If anything, this complimentary nature was the implicit point in yesterday’s post (which I could have made more explicit).

The Problem with Convergence:

As mentioned above, the overarching goals of open data and FOI systems are similar – to enable citizens to access government information – but the two initiatives are grounded in fundamentally different approaches to dealing with government information. From my view FOI has become a system of case by case review while open data is seeking to engage in an approach of “pre-clearance.”

Part of this has to do with what each system is reacting to. FOI was born, in part, out of a reaction to scandals in the mid 20th century which fostered public support for a right to access government information.

FOI has become a powerful tool for accessing government information. But the infrastructure created to manage it has also had some perverse effects. In some ways FOI has, paradoxically made it harder to gain access to government information. I remember talking to a group of retired reporters who talk about how it was easier to gain access to documents in a pre-FOI era since there were no guidelines and many public servants saw most documents as “public” anyways. The rules around disclosure today – thanks in part to FOI regimes – mean that governments can make closed the “default” setting for government information. In the United States the Ashcroft Memo serves as an excellent example of this problem. In this case the FOI legislation actually becomes a tool that helps governments withhold documents, rather than enable citizens to gain legitimate access.

But the bigger problem is that the process by which access to information requests are fulfilled is itself burdensome. While relevant and necessary for some types of information it is often overkill for others. And this is the niche that open data seeks to fill.

Let me pause to stress, I don’t share the above to disparage FOI. Quite the opposite. It is a critical and important tool and I’m not advocating for its end. Nor am I arguing the open data can – in the short or even medium term – solve the problems raised above.

This is why, over the short term, open data will remain a niche solution – a fact linked to its origins. Like FOI Open data has its roots in government transparency. However, it also evolved out of efforts to tear down antiquated intellectual property regimes to the facilitate sharing of data/information (particularly between organizations and governments). Thus the emphasis was not on case by case review of documents, but rather of clearing rights to categories of information, both created and to be created in the future. In other words, this is about granting access to the outputs of a system, not access to individual documents.

Another way of thinking about this is that open data initiatives seek to leverage the benefits of FOI while jettisoning its burdensome process. If a category of information can be pre-clear in advanced and in perpetuity for privacy, security and IP concerns then FOI processes – essential for individual documents and analysis – becomes unnecessary and one can reduce the transaction costs to citizens wishing to access the information.

Maybe, in the future, the scope of these open data initiatives could become broader, and I hope they will. Indeed there is, ample evidence to suggest that technology could be used to pre-clear or assess the sensitivity of any government document. An algorithm that assess a mixture of who the author is, the network of people who review it and a scan of the words would probably allow ascertain if a document could be released to an ATIP request in seconds, rather than weeks. It could at least give a risk profile and/or strip out privacy related information. These types of reforms would be much more disruptive (in the positive sense) to FOI legislation than open data.

But all that said, just getting the current focus of open data initiatives right would be a big accomplishment. And, even if such initiatives could be expanded, there are limits. I am not so naive to believe that government can be entirely open. Nor am I sure that would be an entirely good outcome. When trying to foster new ideas or assess how to balance competing interests in society, a private place to initiate and play with ideas may be essential. And despite the ruminations above, the limits of government IT systems means there will remain a lot of information – particularly non-data information like reports and analysis – that we won’t be able to “pre-clear- for sharing and downloading. Consequently an FOI regime – or something analogous – will continue to be necessary.

So rather than replace or converge with FOI systems, I hope open data will, for the short to medium term actually divert information out of the FOI, not because it competes, but because it offers a simpler and more efficient means of sharing (for both government and citizens) certain types of information. That said, open data initiatives offer none of the protections or rights of FOI and so this legislation will continue to serve as the fail safe mechanism should a government choose to stop sharing data. Moreover, FOI will continue to be a necessary tool for documents and information that – for all sorts of reasons (privacy, security, cabinet confidence, etc…) cannot fall under the rubric of an open data initiative. So convergence… not for now. But co-existence feels both likely and helpful for both.

Calculating the Value of Canada’s Open Data Portal: A Mini-Case Study

Okay, let’s geek out on some open data portal stats from data.gc.ca. I’ve got three parts to this review: First, an assessment on how to assess the value of data.gc.ca. Second, a look at what are the most downloaded data sets. And third, some interesting data about who is visiting the portal.

Before we dive in, a thank you to Jonathan C sent me some of this data to me the other day after requesting it from Treasury Board, the ministry within the Canadian Government that manages the government’s open data portal.

1. Assessing the Value of data.gc.ca

Here is the first thing that struck me. Many governments talk about how they struggle to find methodologies to measure the value of open data portals/initiatives. Often these assessments focus on things like number of apps created or downloaded. Sometimes (and incorrectly in my mind) pageviews or downloads are used. Occasionally it veers into things like mashups or websites.

However, one fairly tangible value of open data portals is that they cheaply resolve some access to information requests –  a point I’ve tried to make before. At the very minimum they give scale to some requests that previously would have been handled by slow and expensive access to information/freedom of information processes.

Let me share some numbers to explain what I mean.

The Canada Government is, I believe, only obligated to fulfill requests that originate within Canada. Drawing from the information in the charts later in this post, let’s say assume there were a total of 2200 downloads in January and that 1/3 of these originated from Canada – so a total of 726 “Canadian” downloads. Thanks to some earlier research, I happen to know that the office of the information commissioner has assessed that the average cost of fulfilling an access to information request in 2009-2010 was $1,332.21.

So in a world without an open data portal the hypothetical cost of fulfilling these “Canadian” downloads as formal access to information requests would have been $967,184.46 in January alone. Even if I’m off by 50%, then the cost – again, just for January – would still sit at $483,592.23. Assuming this is a safe monthly average, then over the course of a year the cost savings could be around $11,606,213.52 or $5,803,106.76 – depending on how conservative you’d want to be about the assumptions.

Of course, I’m well aware that not every one of these downloads would been an information request in a pre-portal world – that process is simply to burdensome. You have to pay a fee, and it has to be by check (who pays for anything by check any more???) so many of these users would simply have abandoned their search for government information. So some of these savings would not have been realized. But that doesn’t mean there isn’t value. Instead the open data portal is able to more cheaply reveal latent demand for data. In addition, only a fraction of the government’s data is presently on the portal – so all these numbers could get bigger still. And finally I’m only assessing downloads that originated inside Canada in these estimates.

So I’m not claiming that we have arrived at a holistic view of how to assess the value of open data portals – but even the narrow scope of assessment I outline above generates financial savings that are not trivial, and this is to say nothing of the value generated by those who downloaded the data – something that is much harder to measure – or of the value of increased access to Canadians and others.

2. Most Downloaded Datasets at data.gc.ca

This is interesting because… well… it’s just always interesting to see what people gravitate towards. But check this out…

Data sets like the Anthropogenic disturbance footprint within boreal caribou ranges across Canada may not seem interesting, but the ground breaking agreement between the Forest Products Association of Canada and a coalition of Environmental Non-Profits – known as the Canadian Boreal Forest Agreement (CBFA) – uses this data set a lot to assess where the endangered woodland caribou are most at risk. There is no app, but the data is critical in both protecting this species and in finding a way to sustainably harvest wood in Canada. (note, I worked as an adviser on the CBFA so am a) a big fan and b) not making this stuff up).

It is fascinating that immigration and visa data tops the list. But it really shouldn’t be a surprise. We are of course, a nation of immigrants. I’m sure that immigration and visa advisers, to say nothing of think tanks, municipal governments, social service non-profits and English as a second language schools are all very keen on using this data to help them understand how they should be shaping their services and policies to target immigrant communities.

There is, of course, weather. The original open government data set. We made this data open for 100s of years. So useful and so important you had to make it open.

And, nice to see Sales of fuel used for road motor vehicles, by province and territory. If you wanted to figure out the carbon footprint of vehicles, by province, I suspect this is a nice dataset to get. Probably is also useful for computing gas prices as it might let you get a handle on demand. Economists probably like this data set.

All this to say, I’m less skeptical than before about the data sets in data.gc.ca. With the exception of weather, these data sets aren’t likely useful to software developers – the group I tend to hear most from – but then I’ve always posited that apps were only going to be a tiny part of the open data ecosystem. Analysis is king for open data and there does appear to be people out there who are finding data of value for analyses they want to make. That’s a great outcome.

Here are the tables outlining the most popular data sets since launch and (roughly) in February.

  Top 10 most downloaded datasets, since launch

DATASET DEPARTMENT DOWNLOADS
1 Permanent Resident Applications Processed Abroad and Processing Times (English) Citizenship and Immigration Canada 4730
2 Permanent Resident Summary by Mission (English) Citizenship and Immigration Canada 1733
3 Overseas Permanent Resident Inventory (English) Citizenship and Immigration Canada 1558
4 Canada – Permanent residents by category (English) Citizenship and Immigration Canada 1261
5 Permanent Resident Applicants Awaiting a Decision (English) Citizenship and Immigration Canada 873
6 Meteorological Service of Canada (MSC) – City Page Weather Environment Canada 852
7 Meteorological Service of Canada (MSC) – Weather Element Forecasts Environment Canada 851
8 Permanent Resident Visa Applications Received Abroad – English Version Citizenship and Immigration Canada  800
9 Water Quality Indicators – Reports, Maps, Charts and Data Environment Canada 697
10 Canada – Permanent and Temporary Residents – English version Citizenship and Immigration Canada 625

Top 10 most downloaded datasets, for past 30 days

DATASET DEPARTMENT DOWNLOADS
1 Permanent Resident Applications Processed Abroad and Processing Times (English) Citizenship and Immigration Canada 481
2 Sales of commodities of large retailers – English version Statistics Canada  247
3 Permanent Resident Summary by Mission – English Version Citizenship and Immigration Canada 207
4 CIC Operational Network at a Glance – English Version Citizenship and Immigration Canada 163
5 Gross domestic product at basic prices, communications, transportation and trade – English version Statistics Canada 159
6 Anthropogenic disturbance footprint within boreal caribou ranges across Canada – As interpreted from 2008-2010 Landsat satellite imagery Environment Canada  102
7 Canada – Permanent residents by category – English version Citizenship and Immigration Canada  98
8 Meteorological Service of Canada (MSC) – City Page Weather Environment Canada  61
9 Sales of fuel used for road motor vehicles, by province and territory – English version  Statistics Canada 52
10 Government of Canada Core Subject Thesaurus – English Version  Library and Archives Canada  51

3. Visitor locations

So this is just plain fun. There is not a ton to derive from this – especially as IP addresses can, occasionally, be misleading. In addition, this is page view data, not download data. But what is fascinating is that computers in Canada are not the top source of traffic at data.gc.ca. Indeed, Canada’s share of the traffic is actually quite low. In fact, in January, just taking into account the countries in the chart (and not the long tail of visitors) Canada accounted for only 16% of the traffic to the site. That said, I suspect that downloads were significantly higher from Canadian visitors – although I have no hard evidence of this, just a hypothesis.

datagcca-december-visits

•Total visits since launch: 380,276 user sessions

Joining the Canadian Government's Advisory Panel on Open Government

Some people have already noticed, so wanted to share the news here as well. Yesterday, the Canadian Government announced the Advisory Panel on Open Government to which I was asked to join.

The purpose of the panel is to serve as a challenge function to the government as it develops its ideas and policies. I see my role as that of pushing the government on where I believe they could be doing more. Obviously, I’ve always been interested in peoples thoughts, hopes and concerns around Open Government (and many of you have been keen to share them with me), my hope is that this will provide another way to inject these ideas into the government’s planning process.

As I make suggestions and recommendations I will attempt to blog about them here, there were, indeed, a number of suggestions I made yesterday during the first Advisory Panel’s meeting, and I hope to write up as I think they will be helpful to other governments as well.

For those curious about who else is on the panel, it is chaired by Minister Clement and I’m joined by a number of other excellent “outside of government” voices (full list of names and bios here as well). In the list below I’ve tried to include twitter handles wherever possible:

Bernard Courtois, Past President & CEO, Information Technology Association of Canada (ITAC)

Robert Herjavec, Founder and Chief Executive Officer, The Herjavec Group

Alexander B. Howard, Government 2.0 Correspondent, O’Reilly Media

Thomas ‘Tom’ Jenkins, Head of the Canadian Digital Media Network and Executive Chairman and Chief Strategy Officer, OpenText Corporation

Vivek Kundra, Executive Vice President of Emerging Markets, Salesforce.com.

Herb Lainchbury, Chief Technology Officer, MD Databank Corp.

Colin McKay, Public Policy Manager (Canada), Google

Toby Mendel, Executive Director, Centre for Law and Democracy

Alex Miller, President and Founder, ESRI Canada

Marie-Lucie Morin, Executive Director for Canada, Ireland and the Caribbean, The World Bank

Dr. Rufus Pollock, Co-Founder and Director, Open Knowledge Foundation

Dr. Teresa Scassa, Vice-Dean of Research and Professor of Law, University of Ottawa

As an off topic aside, the first meeting too place using Cisco’s telepresence technology. This essentially is fancy videoconferencing where all the rooms around are virtually identical so that it feels like people are sitting around the same table. It was the first time I’ve tried using it and I was duly impressed. It did mean that the government didn’t have to fly us in from all around the world to meet face to face – a real cost savings and obviously, good for emissions as well.

 

More on Google Transit and how it is Reshaping a Public Service

Some of you know I’ve written a fair bit on Google transit and how it is reshaping public transit – this blog post in particular comes to mind. For more reading I encourage you to check out the Xconomy article Google Transit: How (and Why) the Search Giant is Remapping Public Transportation as it provides a lot of good details as to what is going on in this space.

Two things about this article:

First, it really is a story about how the secret sauce for success is combining open data with a common standard across jurisdictions. The fact that the General Transit Feed Specification (a structured way of sharing transit schedules) is used by over 400 transit authorities around the world has helped spur a ton of other innovations.

Couple of money quotes include this one about the initial reluctance of some authorities to share their data for free (I’m looking at you Translink board):

“I have watched transit agencies try to monetize schedules for years and nobody has been successful,” he says. “Markets like the MTA and the D.C. Metro fought sharing this data for a very long time, and it seems to me that there was a lot of fallout from that with their riders. This is not our data to hoard—that’s my bottom line.”

and this one about iBart, an app that uses the GTFS to power an app for planning transit trips:

in its home city, San Francisco, the startup’s app continues to win more users: about 3 percent of all trips taken on BART begin with a query on iBART

3%? That is amazing. Last year my home town of Vancouver’s transit authority, Translink, had 211.3 million trips. If the iBart app were ported to here and enjoyed similar success that would man 6.4 million trips planned on iBart (or iTranslink?). That’s a lot of trips made easier to plan.

The second thing I encourage you to think about…

Where else could this model be recreated? What’s the data set, where is the demand from the public, and what is the company or organization that can fulfill the role of google to give it scale. I’d love to hear thoughts.