Monthly Archives: June 2013

The Uncertain Future of Open Data in the Government of Canada

It is possible to state that presently, open data is at its high water mark in the Government of Canada. Data.gc.ca has been refreshed, more importantly, the government has signed the Open Data Charter committing it to making data “open” by default, and a rash of new data sets have been made available.

In other words there is a lot of momentum in the right direction. So what could go wrong.

The answer…? Everything.

The reason is the upcoming cabinet shuffle.

I confess that Minister Clement and I have not agreed on all things. I believe – like the evidence shows us – that needle injection sites such as Insite make communities safer, save lives and make it easier for drug users to get help. As Health Minister, Clement did not. I argued strongly against dismantling of the mandatory long form census, noting its demise would make our government dumber and, ultimately, more expensive. As Industry Minister, Minister Clement was responsible for the end of a reliable long form census.

However, when it comes to open data, Minister Clement has been a powerful voice in a government that has, on many occasions, looked for ways to make access to information harder, not easier. Indeed, open data advocates have been lucky to have had two deeply supportive ministers, Clement and, prior to him, Stockwell Day (who also felt strongly about this issue and was incredibly responsive to many of my concerns when I shared them). This run, though, may be ending.

With the Government in trouble there is wide spread acceptance that a major cabinet re-shuffle will be in order. While Minister Clement has been laying a lot of groundwork for the upcoming negotiations with the public sector unions and a rethink of the public service could be more effective and accountable, he may not be sticking around to see this work (that I’m sure the government sees as essential) through to the end. Nor may he want to. Treasury Board remains a relatively inward facing ministry and it would not surprise me if both the Minister, and the PMO, were interested in moving him to a portfolio that was more outward and public facing. Only a notable few politicians dream of wrestling with public servants and figuring out how to reform the public service. (Indeed Reg Alcock is the only one I can think of).

If the Minister is moved it will be a real test for the sustainability of open data at the federal level. Between the Open Data charter, the expertise and team built up within Treasury Board and hopefully some educational work Minister Clement has done within his own caucus, ideally there is enough momentum and infrastructure in place that the open data file will carry on. This is very much what I hope to be the case.

But much may depend on who is made President of the Treasury Board.  If that role changes open data advocates may find themselves busy not doing new things, but rather safe guarding gains already made.

 

Some thoughts on the relaunched data.gc.ca

Yesterday, I talked about what I thought was the real story that got missed in the fanfare surrounding the relaunch of data.gc.ca. Today I’ll talk about the new data.gc.ca itself.

Before I begin, there is an important disclaimer to share (to be open!). Earlier this year Treasury Board asked me to chair five public consultations across Canada to gather feedback on both its open data program and data.gc.ca in particular. As such, I solicited peoples suggestions on how data.gc.ca could be improved – as well as shared my own – but I was not involved in the creation of data.gc.ca. Indeed the first time I saw the site was on Tuesday when it launched. My role was merely to gather feedback. For those curious you can read the report I wrote here

There is, I’m happy to say, much to commend about the new open data portal. Of course, aesthetically, it is much easier on the eye, but this is really trivial compared to a number of other changes.

The most important shift relates to the desire of the site to foster community. Users can now register with the site as well as rate and comment on data sets. There are also places like the Developers’ Corner which contains documentation that potential users might find helpful and a sort of app store where government agencies and citizens can posts applications they have created. This shift mirrors the evolution of data.govdata.gov.uk and DataBC which started out as data repositories but sought to foster and nurture a community of data users. The critical piece here is that simply creating the functionality will probably not be sufficient, in the US, UK and BC it has required dedicated community managers/engagers to help foster such a community. At present it is unclear if that exists behind the website at data.gc.ca.

The other two noteworthy improvements to the site are an improved search and the availability of API’s. While not perfect, the improved search is nonetheless helpful as previously it was basically impossible to find anything on the site. Today a search for “border time” and a border wait time data set is the top result. However, search for “border wait times” and “Biogeochemical exploration using Douglas-fir tree tops in the Mabel Lake area, southern British Columbia (NTS 82L09 and 10)” becomes the top hit with actual border wait time data set pushed down to fifth. That said the search is still a vast improvement and this alone could be a boon to policy wonks, researchers and developers who elect to make use of the site.

The introduction of APIs is another interesting development. For the uninitiated an API (application programming interface) provides continuous access to updated data, so rather than downloading a file, it is more like you are plugging into a socket that delivers data, rather than electricity. The aforementioned border wait time data set is a fantastic example. It is less of a “data set” than of a “data stream” providing the most recent updates of border wait times, like what you would see on the big signs across the highway as you approach the border. By providing it through the open data site it would not, for example, be impossible for Google Maps to scan this data set daily, understand how border wait times fluctuate and incorporate these delays in its predicted travel times. Indeed, it could even querry the API  in real time and tell you how long it will take to drive from Vancouver to Seattle, with border delays taken into account. The opportunity for developers and, equally intriguing, government employees and contractors, to build applications a top of these APIs is, in my mind, quite exciting. It is a much, much cheaper and flexible approach than how a lot of government software is currently built.

I also welcome the addition of the ability to search Access to Information (ATIP) requests summaries. That said, I’d like for there to be more than just the summaries, that actually responses would be nice, particularly given that ATIP requests likely represent information people have identified as important. In addition, the tool for exploring government expenditures is interesting, but it is weirdly more notable because, as far as I can tell, none of the data displayed in the tool can be downloaded, meaning it is not very open.

Finally, I will briefly note that the license is another welcome change. For more on that I recommend checking out Teresa Scassa’s blog post on it. Contrary to my above disclaimer I have been more active on this side of things, and hope to have more to share on that another time.

I’m sure, as I and others explore the site in the coming days we will discover more to like and dislike about it, but it is a helpful step forward and another signal that open data is, slowly, being baked into the public service as a core service.

 

The Real News Story about the Relaunch of data.gc.ca

As many of my open data friends know, yesterday the government launched its new open data portal to great fanfare. While there is much to talk about there – something I will dive into tomorrow – that was not the only thing that happened yesterday.

Indeed, I did a lot of media yesterday between flights and only after it was over did I notice that virtually all the questions focused on the relaunch of data.gc.ca. Yet it is increasingly clear that for me, the much, much bigger story of the portal relaunch was the Prime Minister announcing that Canada would adopt the Open Data Charter.

In other words, Canada just announced that it is moving towards making all government data open by default. Moreover, it even made commitments to make specific “high value” data sets open in the next couple of years.

As an aside, I don’t think the Prime Minister’s office has ever mentioned open data – as far as I can remember, so that was interesting in of itself. But what is still more interesting is what the Prime Minister committed Canada to. The open data charter commits the government to make data open by default as well as four other principles including:

  • Quality and Quantity
  • Useable by All
  • Releasing Data for Improved Governance
  • Releasing Data for Innovation

In some ways Canada has effectively agreed to implement the equivalent to Presidential Executive Order on Open Data the White House announced last month (and that I analyzed in this blog post). Indeed, the charter is more aggressive than the executive order since it goes on to layout the need to open up not just future data, but also current “high value” data sets. Included among these are data sets the Open Knowledge Foundation has been seeking to get opened via its open data census, as well as some data sets I and many others have argued should be made open, such as the company/business register. Other suggested high value data sets include data on crime, school performance, energy and environment pollution levels, energy consumption, government contracts, national budgets, health prescription data and many, many others. Also included on the list… postcodes – something we are presently struggling with here in Canada.

But the charter wasn’t all the government committed to. The final G8 communique contained many interesting tidbits that again, highlighted commitments to open up data and adhere to international data schemas.

Among these were:

  • Corporate Registry Data: There was a very interesting section on “Transparency of companies and legal arrangements” which is essentially on sharing data about who owns companies. As an advisory board member to OpenCorporates, this was music to my ears. However, the federal government already does this, the much, much bigger problem is with the provinces, like BC and Quebec that make it difficult or expensive to access this data.
  • Extractive Industries Transparency Initiative: A commitment that “Canada will launch consultations with stakeholders across Canada with a view to developing an equivalent mandatory reporting regime for extractive companies within the next two years.” This is something I fought to get included into our OGP commitment two years ago but failed to succeed at. Again, I’m thrilled to see this appear in the communique and look forward to the government’s action.
  • International Aid Transparency Initiative (IATI) and Busan Common Standard on Aid Transparency,: A commitment to make aid data more transparent and downloadable by 2015. Indeed, with all the G8 countries agreed to taking this step it may be possible to get greater transparency around who is spending what money, where on aid. This could help identify duplication as well as in assessments around effectiveness. Given how precious aid dollars are, this is a very welcome development. (h/t Michael Roberts of Acclar.org)

So lots of commitments, some on the more vague side (the open data charter) but some very explicit and precise. And that is the real story of yesterday, not that the country has a new open data portal, but that a lot more data is likely going to get put into that portal over then next 2-5 years. And a tsunami of data could end up in it over the next 10-25 years. Indeed, so much data, that I suspect a portal will no longer be a logical way to share it all.

And therein lies the deeper business and government story in all this. As I mentioned in my analysis of the White House Executive Order that made open data default, the big change here is in procurement. If implemented, this could have a dramatic impact on vendors and suppliers of equipement and computers that collect and store data for the government. Many vendors try to find ways to make their data difficult to export and share so as to lock the government in to their solution. Again, if (and this is a big if) the charter is implemented it will hopefully require a lot of companies to rethink what they offer to government. This is a potentially huge story as it could disrupt incumbents and lead to either big reductions in the costs of procurement (if done right) or big increases and the establishment of the same, or new, impossible to work with incumbents (if done incorrectly).

There is potentially a tremendous amount at stake in how the government handles the procurement side of all this, because whether it realizes it or not, it may have just completely shaken up the IT industry that serves it.

 

Postscript: One thing I found interesting about the G8 communique was how many times commitments about open data and open data sets occurred in the section that had nothing to do with open data. Will be interesting if that is a trend that continues at the next G8 meeting. Indeed, I wouldn’t be surprised is a specific open data section disappears and instead these references just become part of various issue related commitments.

 

 

 

Policy-Making in a Big Data World

For those interested I appeared on The Agenda with Steve Paikin the other week talking about Big Data and policy making.

There was a good discussion with a cast of character that included (not counting myself):

So much to dive into this space. There are, obviously, the dangers of thinking that data can solve all our problems, but I think the reverse is also true, that there is actually a real shortage of capacity within government (as in the private sector where these skills are highly sought after and compensated) to think critically about and effectively analyze data. Indeed, sadly, one of the few places in government that seems to understand and have the resources to work in this space is the security/intelligence apparatus.
It’s a great example of the growing stresses I think governments and their employees are going to be facing. One I hope we find ways to manage.

What Traffic Lights Say About the Future of Regulation

I have a piece up on TechPresident about some crazy regulations that took place in Florida that put citizens at greater risk all so the state and local governments can make more money.

Here’s a chunk:

In effect, what the state of Florida is saying is that a $20 million increase in revenue is worth an increase in risk of property damage, injury and death as a result of increased accidents. Based onnational statistics, there are likely about 62 deaths and 5,580 injuries caused by red light running in Florida each year. If shorter yellow lights increased that rate by 10 percent (far less than predicted by the USDOT) that could mean an additional 6 deaths and 560 injuries. Essentially the state will raise a measly extra $35,000 for each injury or death its regulations help to cause, and possibly far less.

The Past, Present and Future of Sensor Journalism

This weekend I had the pleasure of being invited to the Tow Centre for Digital Journalism at the Columbia Journalism School for a workshop on sensor journalism.

The workshop (hashtag #towsenses) brought together a “community of journalists, hackers, makers, academics and researchers to explore the use of sensors in journalism; a crucial source of information for investigative and data journalists.” And, it was fascinating to talk about what role sensors – from the Air Quality Egg to aerial drones – should, could or might play in journalism. Even more fun with a room full of DIYers, academics and journalists with interesting titles such as “applications division manager” or “data journalist.” Most fascinating was a panel on the ethics of sensors in journalism of which I hope to write about another time.

There is, of course, a desire to treat sensors as something new in journalism. And for good reason. Much like I’m sure there were early adopters of camera’s in the newsroom, cameras probably didn’t radically change the newsroom until they were (relatively) cheap, portable and gave you something your audience wanted. Today we may be experiencing something similar with sensors. The costs of creating sophisticated sensors is falling and/or other objects, like our cell phones, can be repurposed to be sensors. The question is… like cameras’ how can the emergence of sensors help journalists? And how might they distract them?

My point is, well, they already do sensor journalism. Indeed, I’d argue that somewhere between 5-15% of many news broadcasts are consumed with sensor journalism. At the very minimum the weather report is a form of sensor journalism. The meteorological group is a part of the news media organization that is completely reliant on sensors to provide it with information which it must analyze and turn into relevant information for its audience. And it is a very specific piece of knowledge that matters to the audience. They are not asking for how the weather came about, but merely and accurate prediction of what the weather will be. For good or (as I feel) for ill, there is not a lot of discussions about climate change on the 6 o’clock news weather report. (As an aside Clay Johnson cleverly pointed out that weather data may also be the government’s oldest, most mature and economically impactful open data set).

Of course weather data is not the only form of sensor journalism going on on a daily basis. Traffic reports frequently rely on sensors, from traffic counting devices to permanently mounted visual sensors (cameras!) that allow one to count, measure, and even model and predict traffic. There may still be others.

So there are already some (small) parts of the journalism world that are dependent on sensors. Of course, some of you may not consider traffic reports and weather reports to be journalism since it is not, well, investigative journalism. But these services are important, have tended to be part of news gathering organizations and are in constant demand by consumers. And while demand may not always the most important metric, it is an indication that this matters to people. My broader point here is that, there is part of the media community that is used to dealing with a type of sensor journalism. Yes, it has low ethical risk (we aren’t pointing these sensors at humans really) but it does mean there are policies, processes, methodologies and practices for thinking about sensors that may exist in news organizations, if not in the newsroom.

It is also a window in the the types of stories that sensors have, at least in the past, been good at helping out with. Specifically there seem to be two criteria: things that both occur at, and that a large number of people want to know about at, a high frequency. Both weather and traffic fit the bill, lots of people want to know about them, often twice a day, if not more frequently. So it might be worth thinking about, what are the other types of issues or problems that interest journalist that do, or could conform, with that criteria? In addition, if we are able to lower the cost of gathering and analyzing the data, does it become feasible, or profitable to serve smaller, niche audiences?

None of this is to say that sensors can’t, won’t or shouldn’t be used to cover investigative journalism projects. The work Public Labs did in helping map the extent of the oil spill along the gulf coast is a fantastic example of where sensors may be critical in journalism (as well as advocacy and evidence building) as has been the example of groups like Safecast and others who monitored radioactivity levels in Japan after the  Fukushima disaster. Indeed I think the possibilities of sensors in investigative journalism are both intriguing, and potentially very, very bright. I just love for us to build off of work that is already being done – even if it is in the (journalistically) mundane space of traffic and weather rather than imagine we are beginning with an entirely blank slate.