Yesterday, I talked about what I thought was the real story that got missed in the fanfare surrounding the relaunch of data.gc.ca. Today I’ll talk about the new data.gc.ca itself.
Before I begin, there is an important disclaimer to share (to be open!). Earlier this year Treasury Board asked me to chair five public consultations across Canada to gather feedback on both its open data program and data.gc.ca in particular. As such, I solicited peoples suggestions on how data.gc.ca could be improved – as well as shared my own – but I was not involved in the creation of data.gc.ca. Indeed the first time I saw the site was on Tuesday when it launched. My role was merely to gather feedback. For those curious you can read the report I wrote here.
There is, I’m happy to say, much to commend about the new open data portal. Of course, aesthetically, it is much easier on the eye, but this is really trivial compared to a number of other changes.
The most important shift relates to the desire of the site to foster community. Users can now register with the site as well as rate and comment on data sets. There are also places like the Developers’ Corner which contains documentation that potential users might find helpful and a sort of app store where government agencies and citizens can posts applications they have created. This shift mirrors the evolution of data.gov, data.gov.uk and DataBC which started out as data repositories but sought to foster and nurture a community of data users. The critical piece here is that simply creating the functionality will probably not be sufficient, in the US, UK and BC it has required dedicated community managers/engagers to help foster such a community. At present it is unclear if that exists behind the website at data.gc.ca.
The other two noteworthy improvements to the site are an improved search and the availability of API’s. While not perfect, the improved search is nonetheless helpful as previously it was basically impossible to find anything on the site. Today a search for “border time” and a border wait time data set is the top result. However, search for “border wait times” and “Biogeochemical exploration using Douglas-fir tree tops in the Mabel Lake area, southern British Columbia (NTS 82L09 and 10)” becomes the top hit with actual border wait time data set pushed down to fifth. That said the search is still a vast improvement and this alone could be a boon to policy wonks, researchers and developers who elect to make use of the site.
The introduction of APIs is another interesting development. For the uninitiated an API (application programming interface) provides continuous access to updated data, so rather than downloading a file, it is more like you are plugging into a socket that delivers data, rather than electricity. The aforementioned border wait time data set is a fantastic example. It is less of a “data set” than of a “data stream” providing the most recent updates of border wait times, like what you would see on the big signs across the highway as you approach the border. By providing it through the open data site it would not, for example, be impossible for Google Maps to scan this data set daily, understand how border wait times fluctuate and incorporate these delays in its predicted travel times. Indeed, it could even querry the API in real time and tell you how long it will take to drive from Vancouver to Seattle, with border delays taken into account. The opportunity for developers and, equally intriguing, government employees and contractors, to build applications a top of these APIs is, in my mind, quite exciting. It is a much, much cheaper and flexible approach than how a lot of government software is currently built.
I also welcome the addition of the ability to search Access to Information (ATIP) requests summaries. That said, I’d like for there to be more than just the summaries, that actually responses would be nice, particularly given that ATIP requests likely represent information people have identified as important. In addition, the tool for exploring government expenditures is interesting, but it is weirdly more notable because, as far as I can tell, none of the data displayed in the tool can be downloaded, meaning it is not very open.
Finally, I will briefly note that the license is another welcome change. For more on that I recommend checking out Teresa Scassa’s blog post on it. Contrary to my above disclaimer I have been more active on this side of things, and hope to have more to share on that another time.
I’m sure, as I and others explore the site in the coming days we will discover more to like and dislike about it, but it is a helpful step forward and another signal that open data is, slowly, being baked into the public service as a core service.