Tag Archives: opendata

City of Vancouver Wins Top Innovator Award from BC Business

Access to Information is Fatally Broken… You Just Don’t Know it Yet

I’ve been doing a lot of thinking about access to information, and am working on a longer analysis, but in the short term I wanted to share two graphs – graphs that outline why Access to Information (Freedom of Information in the United States) is unsustainable and will, eventually, need to be radically rethought.

First, this analysis is made possible by the enormous generosity of the Canadian Federal Information Commissioners Office which several weeks ago sent me a tremendous amount of useful data regarding access to information requests over the past 15 years at the Treasury Board Secretariat (TBS).

The first figure I created shows both the absolute number of Access to Information Requests (ATIP) since 1996 as well as the running year on year percentage increase. The dotted line represents the average percentage increase over this time. As you can see the number of ATIP requests has almost tripled in this time period. This is very significant growth – the kind you’d want to see in a well run company. Alas, for those processing ATIP requests, I suspect it represents a significant headache.

That’s because, of course, such growth is likely unmanageable. It might be manageable if say, the costs of handling each requests was dropping rapidly. If such efficiencies were being wrestled out of the system of routing and sorting requests then we could simply ignore the chart above. Sadly, as the next chart I created demonstrates this is not the case.

In fact the costs of managing these transactions has not tripled. It has more than quadrupled. This means that not only are the number of transactions increasing at about 8% a year, the cost of fulfilling each of those transactions is itself rising at a rate above inflation.

Now remember, I’m not event talking about the effectiveness of ATIP. I’m not talking about how quickly requests are turned around (as the Information Commissioner has discussed, it is broadly getting worse) nor am I discussing less information is being restricted (it’s not, things are getting worse). These are important – and difficult to assess – metrics.

I am, instead, merely looking at the economics of ATIP and the situation looks grim. Basically two interrelated problems threaten the current system.

1) As the number of ATIP requests increase, the manpower required to answer them also appears to be increasing. At some point the hours required to fulfill all requests sent to a ministry will equal the total hours of manpower at that ministry’s disposal. Yes that day may be far off, but they day where it hits some meaningful percentage – say 1%, 3% or 5% of total hours worked at Treasury Board, may not be that far off. That’s a significant drag on efficiency. I recall talking to a foreign service officer who mentioned that during the Afghan prisoner scandal an entire department of foreign service officers – some 60 people in all – were working full time on assessing access to information requests. That’s an enormous amount of time, energy and money.

2) Even more problematic than the number of work hours is the cost. According to the data I received, Access to Information requests costs The Treasury Board $47,196,030 last year. Yes, that’s 47 with a “million” behind it. And remember, this is just one ministry. Multiply that by 25 (let’s pretend that’s the number of ministries, there are actually many more, but I’m trying to be really conservative with my assumptions) and it means last year the government may have spent over $1.175 Billion fulfilling ATIP requests. That is a staggering number. And its growing.

Transparency, apparently, is very, very expensive. At some point, it risks becoming too expensive.

Indeed, ATIP reminds me of healthcare. It’s completely unsustainable, and absolutely necessary.

To be clear, I’m not saying we should get rid of ATIP. That, I believe, to be folly. It is and remains a powerful tool for holding government accountable. Nor do I believe that requesters should pay for ATIP requests as a way to offset costs (like BC Ferries does) – this creates a barrier that punishes the most marginalized and threatened, while enabling only the wealthy or well financed to hold government accountable.

I do think it suggests that governments need to radical rethink how manage ATIP. More importantly I think it suggests that government needs to rethink how it manages information. Open data, digital documents are all part of a strategy that, I hope, can lighten the load. I’ve also felt that if/as government’s move their work onto online platforms like GCPEDIA, we should simply make non-classified pages open to the public on something like a 5 year timeline. This could also help reduce requests.

I’ve more ideas, but at its core we need a system rethink. ATIP is broken. You may not know it yet, but it is. The question is, what are we going to do before it peels off the cliff? Can we invent something new and better in time?

Canada launches data.gc.ca – what works and what is broken

55 Replies

Those on twitter will already know that this morning I had the privilege of conducting a press conference with Minister Day about the launch of data.gc.ca – the Federal Government’s Open Data portal. For those wanting to learn more about open data and why it matters, I suggest this and this blog post, and this article – they outline some of the reasons why open data matters.

In this post I want to review what works, and doesn’t work, about data.gc.ca.

What works

Probably the most important thing about data.gc.ca is that it exists. It means that public servants across the Government of Canada who have data they would like to share can now point to a website that is part of government policy. It is an enormous signal of permission from a central agency that will give a number of people who want to share data permission, a process and a vehicle, by which to do this. That, in of itself, is significant.

Indeed, I was informed that already a number of ministries and individuals are starting to approach those operating the portal asking to share their data. This is exactly the type of outcome we as citizens should want.

Moreover, I’ve been told that the government wants to double the number of data sets, and the number of ministries, involved in the site. So the other part that “works” on this site is the commitment to make it bigger. This is also important, as there have been some open data portals that have launched with great fanfare, only to have the site languish as neither new data sets are added and the data sets on the site are not updated and so fall out of date.

What’s a work in progress

The number of “high value” datasets is, relatively speaking, fairly limited. I’m always cautious about this as, I feel, what constitutes high value varies from user to user. That said, there are clearly data sets that will have greater impact on Canadians: budget data, line item spend data by department (as the UK does), food inspection data, product recall data, pretty much everything on the statscan website, Service Canada locations, postal code data and, mailbox location data, business license data, Canada Revenue data on charities and publicly traded companies are all a few that quickly come to mind, clearly I can imagine many, many more…

I think the transparency, tech, innovation, mobile and online services communities will be watching data.gc.ca closely to see what data sets get added. What is great is that the government is asking people what data sets they’d like to see added. I strongly encourage people to let the government know what they’d like to see, especially when it involves data the government is already sharing, but in unhelpful formats.

What doesn’t work

In a word: the license.

The license on data.gc.ca is deeply, deeply flawed. Some might go so far as to say that the license does not make it data open at all – a critique that I think is fair. I would say this: presently the open data license on data.gc.ca effectively kills any possible business innovation, and severally limits the use in non-profit realms.

The first, and most problematic is this line:

“You shall not use the data made available through the GC Open Data Portal in any way which, in the opinion of Canada, may bring disrepute to or prejudice the reputation of Canada.”

What does this mean? Does it mean that any journalist who writes a story, using data from the portal, that is critical of the government, is in violation of the terms of use? It would appear to be the case. From an accountability and transparency perspective, this is a fatal problem.

But it is also problematic from a business perspective. If one wanted to use a data set to help guide citizens around where they might be well, and poorly, served by their government, would you be in violation? The problem here is that the clause is both sufficiently stifling and sufficiently negative that many businesses will see the risk of using this data simply too great.

UPDATE: Thursday March 17th, 3:30pm, the minister called me to inform me that they would be striking this clause from the contract. This is excellent news and Treasury Board deserves credit for moving quickly. It’s also great recognition that this is a pilot (e.g. beta) project and so hopefully, the other problems mentioned here and in the comments below will also be addressed.

It is worth noting that no other open data portal in the world has this clause.

The second challenging line is:

“you shall not disassemble, decompile except for the specific purpose of recompiling for software compatibility, or in any way attempt to reverse engineer the data made available through the GC Open Data Portal or any part thereof, and you shall not merge or link the data made available through the GC Open Data Portal with any product or database for the purpose of identifying an individual, family or household or in such a fashion that gives the appearance that you may have received or had access to, information held by Canada about any identifiable individual, family or household or about an organization or business.”

While I understand the intent of this line, it is deeply problematic for several reasons. First, many business models rely on identifying individuals, indeed, frequently individuals ask businesses to do this. Google, for example, knows who I am and offers custom services to me based on the data they have about me. It would appear that terms of use would prevent Google from using Government of Canada data to improve its service even if I have given them permission. Moreover, the future of the digital economy is around providing customized services. While this data has been digitized, it effectively cannot be used as part of the digital economy.

More disconcerting is that these terms apply not only to individuals, but also to organizations and businesses. This means that you cannot use the data to “identify” a business. Well, over at Emitter.ca we use data from Environment Canada to show citizens facilities that pollute near them. Since we identify both the facilities and the companies that use them (not to mention the politicians whose ridings these facilities sit in), are we not in violation of the terms of use? In a similar vein, I’ve talked about how government data could have prevented $3B of tax fraud. Sadly, data from this portal would not have changed that since, in order to have found the fraud, you’d have to have identified the charitable organizations involved. Consequently, this requirement manifestly destroys any accountability the data might create.

It is again worth noting that no other open data portal in the world has this clause.

And finally:

4.1 You shall include and maintain on all reproductions of the data made available through the GC Open Data Portal, produced pursuant to section 3 above, the following notice:

Reproduced and distributed with the permission of the Government of Canada.

4.2 Where any of the data made available through the GC Open Data Portal is contained within a Value-Added Product, you shall include in a prominent location on such Value-Added Product the following notice:

This product has been produced by or for (your name – or corporate name, if applicable) and includes data provided by the Government of Canada.

The incorporation of data sourced from the Government of Canada within this product shall not be construed as constituting an endorsement by the Government of Canada of our product.

or any other notice approved in writing by Canada.

The problem here is that this creates what we call the “Nascar effect.” As you use more and more government data, these “prominent” displays of attribution begin to pile up. If I’m using data from 3 different governments, each that requires attribution, pretty soon all your going to see are the attribution statements, and not the map or other information that you are looking for! I outlined this problem in more detail here. The UK Government has handled this issue much, much more gracefully.

Indeed, speaking of the UK Open Government License, I really wish our government had just copied it wholesale. We have a similar government system and legal systems so I see no reason why it would not easily translate to Canada. It is radically better than what is offered on data.gc.ca and, by adopting it, we might begin to move towards a single government license within Commonwealth countries, which would be a real win. Of course, I’d love it if we adopted the PDDL, but the UK Open Government License would be okay to.

In Summary

The launch of data.gc.ca is an important first step. It gives those of us interested in open data and open government a vehicle by which to get more data open and improve the accountability, transparency as well as business and social innovation. That said, there is much work to be done still: getting more data up and, more importantly, addressing the significant concerns around the license. I have spoken to Treasury Board President Stockwell Day about these concerns and he is very interested and engaged by them. My hope is that with more Canadians expressing their concerns, and with better understanding by ministerial and political staff, we can land on the right license and help find ways to improve the website and program. That’s why we to beta launches in the tech world, hopefully it is something the government will be able to do here too.

Apologies for any typos, trying to get this out quickly, please let me know if you find any.

MP Jim Abbott: The Face of the Sad State of Open Data in Canada

4 Replies

“I guess my attack to this has always been from the perspective of are we working in a bubble. In other words, when this was… under this initiative by the President, how quick was the takeup by the population at large? Not by the people that we affectionately call geeks, or people who don’t have a life, or don’t come up out of the dark, or whatever. The average person walking through Times Square I guess is what I’m trying to say. How quick was their take up, and in fact has there been a takeup?”

– Jim Abbott, ETHI Meeting No. 47, Open Government Study, March 2, 2011

Yes, the above quote comes from Jim Abbott, Member of Parliament (Conservative) for Kootenay—Columbia during the testimony of Beth Noveck, President Obama’s former Deputy Chief Technology Officer for Open Government (her statement can be found here). You can see the remarks in the online video here, at around the 1:17:50 mark.

First, I want to be clear. This is disappointing, not on a political level, but on an individual level. During my testimony for the ETHI committee (which I intend to blog about) I found members of all parties – NDP, Liberal, Bloc Quebecois and Conservative – deeply interested in the subject matter, asking thoughtful questions and expressing legitimate concerns. Indeed, I was struck by Pierre Poilievre, the Conservative MP for Napean-Carleton, who asked a number of engaging questions, particularly around licenses. That’s a level of sophistication around the issue that many people don’t care to ask about. Moreover, many of the committee members grasped the economic and social opportunity around open data.

Jim Abbott, in contrast, may believe that describing technologists and geeks as people who “don’t have a life” or “don’t come up out of the dark” is affectionate, but I’m not so sure these stereotypes are so endearing, especially given how they aren’t true. Moreover, his comments are particularly unfortunate as it’s the people he (affectionately) demeans who created RIM, OpenText, Cognos, and thousands of other successful technology companies that pump billions into the Canadian economy, employ hundreds of thousands, and do actually impact the “person on the street.” But a few simple demeaning words can make one forget these contributions or worse, make them sound insignificant.

Of course, it will be the work of these people that creates the open data applications that, in the US at least, already impact the average person walking through Times Square (consider this lifesaving app that was created by a hacker using opendata). Indeed, there are a growing number of businesses consuming and using open data, some even valued in the billions of dollars and used by millions of americans every day.

The sad part is they will only be available to the people in Times Square, or Trafalgar Square or on the Champs-Élysées since the Americans, British and French all have national open data portals (among numerous other countries). There will be no uptake for people on Wellington St., Queen St., Robson St. or wherever, since without a national open data portal in Canada, there can be no uptake. (It’s not easy to be behind the French government on an issue related to the digital economy, but we’ve somehow managed).

But forget the economic opportunity. There is also the question of government transparency and accountability. What makes the above statement so disappointing is that it exposes how an MP who for so long railed for greater transparency in government, has suddenly decided that transparency is no longer important unless “there is sufficient uptake.”

One wonders what Jim Abbott of 2000 would say of Jim Abbott of 2011? Because back in a pre-2001 era Jim Abbott had fantastic quotes like this:

I suggest in the strongest way possible to the minister that even if we can get him to clear up the history of the Canada Information Office, which I do not have a lot of hope for but I am asking for, from this point forward there must be proper transparency of the Canada Information Office. The country needs openness and transparency because democracy cannot be true democracy without openness and transparency.

– Jim Abbott, June 8th, 2000 / 11:10 a.m.

and this

Second, the difficulty the government has created with the Canada Information Office is that many of the contracts and much of the ongoing activity have been conducted in a way that does not befit what we are in Canada, which is a democracy. In a democracy the people depend on the people in the Chamber to hold the government accountable for the affairs of the government and to be as transparent as possible.

– Jim Abbott, June 8th, 2000 / 11:10 a.m.

and this

It will never have the transparency that it must have in a democracy. It is just absolutely unacceptable.

– Jim Abbott, June 16th, 1995 / 3:25 p.m.

I could go on…

(If you are wondering how I was able to dig up these quotes, please check out OpenParliament.ca – it really is extraordinary tool and again, shows the power of open (parliamentary) data).

But more importantly, and on point, it seems to me that Jim Abbott from the year 2000 would see open data as a important way to ensure greater transparency. Wouldn’t it have been nice if the Canada Information Office had had its budget and expenditures available as open data? Wouldn’t that have brought about some of the accountability the 2000 Jim Abbott would have sought? Sadly, and strangely, Jim Abbott of 2011 no longer seems to feel that way.

Yes, if only he could meet Jim Abbott of 2000, I think they’d have a great debate.

Of course, Jim Abbott of 2000 can’t meet Jim Abbott of 2011, and so it is up to us to (re)educate him. And on that front, I have, so far, clearly failed the tech community, the open data community and the government accountability community. Hopefully with time and more effort, that will change. Maybe next time I’m in Ottawa, Jim Abbott and I can grab coffee and I can try again.

Launching an Open Data Business: Recollect.net (Vantrash 2.0)

1 Reply

Have you ever forgotten to take the garbage or recycling out? Wouldn’t it be nice if someone sent you a reminder the night before, or the morning of? Maybe an email, or an SMS, or even a phone call?

Now you can set it up so somebody does. Us.

Introducing Recollect: the garbage and recycling collection reminder service.

For People

We’ve got the garbage schedules for a number of Canadian cities big and small (with American ones coming soon) – test our site out to see if we support yours.

You can set up a reminder for the night before – or the day of – your garbage pickup, and we’ll email, text or call you letting you know your garbage day is imminent and what will be picked up (say, recycling, yard waste or garbage). Our email and Twitter reminders are free, and text message and phone calls cost $1.50 a month.

If you think you, your sibling, friends, or your parents might like a service like this, please come check out our website.

It’s simple and we hope you’ll give it a whirl.

For Cities

We don’t think that Recollect is going to change the world, but we do think we can help better manage citizens’ expectations around customer service. For cities (and companies) interested in connecting with their citizens and customers, we have have a number of partnering options we have already started to explore with some cities.

More importantly, if you’d like to see Recollect come to your city, have your garbage schedule and zones available for download – like Edmonton and Vancouver.

On either of these fronts, if you are a politician, city employee or a business owner who needs a reminder service of some kind, please contact us.

Background – an open data municipal business

In June of 2009, as Vancouver was preparing to launch its open data portal I wrote a blog post called How Open Data even makes Garbage collection sexier, easier and cheaper in which I talked about how, using city data, a developer could create a garbage pickup reminder service for Vancouverites. Tim Bray called it his Hello World moment for Open Data. More importantly, Luke Closs and Kevin Jones, two Vancouver programers (and now good friends) took the idea and made it real. The program was called Vantrash, and in two quiet, low-maintenance years – with no advertising or marketing – it garnered over 3000 users.

Last week we retired Vantrash. Today, we launched Recollect.

Yes, Recollect is more beautiful than its predecessor, but more importantly it is going to start serving your community. At a high level, we want to see if we can scale an open data business to a continental level. Can we use open data to serve a range of cities across North America?

At a practical level, the goal of Recollect is more basic: To help make citizens’ lives just a little bit easier by providing them customized reminders for services they use, to the device of their choice, at the time of their choice.

Let’s face it: We are all too busy being parents, holding down jobs or enjoying the limited free time we have to remember things like garbage day or little league schedules. Our job is to make your life easier by finding ways to free our minds of wasting time remembering these small details. If you aren’t trying to remember to take out the garbage, hopefully it means you can spend a little more time thinking about your family, your work or whatever your passion may be.

In short, we believe that city services should be built around your life – and we are trying to take a small step to bring that a little closer to reality.

Again, we don’t expect Recollect to change the world. But we do hope that it will serve as a building block for rethinking the government-user experience that will lay the foundations so that others will be able to change the world.

Today in the Toronto Star: End the silence on aid

2 Replies

Sorry for the cross post – I have this piece today on the opinion page of the Toronto Star. They’ve actually done a nice graphic for it so do encourage you to check it out.

End the silence on aid

For the past two weeks, Canadians have slowly watched the minister of international development, Bev Oda, implode. Caught in a slowly escalating scandal, it’s become clear that the minister misled Parliament — and the public — about how the government chooses whom it funds to do international development work.

The scandal around Oda, however, is a metaphor for a much larger problem in Canada’s foreign aid. The world is dividing itself into donors who hold forth an open model of evidence, accountability and, above all, transparency, and those who cling to a model of patronage, ideology and opacity.

So the question is: Where will Canada land on this debate? So far, the answer is not promising.

Internationally, the Kairos decision suggests Canada is on the wrong side of the divide. Indeed, the gap between CIDA and the world’s leading institutions is growing. Consider a recent report by the U.K.-based international advocacy group Publish What You Fund. Of the 30 institutions assessed in its 2010 report on aid transparency, the Canadian International Development Agency ranked 23rd. Among countries, Canada ranked 15th out of 22 (the Netherlands, U.K. and Ireland held the top three spots).

We are, by any metric, near the bottom of the pack. For a country and a government that prides itself on accountability and transparency, it’s a damning assessment.

What’s all the more frustrating is that transparency isn’t just about accountability. It’s about effectiveness and saving taxpayers’ money — something our major allies have already figured out.

So while Canada’s international development minister fights allegations of making the decision-making process more opaque, a coalition of leading countries is moving forward — without Canada — to do the opposite.

Take, for example, the newly founded International Aid Transparency Initiative (IATI). A coalition of donor governments, developing countries and NGOs, the IATI has a single goal: to improve aid effectiveness by making information about aid spending easier to access, use and understand.

It’s a deeply pragmatic exercise, one far removed from the partisan politics around aid seen in Canada. In one of its first reports, it outlines how setting up systems to make aid data available would involve a one-time cost of between $50,000 and $500,000, but would save taxpayers in countries like Canada several times that amount every year.

Part of these savings would come just from reducing bureaucracy. Making data publicly available would eliminate the need for civil servants to respond to duplicate information requests from international organizations, other governments and Canadian organizations. Instead, the relevant information could just be downloaded. It’s the kind of efficiency we expect from our government.

It’s also the kind of transparency Canadians are starting to see elsewhere. The World Bank — at one time loathed for its opacity — has made transparency a core value of its operations. It recently launched an open data portal where it shares enormous quantities of information on the global economy and aid projects. It has also promised much more and is slowly rolling out a “mapping for results” website where every project the bank funds and how much money it receives can be viewed on a downloadable map.

Canada sits on the sidelines while others move forward implementing proposals that could — ironically — fund several Kairoses every year.

The costs aren’t borne just by taxpayers, but also by Canadian NGOs. They have to provide the same information, but in different forms, to every government and organization that funds them. This means aid workers spend precious time and money filling out CIDA’s unique forms. Repeat this cost over the hundreds of projects that CIDA funds and the collective waste is enormous.

Perhaps more importantly, making our aid more transparent and accessible would close another gap — our inability to measure our effectiveness. One of the reasons countries like the U.K., Denmark and Sweden have signed up to the International Aid Transparency Initiative is so they can more easily compare the projects they fund with one another. These are countries that are serious about getting bang for their buck — they want to compare the evidence, see which projects work, and which ones fail.

It’s a lesson leading Canadian organizations are taking to heart. Engineers Without Borders, for example, regularly publishes a “failure report” in which it outlines which of its projects didn’t work and why. This honest, open and evidence-based approach to development is exactly what we need to demand of our government. Anything less constitutes a waste of our tax dollars.

And yet, the current debate in Parliament suggests we may be mapping a different route — one of opaque, ideologically driven development that is blind to both effectiveness and accountability. This serves neither Canadians nor donor recipients well.

Regardless of whether Oda resigns, Canadians should not lose sight of the larger issue and opportunity. We are in the midst of a global movement for international development aid transparency.

The benefits are clear, our allies are present, and even five of our focus recipient countries have signed up. And yet, Canada is nowhere to be found.

Sharing Critical Information with the public: Lessons for Governments

8 Replies

Increasingly governments are looking for new and more impactful ways to communicate with citizens. There is a slow but growing awareness that traditional sources of outreach, such as TV stories and newspaper advertisements are either not reaching a significant portion of the population and/or have little impact on raising awareness of a given issue.

The exciting thing about this is that there is some real innovation taking place in governments as they grapple with this challenge. This blog post will look at one example from Canada and talk about why the innovation pioneered to date – while a worthy effort – falls far short of its potential. Specifically, I’m going to talk about how when governments share data, even when they use new technologies, they remain stuck in a government-centric approach that limits effectiveness. The real impact of new technology won’t come until governments begin to think more radically in terms of citizen-centric approaches.

The dilemma around reaching citizens is probably felt most acutely in areas where there is a greater sense of urgency around the information – like, say, in issues relating to health and safety. Consequently, in Canada, it is perhaps not surprising to see that some of the more innovative outreach work has thus been pioneered by the national agency responsible for many of these issues, Health Canada.

The most cutting edge stuff I’ve seen is an effort by Health Canada to share advisories from Health Canada, Transport Canada and the Canadian Food Inspection Agency via three vehicles: an RSS feed, a mobile app available for Blackberry, iPhone (pictured far right) and Android, and finally as a widget (pictured near right) that anyone can install into their blog.

I think all of these are interesting ideas and have much to commend them. It is great to see information of a similar type, from three different agencies, being shared through a single vehicle – this is definitely a step forward from a user’s perspective. It’s also nice to see the government experiment with different vehicles for delivery (mobile and other parties’ websites).

But from a citizen-centric perspective, all these innovations share a common problem: They don’t fundamentally change the citizen’s experience with this information. In other words, they are simply efforts to find new ways to “broadcast” the information. As a result, I predict that these intiatives will have a minimal impact as currently structured. There are two reasons why:

The problem isn’t about access: These tools are predicated on the idea that the problem to conveying this information is about access to the information. It isn’t. The truth is, people don’t care. We can debate about whether they should care but the fact of the matter is, they don’t. Most people won’t pay attention to a product recall until someone dies. In this regard these tools are simply the modern day version of newspaper ads, which, historically, very few people actually paid attention to. We just couldn’t measure it, so we pretended like people read them.

The content misses the mark: Scrape a little deeper on these tools and you’ll notice something. They are all, in essence, press releases. All of these tools, the RSS feed, blog widget and mobile apps, are simply designed to deliver a marginally repackaged press release. Given that people tuned out of newspaper ads, pushing these ads onto them in another device will likely have a limited impact.

As a result, I suspect that those likely to pay attention to these innovations were probably those who were already paying attention. This is okay and even laudable. There is a small segment of people for whom these applications reduce the transactions costs of access. However, with regard to expanding the numbers of Canadians impacted my this information or changing behaviour in a broader sense, these tools have limited impact. To be blunt, no one is checking a mobile application before they buy a product, nor are they reading these types of widgets in a blog, nor is anyone subscribing to an RSS feed of recalls and safety warnings. Those who are, are either being paid to do so (it is a requirement of their job) or are fairly obsessive.

In short, this is a government-centric solution – it seeks to share information the government has, in a context that makes sense to government – it is not citizen-centric, sharing the information in a form that matters to citizens or relevant parties, in a context that makes sense to them.

Again, I want to state while I draw this conclusion I still applaud the people at Health Canada. At least they are trying to do something innovative and creative with their data and information.

So what would a citizen-centric approach look like? Interestingly, it would involve trying to reach out to citizens directly.

People are wrestling with a tsunami of information. We can’t simply broadcast them with information, nor can we expect them to consult a resource every time they are going to make a purchase.

What would make this data far more useful would be to structure it so that others could incorporate it into software and applications that could shape people’s behaviors and/or deliver the information in the right context.

Take this warning, for example: “CERTAIN FOOD HOUSE BRAND TAHINI OF SESAME MAY CONTAIN SALMONELLA BACTERIA” posted on Monday by the Canadian Food Inspection Agency. There is a ton of useful information in this press release including things like:

The geography impacted: Quebec

The product name, size and better still the UPC and LOT codes.

Product	Size	UPC	Lot codes
Tahini of Sesame	400gr	6 210431 486128	Pro : 02/11/2010 and Exp : 01/11/2012
Tahini of Sesame	1000gr	6 210431 486302	Pro: 02/11/2010 and Exp: 01/11/2012
Premium Halawa	400gr	6 210431 466120	Pro: 02/11/2010 and Exp: 01/11/2012
Premium Halawa	1000gr	6 210431 466304	Pro: 02/11/2010 and Exp: 01/11/2012

However, all this information is buried in the text so is hard to parse and reuse.

If the data was structured and easily machine-readable (maybe available as an API, but even as a structured spreadsheet) here’s what I could imagine happening:

Retailers could connect the bar code scanners they use on their shop floors to this data stream. If any cashier swipes this product at a check out counter they would be immediately notified and would prevent the product from being purchased. This we could do today and would be, in my mind, of high value – reducing the time and costs it takes to notify retailers as well as potentially saving lives.
Mobile applications like RedLaser, which people use to scan bar codes and compare product prices could use this data to notify the user that the product they are looking at has been recalled. Apps like RedLaser still have a small user base, but they are growing. Probably not a game changer, but at least context sensitive.
I could install a widget in my browser that, every time I’m on a website that displays that UPC and/or Lot code would notify me that I should not buy that product and that it’s been recalled. Here the potential is significant, especially as people buy more and more goods over the web.
As we move towards having “smart” refrigerators that scan the RFID chips on products to determine what is in the fridge, they could simply notify me via a text message that I need to throw out my jar of Tahini of Sesame. This is a next generation use, but the government would be pushing private sector innovation in the space by providing the necessary and useful data. Every retailer is going to want to sell a “smart” fridge that doubles as a “safe” fridge, telling you when you’ve got a recalled item in it.

These are all far more citizen-centric, since they don’t require citizens to think, act or pay attention. In short, they aren’t broadcast-oriented, they feel customized, filtering information and delivering it where citizens need it, when they need it, sometimes without them even needing to know. (This is the same argument I made in my How Yelp Could Help Save Millions in Healthcare Costs). The most exciting thing about this is that Health Canada already has all the data to do this, it’s just a question of restructuring it so it is of greater use to various consumers of the data – from retailers, to app developers, to appliance manufactuers. This should not cost that much. (Health Canada, I know a guy…)

Another advantage of this approach is that it also gets the Government out of the business of trying to find ways to determine the best and most helpful way to share information. This appears to be a problem the UK government is also interested in solving. Richard A. sent me this excellent link in which a UK government agency appeals to the country’s developers to help imagine how it can better share information not unlike that being broadcast by Health Canada.

However, at the end of the day even this British example falls into the same problem – believing that the information is most helpfully shared through an app. The real benefit of this type of information (and open data in general) won’t be when you can create a single application with it, but when you can embed the information into systems and processes so that it can notify the right person at the right time.

That’s the challenge: abandoning a broadcast mentality and making things available for multiple contexts and easily embeddable. It’s a big culture shift, but for any government interested in truly exploring citizen-centric approach, it’s the key to success.

The State of Open Data in Canada: The Year of the License

13 Replies

Open Data now an established fact in a growing list of Canadian cities. Vancouver, Toronto, Edmonton, Ottawa have established portals, Montreal, Calgary, Hamilton and some other cities are looking into launching their own and a few provinces are rumored to be exploring open data portals as well.

This is great news and a significant accomplishment. While at the national level Canadian is falling further behind leaders such as England, the United States, Australia and New Zealand, at the local and potentially provincial/state level, Canada could position itself as an international leader.

There is however, one main obstacle: our licenses.

The current challenge:

So far most Open Data portals adopt what has been termed the Vancouver License (it was created by Vancouver for its open data portal and has subsequently been adopted, with occasional minor changes, by virtually every other jurisdiction).

The Vancouver license, however, suffers from a number of significant defects. As someone who was involved in its creation these “bugs” were a necessary tradeoff. If we were looking for a perfect license that satisfied all stakeholders, I suspect we’d still be arguing about it and there’d be no open data or data portal with the Vancouver license. Today, thanks in part to the existence of these portals our politicians, policy makers and government lawyers understanding of this issue has expanded. This fact, in combination with a growing number of complaints about the licenses from non-profits and businesses interested in using open data, has fostered growing interest in adjusting it.

This is encouraging. And we must capitalize on the moment. I wish to be clear: until Canadian governments get the licensing issue right, Open Data cannot advance in this country. Open Data released by governments will not enjoy significant reuse undermining one of the main reasons for doing Open Data.

There are a few things everyone agrees a new license needs to cover. It must establish there is no warranty to the data and that the government cannot be held liable for any reuse. So let’s focus on the parts that governments most often get wrong.

Here, there are 3 things a new license needs to get right.

1. No Attribution

Nascar Jeff Gordon #24 by Dan Raustadt licensed CC-NC-ND

We need a license that does not require attribution. First, attribution gets messy fast – all those cities logos crammed in on a map, on a mobile phone? It’s fine when you are using data from one or two cities, but what happens when you start using data from 10 different governments, or 50? Pretty soon you’ll have NASCAR apps, that will look ugly and be unusable.

More importantly, the goal of open data isn’t to create free advertising for governments, its to support innovation and reuse. These are different goals and I think we agree on which one is more important.

Finally, what government is actually going to police this part of the license? Don’t demand what you aren’t going to enforce – and no government should waste precious resources by paying someone to scour the internet to find websites and apps that don’t attribute.

2. No Share alike

One area the Vancouver license falls down is on the share is in this clause:

If you distribute or provide access to these datasets to any other person, whether in original or modified form, you agree to include a copy of, or this Uniform Resource Locator (URL) for, these Terms of Use and to ensure they agree to and are bound by them but without introducing any further restrictions of any kind.

The last phrase is particularly problematic as it makes the Vancouver license “viral.” Any new data created through a mash up that involves data with the Vancouver license must also use the Vancouver license. This will pretty much eliminate any private sector use of the data since any new data set a company creates they will want to be able to license in manner that is appropriate to their business model. It also has a chilling effect on those who would like to use the data but would need to keep the resulting work private, or restricted to a limited group of people. Richard Weait has an unfortunately named blog post that provides an excellent example of this problem.

Any new license should not be viral so as to encourage a variety or reuses of any data.

3. Standardized

The whole point of Open Data is to encourage the reuse of a public asset. So anything a government does that impedes this reuse will hamper innovation and undermine the very purpose of the initiative. Indeed, the open data movement has, in large part, come to life because one traditional impediment to using data has disappeared: data can now usually be downloaded and available in open formats that anyone can use. The barriers to use have declined so more and more people are interested.

But the other barrier to re-use is legal. If licenses are not easily understood then individuals and businesses will not reuse data, even when it is easily downloadable from a government’s website. Building a businesses or a new non-profit activity on a public asset to which your rights are unclear is simply not viable for many organizations. This is why you want every government should want its license to be easily understood – lowering the barriers to access means making data downloadable and reducing the legal barriers.

Most importantly, it is also why it is ideal if there is a single license in the whole country, as this would significantly reduce transaction and legal costs for all players. This is why I’ve been championing Canada’s leading cities to adopt a single common license.

So, there are two ways of doing this.

The easiest is for Canadian governments to align themselves with several of the international standardized open data licenses that already exist. There are a variety out there. My preference is the Open Commons’ Public Domain Dedication and License (PDDL), although they also publish the Open Database License (ODC-ODbL) and the Attribution License (ODC-By). There is also Creative Commons CC-0 license which Creative Commons suggests to use for open data (I actually recommend against all of these except the PDDL for governments, but more on that later).

These licenses has several advantages.

First, standardized licenses are generally well understood. This means people don’t have to educate themselves on the specifics of dozens of different licenses.

Second, they are stable. Because these licenses are managed by independent authorities and many people use them, they evolve cautiously, and balance the interest of consumers and sharers of data or information.

Third, these licenses balance interests responsible. The creators of these licenses are thought through all the issues that pertain to open data and so give both consumers of data and distributors of data comfort in knowing that they have a licenses that will work.

A second option is for governments in Canada to align around a self-generated common license. Indeed, this is one area where the Federal Government could show (some presently lacking) leadership.(although GeoGratis does have a very good license). This, for example appears to be happening in the UK, where the national government has created an Open Government Licence.

My hope is that, before the year is out, jurisdictions in Canada began to move towards a common licenses, or begin adopting some standard licenses.

Specifically, it would be great to see various Canadian jurisdictions either:

a) Adopt the PDDL (like the City of Surrey, BC). There are some reference to European Data Rights in the PDDL but these have no meaning in Canada and should not be an obstacle – and may even reassure foreign consumers of Canadian data. The PDDL is the most open and forward looking license.

b) Adopt the UK government’s Open Government Licence. This license is the best created by any government to date (with the exemption of simple making the data public domain, which, of course, is far more ideal.

c) Use a modified version of the Geogratis license that adjusts the “3.0 PROTECTION AND ACKNOWLEDGEMENT OF SOURCE” clause to prevent the NASCAR effect from taking place.

What I hope does not happen is that:

a) More and more jurisdictions continue to use the Vancouver License. There are better options and it is an opportunity to launch an open data policy and leapfrog the current leaders in the space.

b) Jurisdictions adopt a Creative Commons license. Creative Commons was created to help license copyrighted material. Since data cannot be copyrighted, the use of creative commons risks confusing the public about the inherent rights they have to data. This is, in part, a philosophical argument, but it matters, especially for governments. We – and our governments especially – cannot allow people to begin to believe that data can be copyrighted.

c) There is no change to the current licenses being used, or a new license, like Open Database License (ODC-ODbL) which goes against the attributes described above, is adopted.

Let’s hope we make progress on this front in 2011.

Open Knowledge Foundation Open Data Advocate

5 Replies

My colleagues over at the Open Knowledge Foundation have been thinking about recruiting an Open Data Advocate, someone who can coordinate a number of the activities they are up to in the open data space. I offered to think about what the role should entail and how that person could be effective. Consequently, in the interests of transparency, fleshing out my thinking and seeing if there might be feed back (feel free to comment openly, or email me personally if you wish to keep it private) I’m laying out my thinking below.

Context

These are exciting times for open government data advocates. Over the past few years a number of countries, cities and international organizations have launched open data portals and implemented open data policies. Many, many more are contemplating joining the fray. What makes this exciting is that some established players (e.g. United States, UK, World Bank) are continue to push forward and will, I suspect, be refining and augmenting their services in the coming months. At the same time there are still a number of laggards (e.g. Canada federally, Southern Europe, Asia) in which mobilizing local communities, engaging with public servants and providing policy support is still the order of the day.

This makes the role of an Open Data Advocate complex. Obviously, helping pull the laggards along is an important task. Alternatively (or in addition) they may need to also be thinking longer term. Where is open data going, what will second and third generation open data portals need to look like (and what policy infrastructure will be needed to support them).

These are two different goals and so either choosing, or balancing, between them will not be easy.

Key Challenges

Some of the key challenges spring quite obviously from that context. But there are also other challenges, I believe to be looming as well. So what do I suspect are the key challenges around open data over the next 1-5 years?

Getting the laggards up and running
Getting governments to use standardized licenses that are truly open (be it the PDDL, CC-0 or one of the other available licenses out there
Cultivating/fostering an eco-system of external data users
Cultivating/fostering an eco-system of internal government user (and vendors) for open data (this is what will really make open data sustainable)
Pushing jurisdictions and vendors towards adopting standard structures for similar types of data (e.g. wouldn’t it be nice if restaurant inspection data from different jurisdictions were structured similarly?)
Raising awareness about abuses of, and the politicization of, data. (e.g. this story about crime data out of New York which has not received nearly enough press)

The Tasks/Leverage Points

There are some basic things that the role will require including:

Overseeing the Working Group on Open Government Data
Managing opengovernmentdata.org
Helping organize the Open Government Data Camp 2011, 2012 and beyond

But what the role will really have to do is figure out the key leverage points that can begin to shift the key challenges listed above in the right direction. The above mentioned tasks may be helpful in doing that… but they may not be. Success is going to be determined but figuring how to shift systems (government, vendor, non-profit, etc…) to advance the cause of open data. This will be no small task.

My sense is that some of these leverage points might include:

Organizing open data hackathons – ideally ones that begin to involve key vendors (both to encourage API development, but also to get them using open data)
Leveraging assets like Civic Commons to get open data policies up on online so that jurisdictions entertaining the issue can copy them
Building open data communities in key countries around the world – particularly in key countries in such as Brasil and India where a combination of solid democratic institutions and a sizable developer community could help trigger changes that will have ramifications beyond their borders (I suspect there are also some key smaller countries – need to think more on that)
I’m sure this list could be enhanced…

Metrics/Deliverables

Obviously resolving the above defined challenges in 1-5 years is probably not realistic. Indeed, resolving many of those issues is probably impossible – it will be a case of ensuring each time we peel back one layer of the onion we are well positioned to tackle the next layer.

Given this, some key metrics by which the Open Knowledge Foundation should evaluate the person in this role might be:

At a high level, possible some metrics might include:

Number of open data portals world wide? (number using CKAN?)
Number of groups, individuals, cities participating in Opendata hackathons
Number of applications/uses of open data
Awareness of CKAN and its mission in the public, developer space, government officials, media?
Number of government vendors offering open data as part of their solution

More additional deliverables, could include:

Running two Global OpenData Hackathons a year?
Developing an OKFN consulting arm specializing in open data services/implementation
Create an open data implementation policy “in a box” support materials for implementing an open data strategy in government
Develop a global network of OKFN chapters to push their local and national governments, share best practices
Run opendata bootcamps for public servants and/or activists
Create a local open data hackathon in a box kit (to enable local events)
Create a local “how to be an open data activist” site
Conduct some research on the benefits of open data to advance the policy debate
Create a stronger feedback loop on CKAN’s benefits and weaknesses
Create a vehicle to connect VC’s and/or money with open data drive companies and app developers (or at least assess what barriers remain to use open data in business processes).

Okay, I’ll stop there, but if you have thoughts please send them or comment below. Hope this stimulates some thinking among fellow open data geeks.

How Yelp Could Help Save Millions in Health Care Costs

21 Replies

Okay, before I dive in, a few things.

1) Sorry for the lack of posts last week. Life’s been hectic. Between Code for America, a number of projects and a few articles I’m trying to get through, the blogging slipped. Sorry.

2) I’m presenting on Open Data and Open Government to the Canadian Parliament Access to Information, Privacy and Ethics Committee today – more on that later this week

3) I’m excited about this post

When it comes to opening up government data many of us focus on Governments: we cajole, we pressure, we try to persuade them to open up their data. It’s approach we will continue to have to take for a great deal of the data our tax dollars pay to collect and that government’s continue to not share. There is however another model.

Consider transit data. This data is sought after, intensely useful, and probably the category of data most experimented with by developers. Why is this? Because it has been standardized. Why has it been standardized. Because local government’s (responding to citizen demand) have been desperate to get their transit data integrated with Google Maps (See image).

It turns out, to get your transit data into Google Maps, Google insists that you submit to them the transit data in a single structured format. Something that has come to be known as the General Transit Feed Specification (GTFS). The great thing about the GTFS is that it isn’t just google that can use it. Anyone can play with data converted into the GTFS. Better still, because the data structure us standardized an application someone develops, or analysis they conduct, can be ported to other cities that share their transit data in a GTFS format (like, say, my home town of Vancouver).

In short, what we have here is a powerful model both for creating open data and standardizing this data across thousands of jurisdictions.

So what does this have to do with Yelp! and Health Care Costs?

For those not in the know Yelp! is a mobile phone location based rating service. I’m particularly a fan of its restaurant locator: it will show you what is nearby and how it has been rated by other users. Handy stuff.

But think bigger.

Most cities in North America inspect restaurants for health violations. This is important stuff. Restaurants with more violations are more likely to transmit diseases and food born illnesses, give people food poisoning and god knows what else. Sadly, in most cases the results of these tests are posted in the most useless place imaginable. The local authorities website.

I’m willing to wager almost anything that the only time anyone visits a food inspection website is after they have been food poisoned. Why? Because they want to know if the jerks have already been cited.

No one checks these agencies websites before choosing a restaurant. Consequently, one of the biggest benefits of the inspection data – shifting market demand to more sanitary options – is lost. And of course, there is real evidence that shows restaurants will improve their sanitation, and people will discriminate against restaurants that get poor ratings from inspectors, when the data is conveniently available. Indeed, in the book Full Disclosure: The Perils and Promise of Transparency Fung, Graham and Weil noted that after Los Angeles required restaurants to post food inspection results, that “Researchers found significant effects in the form of revenue increases for restaurants with high grades and revenue decreases for C-graded (poorly rated) restaurants.” More importantly, the study Fung, Graham and Weil reference also suggested that making the rating system public positively impacted healthcare costs. Again, after inspection results in Los Angeles were posted on restaurant doors (not on some never visited website), the county experienced a reduction in emergency room visits, the most expensive point of contact in the system. As the study notes these were:

an 18.6 percent decline in 1998 (the first year of program operation), a 4.8 percent decline in 1999, and a 5.4 per- cent decline in 2000. This pattern was not observed in the rest of the state.

This is a stunning result.

So, now imagine that rather than just giving contributor generated reviews of restaurants Yelp! actually shared real food inspection data! Think of the impact this would have on the restaurant industry. Suddenly, everyone with a mobile phone and Yelp! (it’s free) could make an informed decision not just about the quality of a restaurant’s food, but also based on its sanitation. Think of the millions (100s of millions?) that could be saved in the United States alone.

All that needs to happen is for a simple first step, Yelp! needs approach one major city – say a New York, or a San Francisco – and work with them to develop a sensible way to share food inspection data. This is what happened with Google Maps and the GTSF, it all started with one city. Once Yelp! develops the feed, call it something generic, like the General Restaurant Inspection Data Feed (GRIDF) and tell the world you are looking for other cities to share the data in that format. If they do, you promise to include it in your platform. I’m willing to bet anything that once one major city has it, other cities will start to clamber to get their food inspection data shared in the GRIDF format. What makes it better still is that it wouldn’t just be Yelp! that could use the data. Any restaurant review website or phone app could use the data – be it Urban Spoon or the New York Times.

The opportunity here is huge. It’s also a win for everyone: Consumers, Health Insurers, Hospitals, Yelp!, Restaurant Inspection Agencies, even responsible Restaurant Owners. It would also be a huge win for Government as platform and open data. Hey Yelp. Call me if you are interested.

eaves.ca

if writing is a muscle, this is my gym