Category Archives: open data

Are you a Public Servant? What are your Open Data Challenges?

A number of governments have begun to initiate open data and open government strategies. With more governments moving in this direction a growing number of public servants are beginning to understand the issues, obstacles, challenges and opportunities surrounding open data and open government.

Indeed, these challenges are why many of these public servants frequent this blog.

This is precisely why I’m excited to share that, along with the Sunlight Foundation, the Personal Democracy Forum, Code for America, and GovLoop, I am helping Socrata in a recently launched survey aimed at government employees at the national, regional and local levels in the US and abroad about the progress of Open Data initiatives within their organization.

If you are a government employee please consider taking time to help us understand the state of Open Data in government. The survey is comprehensive, but given how quickly this field and the policy questions that come with it, is expanding, I think the collective result of our work could be useful. So, with that all said, I know you’re busy, but hope you’ll consider taking 10 minutes to fill out the survey. You can find it at: http://www.socrata.com/benchmark-study.

Creating Open Data Apps: Lessons from Vantrash Creator Luke Closs

Last week, as part of the Apps for Climate Action competition (which is open to anyone in Canada), I interviewed the always awesome Luke Closs. Luke, along with Kevin Jones, created VanTrash, a garbage pick up reminder app that uses open data from the City of Vancouver. In it, Luke shares some of the lessons learned while creating an application using open data.

As the deadline for the Apps for Climate Action competition approaches (August 8th) we thought this might help those who are thinking about throwing their hat in the ring last minute.

Some key lessons from Luke:

  • Don’t boil the ocean: Keep it simple – do one thing really, really well.
  • Get a beta up fast: Try to scope something you can get a rough version working in day or evening – that is a sure sign that it is doable
  • Beta test: On friends and family. A lot.
  • Keep it fun: do something that develops a skill or let’s you explore a technology you’re interested in

Ministerial Twitter Battle! $130M tax payer dollars wasted! Conspiracy theories!

Who knew the census could be so exciting.

Yesterday, I published Why you should care about the sudden demise of the mandatory long census form on the Globe and Mail website (also can be found here on this blog).

One interesting impact of the piece was that it generated the following debate between the Minister and a Laval Statistics Professor. Ultimately the professor’s concerns that the data generated by a voluntary long form remain unresolved. In short, as the former chief statistician of Statistics Canada also noted, a $100 million dollar survey may generate useless data.

UPDATE: Turns out there is now a petition to save the long form census here.

In addition to the online debate here are some other interesting facts about the end of the mandatory long census form:

1. It will be more expensive to implement.

For a government that is supposed to be fiscally conservative, ending the mandatory form could actually cost Canadian Tax Payers an additional $30 Million. As the Canadian Press reports:

“The cost of the change could reach $30 million, says Statistics Canada: $5 million for the additional mailout, and $25 million in case there is a major problem in getting people to respond.”

So, to sum up: Canadians will pay more to get data that risks being useless and skewed. Total waste of tax payer dollars: $130M.

2. No one can identify who wanted this change, not even the Minister.

Again, from the same Canadian Press article:

Industry Minister Tony Clement acknowledged in an interview that no consultations were undertaken on the decision.

He said it was based on the fact that many Canadians had complained of the coercive and intrusive nature of the census, but Clement had not seen polling on the issue.

How many Canadians? What is clear is that a large number of people are stepping forward to say this was a bad idea. So far every major municipality – through the Federation of Canadian Municipalities – is complaining about the decision, so to is both the Canadian and the Toronto Associations of Business Economists, the Canadian Council on Social Development and the Canadian Association of University Teachers and numerous academics and of course, the former chief statistician of Canada who says he would have quit rather than carry out the order to end the mandatory long census form.

3. So who has complained? (this is the best part)

So far all the stories about Canadians who have complained about the census refer to two individuals. The Canadian Press story (Which has been reprinted in several forms in a number of newspapers) references Sandra Finley, a Saskatoon activist who is still fighting in court after refusing to fill out the 2006 census. In addition to Ms. Finley, the Globe also referenced Don Rogers, a Kingston, Ont., man who mounted the “Count Me Out” campaign against the census.

In both cases the protesters core problem was not the coercive or intrusive nature of the census (the concern the minister seeks to address) but the fact the Statistics Canada bought software from defence manufacturer Lockheed Martin back in 2003. Indeed, the Count Me Out website is plastered with a hodge podge of conspiracy theories NAFTA, Lockheed Martin and the Canadian census funding ballistic missile tests.

Is this the complaint upon which the minister is grounding his decision?

4. Canada is alone

As far as the former Chief Statistician can tell, no other country in the world has a voluntary portion to its census.

Taken these four issues one is left wondering – is this how cabinet decisions are made?

Update 9:17am July 7th

Liberals issue a press release stating that:

“If the Conservatives don’t reverse their decision, Liberals are prepared to explore the introduction of an amendment to the Statistics Act to ensure a comprehensive, mandatory long-form stays,” said Ms. Jennings. “This decision, made in secret, without any consultation, is dangerous because the information that will be lost is used to help Canadians in their daily lives – particularly our most vulnerable citizens.”

Update 8:30am July 7th

Other articles raising concerns about this decision:

Winnipeg Free-Press: Anti-census crusader not satisfied with federal axing of long form

Globe and Mail: Tories Scrap Mandatory Long Form Census

Saskatoon Star-Phoenix: Detailed census data invaluable for sound policy

Montreal Gazette: Canadians must be able to count on Statistics Canada

Vancouver Sun Editorial: Canada needs its citizens to stand up and be counted

Victoria Times Colonist: Census shortcut bad for Canada

Edmonton Journal: Ridiculous to scrap key census data

(Plus many more who ran the Canadian Press story and others I just didn’t paste in here).

Articles supportive about this decision:

zero

Your Government Just Got Dumber: how it happened and why it matters to you

This piece was published in the Globe and Mail today so always nice when you read it there and let them know it matters to you.

Last week the Conservative Government decided that it would kill the mandatory long census form it normally sends out to thousands of Canadians every five years. On the surface such a move may seem unimportant and, to many, uninteresting, but it has significant implications for every Canadian and every small community in Canada.

Here are 3 reasons why this matters to you:

1. The Death of Smart Government

Want to know who the biggest user of census data is? The government. To understand what services are needed, where problems or opportunities may arise, or how an region is changing depends on having accurate data. The federal government, but also the provincial and, most importantly, local governments use Statistics Canada’s data every day to find ways to save tax payers money, improve services and make plans. Now, at the very moment – thanks to computers – governments are finding new ways to use this information more effectively than ever before, it is to be cut off.

To be clear this is a direct attack on the ability of government to make smart decisions. In fact it is an attack on evidence based public policy. Moreover, it was a political decision – it came from the Minister’s office and does not appear to reflect what Statistics Canada either wants or recommends. Of course, some governments prefer not to have information, all that data and evidence gets in the way of legislation and policies that are ineffective, costly and that reward vested interests (I’m looking at you Crime Bill).

2. The Economy is Less Competitive

But it isn’t just government that will suffer. In the 21st century economies data and information are at the heart of economic activity, it is what drives innovation, efficiencies and productivity. Starve our governments, ngo’s, businesses and citizens of data and you limit the wealth a 21st century economy will generate.

Like roads to the 20th century economy, data is the core infrastructure for a 21st century economy. While just a boring public asset, it can nonetheless foster big companies, jobs and efficiencies. Roads spawned GM. Today, people often fail to recognize that the largest company already created by the new economy – Google – is a data company. Google is effective and profitable not because it sells ads, but because it generates and leverages petabytes of data every day from billions of search queries. This allows it to provide all sorts of useful services such as pointing us, with uncanny accuracy, to merchandises and services we want, or better yet, spam we’d like to avoid. It can even predict when communities will experience flu epidemics four months in advance.

And yet, it is astounding that the Minister in charge of Canada’s digital economy, the minister who should understand the role of information in a 21st century economy, is the minister who authorized killing the creation of this data. In doing so he will deprive Canadians and their businesses of information that would make them, and thus our economy, more efficient, productive and profitable. Of course, the big international companies will probably be able to find the money to do their own augmented census, so those that will really suffer will be small and medium size Canadian businesses.

3. Democracy Just got Weaker

Of course, the most important people who could use the data created by the census aren’t government or businesses. It is ordinary Canadians. In theory, the census creates a level playing field in public policy debates. Were Statistics Canada website usable and its data accessible (data, may I remind you we’ve already paid for) then citizens could use this information to fight ineffective legislation, unjust policies, or wasteful practices. In a world where this information won’t exist those who are able to pay for the creation of this information – read large companies – will have an advantage not only over citizens, but over our governments (which of course, won’t have this data anymore either). Today, the ability of ordinary citizens to defend themselves against government and businesses just got weaker.

So who’s to blame? Tony Clement, the Minister of Industry Canada who oversees Statistics Canada, is to blame. His office authorized this decision. But Statistics Canada also shares in the blame. In an era where the internet has flattened the cost of distributing information Statistics Canada: continues to charge citizens for data their tax dollars already paid for; has an unnavigable website where it is impossible to find anything; and often distributes data in formats that are hard to use. In short, for years the department has made its data inaccessible to ordinary Canadians. As a result it isn’t hard to see why most Canadians don’t know about or understand this issue. Sadly, once they do wake up to the cost of this terrible decisions, I fear it will be too late.

Open Canada – Hello Globe and Mail?

Richard Poynder has a wonderful (and detailed) post on his blog Open and Shut about the state of open data in the UK. Much of it covers arguments about why open data matters economically and democratically (the case I’ve been making as well). It is worthwhile reading for policy makers and engaged citizens.

There is however a much more important lesson buried in the article. It is in regard to the role of the Guardian newspaper.

As many of you know I’ve been advocating for Open Data at all levels of government, and in particular, at the federal level. This is why I and others created datadotgc.ca: If the government won’t create an open data portal, we’ll create one for them. The goal of course, was to show them that it already does open data, and that it could do a lot, lot more (there is a v2 of the site in the works that will offer some more, much cooler functionality coming soon).

What is fascinating about Poynder’s article is the important role the Guardian has played in bringing open data to the UK. Consider this small excerpt from his post.

For The Guardian the release of COINS marks a high point in a crusade it began in March 2006, when it published an article called “Give us back our crown jewels” and launched the Free Our Data campaign. Much has happened since. “What would have been unbelievable a few years ago is now commonplace,” The Guardian boasted when reporting on the release of COINS.

Why did The Guardian start the Free Our Data campaign? Because it wanted to draw attention to the fact that governments and government agencies have been using taxpayers’ money to create vast databases containing highly valuable information, and yet have made very little of this information publicly available.

The lesson here is that a national newspaper in the UK played a key role in pressuring a system of government virtually identical to our own (now also governed by a minority, conservative lead government) to release one of the most important data in its possession – the Combined Online Information System (COINS). This on top of postal codes and what we would find in Stats Canada’s databases.

All this leads me to ask one simple question. Where is the Globe and Mail? I’m not sure its editors have written a single piece calling for open data (am I wrong here?). Indeed, I’m not even sure the issue is on their radar. It certainly has done nothing close to launching a “national campaign.” They could do the Canadian economy, democracy and journalism and world of good. Open data can be championed by individual advocates such as myself but having a large media player repeatedly raising the issue, time and time again brings out the type of pressure few individuals can muster.

All this to say, if the Globe ever gets interested, I’m here. Happy to help.

Canadian Open Cities Update

For those who have not been following the news there have been a couple of exciting developments on the open data front at the municipal level in Canada.

First off, the City of Edmonton has launched its Apps competition, details can be found at the Apps4Edmonton website.

Second, it looks like the City of London, Ontario is may do a pilot of open data – thanks to the vocal activism of local developers and community organizers the Mayor of London expressed interesting in doing a pilot at the London Changecamp. As mentioned, there is a vibrant and active community in London, Ontario so I hope this effort takes flight.

Third, and much older, is that Ottawa approved doing open data, so keep an eye on this website as things begin to take shape

The final municipal update is the outlier… Turns out that although Calgary passed a motion to do open data a few months ago the roll out keeps getting delayed by a small group of city councillors. Reasons are murky especially since I’m told by local activists that the funds have already been allocated and that everything is set to go. Will be watching this unfold with interest.

Finally, unrelated to municipal data, but still important (!), Apps4Climate Action has extended the contest deadline due to continued interest in the contest. The new submission deadline is August 8th.

Hope everyone has a great weekend. Oh, and if you haven’t already, please join the facebook group “let’s get 100,000 Canadian to op out of yellow pages delivery.” Already, in less than a week, over 800 Canadians have successfully opted of receiving the yellow pages. Hope you’ll join too.

Learning from Libraries: The Literacy Challenge of Open Data

We didn’t build libraries for a literate citizenry. We built libraries to help citizens become literate. Today we build open data portals not because we have public policy literate citizens, we build them so that citizens may become literate in public policy.

Yesterday, in a brilliant article on The Guardian website, Charles Arthur argued that a global flood of government data is being opened up to the public (sadly, not in Canada) and that we are going to need an army of people to make it understandable.

I agree. We need a data-literate citizenry, not just a small elite of hackers and policy wonks. And the best way to cultivate that broad-based literacy is not to release in small or measured quantities, but to flood us with data. To provide thousands of niches that will interest people in learning, playing and working with open data. But more than this we also need to think about cultivating communities where citizens can exchange ideas as well as involve educators to help provide support and increase people’s ability to move up the learning curve.

Interestingly, this is not new territory.  We have a model for how to make this happen – one from which we can draw lessons or foresee problems. What model? Consider a process similar in scale and scope that happened just over a century ago: the library revolution.

In the late 19th and early 20th century, governments and philanthropists across the western world suddenly became obsessed with building libraries – lots of them. Everything from large ones like the New York Main Library to small ones like the thousands of tiny, one-room county libraries that dot the countryside. Big or small, these institutions quickly became treasured and important parts of any city or town. At the core of this project was that literate citizens would be both more productive and more effective citizens.

But like open data, this project was not without controversy. It is worth noting that at the time some people argued libraries were dangerous. Libraries could spread subversive ideas – especially about sexuality and politics – and that giving citizens access to knowledge out of context would render them dangerous to themselves and society at large.  Remember, ideas are a dangerous thing. And libraries are full of them.

Cora McAndrews Moellendick, a Masters of Library Studies student who draws on the work of Geller sums up the challenge beautifully:

…for a period of time, censorship was a key responsibility of the librarian, along with trying to persuade the public that reading was not frivolous or harmful… many were concerned that this money could have been used elsewhere to better serve people. Lord Rodenberry claimed that “reading would destroy independent thinking.” Librarians were also coming under attack because they could not prove that libraries were having any impact on reducing crime, improving happiness, or assisting economic growth, areas of keen importance during this period… (Geller, 1984)

Today when I talk to public servants, think tank leaders and others, most grasp the benefit of “open data” – of having the government sharing the data it collects. A few however, talk about the problem of just handing data over to the public. Some questions whether the activity is “frivolous or harmful.” They ask “what will people do with the data?” “They might misunderstand it” or “They might misuse it.” Ultimately they argue we can only release this data “in context”. Data after all, is a dangerous thing. And governments produce a lot of it.

As in the 19th century, these arguments must not prevail. Indeed, we must do the exact opposite. Charges of “frivolousness” or a desire to ensure data is only released “in context” are code to obstruct or shape data portals to ensure that they only support what public institutions or politicians deem “acceptable”. Again, we need a flood of data, not only because it is good for democracy and government, but because it increases the likelihood of more people taking interest and becoming literate.

It is worth remembering: We didn’t build libraries for an already literate citizenry. We built libraries to help citizens become literate. Today we build open data portals not because we have a data or public policy literate citizenry, we build them so that citizens may become literate in data, visualization, coding and public policy.

This is why coders in cities like Vancouver and Ottawa come together for open data hackathons, to share ideas and skills on how to use and engage with open data.

But smart governments should not only rely on small groups of developers to make use of open data. Forward-looking governments – those that want an engaged citizenry, a 21st-century workforce and a creative, knowledge-based economy in their jurisdiction – will reach out to universities, colleges and schools and encourage them to get their students using, visualizing, writing about and generally engaging with open data. Not only to help others understand its significance, but to foster a sense of empowerment and sense of opportunity among a generation that could create the public policy hacks that will save lives, make public resources more efficient and effective and make communities more livable and fun. The recent paper published by the University of British Columbia students who used open data to analyze graffiti trends in Vancouver is a perfect early example of this phenomenon.

When we think of libraries, we often just think of a building with books.  But 19th century mattered not only because they had books, but because they offered literacy programs, books clubs, and other resources to help citizens become literate and thus, more engaged and productive. Open data catalogs need to learn the same lesson. While they won’t require the same centralized and costly approach as the 19th century, governments that help foster communities around open data, that encourage their school system to use it as a basis for teaching, and then support their citizens’ efforts to write and suggest their own public policy ideas will, I suspect, benefit from happier and more engaged citizens, along with better services and stronger economies.

So what is your government/university/community doing to create its citizen army of open data analysts?

Apps for Climate Action Update – Lessons and some new sexy data

ttl_A4CAOkay, so I’ll be the first to say that the Apps4Climate Action data catalog has not always been the easiest to navigate and some of the data sets have not been machine readable, or even data at all.

That however, is starting to change.

Indeed, the good news is three fold.

First, the data catalog has been tweaked and has better search and an improved capacity to sort out non-machine readable data sets. A great example of a government starting to think like the web, iterating and learning as the program progresses.

Second, and more importantly, new and better sets are starting to be added to the catalog. Most recently the Community Energy and Emissions Inventories were released in an excel format. This data shows carbon emissions for all sorts of activities and infrastructure at a very granular level. Want to compare the GHG emissions of a duplex in Vancouver versus a duplex in Prince George? Now you can.

Moreover, this is the first time any government has released this type of data at all, not to mention making it machine readable. So not only have the app possibilities (how green is your neighborhood, rate my city, calculate my GHG emissions) all become much more realizable, but any app using this data will be among the first in the world.

Finally, probably one of the most positive outcomes of the app competition to date is largely hidden from the public. The fact that members of the public have been asking for better data or even for data sets at all(!) has made a number of public servants realize the value of making this information public.

Prior to the competition making data public was a compliance problem, something you did but you figured no one would ever look at or read it. Now, for a growing number of public servants, it is an innovation opportunity. Someone may take what the government produces and do something interesting with it. Even if they don’t, someone is nonetheless taking interest in your work – something that has rewards in of itself. This, of course, doesn’t mean that things will improve over night, but it does help advance the goal of getting government to share more machine readable data.

Better still, the government is reaching out to stakeholders in the development community and soliciting advice on how to improve the site and the program, all in a cost-effective manner.

So even within the Apps4Climate Action project we see some of the changes the promise of Government 2.0 holds for us:

  • Feedback from community participants driving the project to adapt
  • Iterations of development conducted “on the fly” during a project or program
  • Success and failures resulting in queries in quick improvements (release of more data, better website)
  • Shifting culture around disclosure and cross sector innovation
  • All on a timeline that can be measured in weeks

Once this project is over I’ll write more on it, but wanted to update people, especially given some of the new data sets that have become available.

And if you are a developer or someone who would like to do a cool visualization with the data, check out the Apps4Climate Action website or drop me an email, happy to talk you through your idea.

Half victory in making BC local elections more transparent

Over the past few months the British Columbia government (my home province – or for my American friends – state) has had a taskforce looking at reforming local (municipal) election rules.

During the process I submitted a suggestion to the taskforce outlining why campaign finance data should be made available online and in machine readable format (ie. so you can open it in Microsoft Excel, or Google Docs, for example).

Yesterday the taskforce published their conclusions and… they kind of got it right.

At first blush, things look great… The press release and taskforce homepage list, as one of the core recommendations:

Establish a central role for Elections BC in enforcement of campaign finance rules and in making campaign finance disclosure statements electronically accessible

Looks promising… yes? Right. But note the actual report (which ironically, is only available in PDF, so I can’t link to the specific recommendations… sigh). The recommendation around disclosure reads:

Require campaign finance disclosure information to be published online
and made centrally accessible though Elections BC

and the explanatory text reads:

Many submissions suggested that 120 days is too long to wait for disclosure reports, and that the public should be able to access disclosure information sooner and more easily. Given the Task Force’s related recommendations on Elections BC’s role in overseeing local campaign finance rules, it is suggested that Elections BC act as a central repository of campaign finance disclosure statements. Standardizing disclosure statement forms is of practical importance if the statements are to be published online and centrally available, and would help members of the public, media and academia analyze the information. [my italics]

My take? That the spirit of the recommendation is for campaign finance data be machine readable – that you should be able to download, open, and play with it on your own computer. However, the literally reading of this text suggests that simple scanning account ledgers and sharing them as an image file or unstructured pdf might suffice.

This would be essentially doing the same thing that generally happens presently and so would not mark a step forward. Another equally bad outcome? That the information gets shared in a manner similar to the way federal MP campaign data is shared on Elections Canada website where it cannot be easily downloaded and you are only allowed to look at one candidates financial data at a time. (Elections Canada site is almost designed to prevent you from effectively analyzing campaign finance data).

So in short, the Taskforce members are to be congratulated as I think their intentions were bang on: they want the public to be able to access and analyze campaign finance data. But we will need to continue to monitor this issue carefully as the language is vague enough that the recommendation may not produce the desired outcome.

Open Data: An Example of the Long Tail of Public Policy at Work

VancouverGraffiti_AnalysisAs many readers know, Vancouver passed what has locally been termed the Open3 motion a year ago and has had a open data portal up and running for several months.

Around the world much of the focus of open data initiatives have focused on the development of applications like Vancouver’s Vantrash, Washington DC’s Stumble Safely or Toronto’s Childcare locator. But the other use of data portals is to actually better understand and analyze phenomena in a city – all of which can potentially lead to a broader diversity of perspectives, better public policy and a more informed public and/or decision makers.

I was thus pleased to find out about another example of what I’ve been calling the Long Tail of Public Policy when I received an email from Victor Ngo, a student at the University of British Columbia who just completed his 2nd year in the Human Geography program with an Urban Studies focus (He’s also a co-op student looking for a summer job – nudge to the City of Vancouver).

It turns out that last month, he and two classmates did a project on graffiti occurrence and its relationship to land use, crime rates, and socio-economic variables. As Victor shared with me:

It was a group project I did with two other members in March/April. It was for an introductory GIS class and given our knowledge, our analysis was certainly not as robust and refined as it could have been. But having been responsible for GIS analysis part of the project, I’m proud of what we accomplished.

The “Graffiti sites” shapefile was very instrumental to my project. I’m a big fan of the site and I’ll be using it more in the future as I continue my studies.

So here we have University students in Vancouver using real city data to work on projects that could provide some insights, all while learning. This is another small example of why open data matters. This is the future of public policy development. Today Victor may be a student, less certain about the quality of his work (don’t underestimate yourself, Victor) but tomorrow he could be working for government, a think tank, a consulting firm, an insurance company or a citizen advocacy group. But wherever he is, the open data portal will be a resource he will want to turn to.

With Victor’s permission I’ve uploaded his report, Graffiti in the Urban Everyday – Comparing Graffiti Occurrence with Crime Rates, Land Use, and Socio-Economic Indicators in Vancouver, to my site so anyone can download it. Victor has said he’d love to get people’s feedback on it.

And what was the main drawback of using the open data? There wasn’t enough of it.

…one thing I would have liked was better crime statistics, in particular, the data for the actual location of crime occurrence. It would have certainly made our analysis more refined. The weekly Crime Maps that the VPD publishes is an example of what I mean:

http://vancouver.ca/police/CrimeMaps/index.htm

You’re able to see the actual location where the crime was committed. We had to tabulate data from summary tables found at:

http://vancouver.ca/police/organization/planning-research-audit/neighbourhood-statistics.html

To translate: essentially the city releases this information in a non-machine-readable format, meaning that citizens, public servants at other levels of government and (I’m willing to wager) City of Vancouver public servants outside the police department have to recreate the data in a digital format. What a colossal waste of time and energy. Why not just share the data in a structured digital way? The city already makes it public, why not make it useful as well? This is what Washington DC (search crime) and San Francisco have done.

I hope that more apps get created in Vancouver, but as a public policy geek, I’m also hoping that more reports like these (and the one Bing Thom architects published on the future of Vancouver also using data from the open data catalog) get published. Ultimately, more people learning, thinking, writing and seeking solutions to our challenges will create a smarter, more vibrant and more successful city. Isn’t that what you’d want your city government (or any government, really…) to do?