Tag Archives: statscan

StatsCan's free data costs $2M – a rant

So the other day a reader sent me an email pointing me to a story in iPolitics titled “StatsCan anticipates $2M loss from move to open data” and asked me what I thought.

Frustrated, was my response.

$2M is not a lot of money. Not in a federal budget of almost $200B.  And, the number may have been less. The StatsCan person quoted in the article called this expected loss of revenue a “maximum net loss.” This may mean that the loss from making the data free does not take into account the fact the StatsCan’s expenditures may also go down. For instance, if StatsCan no longer has to handle as many financial transactions or chase down invoices and so forth, the reduction if staff over other overhead (unrelated to its core mission by the way) and so result in lower operating costs not reflected in the $2M cited above.

Moreover it is still unclear to me where the $2M figure comes from. As I noted in a blog post earlier this year, in StatsCan’s own reports it outlined that its online database (the one just made free) generated $559,000 in revenue (not profit) in 2007-08 and was estimated to generate $525,000 in revenue in 2010-11. Where does the extra $1.5M come from? I’m open to the fact that I’m reading these reports incorrectly… but it is hard to see how.

But all this is really an aside.

What really, really, really, frustrates me is that the hard number of $2M. It is a pittance.

This is the unbearable cost that’s been holding up open StatsCan data for years? This may be the tiniest golden goose ever killed. Maybe more like a lame duck. Can anyone believe the loss of $2M (or 500K) was going to break the organization?

Give me a break.

What a colossal lack of imagination and sense of economic and social prosperity on the part of every government since Mulroney (who made StatsCan engage in cost recovery). In the United States open statistical data has helped businesses, the social sector, local and state governments, as well as researchers and academics. Heck, even Canadian teachers tell me that they’ve been forced to train students on US data because they couldn’t afford to train their students on Canadian data. All this lost innovation, efficiency, jobs and social benefits for a measly $2M dollars (if that). Oh lack of vision, at all levels! Both at the top of the political order, and within StatsCan, which has been reluctant to go down this route for years.

Now that we see the “cost” this battle seems more pathetic than ever.

Sigh. Rant over.

Statistics Canada Data to become OpenData – Background, Winners and Next Steps

As some of you learned last night, Embassy Magazine broke the story that all of Statistics Canada’s online data will not only be made free, but released under the Government of Canada’s Open Data License Agreement (updated and reviewed earlier this week) that allows for commercial re-use.

This decision has been in the works for months, and while it does not appear to have been formally announced, Embassy Magazine does appear to have managed to get a Statistics Canada spokesperson to confirm it is true. I have a few thoughts about this story: Some background, who wins from this decision, and most importantly, some hope for what it will, and won’t lead to next.

Background

In the embassy article, the spokesperson claimed this decision had been in the works for years, something that is probably technically true. Such a decision – or something akin to it – has likely been contemplated a number of times. And there have been a number of trials and projects that have allowed for some data to be made accessible albeit under fairly restrictive licenses.

But it is less clear that the culture of open data has arrived at StatsCan, and less clear to me that this decision was internally driven. I’ve met many a Statscan employee who encountered enormous resistance while advocating for data open. I remember pressing the issue during a talk at one of the department’s middle managers conference in November of 2008 and seeing half the room nod vigorously in agreement, while the other half crossed it arms in strong disapproval.

Consequently, with the federal government increasingly interested in open data, coupled with a desire to have a good news story coming out of statscan after last summer census debacle, and with many decisions in Ottawa happening centrally, I suspect this decision occurred outside the department. This does not diminish its positive impact, but it does mean that a number of the next steps, many of which will require StatsCan to adapt its role, may not happen as quickly as some will hope, as the organization may take some time to come to terms with the new reality and the culture shift it will entail.

This may be compounded by the fact that there may be tougher news on the horizon for StatsCan. With every department required to have submitted proposal to cut their budgets by either 5% and 10%, and with StatsCan having already seen a number of its programs cut, there may be fewer resources in the organization to take advantage of the opportunity making its data open creates, or even just adjust to what has happened.

Winners (briefly)

The winners from this decision are of course, consumers of statscan’s data. Indirectly, this includes all of us, since provincial and local governments are big consumers of statscan data and so now – assuming it is structured in such a manner – they will have easier (and cheaper) access to it. This is also true of large companies and non-profits which have used statscan data to locate stores, target services and generally allocate resources more efficiently. The opportunity now opens for smaller players to also benefit.

Indeed, this is the real hope. That a whole new category of winners emerges. That the barrier to use for software developers, entrepreneurs, students, academics, smaller companies and non-profits will be lowered in a manner that will enable a larger community to make use of the data and therefor create economic or social goods.

Such a community, however, will take time to evolve, and will benefit from support.

And finally, I think StatsCan is a winner. This decision brings it more profoundly into the digital age. It opens up new possibilities and, frankly, pushes a culture change that I believe is long over due. I suspect times are tough at StatsCan – although not as a result of this decision – this decision creates room to rethink how the department works and thinks.

Next Steps

The first thing everybody will be waiting for is to see exactly what data gets shared, in what structure and to what detail. Indeed this question arose a number of times on twitter with people posting tweets such as “Cool. This is all sorts of awesome. Are geo boundary files included too, like Census Tracts and postcodes?” We shall see. My hope is yes and I think the odds are good. But I could be wrong, at which point all this could turn into the most over hyped data story of the year. (Which actually matters now that data analysts are one of the fastest growing categories of jobs in North America).

Second, open data creates an opportunity for a new and more relevant role for StatsCan to a broader set of Canadians. Someone from StatsCan should talk to the data group at the World Bank around their transformation after they launched their open data portal (I’d be happy to make the introduction). That data portal now accounts for a significant portion of all the Bank’s web traffic, and the group is going through a dramatic transformation, realizing they are no longer curators of data for bank staff and a small elite group of clients around the world but curators of economic data for the world. I’m told a new, while the change has not been easy, a broader set of users have brought a new sense of purpose and identity. The same could be true of StatsCan. Rather than just an organization that serves the government of Canada and a select groups of clients, StatsCan could become the curators of data for all Canadians. This is a much more ambitious, but I’d argue more democratized and important goal.

And it is here that I hope other next steps will unfold. In the United States, (which has had free census data for as long as anyone I talked to can remember) whenever new data is released the census bureau runs workshops around the country, educating people on how to use and work with its data. StatsCan and a number of other partners already do some of this, but my hope is that there will be much, much more of it. We need a society that is significantly more data literate, and StatsCan along with the universities, colleges and schools could have a powerful role in cultivating this. Tracey Lauriault over at the DataLibre blog has been a fantastic advocate of such an approach.

I also hope that StatsCan will take its role as data curator for the country very seriously and think of new ways that its products can foster economic and social development. Offering APIs into its data sets would be a logical next step, something that would allow developers to embed census data right into their applications and ensure the data was always up to date. No one is expecting this to happen right away, but it was another question that arose on twitter after the story broke, so one can see that new types of users will be interested in new, and more efficient ways, of accessing the data.

But I think most importantly, the next step will need to come from us citizens. This announcement marks a major change in how StatsCan works. We need to be supportive, particularly at a time of budget cuts. While we are grateful for open data, it would be a shame if the institution that makes it all possible was reduced to a shell of its former self. Good quality data – and analysis to inform public policy – is essential to a modern economy, society, and government. Now that we will have free access to what our tax dollars have already paid for, let’s make sure that it stays that way, by both ensure it continues to be available, and that there continues to be a quality institution capable of collecting and analyzing it.

(sorry for typos – it’s 4am, will revise in the morning)

Making StatsCan Data Free: Assessing the Cost

Regular readers of my blog will know that I’ve advocated that StatsCan’s data – and particularly its Census data – should be made open (e.g. free, unlicensed, and downloadable in multiple formats). Presently, despite the fact that Canadian tax dollars pay to collect (a sadly diminishing amount, and quality of,) data, it is not open.

The main defense I hear to why StatsCan’s data should not be free is because the department depends on the revenue the data generates.

So exactly how much revenue are we talking about? Thanks to the help of some former public servants I’ve been able to go over the publicly available numbers. The basic assessment – which I encourage people to verify and challenge – turns out not to be a huge a number.

The most interesting figure in StatsCan’s finances is the revenue it generates from its online database (e.g. data downloaded from its website). So how much revenue is it? Well in 2007/2008, it was $559,000.

That’s it. For $559,000 in lost government revenue Canadians could potentially have unlimited access to the Statscan census database their tax dollars paid to collect and organize. I suspect this is a tiny fraction of the value (and tax revenue) that might be generated by economic activity if this data were free.

Worse, the $559,000 is not profit. From what I can tell it is only revenue. Consequently, it doesn’t factor in collection costs StatsCan has to absorb to run and maintain a checkout system on its website, collect credit card info, bill people, etc… I’m willing to bet almost anything that the cost of these functions either exceed $559,000 a year, or come pretty close. So the net cost of making the data free could end up being a less.

StatsCan makes another $763,000 selling Statistics Canada publications (these are 243 data releases of the 29 major economic indicators StatsCan measures and the 5 census releases it does annually – in short these are non-customized reports). So for $1,422,000 Canadians could get access to both the online data statscan has and the reports the organization generates. This is such laughably (or depressingly) small number it begs the question – why are we debating this? (again this is revenue, not profit, so the cost could be much lower)

Of course, the figure that you’ll often hear cited is $100M in revenue. So what accounts for the roughly 100x difference between the above number and the alleged revenue? Well, in 2007/08 StatsCan did make $103,155,000 but this was from value added (e.g. customized) reports. This is very, very different product than the basic data that is available on its website. My sources tell me this is not related to downloaded data.

I think we should concede that if the entire StatsCan’s database were made open and free it would impact some of this revenue. But this would also be a good thing. Why is this? Let’s break it down:

  1. Increase Capacity and Data Literacy: By making a great deal of data open and free, StatsCan would make it easier for competitors to enter the market place. More companies and individuals could analyze the country’s census and other data, and so too could more “ordinary” Canadians than ever would be able to access the database (again, that their tax dollars paid to create). This might include groups like senior high school and university students, non-profits and everyday citizens who wanted to know more about their country. So yes, Statscan would have more competitors, but the country might also benefit from having a more data literate population (and thus potential consumers).
  2. Increase Accessibility of Canadian Data to Marginalized Groups: An increase in the country’s analysis capacity would drop the price for such work. This would make it cheaper and easier for more marginal groups to benefit from this data – charities, religious groups, NGO’s, community organizations, individuals, etc…
  3. Improve Competitiveness: It would also be good for Canadian competitiveness, companies would have to spend less to understand and sell into the Canadian market. This would lower the cost of doing business in Canada – helpful to consumers and the Canadian economy.
  4. StatsCan would not lose all or even most of its business: Those at StatsCan who fear the organization would be overwhelmed by a more open world should remember, not all the data can be shared. Some data – particularly economic data gathered from companies – is sensitive and confidential. As a result there will be some data that StatsCan retains exclusive access to, and thus a monopoly over analysis. More importantly, I suspect that were Statscan data made open the demand for data analysis would grow, so arguably new capacity might end up being devoted to new demand, not existing demand.
  5. It will Reduce the Cost of Government: Finally, the crazy thing about StatsCan is that it sells its data and services to other Ministries and layers of government. This means that governments are paying people to move tax payer money between government ministries and jurisdictions. This is a needless administrative costs that drives up everybody’s taxes and poorly allocates scarce government resources (especially at the local level). Assuming every town and city in Canada pays $50 – 1000 dollars to access statscan data may not seem like much, but in reality, we are really paying that, plus their and StatsCan’s staff time to manage all these transactions, enforce compliance, etc… all of which is probably, far, far more.

So in summary, the cost to Canada of releasing this data will likely be pretty marginal, while the benefits could be enormous.

At best, if costs half a million dollars in forgone revenue. Given the improved access and enormous benefits, this is a pittance to pay.

At worst, StatsCana would lose maybe 20-30 million – this is a real nightmare scenario that assumes much greater competition in the marketplace (again, a lot of assumptions in this scenario). Of course the improved access to data would lead to economic benefits that would far, far, surpass this lost revenue, so the net benefit for the country would be big, but the cost to StatsCan would be real. Obviously, it would be nice if this decline in revenue was offset by improved funding for StatsCan (something a government that was genuinely concerned about Canadian economic competitiveness would jump at doing). However, given the current struggles Statscan faces on the revenue front (cuts across the board) I could see how a worse case scenario would be nerve wracking to the department’s senior public servants, who are also still reeling from the Long Form Census debacle.

Ultimately, however, I think the worse case scenario is unlikely. Moreover, in either scenario the benefits are significant.

Bonus Material:

Possibly the most disconcerting part of the financial reports on StatsCan on Treasury Board’s website was the stakeholder consultation associated with access to statscan’s database. It claimed that:

Usability and client satisfaction survey were conducted with a sample of clients in early 2005. Declared level of satisfaction with service was very high.

This is stunning. I’ve never talked to anyone who has had a satisfactory experience on StatsCan’s website (in contrast to their phone support – which everyone loves). I refer to the statscan site where the place where what you want is always one click away.

I’m willing to bet a great deal that the consultations were with existing long term customers – the type of people that have experience using the website. My suspicion is that if a broader consultation was conducted with potential users (university students, community groups, people like me and you, etc…) the numbers would tank. I dare you to try to use their website. It is virtually unnavigable.

Indeed, had made its website and data more accessible I suspect it the department would engage Canadians and have more stakeholders. This would have been the single most powerful thing it could have done to protect itself from cuts and decisions like the Long Form fiasco.

I know this post may anger a number of people at Statscan. I’m genuinely sorry. I know the staff work hard, are dedicated and are exceedingly skilled and professional. This type of feedback is never flattering – particularly in public. It is because you are so important to the unity, economy and quality of life in our country that it is imperative we hold you to the highest possible bar – not just in the quality of that data your collect (there you already excel) but in the way you serve and engage Canadians. In this, I hope that you get the support you need and deserve.

Census Update: It's the Economy, Stupid

Yesterday during a press conference newly minted House leader John Baird announced “The next few months will be sharply focused on Canadians’ No. 1 priority: jobs and the economy… The economic recovery remains fragile and it is increasingly clear that we are not out of the woods yet.”

Fantastic news.

I just hope someone sends Industry Minister Tony Clement the memo.

The effects and impacts of ending the mandatory long form census continues to spill out with a number of Canada’s most senior business and economic leaders pointing out how the decision will negatively impact the economy and… job growth.

First, there was Bank of Canada Governor Mark Carney (voted one of the most influential people in the world by Time Magazine) noting that the bank relies on data found in the mandatory long form to assess the economy and, presumably, to inform decisions on interest rates and other issues. The bank’s capacity to make informed decisions has now been compromised – not exactly a win for jobs or the economy.

As an interesting side note, Carney goes on to say that this may cause the bank to have to supplement StatsCan’s research with its own. Expect to hear more and more statements like this from Government agencies (which are still allowed to talk to the press) as more and more ministries and agencies get plunged into the dark regarding what is going on in the country and are no longer able to assess programs and issues they’ve been tasked to monitor. Various arms of the government (and thus you, taxpayer) will be spending 10s if not 100s of millions to pay for Industry Minister Clement’s mistake.

Then, in the same Globe article in which Carney makes these statements, Roger Martin, dean of the Rotman School of Management notes that ending the long form census hampers Canadian companies capacity to both compete globally and boost productivity. More damning, and further echoing arguments I’ve been making here, he states it will prevent Canadians from having “a sophisticated economy that uses information to its best.” Unkind words from one of the world’s recognized business leaders.

Sadly, it doesn’t end there. The always excellent Stephen Gordon lists the emerging academic literature chronicling the havoc the demise of the long form census is about to wreck. Especially relevant is “The Importance of the Long-Form Census to Canada” by UBC economists David Green and Kevin Milligan. Interestingly, it turns out that the Canadian Mortgage and Housing Corporation uses long form data to fulfill its legislative mandate, and also by local governments and private sector actors to learn about trends in housing. Something that might be of interest to those concerned about the economy and jobs given Canada is rumored to possible have a housing bubble.

Still more damning is how Green and Milligan show the mandatory long form serves as the foundation for the Labour Force Survey (LFS) from which we derive unemployment levels. Compromising the long form survey has, in short, compromised our ability to assess how many Canadians actually have jobs, something that, if you really believed Canadians felt the economy and jobs were the number 1 priority, your government should care about measuring accurately.

Maybe John Baird will sit down with Tony Clement and the Prime Minister and explain to them how, if the economy and jobs are priority 1 then perhaps the government should rethink its decision on the long form census.

Just don’t hold your breath. Instead, do write another email or letter to your local MP. Our country’s economic recovery and competitiveness is being eroded by a government either too dumb to understand the implications of its decision and too stubborn to admit a mistake. Those of us who will be paying the price should remind them of how they can best serve their own priorities.

Creating effective open government portals

In the past few years a number of governments have launched open data portals. These sites, like www.data.gov or data.vancouver.ca share data – in machine readable formats (e.g. that you can play with on your computer) that government agencies collect.

Increasingly, people approach me and ask: what makes for a good open data portal? Great question. And now that we have a number of sites out there we are starting to learn what makes a site more or less effective. A good starting point for any of this is 8 Open Government principles, and for those newer to this discussion, there are the 3 laws of open data (also available in German Japanese, Chinese, Spanish, Dutch and Russian).

But beyond that, I think there are some pretty tactical things, data portal owners should be thinking about. So here are some issues I’ve noticed and thought might be helpful.

1. It’s all about automating the back end

Probably the single greatest mistake I’ve seen governments make is, in the rush to get some PR or meet an artificial deadline, they create a data portal in which the data must be updated manually. This means that a public servant must run around copying the data out of one system, converting (and possibly scrubbing it of personal and security information) and then posting it to the data portal.

There are a few interrelated problems with this approach. Yes, it allows you to get a site up quickly but… it isn’t sustainable. Most government IT departments don’t have a spare body that can do this work part time, even less so if the data site were to grow to include 100s or 1000s of data sets.

Consequently, this approach is likely to generate ill-will towards the government, especially from the very community of people who could and should be your largest supporters: local tech advocates and developers.

Consider New York, here is a site where – from I can tell – the data is not regularly updated and grumblings are getting louder. I’ve heard similar grumblings out of some developers and citizens in Canadians cities where open data portals get trumpeted despite infrequent updates and having few data sets available.

If you are going to launch an open data portal, make sure you’ve figured out how to automate the data updates first. It is harder to do, but essential. In the early days open data sites often live and die based on the engagement of a relatively small community or early adopters – the people who will initially make the data come alive and build broader awareness. Frustrate the community and the initiative will have a harder time gaining traction.

2. Keep the barriers low

Both the 8 principles and 3 laws talk a lot about licensing. Obviously there are those who would like the licenses on many existing portals to be more open, but in most cases the licenses are pretty good.

What you shouldn’t do is require users to register. If the data is open, you don’t care who is using it and indeed, as a government, you don’t want the hassle of tracking them. Also, don’t call your data open if members must belong to a educational institution or a non-profit. That is by definition not data that is open (I’m looking at you StatsCan, its not liberated data if only a handful of people can look at it, sadly, you’re not the only site to do this). Worst is one website that, in order to access the online catalogue you have to fax in a form outlining who you are.

This is the antithesis of how an open data portal should work.

3. Think like (or get help from) good librarians and designers

The real problem is when sites demand too much of users to even gain access to the data. Readers of this blog know about my feelings regarding Statistics Canada’s website, the data always seems to be one click away. Of course, that’s if you even think you are able to locate the data you are interested in, which usually seems impossible to find.

And yes, I know that Statistics Canada’s phone operators are very helpful and can help you locate datasets quickly – but I submit to you that this is a symptom of a problem. If every time I went to Amazon.com I had to call a help desk to find the book I was interested in I don’t think we’d be talking about how great Amazon’s help desk was. We’d be talking about how crappy their website is.

The point here is that an open data site is likely to grow. Indeed, looking at data.gov and data.gov.uk these sites now have thousands of data sets on them. In order to be navigable they need to have excellent design. More importantly, you need to have a new breed of librarian – one capable of thinking in the online space – to help create a system where data sets can be easily and quickly located.

This is rarely a problem early on (Vancouver has 140 data sets up, Washington DC, around 250, these can still be trolled through without a sophisticated system). But you may want to sit down with a designer and a librarian during these early stages to think about how the site might evolve so that you don’t create problems in the future.

4. Feedback

Finally, I think good open data portals want, and even encourage feedback. I like that data.vancouver.ca has a survey on the site which asks people what data sets they would be interested in seeing made open.

But more importantly, this is an area where governments can benefit. No data set is perfect. Most have a typo here or there. Once people start using your data they are going to find mistakes.

The best approach is not to pretend like the information is perfect (it isn’t, and the public will have less confidence in you if you pretend this is true). Instead, ask to be notified about errors. Remember, you are using this data internally, so any errors are negatively impacting your own planning and analysis. By harnessing the eyes of the public you will be able to identify and fix problems more quickly.

And, while I’m sure we all agree this is probably not the case, maybe the face that the data us public, there will be a small added incentive to fixing it quickly. Maybe.


The week in review (or… why I blog and a thank you)

Here’s a few snippets of comments, emails and other communications I’ve had this week in response to specific posts or just the blog in general. Each one touches on why I love blogging and my readers and why this blog has come to mean so much to me.

Venting, and finding out your not alone…

So, yesterday I got a little bit into a hate-on for Statistics Canada’s website. It wasn’t the first time and pretty much every time I do it I find another soul out there whose had their soul crushed by the website as well. Take this comment from last week:

Re: Stats Canada’s website being unusable. I completely frickin agree. God. Has anyone in government actually tried to use that website? An econ professor gave our class an assigment last year that involved looking stuff up on Statscan. Half of our class failed the assignment because they gave up and the other half had the wrong data, but got the marks anyways for trying. I think he actually took that assigment off of the grading at the end. It’s a bloody gong show…

Sometimes it makes me feel more human knowing that others are out there struggling with the same thing. StatsCan does great work… I just wish they made it accessible.

…and then having some kind souls find some solutions for you.

But as nice as knowing you’re not alone… even better is how often the internet connects you to others who just happen to have that esoteric piece of knowledge that saves the day.

I agree, Stats Can is one of the worst government websites out there (specifically those stupid CANSIM tables), one that, as a policy analyst with XXXXXXXX Canada, i frequently have to use to get data. I had the data for XXXXXXX and it wasn’t hard to get it for the country.

This kind soul led me straight to a completely different page on statscan that happened to have the data I was looking for. (for those interested, it was here).

And they weren’t the only one. Another reader posted a link to the data over twitter…

Thank god there is an army good natured amateur and professional experts experienced in navigating the byzantine structure of the statscan website!

So… thank you! I’m going to try to grind out an updated pan-North American version of the Fatness Index this weekend.

Impacting Policy

But this week also had that other rewarding ingredient I love to get: hearing about a post helped, incrementally, foster better public policy. This came in via email from a public servant about yesterday’s blog post:

Your blog today provided a good example in a meeting with government colleagues about the benefits of opening data. It illustrates the implications of not releasing data to the public (e.g. stifling innovation)… It resonated well with them.

This is a huge part of why I blog. Part of it is to explore ideas, part of it is to introduce ideas and thoughts, but a big piece of it is to enable public servants and do just this, helps small internal government meeting (on subjects like open data) go a little more smoothly.

So to everyone out there, be it policy wonks, students, public servants, politicians or ordinary, engaged citizens. Thank you. It was a good week. We wrote some good posts, some good comments, had an original story on the stupidity of the census, and maintained sanity in the face of the StatsCan website. Thank you everyone for making it so fun. Hope you all have a great weekend. – Dave

Fatness Index 2 years on: the good, the bad, the ugly

Two years ago I saw that Richard Florida and Andrew Sullivan had re-posted a map created by calorielab that color-coded US states by weight.

As I found it interesting I created a North America wide map the included Canadian data (knowing that it probably would be a perfect apple to apple comparisons). The map and subsequent blog post turned into one of my best viewed pages with well over 20,000 pageviews.

The very cool people over at Calorie Labs informed me that they have released an updated version of the American map (posted below, you can see the original at their site here). Not too much has changed, but after looking at the map I’ve a few comments.

Calorie lab’s release of an updated version of the map has triggered a few thoughts and some lessons that I think should matter to policy makers, health-care professionals and citizens in general. Here they are:

The Good

The amazing people at Calorie Lab. When I created the map 2 years ago I didn’t even check to see if their work was copyrighted. Although the data was public domain, I copied Calorielab’s colour palette as I was trying to create a “mash-up” of their work with Canadian data. I wanted the maps to look similar. My map was a derivative work.

Did the people at Calorielab freak out? No. Quite the opposite. They reached out, said thank you and asked if I needed help.

It seems this year they’ve gotten even cooler. I don’t remember if the original map’s license but with the publishing of their 2010 update they wrote:

CalorieLab’s United States of Obesity 2010 map is licensed for use by anyone in any media and can be downloaded in various formats (small GIF, large GIF, SVG, EPS).

There’s a line directed specifically at people like me. It says, please, use this map! Not only is the license open but they’ve provided it in lots of formats (Which is great cause two years I had to recreate the thing from scratch and it took hours).

So naturally you are wondering, where is David’s 2010 mashup-Northern American Fatness Index.

The Bad

The bad is that trying to find the Canadian data is a pain. A couple of times a year I get a cool idea for a visual or graph that Statistics Canada data might help me create. In minutes I’m on their webpage and, within 5 minutes, I’m walking away from my computer fearing I might throw it out the window.

StatCans website may be the worst, most inaccessible government website in the western world. Whatever data you are looking for always seems to be at least one more click away.

It spent an hour trying to find data that StatsCan allegedly wants me to find. (This in an era of google where I generally find data people don’t want me to find, in minutes). Ultimately, I think I found the relevant data on overwieight/obesity figures by province (but who knows! Should I be choosing peer group A, or B, or C, D, E, F, G? None of which have labels explaining what they mean!).

The Ugly

Sadly, it gets worse. Even if you a) locate the data on Statscan’s website and b) it is free, it will probably still be inaccessible.  The only way the data can be viewed is with a Beyond 20/20 Professional Browser. You need to learn a new software package, one 99.9% of Canadians have never heard of, and that only works on a PC (I’m on a mac). The data I want is pretty simple, a CSV file, or even an Excel spreadsheet would be sufficient, something the average Canadian could access. But I guess it is not to be.

So I give up.

You win StatsCan. There are 10s of thousands of Canadians like me who would love to do interesting things with the data our tax dollars paid to collect, but even when your data is free and “open,” it isn’t. You’ve enjoyed tremendous support in the last month from those Canadians who understand why you are important (including me) but many Canadians have had to go up a steeper learning curve around why they should care. I might suggest they’d have gotten up that curve faster if they too could have used your data.

Myself, healthcare professionals, students and countless others could paint innumerable stories explaining Canadians and Canada to one another – helping us grasp our history, our social and health challenges, as well simply who we are. But we can’t.

In the end I’m still one of your biggest supporters, but frankly even I feel alienated.

Note: If someone wants to help me get this data, I’ll take a cut at recreating the map again, otherwise, as I said before. I give up.

Your Government Just Got Dumber: how it happened and why it matters to you

This piece was published in the Globe and Mail today so always nice when you read it there and let them know it matters to you.

Last week the Conservative Government decided that it would kill the mandatory long census form it normally sends out to thousands of Canadians every five years. On the surface such a move may seem unimportant and, to many, uninteresting, but it has significant implications for every Canadian and every small community in Canada.

Here are 3 reasons why this matters to you:

1. The Death of Smart Government

Want to know who the biggest user of census data is? The government. To understand what services are needed, where problems or opportunities may arise, or how an region is changing depends on having accurate data. The federal government, but also the provincial and, most importantly, local governments use Statistics Canada’s data every day to find ways to save tax payers money, improve services and make plans. Now, at the very moment – thanks to computers – governments are finding new ways to use this information more effectively than ever before, it is to be cut off.

To be clear this is a direct attack on the ability of government to make smart decisions. In fact it is an attack on evidence based public policy. Moreover, it was a political decision – it came from the Minister’s office and does not appear to reflect what Statistics Canada either wants or recommends. Of course, some governments prefer not to have information, all that data and evidence gets in the way of legislation and policies that are ineffective, costly and that reward vested interests (I’m looking at you Crime Bill).

2. The Economy is Less Competitive

But it isn’t just government that will suffer. In the 21st century economies data and information are at the heart of economic activity, it is what drives innovation, efficiencies and productivity. Starve our governments, ngo’s, businesses and citizens of data and you limit the wealth a 21st century economy will generate.

Like roads to the 20th century economy, data is the core infrastructure for a 21st century economy. While just a boring public asset, it can nonetheless foster big companies, jobs and efficiencies. Roads spawned GM. Today, people often fail to recognize that the largest company already created by the new economy – Google – is a data company. Google is effective and profitable not because it sells ads, but because it generates and leverages petabytes of data every day from billions of search queries. This allows it to provide all sorts of useful services such as pointing us, with uncanny accuracy, to merchandises and services we want, or better yet, spam we’d like to avoid. It can even predict when communities will experience flu epidemics four months in advance.

And yet, it is astounding that the Minister in charge of Canada’s digital economy, the minister who should understand the role of information in a 21st century economy, is the minister who authorized killing the creation of this data. In doing so he will deprive Canadians and their businesses of information that would make them, and thus our economy, more efficient, productive and profitable. Of course, the big international companies will probably be able to find the money to do their own augmented census, so those that will really suffer will be small and medium size Canadian businesses.

3. Democracy Just got Weaker

Of course, the most important people who could use the data created by the census aren’t government or businesses. It is ordinary Canadians. In theory, the census creates a level playing field in public policy debates. Were Statistics Canada website usable and its data accessible (data, may I remind you we’ve already paid for) then citizens could use this information to fight ineffective legislation, unjust policies, or wasteful practices. In a world where this information won’t exist those who are able to pay for the creation of this information – read large companies – will have an advantage not only over citizens, but over our governments (which of course, won’t have this data anymore either). Today, the ability of ordinary citizens to defend themselves against government and businesses just got weaker.

So who’s to blame? Tony Clement, the Minister of Industry Canada who oversees Statistics Canada, is to blame. His office authorized this decision. But Statistics Canada also shares in the blame. In an era where the internet has flattened the cost of distributing information Statistics Canada: continues to charge citizens for data their tax dollars already paid for; has an unnavigable website where it is impossible to find anything; and often distributes data in formats that are hard to use. In short, for years the department has made its data inaccessible to ordinary Canadians. As a result it isn’t hard to see why most Canadians don’t know about or understand this issue. Sadly, once they do wake up to the cost of this terrible decisions, I fear it will be too late.