Monthly Archives: January 2011

Most Popular Eaves.ca Posts of 2010

Some people have asked me, what were the 10 most viewed posts from last year? Well here as posts that were written last year in order of popularity (excluding static pages and the homepage):

Here are the 20 viewed posts in 2010 including posts written in previous years (and including static pages)

At some point I’ll write a piece about my favourite posts from 2010…

What I love about the latter list is how many old posts are in there. Some just keep logging hits year after year. The Fatness Index has links all over the internet to it so there is never a day where it doesn’t get at least a few hits. Also think it is hilarious that the R2D2 post is always so strong since it really just links to a great post on another blog – Google just really likes putting it up high on those searches.

What I really like though is the mix: there are pieces on GCPEDIA, open data, the media, Firefox and open source, and politics. A great mix.

Making StatsCan Data Free: Assessing the Cost

9 Replies

Regular readers of my blog will know that I’ve advocated that StatsCan’s data – and particularly its Census data – should be made open (e.g. free, unlicensed, and downloadable in multiple formats). Presently, despite the fact that Canadian tax dollars pay to collect (a sadly diminishing amount, and quality of,) data, it is not open.

The main defense I hear to why StatsCan’s data should not be free is because the department depends on the revenue the data generates.

So exactly how much revenue are we talking about? Thanks to the help of some former public servants I’ve been able to go over the publicly available numbers. The basic assessment – which I encourage people to verify and challenge – turns out not to be a huge a number.

The most interesting figure in StatsCan’s finances is the revenue it generates from its online database (e.g. data downloaded from its website). So how much revenue is it? Well in 2007/2008, it was $559,000.

That’s it. For $559,000 in lost government revenue Canadians could potentially have unlimited access to the Statscan census database their tax dollars paid to collect and organize. I suspect this is a tiny fraction of the value (and tax revenue) that might be generated by economic activity if this data were free.

Worse, the $559,000 is not profit. From what I can tell it is only revenue. Consequently, it doesn’t factor in collection costs StatsCan has to absorb to run and maintain a checkout system on its website, collect credit card info, bill people, etc… I’m willing to bet almost anything that the cost of these functions either exceed $559,000 a year, or come pretty close. So the net cost of making the data free could end up being a less.

StatsCan makes another $763,000 selling Statistics Canada publications (these are 243 data releases of the 29 major economic indicators StatsCan measures and the 5 census releases it does annually – in short these are non-customized reports). So for $1,422,000 Canadians could get access to both the online data statscan has and the reports the organization generates. This is such laughably (or depressingly) small number it begs the question – why are we debating this? (again this is revenue, not profit, so the cost could be much lower)

Of course, the figure that you’ll often hear cited is $100M in revenue. So what accounts for the roughly 100x difference between the above number and the alleged revenue? Well, in 2007/08 StatsCan did make $103,155,000 but this was from value added (e.g. customized) reports. This is very, very different product than the basic data that is available on its website. My sources tell me this is not related to downloaded data.

I think we should concede that if the entire StatsCan’s database were made open and free it would impact some of this revenue. But this would also be a good thing. Why is this? Let’s break it down:

Increase Capacity and Data Literacy: By making a great deal of data open and free, StatsCan would make it easier for competitors to enter the market place. More companies and individuals could analyze the country’s census and other data, and so too could more “ordinary” Canadians than ever would be able to access the database (again, that their tax dollars paid to create). This might include groups like senior high school and university students, non-profits and everyday citizens who wanted to know more about their country. So yes, Statscan would have more competitors, but the country might also benefit from having a more data literate population (and thus potential consumers).
Increase Accessibility of Canadian Data to Marginalized Groups: An increase in the country’s analysis capacity would drop the price for such work. This would make it cheaper and easier for more marginal groups to benefit from this data – charities, religious groups, NGO’s, community organizations, individuals, etc…
Improve Competitiveness: It would also be good for Canadian competitiveness, companies would have to spend less to understand and sell into the Canadian market. This would lower the cost of doing business in Canada – helpful to consumers and the Canadian economy.
StatsCan would not lose all or even most of its business: Those at StatsCan who fear the organization would be overwhelmed by a more open world should remember, not all the data can be shared. Some data – particularly economic data gathered from companies – is sensitive and confidential. As a result there will be some data that StatsCan retains exclusive access to, and thus a monopoly over analysis. More importantly, I suspect that were Statscan data made open the demand for data analysis would grow, so arguably new capacity might end up being devoted to new demand, not existing demand.
It will Reduce the Cost of Government: Finally, the crazy thing about StatsCan is that it sells its data and services to other Ministries and layers of government. This means that governments are paying people to move tax payer money between government ministries and jurisdictions. This is a needless administrative costs that drives up everybody’s taxes and poorly allocates scarce government resources (especially at the local level). Assuming every town and city in Canada pays $50 – 1000 dollars to access statscan data may not seem like much, but in reality, we are really paying that, plus their and StatsCan’s staff time to manage all these transactions, enforce compliance, etc… all of which is probably, far, far more.

So in summary, the cost to Canada of releasing this data will likely be pretty marginal, while the benefits could be enormous.

At best, if costs half a million dollars in forgone revenue. Given the improved access and enormous benefits, this is a pittance to pay.

At worst, StatsCana would lose maybe 20-30 million – this is a real nightmare scenario that assumes much greater competition in the marketplace (again, a lot of assumptions in this scenario). Of course the improved access to data would lead to economic benefits that would far, far, surpass this lost revenue, so the net benefit for the country would be big, but the cost to StatsCan would be real. Obviously, it would be nice if this decline in revenue was offset by improved funding for StatsCan (something a government that was genuinely concerned about Canadian economic competitiveness would jump at doing). However, given the current struggles Statscan faces on the revenue front (cuts across the board) I could see how a worse case scenario would be nerve wracking to the department’s senior public servants, who are also still reeling from the Long Form Census debacle.

Ultimately, however, I think the worse case scenario is unlikely. Moreover, in either scenario the benefits are significant.

Bonus Material:

Possibly the most disconcerting part of the financial reports on StatsCan on Treasury Board’s website was the stakeholder consultation associated with access to statscan’s database. It claimed that:

Usability and client satisfaction survey were conducted with a sample of clients in early 2005. Declared level of satisfaction with service was very high.

This is stunning. I’ve never talked to anyone who has had a satisfactory experience on StatsCan’s website (in contrast to their phone support – which everyone loves). I refer to the statscan site where the place where what you want is always one click away.

I’m willing to bet a great deal that the consultations were with existing long term customers – the type of people that have experience using the website. My suspicion is that if a broader consultation was conducted with potential users (university students, community groups, people like me and you, etc…) the numbers would tank. I dare you to try to use their website. It is virtually unnavigable.

Indeed, had made its website and data more accessible I suspect it the department would engage Canadians and have more stakeholders. This would have been the single most powerful thing it could have done to protect itself from cuts and decisions like the Long Form fiasco.

I know this post may anger a number of people at Statscan. I’m genuinely sorry. I know the staff work hard, are dedicated and are exceedingly skilled and professional. This type of feedback is never flattering – particularly in public. It is because you are so important to the unity, economy and quality of life in our country that it is imperative we hold you to the highest possible bar – not just in the quality of that data your collect (there you already excel) but in the way you serve and engage Canadians. In this, I hope that you get the support you need and deserve.

Hello 2011! Coding for America, speaking, open data and licenses

3 Replies

Eaves.ca readers – happy new year! Here’s a little but of an overview into how we are kicking off 2011/

Thank you

First, if you are a regular reader…. thank you. Eaves.ca has just entered its 4th year and it all keeps getting more and more rewarding. I write to organize my thoughts and refine my writing skills but it is always most rewarding to see others benefit from/find it of value to read these posts.

Code for America

I’ve just landed San Francisco (literally! I’m sitting on the floor of the airport, catching the free SFO wifi) where I’ll be spending virtually all of January volunteering to help launch Code for America. Why? Because I think the organizations matters, its projects are important and the people are great. I’ve always enjoyed hanging out with technologists with a positive agenda for change (think Mozilla & OpenMRS) and Code for America takes all that fun and combines it with government – one of my other great passions. I hope to update you on the progress the organizations makes and what will be happening over the coming weeks. And yes, I am thinking about how Code for Canada might fit into all this.

Gov 2.0

I’ll also be in Ottawa twice in January. My first trip out is to present on a paper about how to write collaboratively using a wiki. With a little bit of work, I’ll be repositioning this paper to make it about how to draft public policy on a wiki within government (think GCPEDIA). With luck I’ll publish this in something like Policy Options or something similar (maybe US based?). I think this has the potential of being one of my most important pieces of the year and needless to say, I’m excited and will be grateful for feedback, both good and negative.

Open Data and Government

On my second trip to Ottawa I’ll be presenting on Open Data and Open Government to the Standing Committee on Access to Information, Privacy and Ethics. Obviously, I’m honored and thrilled they’ve asked me to come and talk and look forward to helping parliamentarians understand why this issue is so important and how they could make serious progress in short order if they put their minds to the task.

Licenses

So for the last two years we’ve been working hard to get cities to do open data with significant success. This year, may be the year that the Feds (more on that in a later post) and some provinces get in on the game, as well as a larger group of cities. The goal for them will be to build on, and take to the next level, the successes of the first movers like Vancouver, Edmonton, Toronto and Ottawa. This will mean one thing. Doing the licensing better. The Vancouver license, which has been widely copied was a good starting point for governments venturing into unknown territory (e.g. getting their toes wet). But the license has several problems –and there are several significantly better choices out there (I’m look over at you PDDL – which I notice the City of Surrey has adopted, nice work.). So, I think one big goal for 2011 will be to get governments to begin shifting to (ideally) the PDDL and (if necessary) something more equivalent. On this front I’m feeling optimistic as well and will blog on this in the near future.

Lots of other exciting things going on as well – I look forward to sharing them here in the blog soon.

All in all, its hard to to be excited about 2011 and I hope you are too. Thank you so much for being part of all of this.

eaves.ca

if writing is a muscle, this is my gym

Monthly Archives: January 2011

Most Popular Eaves.ca Posts of 2010

Making StatsCan Data Free: Assessing the Cost

Hello 2011! Coding for America, speaking, open data and licenses