Category Archives: technology

Why Social Media behind the Government Firewall Matters

Shared IT Services across the Canadian Government – three opportunities

Earlier this week the Canadian Federal Government announced it will be creating Shared Services Canada which will absorb the resources and functions associated with the delivery of email, data centres and network services from 44 departments.

These types of shared services projects are always fraught with danger. While they sometimes are successful, they are often disasters. Highly disruptive with little to show for results (and eventually get unwound). However, I suspect there is a significant amount of savings that can be made and I remain optimistic. With luck the analogy here is the work outgoing US CIO Vivek Kundra accomplished as he has sought to close down and consolidate 800 data centres across the US which is yielding some serious savings.

So here’s what I’m hoping Shared Services Canada will mean:

1) A bigger opportunity for Open Source

What I’m still more hopeful about – although not overly optimistic – is the role that open source solutions could play in the solutions Shared Services Canada will implement. Over on the Drupal site, one contributor claims government officials have been told to hold off buying web content management systems as the government prepares to buy a single solution for across all departments.

If the government is serious about lowering its costs it absolutely must rethink its procurement models so that open source solutions can at least be made a viable option. If not this whole exercise will mean the government may save money, but it will be the we move from 5 expensive solutions to one expensive solution variety.

On the upside some of that work has clearly taken place. Already there are several federal government websites running on Drupal such as this Ministry of Public Works website, the NRCAN and DND intranet. Moreover, there are real efforts in the open source community to accommodate government. In the United States OpenPublic has fostered a version of Drupal designed for government’s needs.

Open source solutions have the added bonus of allowing you the option of using more local talent, which, if stimulus is part of the goal, would be wise. Also, any open source solutions fostered by the federal government could be picked up by the provinces, creating further savings to tax payers. As a bonus, you can also fire incompetent implementors, something that needs to happen a little more often in government IT.

2) More accountability

Ministers Ambrose and Clement are laser focused on finding savings – pretty much every ministry needs to find 5 or 10% savings across the board. I also know both speak passionately about managing tax payers dollars: “Canadians work hard for their money and expect our Government to manage taxpayers dollars responsibly, Shared Services Canada will have a mandate to streamline IT, save money, and end waste and duplication.”

Great. I agree. So one of Shared Service Canada’s first act should be to follow in the footsteps of another Vivek Kundra initiative and recreate his incredibly successful IT Dashboard. Indeed it was by using the dashboard Vivek was able to “cut the time in half to deliver meaningful [IT system] functionality and critical services, and reduced total budgeted [Federal government IT] costs by over $3 billion.” Now that some serious savings. It’s a great example of how transparency can drive effective organizational change.

And here’s the kicker. The White House open sourced the IT Dashboard (the code can be downloaded here). So while it will require some work adapting it, the software is there and a lot of the heavy work has been done. Again, if we are serious about this, the path forward is straightforward.

3) More open data

Speaking of transparency… one place shared services could really come in handy is creating some data warehouses for hosting critical government data sets (ideally in the cloud). I suspect there are a number of important datasets that are used by public servants across ministries, and so getting them on a robust platform that is accessible would make a lot of sense. This of course, would also be an ideal opportunity to engage in a massive open data project. It might be easier to create policy for making the data managed by Shared Service Canada “open.” Indeed, this blog post covers some of the reasons why now is the time to think about that issue.

So congratulations on the big move everyone and I hope these suggestions are helpful. Certainly we’ll be watching with interest – we can’t have a 21st century government unless we have 21st century infrastructure, and you’re now the group responsible for it.

Open Source Data Journalism – Happening now at Buzz Data

15 Replies

(there is a section on this topic focused on governments below)

A hint of how social data could change journalism

Anyone who’s heard me speak in the last 6 months knows I’m excited about BuzzData. This week, while still in limited access beta, the site is showing hints its potential – and it still has only a few hundred users.

First, what is BuzzData? It’s a website that allows data to be easily uploaded and shared among any number of users. (For hackers – it’s essentially github for data, but more social). It makes it easy for people to copy data sets, tinker with them, share the results back with the original master, mash them up with other data sets, all while engaging with those who care about that data set.

So, what happened? Why is any of this interesting? And what does it have to do with journalism?

Exactly a month ago Svetlana Kovalyova of Reuters had her article – Food prices to remain high, UN warns – re-published in the Globe and Mail. The piece essentially outlined that food commodities were getting cheaper because of local conditions in a number of regions.

Someone at the Globe and Mail decided to go a step further and upload the data – the annual food price indices from 1990-present – onto the BuzzData site, presumably so they could play around with it. This is nothing complicated, it’s a pretty basic chart. Nonetheless a dozen or so users started “following” the dataset and about 11 days ago, one of them, David Joerg, asked:

The article focused on short-term price movements, but what really blew me away is: 1) how the price of all these agricultural commodities has doubled since 2003 and 2) how sugar has more than TRIPLED since 2003. I have to ask, can anyone explain WHY these prices have gone up so much faster than other prices? Is it all about the price of oil?

He then did a simple visualization of the data.

In response someone from the Globe and Mail entitled Mason answered:

Hi David… did you create your viz based on the data I posted? I can’t answer your question but clearly your visualization brought it to the forefront. Thanks!

But of course, in a process that mirrors what often happens in the open source community, another “follower” of the data shows up and refines the work of the original commentator. In this case, an Alexander Smith notes:

I added some oil price data to this visualization. As you can see the lines for everything except sugar seem to move more or less with the oil. It would be interesting to do a little regression on this and see how close the actual correlation is.

The first thing to note is that Smith has added data, “mashing in” Oil Price per barrel. So now the data set has been made richer. In addition his graph quite nice as it makes the correlation more visible than the graph by Joerg which only referenced the Oil Price Index. It also becomes apparent, looking at this chart, how much of an outlier sugar really is.

Perhaps some regression is required, but Smith’s graph is pretty compelling. What’s more interesting is not once is the price of oil mentioned in the article as a driver of food commodity prices. So maybe it’s not relevant. But maybe it deserves more investigation – and a significantly better piece, one that would provide better information to the public – could be written in the future. In either case, this discussion, conducted by non-experts simply looking at the data, helped surface some interesting leads.

And therein lies the power of social data.

With even only a handful of users a deeper, better analysis of the story has taken place. Why? Because people are able to access the data and look at it directly. If you’re a follower of Julian Assange of wikileaks, you might call this scientific journalism, maybe it is, maybe it isn’t, but it certainly is a much more transparent way for doing analysis and a potential audience builder – imagine if 100s or 1000s of readers were engaged in the data underlying a story. What would that do to the story? What would that do to journalism? With BuzzData it also becomes less difficult to imagine a data journalists who spends a significant amount of their time in BuzzData working with a community of engaged pro-ams trying to find hidden meaning in data they amass.

Obviously, this back and forth isn’t game changing. No smoking gun has been found. But I think it hints at a larger potential, one that it would be very interesting to see unlocked.

More than Journalism – I’m looking at you government

Of course, it isn’t just media companies that should be paying attention. For years I argued that governments – and especially politicians – interested in open data have an unhealthy appetite for applications. They like the idea of sexy apps on smart phones enabling citizens to do cool things. To be clear, I think apps are cool too. I hope in cities and jurisdictions with open data we see more of them.

But open data isn’t just about apps. It’s about the analysis.

Imagine a city’s budget up on Buzzdata. Imagine, the flow rates of the water or sewage system. Or the inventory of trees. Think of how a community of interested and engaged “followers” could supplement that data, analyze it, visualize it. Maybe they would be able to explain it to others better, to find savings or potential problems, develop new forms of risk assessment.

It would certainly make for an interesting discussion. If 100 or even just 5 new analyses were to emerge, maybe none of them would be helpful, or would provide any insights. But I have my doubts. I suspect it would enrich the public debate.

It could be that the analysis would become as sexy as the apps. And that’s an outcome that would warm this policy wonk’s soul.

How Dirty is Your Data? Greenpeace Wants the Cloud to be Greener

4 Replies

My friends over at Greenpeace recently published an interesting report entitled “How dirty is your data?
A Look at the Energy Choices That Power Cloud Computing.”

For those who think that cloud computing is an environmentally friendly business, let’s just say… it’s not without its problems.

What’s most interesting is the huge opportunity the cloud presents for changing the energy sector – especially in developing economies. Consider the follow factoids from the report:

Data centres to house the explosion of virtual information currently consume 1.5-2% of all global electricity; this is growing at a rate of 12% a year.
The IT industry points to cloud computing as the new, green model for our IT infrastructure needs, but few companies provide data that would allow us to objectively evaluate these claims.
The technologies of the 21st century are still largely powered by the dirty coal power of the past, with over half of the companies rated herein relying on coal for between 50% and 80% of their energy needs.

The 12% growth rate is astounding. It essentially makes it the fastest growing segment in the energy business – so the choices these companies make around how they power their server farms will dictate what the energy industry invests in. If they are content with coal – we’ll burn more coal. If they demand renewables, we’ll end up investing in renewables and that’s what will end up powering not just server farms, but lots of things. It’s a powerful position big data and the cloud hold in the energy marketplace.

And of course, the report notes that many companies say many of the right things:

“Our main goal at Facebook is to help make the world more open and transparent. We believe that if we want to lead the world in this direction, then we must set an example by running our service in this way.”

– Mark Zuckerberg

But then Facebook is patently not transparent about where its energy comes from, so it is not easy to assess how good or bad they are, or how they are trending.

Indeed it is worth looking at Greenpeace’s Clean Cloud report card to see – just how dirty is your data?

I’d love to see a session at the upcoming (or next year) Strata Big Data Conference on say “How to use Big Data to make Big Data more Green.” Maybe even a competition to that effect if there was some data that could be shared? Or maybe just a session where Greenpeace could present their research and engage the community.

Just a thought. Big data has got some big responsibilities on its shoulders when it comes to the environment. It would be great to see them engage on it.

Lessons for Open Source Communities: Making Bug Tracking More Efficient

12 Replies

This post is a discussion about making bug tracking in Bugzilla for the Mozilla project more efficient. However, I believe it is applicable to any open source project or even companies or governments running service desks (think 311).

Almost exactly a year ago I wrote a blog post titled: Some thoughts on improving Bugzilla in which I made several suggestions for improving the work flow in bugzilla. Happily a number of those ideas have been implemented.

One however, remains outstanding and, I believe, creates an unnecessary amount of triage work as well as a terrible experience for end users. My understanding is that while the bug could not be resolved last year for a few reasons, there is growing interest (exemplified originally in the comment field of my original post) to tackle it once again. This is my attempt at a rallying cry to get that process moving.

Those who are already keen on this idea and don’t want to read anything more below, this refers to bug 444302.

The Challenge: Dealing with Support Requests that Arrive in Bugzilla

I first had this idea last summer while talking to the triage team at the Mozilla Summit. These are the guys who look at the firehose of bugs being submitted to Mozilla every day. They have a finite amount of time, so anything we can do to automate their work is going to help them, and the project, out significantly.

Presently, I’m told that Mozilla gets a huge number of bugs submitted that are not actually bugs, but support issues. This creates several challenges.

First, it means that support related issues, as opposed to real problems with the software, are clogging up the bug tracking system. This increases the amount of noise in the system – making it harder for everyone to find the information they need.

Second, it means the triage teams has to spend time filtering bugs that are actually support issues. Not a good use of their time.

Third, it means that users who have real support issues but submit them accidentally though Bugzilla, get a terrible experience.

This last one is a real problem. If you are a user, feeling frustrated (and possibly not behaving as your usual rational self – we’ve all been there) because your software is not working the way you expect, and then you submit what a triage person considers a support issue (Resolve-Invalid) you get an email that looks like this:

If I’m already cheesed that my software isn’t doing what I want, getting an email that says “Invalid” and “Verified” is really going to cheese me off. That of course presumes I even know what this email means. More likely, I’ll be thinking that some ancient machine in the bowels of mozilla using software created in the late 1990s received my plea and has, in its 640K confusion, has spammed me. (I mean look at it… from a user’s perspective!)

The Proposal: Re-Automating the Process for a better result

Step 1: My sense is that this issue – especially problem #3 – could be resolved by simply creating a new resolution field. I’ve opted to call it “Support” but am happy to name it something else.

This feels like a simple fix and it would quickly move a lot of bugs that are cluttering up bugzilla… out.

Step 2: Query the text of bugs marked “support” against Mozilla’s database. Then insert the results in an email that goes back to the user. I’m imagining something that might look like this:

Such an email has several advantages:

First, if these are users who’ve submitted inappropriate bugs and who really need support, giving them a bugzilla email isn’t going to help them, they aren’t even going to know how to read it.

Second, there is an opportunity to explain to them where they should go for help – I haven’t done that explicitly enough in this email – but you get the idea.

Because, because we’ve done a query of the Mozilla support database (SUMO) we are able to include some support articles that might resolve their issue.

Fourth, if this really is a bug from a more sophisticated user, we give them a hyperlink back to bugzilla so they can make a note or comment.

What I like about this is it is customized engagement at a low cost. More importantly, it helps unclutter things while also making us more responsive and creating a better experience for users.

Next Steps:

It’s my understanding that this is all pretty doable. After last year’s post there were several helpful comments. Including this one from Bugzilla expert Gervase Markham:

The best way to implement this would be a field on SUMO where you paste a bug number, and it reaches out, downloads the Bugzilla information using the Bugzilla API, and creates a new SUMO entry using it. It then goes back and uses the API to automatically resolve the Bugzilla bug – either as SUPPORT, if we have that new resolution, or INVALID, or MOVED (which is a resolution Bugzilla has had in the past for bugs moved elsewhere), or something else.

The SUMO end could then send them a custom email, and it could include hyperlinks to appropriate articles if the SUMO engine thought there were any.

And Tyler Downer noted in this comment that there maybe be a dependency bug (#577561) that would also need resolving:

Gerv, I love you point 3. Exactly what I had in mind, have SUMO pull the relevant data from the bug report (we just need BMO to autodetect firefox version numbers, bug 577561 ;) and then it should have most of the required data. That would save the user so much time and remove a major time barrier. They think “I just filed a bug, now they want me to start a forum thread?” If it does it automatically, the user would be so much better served.

So, if there is interest in doing this, let me know. I’m happy to support any discussion, should it take place on the comment stream of the bug, the comments below, or somewhere else that might be helpful (maybe I should dial in on this call?). Regardless, this feels like a quick win, one that would better serve Mozilla users, teach them to go to the right place for support (over time) and improve the Bugzilla workflow. It might be worth implementing even for a bit, and we can assess any positive or negative feedback after 6 months.

Let me know how I can help.

Additional Resources

Bug 444302: Provide a means to migrate support issues that are misfiled as bugs over to the support.mozilla.com forums.

My previous post: Some thoughts on improving Bugzilla. The comments are worth checking out

Mozilla’s Bugzilla Wiki Page

Why I’m Struggling with Google+

14 Replies

So it’s been a couple of weeks since Google+ launched and I’ll be honest, I’m really struggling with the service. I wanted to give it a few weeks before writing anything, which has been helpful in letting my thinking mature.

First, before my Google friends get upset, I want to acknowledge the reason I’m struggling has more to do with me than with Google+. My sense is that Google+ is designed to manage personal networks. In terms of social networking, the priority, like at Facebook, is on a soft version of the word “social” eg. making making the experience friendly and social, not necessarily efficient.

And I’m less interested in the personal experience than in the learning/professional/exchanging experience. Mark Jones, the global communities editor for Reuters, completely nailed what drives my social networking experience in a recent Economist special on the News Industry: “The audience isn’t on Twitter, but the news is on Twitter.” Exactly! That’s why I’m on Twitter. Cause that’s where the news is. It is where the thought leaders are interacting and engaging one another. Which is very different activity than socializing. And I want to be part of all that. Getting intellectually stimulated and engaged – and maybe even, occasionally, shaping ideas.

And that’s what threw me initially about Google+. Because of where I’m coming from, I (like many people) initially focused on sharing updates which begged comparing Google+ to Twitter, not Facebook. That was a mistake.

But if Google+ is about about being social above all else, it is going to be more like Facebook than Twitter. And therein lies the problem. As a directory, I love Facebook. It is great for finding people, checking up on their profile and seeing what they are up to. For some people it is good for socializing. But as a medium for sharing information… I hate Facebook. I so rarely use it, it’s hard to remember the last time I checked my stream intentionally.

So I’m willing to accept that part of the problem is me. But I’m sure I’m not alone so if you are like me, let me try to further breakdown why I (and maybe you too) are struggling.

Too much of the wrong information, too little of the right information.

The first problem with Google+ and Facebook is that they have both too much of the wrong information, and too little of the right information.

What do I mean by too much of the wrong? What I love about Twitter is its 140 character limit. Indeed, I’m terrified to read over at Mathew Ingram’s blog that some people are questioning this limit. I agree with Mathew: changing Twitter’s 140 character limit is a dumb idea. Why? For the same reason I thought it made sense back in March of 2009, before Google+ was even a thought:

What I love about Twitter is that it forces writers to be concise. Really concise. This in turn maximizes efficiency for readers. What is it Mark Twain said? “I didn’t have time to write a short letter, so I wrote a long one instead.” Rather than having one, or even thousands or readers read something that is excessively long, the lone drafter must take the time and energy to make it short. This saves lots of people time and energy. By saying what you’ve got to say in 140 characters, you may work more, but everybody saves.

On the other hand, while I want a constraint over how much information each person can transmit, I want to be able to view my groups (or circles) of people as I please.

Consider the screen shot of TweetDeck below. Look how much information is being displayed in a coherent manner (of my choosing). It takes me maybe, maybe 30-60 seconds to scan all this. In one swoop I see what friends are up to, some of my favourite thought leaders, some columnists I respect… it is super fast and efficient. Even on my phone, switching between these columns is a breeze.

But now look at Google+. There are comments under each item…but I’m not sure I really care to see. Rather then the efficient stream of content I want, I essentially have a stream of content I didn’t ask for. Worse, I can see, what, maybe 2-5 items per screen, and of course I see multiple circles on a single screen.

Obviously, some of this is because Google+ doesn’t have any applications to display it in alternative forms. I find the Twitter homepage equally hard to use. So some of this could be fixed if (and hopefully when) Google makes public their Google+ API.

But it can’t solve some underlying problems. Because an item can be almost as long as the author wants, and there can be comments, Google+ doesn’t benefit from Twitter’s 140 character limit. As one friend put it, rather than looking at a stream of content, I’m looking at a blog in which everybody I know is a writer submitting content and in which an indefinite number of comments may appear. I’ll be honest: that’s not really a blog I’m interested in reading. Not because I don’t like the individual authors, but because it’s simply too much information, shared inefficiently.

Management Costs are too high

And herein lies the second problem. The management costs of Google+ are too high.

I get why “circles” can help solve some of the problems outlined above. But, as others have written, it creates a set of management costs that I really can’t be bothered with. Indeed this is the same reason Facebook is essentially broken for me.

One of the great things about Twitter is that it’s simple to manage: Follow or don’t follow. I love that I don’t need people’s permission to follow them. At the same time, I understand that this is ideal for managing divergent social groups. A lot of people live lives much more private than mine or want to be able to share just among distinct groups of small friends. When I want to do this, I go to email… that’s because the groups in my life are always shifting and it’s simple to just pick the email addresses. Managing circles and keeping track of them feels challenging for personal use. So Google+ ends up taking too much time to manage, which is, of course, also true of Facebook…

Using circles to manage for professional reasons makes way more sense. That is essentially what I’ve got with Twitter lists. The downside here is that re-creating these lists is a huge pain.

And now one unfair reason with some insight attached

Okay, so going to the Google+ website is a pain, and I’m sure it will be fixed. But presently my main Google account is centered around my eaves.ca address and Google+ won’t work with Google Apps accounts so I have to keep flipping to a gmail account I loathe using. That’s annoying but not a deal breaker. The bigger problem is my Google+ social network is now attached to an email account I don’t use. Worse, it isn’t clear I’ll ever be able to migrate it over.

My Google experience is Balkanizing and it doesn’t feel good.

Indeed, this hits on a larger theme: Early on, I often felt that one of the promises of Google was that it was going to give me more opportunities to tinker (like what Microsoft often offers in its products), but at the same time offer a seamless integrated operating environment (like what Apple, despite or because of their control freak evilness, does so well). But increasingly, I feel the things I use in Google are fractured and disconnected. It’s not the end of the world, but it feels less than what I was hoping for, or what the Google brand promise suggested. But then, this is what everybody says Larry Page is trying to fix.

And finally a bonus fair reason that’s got me ticked

Now I also have a reason for actively disliking Google+.

After scanning my address book and social network, it asked me if I wanted to add Tim O’Reilly to a circle. I follow Tim as a thought leader on Twitter so naturally I thought – let’s get his thoughts via Google+ as well. It turns out however, that Tim does not have a Google+ account. Later when I decided to post something a default settings I failed to notice sent emails to everyone in my circles without a Google+ account. So now I’m inadvertently spamming Tim O’Reilly who frankly, doesn’t need to get crap spam emails from me or anyone. I’m feeling bad for him cause I suspect, I’m not the only one doing it. He’s got 1.5 million followers on Twitter. That could be a lot of spam.

My fault? Definitely in part. But I think there’s a chunk of blame that can be heaped on to a crappy UI that wanted that outcome. In short: Uncool, and not really aligned with the Google brand promise.

In the end…

I remember initially, I didn’t get Twitter; after first trying it briefly I gave up for a few months. It was only after the second round that it grabbed me and I found the value. Today I’m struggling with Google+, but maybe in a few months, it will all crystallize for me.

What I get, is that it is an improvement on Facebook, which seems to becoming the new AOL – a sort of gardened off internet that is still connected but doesn’t really want you off in the wilds having fun. Does Google+ risk doing the same to Google? I don’t know. But at least circles are clearly a much better organizing system than anything Facebook has on offer (which I’ve really failed to get into). It’s far more flexible and easier to set up. But these features, and their benefits, are still not sufficient to overcome the cost setting it up and maintaining it…

Ultimately, if everybody moves, I’ll adapt, but I way prefer the simplicity of Twitter. If I had my druthers, I’d just post everything to Twitter and have it auto-post over to Google+ and/or Facebook as well.

But I don’t think that will happen. My guess is that for socially driven users (e.g. the majority of people) the network effects probably keep them at Facebook. And does Google+ have enough features to pull the more alpha type user away? I’m not sure. I’m not seeing it yet.

But I hope they try, as a little more competition in the social networking space might be good for everyone, especially when it comes to privacy and crazy end-user agreements.

The State of Open Data Licenses in Canada and where to go from here

11 Replies

(for readers less interested in Open Data – I promise something different tomorrow)

In February I wrote how 2011 would be the year of the license for Canada’s open data community. This has indeed been the case. For public servants and politicians overseeing the various open data projects happening in Canada and around the world, here is an outline of where we are, and what I hope will happen next. For citizens I hope this will serve as a primer and help explain why this matters. For non-Canadians, I hope this can help you strategize how to deal with the different levels of government in your own country.

This is important stuff, and will be important to ensure success in the next open data challenge: aligning different jurisdictions around common standards.

Why Licenses Matter

Licenses matter because they determine how you are able to use government data – a public asset. As I outlined in the three laws of open data, data is only open if it can be found, be played with and be shared. The license deals with the last of these. If you are able to take government data, find some flaw or use it to improve a service, it means nothing if you are not able to share what you create with others. The more freedom you have in doing this, the better.

What we want from the license regime (and for your government)

There are a couple of interests one is trying to balance in creating a license regime. You want:

Open: there should maximum freedom for reuse (see above, and this blog post)
Secure: it offers governments appropriate protections for privacy and security
Simplicity: to keep down legal costs low, and make it easier for everyone to understand
Standardized: so my work is accessible across jurisdictions
Stable: so I know that the government won’t change the rules on me

At the moment, two licenses in Canada meet these tests. The Public Domain Dedication and License (PDDL) used by Surrey, Langley, Winnipeg (for its transit data) and the BC government open data portal license (which is a copy of the UK Open Government license).

Presently a bunch of licenses do not. This includes the Government of Canada Open Data Licence Agreement for Unrestricted Use of Canada’s Data (couldn’t they choose a better name? But for a real critique of why, read this blog post). It also includes the variants of the license created by Vancouver and now used by Toronto, Ottawa and Edmonton (among others). Full disclosure, I was peripherally involved in the creation of this license – it was necessary at the time.

Both these licenses are not standardized, have restrictions in them not found in the UK/BC Open Government License and the PDDL and are anything but simple. Nor are they stable. At any time the government can revoke them. In other words, many developers and companies interested in open data dislike them immensely.

Where do we go from here?

At the moment there are a range of licenses available in Canada – this undermines the ability of developers to create software that uses open data across multiple jurisdictions.

First, the launch of BC’s open data portal and its use of the UK Open Government License has reset the debate in this country. The Federal government, which has an awkward, onerous and unloved license should stop trying to create a new license that simply adds unnecessary complexity and creates confusion for software developers. (I detail the voluminous problems with the Federal license here.)

Instead the Feds should adopt the UK Open Government Licence and push for it to be a standard, both for the provinces and federal government agencies, as well as for other common wealth countries. Its refusal to adopt the UK license is deeply puzzling. It has offered no explanation about why it can’t, indeed, it would be interesting to hear what the Federal Government believes it knows that the UK government (which has been doing this for much longer) and the BC government doesn’t know.

What I predict will happen is that more and more provinces will adopt the UK license and increasingly the Feds will look isolated and ridiculous. Barring some explanation, this silliness should end.

At the municipal level, things are more complicated. If you look at the open data portals of Vancouver, Toronto, Edmonton and Ottawa (sometimes referred to as the G4) you’ll notice each has a similar paragraph:

The Cities of Vancouver, Edmonton, Ottawa and Toronto have recently joined forces to collaborate on an “Open Data Framework”. The project aims to enhance current open data initiatives in the areas of data standards and terms of use agreements. Please contact us for further information.

This paragraph has been sitting on these sites for well over a year now (approaching almost two years) but in terms of data standards and common terms of use the work, to date, the G4 has produced nothing tangible for end users. (Full disclosure, I have sat in on some of these meetings.) The G4 cities, which were leaders, are now languishing with a license that actually puts them in the middle, not the front of the pack. They remain ahead of the bulk of Canadian cities that have no open data, but, in terms of license, behind the aforementioned cities of Surrey, Langley, Winnipeg (for its transit data).

These second generation open data cities either had fewer resources or drew the right lessons and have leap-frogged the G4 cities by adopting the PDDL – something they did because it essentially outsourced the management of the license to a competent third party. It maximized the effectiveness of their data, while limiting their costs all while giving them the same level of protection.

The UK and BC versions of the Open Government License could work for the cities, but the PDDL is a better license. Also, it is well managed. If the cities were to adopt the OGL it wouldn’t be the end of the world but it also isn’t necessary. It probably makes more sense for them to simply follow the new leaders in the space and adopt the PDDL as this will less restrictive and easier to adopt.

Thus, speaking personally, the ideal situation in Canada would be that:

the Federal and Provincial Governments to adopt the UK/BC Open Government License. I’d love to live in a world where the adopted the PDDL, but my conversations with them lead me to believe this simply is not likely in the near to mid term. I think 99% of software developers out there will agree that the Open Government License is an acceptable substitute. and
the municipalities push to adopt the PDDL. Already several municipalities have done this and the world has not ended. The bar has been set.

The worse outcome would be:

the G4 municipalities invent some new license. The last thing the world needs is another open data license to confuse users and increase legal costs.
the federal government continues along the path of evolving its own license. Its license was born broken and is unnecessary.

Sadly, I see little evidence for optimism at the federal level. However, I’m optimistic about the cities and provinces. The fact that most new open data portals at the municipal level have adopted the PDDL suggests that many in these governments “get it”. I also think the launch of data.gov.bc.ca will spur other provinces to be intelligent about their license choice.

Province of BC launches Open Data Catalog: What works

9 Replies

As revealed yesterday, the province of British Columbia became the first provincial government in Canada to launch an open data portal.

It’s still early but here are some things that I think they’ve gotten right.

1. License: Getting it Right (part 1)

Before anything else happens, this is probably the single biggest good news story for Canadians interested in the opportunities around open data. If the license is broken, it pretty much doesn’t matter how good the data is, it essential gets put in a legal straightjacket and cannot be used. For BC open data portal this happily, is not the case.

There’s actually two good news stories here.

The first is that the license is good. Obviously my preference would be for everything to be unlicensed and in the public domain as it is in the United States. Short of that however, the most progressive license out there is the UK Government’s Open Government License for Public Sector Information. Happily the BC government has essentially copied it. This means that many of that BC’s open data can be used for commercial purposes, political advocacy, personal use and so forth. In short the restrictions are minimal and, I believe, acceptable. The license addresses the concerns I raised back in March when I said 2011 would be the year of Open Data licenses in Canada.

2. License: The Virtuous Convergence (part 2)

The other great thing is that this is a standardized license. The BC government didn’t invent something new they copied something that already worked. This is music to the ears of many as it means applications and analysis developed in British Columbia can be ported to other jurisdictions that use the same license seamlessly. At the moment, that means all of the United Kingdom. There has been some talk of making the UK Open Government Licenses (OGL) a standard that can be used across the commonwealth – that, in my mind, would be a fantastic outcome.

My hope is that this will also put pressure on other jurisdictions to either improve their licenses or converge them with BC/UK or adopt a better license still. With the exception of the City of Surrey, which uses the PDDL license, the BC government’s license far superior to the licenses being used by other jurisdictions: – the municipal licenses based on Vancouver’s license (used by Vancouver, Edmonton, Ottawa, Toronto and a few others) and the Federal Government’s open data license (used by Treasury Board and CIDA) are both much more restrictive. Indeed, my real hope is that BC’s move will snap the Federal Government out of their funk, make them realize their own licenses are confusing, problematic and a waste of time, and encourage them to contribute to making the UK’s OGL a new standard for all of Canada. It would be much better than what they have on offer.

3. Tools for non-developers

Another nice thing about the data.gov.bc.ca website is that it provides tools for non-developers, so that they can play with, and learn from, some of the data. This is, of course, standard fare on most newer open data portals – indeed, it’s seems to be the primary focus on Socrata, a company that specializes in creating open government data portals. The goal everywhere is to increase the number of people who can make use of the data.

4. Meaty Data – Including Public Accounts

One of the charges sometimes leveled against open data portals is that they don’t publish data that is important, or that could drive substantive public policy debates. While this is not true of what has happened in the UK and the United States, that charge probably is someone fair in Canada. While I’m still exploring the data available on data.gov.bc.ca one thing seems clear, there is a commitment to getting the more “high-value” data sets out to the public. For example, I’ve already noticed you can download the Consolidated Revenue Fund Detailed Schedules of Payments-FYE10-Suppliers which for the fiscal year 2009-2010 details the payees who received $25,000 or more from the government. I also noticed that the Provincial Obstacles to Fish Passage are available for download – something I hope our friends in the environmental movement will find helpful. There is also an entire section dedicated to data on the provincial educational system, I’ll be exploring that in more detail.

Wanted to publish this for now, definitely keen to hear about others thoughts and comments on the data portal, data sets you find interesting and helpful, or anything else. If you are building an app using this data, or doing an analysis that is made easier because of the data on this site, I’d love to hear from you.

This is a big step for the province. I’m sure I’ll discover some shortcomings as I dive deeper, but this is a solid start and, I hope, an example to other provinces about what is possible.

Using Data to Make Firefox Better: A mini-case study for your organization

3 Replies

I love Mozilla. Any reader of this blog knows it. I believe in its mission, I find the organization totally fascinating and its processes engrossing. So much so I spend a lot of time thinking about it – and hopefully, finding ways to contribute.

I’m also a big believer in data. I believe in the power of evidence-based public policy (hence my passion about the long-form census) and in the ability of data to help organizations develop better products, and people make smarter decisions.

Happily, a few months ago I was able to merge these two passions: analyzing data in an effort to help Mozilla understand how to improve Firefox. It was fun. But more importantly, the process says a lot about the potential for innovation open to organizations that cultivate an engaged user community.

So what happened?

In November 2010, Mozilla launched a visualization competition that asked: How do People Use Firefox? As part of the competition, they shared anonymous data collected from Test Pilot users (people who agreed to share anonymous usage data with Mozilla). Working with my friend (and quant genius) Diederik Van Liere, we analyzed the impact of add-on memory consumption on browser performance to find out which add-ons use the most memory and thus are most likely slowing down the browser (and frustrating users!). (You can read about our submission here).

But doing the analysis wasn’t enough. We wanted Mozilla engineers to know we thought that users should be shown the results – so they could make more informed choices about which add-ons they download. Our hope was to put pressure on add-on developers to make sure they weren’t ruining Firefox for their users. To do that we visualized the data by making a mock up of their website – with our data inserted.

For our efforts, we won an honourable mention. But winning a prize is far, far less cool than actually changing behaviour or encouraging an actual change. So last week, during a trip to Mozilla’s offices in Mountain View, I was thrilled when one of the engineers pointed out that the add-on site now has a page where they list add-ons that most slow down Firefox’s start up time.

(Sidebar: Anyone else find it ironic that “FastestFox: Browse Faster” is #5?)

This is awesome! Better still, in April, Mozilla launched an add-on performance improvement initiative to help reduce the negative impact add-ons can have on Firefox. I have no idea if our submission to the visualization competition helped kick-start this project; I’m sure there were many smart people at Mozilla already thinking about this. Maybe it was already underway? But I like to believe our ideas helped push their thinking – or, at least, validated some of their ideas. And of course, I hope it continues to. I still believe that the above-cited data shouldn’t be hidden on a webpage well off the beaten path, but should be located right next to every add-on. That’s the best way to create the right feedback loops, and is in line with Mozilla’s manifesto – empowering users.

Some lessons (for Mozilla, companies, non-profits and governments)

First lesson. Innovation comes from everywhere. So why aren’t you tapping into it? Diederik and I are all too happy to dedicate some cycles to thinking about ways to make Firefox better. If you run an organization that has a community of interested people larger than your employee base (I’m looking at you, governments), why aren’t you finding targeted ways to engage them, not in endless brainstorming exercises, but in innovation challenges?

Second, get strategic about using data. A lot of people (including myself) talk about open data. Open data is good. But it can’t hurt to be strategic about it as well. I tried to argue for this in the government and healthcare space with this blog post. Data-driven decisions can be made in lots of places; what you need to ask yourself is: What data are you collecting about your product and processes? What, of that data, could you share, to empower your employees, users, suppliers, customers, whoever, to make better decisions? My sense is that the companies (and governments) of the future are going to be those that react both quickly and intelligently to emerging challenges and opportunities. One key to being competitive will be to have better data to inform decisions. (Again, this is the same reason why, over the next two decades, you can expect my country to start making worse and worse decisions about social policy and the economy – they simply won’t know what is going on).

Third, if you are going to share, get a data portal. In fact, Mozilla needs an open data portal (there is a blog post that is coming). Mozilla has always relied on volunteer contributors to help write Firefox and submit patches to bugs. The same is true for analyzing its products and processes. An open data portal would enable more people to help find ways to keep Firefox competitive. Of course, this is also true for governments and non-profits (to help find efficiencies and new services) and for companies.

Finally, reward good behaviour. If contributors submit something you end up using… let them know! Maybe the idea Diederik and I submitted never informed anything the add-on group was doing; maybe it did. But if it did… why not let us know? We are so pumped about the work they are doing, we’d love to hear more about it. Finding out by accident seems like a lost opportunity to engage interested stakeholders. Moreover, back at the time, Diederik was thinking about his next steps – now he works for the Wikimedia Foundation. But it made me realize how an innovation challenge could be a great way to spot talent.

The Audacity of Shaw: How Canada's Internet just got Worse

59 Replies

It is really, really, really hard to believe. But as bad as internet access is in Canada, it just got worse.

Yesterday, Shaw Communications, a Canadian telecommunications company and internet service provider (ISP) that works mostly in Western Canada announced they are launching Movie Club, a new service to compete with Netflix.

On the surface this sounds like a good thing. More offerings should mean more competition, more choice and lower prices. All things that would benefit consumers.

Look only slightly closer and you learn the very opposite is going on.

This is because, as the article points out:

“…subscribers to Movie Club — who initially can watch on their TV or computer, with phones and tablets planned to come on line later — can view content without it counting against their data plan.

“There should be some advantage to you being a customer,” Bissonnette said.”

The very reason the internet has been such an amazing part of our lives is that every service that is delivered on it is treated equally. You don’t pay more to look at the Vancouver Sun’s website than you do to look at eaves.ca or CNN or to any other website in the world. For policy and technology geeks this principle of equality of access is referred to as net neutrality. The idea is that ISPs (like Shaw) should not restrict or give favourable access to content, sites, or services on the internet.

But this is precisely what Shaw is doing with its new service.

This is because ISPs in Canada charge what are called “overages.” This means if you use the internet a lot, say you watch a lot of videos, at a certain point you will exceed a “cap” and Shaw charges you extra, beyond your fixed monthly fee. If, for example, you use Netflix (which is awesome and cheap, for $8 a month you get unlimited access to a huge quantity of content) you will obviously be watching a large number of videos, and the likelihood of exceeding the cap is quite high.

What Shaw has announced is that if you use their service – Movie Club – none of the videos you watch will count against your cap. In other words they are favouring their service over that of others.

So why should you care? Because, in short, Shaw is making the internet suck. It wants to turn your internet from the awesome experience where you have unlimited choice and can try any service that is out there, into the experience of cable, where your choice is limited to the channels they choose to offer you. Today they’ll favour their movie service as opposed to (the much better) Netflix service. But tomorrow they may decide… hey you are using Skype instead of our telephone service, people who use “our skype” will get cheaper access than people who use skype. Shaw is effectively applying a tax on new innovative and disruptively cheap service on the internet so that you don’t use them. They are determining – through pricing – what you can and cannot do with your computer while elsewhere in the world, people will be using cool new disruptive services that give them better access to more fun content, for cheaper. Welcome to the sucky world of Canada’s internet.

Doubling down on Audacity: The Timing

Of course what makes this all the more obscene is that Shaw has announced this service at the very moment the CRTC – the body that regulates Canada’s Internet Service Providers – is holding hearings on Usage Based Billings. One of the reasons Canada’s internet providers say that have to charge “overages” for those who use the internet a lot is because of there isn’t enough bandwidth. But how is it that there is enough bandwidth for their own services?

As Steve Anderson of the OpenMedia – a consumer advocacy group – shared with me yesterday “It’s a huge abuse of power.” and that “The launch of this service at the time when the CRTC is holding a hearing on pricing regulation should be seen as a slap in the face to the the CRTC, and the four hundred and ninety one thousand Canadians that signed the Stop The Meter petition.”

My own feeling is the solution is pretty simple. We need to get the ISPs out of the business of delivering content. Period. Their job should be to deliver bandwidth, and nothing else. You do that, you’ll have them competing over speed and price very, very quickly. Until then the incentive of ISPs isn’t to offer good internet service, it’s to do the opposite, it’s to encourage (or force) users to use the services they offer over the internet.

For myself, I’m a Shaw customer and a Netflix customer. Until now I’ve had nothing to complain about with either. Now, apparently I have to choose between the two. I can tell you right now who is going to win. Over the next few months I’m going to be moving my internet service to another provider. Maybe I’ll still get cable TV from Shaw, I don’t know, but my internet service is going to a company that gives me the freedom to choose the services I want and that doesn’t ding me with fees that apparently, I’m being charged under false pretenses. I’ll be telling by family members, friends and pretty much everyone I know, to do the same.

Shaw, I’m sorry it had to end this way. But as a consumer, it’s the only responsible thing to do.