Category Archives: open source

Shared IT Services across the Canadian Government – three opportunities

Earlier this week the Canadian Federal Government announced it will be creating Shared Services Canada which will absorb the resources and functions associated with the delivery of email, data centres and network services from 44 departments.

These types of shared services projects are always fraught with danger. While they sometimes are successful, they are often disasters. Highly disruptive with little to show for results (and eventually get unwound). However, I suspect there is a significant amount of savings that can be made and I remain optimistic. With luck the analogy here is the work outgoing US CIO Vivek Kundra accomplished as he has sought to close down and consolidate 800 data centres across the US which is yielding some serious savings.

So here’s what I’m hoping Shared Services Canada will mean:

1) A bigger opportunity for Open Source

What I’m still more hopeful about – although not overly optimistic – is the role that open source solutions could play in the solutions Shared Services Canada will implement. Over on the Drupal site, one contributor claims government officials have been told to hold off buying web content management systems as the government prepares to buy a single solution for across all departments.

If the government is serious about lowering its costs it absolutely must rethink its procurement models so that open source solutions can at least be made a viable option. If not this whole exercise will mean the government may save money, but it will be the we move from 5 expensive solutions to one expensive solution variety.

On the upside some of that work has clearly taken place. Already there are several federal government websites running on Drupal such as this Ministry of Public Works website, the NRCAN and DND intranet. Moreover, there are real efforts in the open source community to accommodate government. In the United States OpenPublic has fostered a version of Drupal designed for government’s needs.

Open source solutions have the added bonus of allowing you the option of using more local talent, which, if stimulus is part of the goal, would be wise. Also, any open source solutions fostered by the federal government could be picked up by the provinces, creating further savings to tax payers. As a bonus, you can also fire incompetent implementors, something that needs to happen a little more often in government IT.

2) More accountability

Ministers Ambrose and Clement are laser focused on finding savings – pretty much every ministry needs to find 5 or 10% savings across the board. I also know both speak passionately about managing tax payers dollars: “Canadians work hard for their money and expect our Government to manage taxpayers dollars responsibly, Shared Services Canada will have a mandate to streamline IT, save money, and end waste and duplication.”

Great. I agree. So one of Shared Service Canada’s first act should be to follow in the footsteps of another Vivek Kundra initiative and recreate his incredibly successful IT Dashboard. Indeed it was by using the dashboard Vivek was able to “cut the time in half to deliver meaningful [IT system] functionality and critical services, and reduced total budgeted [Federal government IT] costs by over $3 billion.” Now that some serious savings. It’s a great example of how transparency can drive effective organizational change.

And here’s the kicker. The White House open sourced the IT Dashboard (the code can be downloaded here). So while it will require some work adapting it, the software is there and a lot of the heavy work has been done. Again, if we are serious about this, the path forward is straightforward.

3) More open data

Speaking of transparency… one place shared services could really come in handy is creating some data warehouses for hosting critical government data sets (ideally in the cloud). I suspect there are a number of important datasets that are used by public servants across ministries, and so getting them on a robust platform that is accessible would make a lot of sense. This of course, would also be an ideal opportunity to engage in a massive open data project. It might be easier to create policy for making the data managed by Shared Service Canada “open.” Indeed, this blog post covers some of the reasons why now is the time to think about that issue.

So congratulations on the big move everyone and I hope these suggestions are helpful. Certainly we’ll be watching with interest – we can’t have a 21st century government unless we have 21st century infrastructure, and you’re now the group responsible for it.

Open Source Data Journalism – Happening now at Buzz Data

(there is a section on this topic focused on governments below)

A hint of how social data could change journalism

Anyone who’s heard me speak in the last 6 months knows I’m excited about BuzzData. This week, while still in limited access beta, the site is showing hints its potential – and it still has only a few hundred users.

First, what is BuzzData? It’s a website that allows data to be easily uploaded and shared among any number of users. (For hackers – it’s essentially github for data, but more social). It makes it easy for people to copy data sets, tinker with them, share the results back with the original master, mash them up with other data sets, all while engaging with those who care about that data set.

So, what happened? Why is any of this interesting? And what does it have to do with journalism?

Exactly a month ago Svetlana Kovalyova of Reuters had her article – Food prices to remain high, UN warns – re-published in the Globe and Mail.  The piece essentially outlined that food commodities were getting cheaper because of local conditions in a number of regions.

Someone at the Globe and Mail decided to go a step further and upload the data – the annual food price indices from 1990-present – onto the BuzzData site, presumably so they could play around with it. This is nothing complicated, it’s a pretty basic chart. Nonetheless a dozen or so users started “following” the dataset and about 11 days ago, one of them, David Joerg, asked:

The article focused on short-term price movements, but what really blew me away is: 1) how the price of all these agricultural commodities has doubled since 2003 and 2) how sugar has more than TRIPLED since 2003. I have to ask, can anyone explain WHY these prices have gone up so much faster than other prices? Is it all about the price of oil?

He then did a simple visualization of the data.

FoodPrices

In response someone from the Globe and Mail entitled Mason answered:

Hi David… did you create your viz based on the data I posted? I can’t answer your question but clearly your visualization brought it to the forefront. Thanks!

But of course, in a process that mirrors what often happens in the open source community, another “follower” of the data shows up and refines the work of the original commentator. In this case, an Alexander Smith notes:

I added some oil price data to this visualization. As you can see the lines for everything except sugar seem to move more or less with the oil. It would be interesting to do a little regression on this and see how close the actual correlation is.

The first thing to note is that Smith has added data, “mashing in” Oil Price per barrel. So now the data set has been made richer. In addition his graph quite nice as it makes the correlation more visible than the graph by Joerg which only referenced the Oil Price Index. It also becomes apparent, looking at this chart, how much of an outlier sugar really is.

oilandfood

Perhaps some regression is required, but Smith’s graph is pretty compelling. What’s more interesting is not once is the price of oil mentioned in the article as a driver of food commodity prices. So maybe it’s not relevant. But maybe it deserves more investigation – and a significantly better piece, one that would provide better information to the public – could be written in the future. In either case, this discussion, conducted by non-experts simply looking at the data, helped surface some interesting leads.

And therein lies the power of social data.

With even only a handful of users a deeper, better analysis of the story has taken place. Why? Because people are able to access the data and look at it directly. If you’re a follower of Julian Assange of wikileaks, you might call this scientific journalism, maybe it is, maybe it isn’t, but it certainly is a much more transparent way for doing analysis and a potential audience builder – imagine if 100s or 1000s of readers were engaged in the data underlying a story. What would that do to the story? What would that do to journalism? With BuzzData it also becomes less difficult to imagine a data journalists who spends a significant amount of their time in BuzzData working with a community of engaged pro-ams trying to find hidden meaning in data they amass.

Obviously, this back and forth isn’t game changing. No smoking gun has been found. But I think it hints at a larger potential, one that it would be very interesting to see unlocked.

More than Journalism – I’m looking at you government

Of course, it isn’t just media companies that should be paying attention. For years I argued that governments – and especially politicians – interested in open data have an unhealthy appetite for applications. They like the idea of sexy apps on smart phones enabling citizens to do cool things. To be clear, I think apps are cool too. I hope in cities and jurisdictions with open data we see more of them.

But open data isn’t just about apps. It’s about the analysis.

Imagine a city’s budget up on Buzzdata. Imagine, the flow rates of the water or sewage system. Or the inventory of trees. Think of how a community of interested and engaged “followers” could supplement that data, analyze it, visualize it. Maybe they would be able to explain it to others better, to find savings or potential problems, develop new forms of risk assessment.

It would certainly make for an interesting discussion. If 100 or even just 5 new analyses were to emerge, maybe none of them would be helpful, or would provide any insights. But I have my doubts. I suspect it would enrich the public debate.

It could be that the analysis would become as sexy as the apps. And that’s an outcome that would warm this policy wonk’s soul.

Lessons for Open Source Communities: Making Bug Tracking More Efficient

This post is a discussion about making bug tracking in Bugzilla for the Mozilla project more efficient. However, I believe it is applicable to any open source project or even companies or governments running service desks (think 311).

Almost exactly a year ago I wrote a blog post titled: Some thoughts on improving Bugzilla in which I made several suggestions for improving the work flow in bugzilla. Happily a number of those ideas have been implemented.

One however, remains outstanding and, I believe, creates an unnecessary amount of triage work as well as a terrible experience for end users. My understanding is that while the bug could not be resolved last year for a few reasons, there is growing interest (exemplified originally in the comment field of my original post) to tackle it once again. This is my attempt at a rallying cry to get that process moving.

Those who are already keen on this idea and don’t want to read anything more below, this refers to bug 444302.

The Challenge: Dealing with Support Requests that Arrive in Bugzilla

I first had this idea last summer while talking to the triage team at the Mozilla Summit. These are the guys who look at the firehose of bugs being submitted to Mozilla every day. They have a finite amount of time, so anything we can do to automate their work is going to help them, and the project, out significantly.

Presently, I’m told that Mozilla gets a huge number of bugs submitted that are not actually bugs, but support issues. This creates several challenges.

First, it means that support related issues, as opposed to real problems with the software, are clogging up the bug tracking system. This increases the amount of noise in the system – making it harder for everyone to find the information they need.

Second, it means the triage teams has to spend time filtering bugs that are actually support issues. Not a good use of their time.

Third, it means that users who have real support issues but submit them accidentally though Bugzilla, get a terrible experience.

This last one is a real problem. If you are a user, feeling frustrated (and possibly not behaving as your usual rational self – we’ve all been there) because your software is not working the way you expect, and then you submit what a triage person considers a support issue (Resolve-Invalid)  you get an email that looks like this:


If I’m already cheesed that my software isn’t doing what I want, getting an email that says “Invalid” and “Verified” is really going to cheese me off. That of course presumes I even know what this email means. More likely, I’ll be thinking that some ancient machine in the bowels of mozilla using software created in the late 1990s received my plea and has, in its 640K confusion, has spammed me. (I mean look at it… from a user’s perspective!)

The Proposal: Re-Automating the Process for a better result

Step 1: My sense is that this issue – especially problem #3 – could be resolved by simply creating a new resolution field. I’ve opted to call it “Support” but am happy to name it something else.

This feels like a simple fix and it would quickly move a lot of bugs that are cluttering up bugzilla… out.

Step 2: Query the text of bugs marked “support” against Mozilla’s database. Then insert the results in an email that goes back to the user. I’m imagining something that might look like this:

SUMO-transfer-v2

Such an email has several advantages:

First, if these are users who’ve submitted inappropriate bugs and who really need support, giving them a bugzilla email isn’t going to help them, they aren’t even going to know how to read it.

Second, there is an opportunity to explain to them where they should go for help – I haven’t done that explicitly enough in this email – but you get the idea.

Because, because we’ve done a query of the Mozilla support database (SUMO) we are able to include some support articles that might resolve their issue.

Fourth, if this really is a bug from a more sophisticated user, we give them a hyperlink back to bugzilla so they can make a note or comment.

What I like about this is it is customized engagement at a low cost. More importantly, it helps unclutter things while also making us more responsive and creating a better experience for users.

Next Steps:

It’s my understanding that this is all pretty doable. After last year’s post there were several helpful comments. Including this one from Bugzilla expert Gervase Markham:

The best way to implement this would be a field on SUMO where you paste a bug number, and it reaches out, downloads the Bugzilla information using the Bugzilla API, and creates a new SUMO entry using it. It then goes back and uses the API to automatically resolve the Bugzilla bug – either as SUPPORT, if we have that new resolution, or INVALID, or MOVED (which is a resolution Bugzilla has had in the past for bugs moved elsewhere), or something else.

The SUMO end could then send them a custom email, and it could include hyperlinks to appropriate articles if the SUMO engine thought there were any.

And Tyler Downer noted in this comment that there maybe be a dependency bug (#577561) that would also need resolving:

Gerv, I love you point 3. Exactly what I had in mind, have SUMO pull the relevant data from the bug report (we just need BMO to autodetect firefox version numbers, bug 577561 ;) and then it should have most of the required data. That would save the user so much time and remove a major time barrier. They think “I just filed a bug, now they want me to start a forum thread?” If it does it automatically, the user would be so much better served.

So, if there is interest in doing this, let me know. I’m happy to support any discussion, should it take place on the comment stream of the bug, the comments below, or somewhere else that might be helpful (maybe I should dial in on this call?). Regardless, this feels like a quick win, one that would better serve Mozilla users, teach them to go to the right place for support (over time) and improve the Bugzilla workflow. It might be worth implementing even for a bit, and we can assess any positive or negative feedback after 6 months.

Let me know how I can help.

Additional Resources

Bug 444302: Provide a means to migrate support issues that are misfiled as bugs over to the support.mozilla.com forums.

My previous post: Some thoughts on improving Bugzilla. The comments are worth checking out

Mozilla’s Bugzilla Wiki Page

Using Data to Make Firefox Better: A mini-case study for your organization

I love Mozilla. Any reader of this blog knows it. I believe in its mission, I find the organization totally fascinating and its processes engrossing. So much so I spend a lot of time thinking about it – and hopefully, finding ways to contribute.

I’m also a big believer in data. I believe in the power of evidence-based public policy (hence my passion about the long-form census) and in the ability of data to help organizations develop better products, and people make smarter decisions.

Happily, a few months ago I was able to merge these two passions: analyzing data in an effort to help Mozilla understand how to improve Firefox. It was fun. But more importantly, the process says a lot about the potential for innovation open to organizations that cultivate an engaged user community.

So what happened?

In November 2010, Mozilla launched a visualization competition that asked: How do People Use Firefox? As part of the competition, they shared anonymous data collected from Test Pilot users (people who agreed to share anonymous usage data with Mozilla). Working with my friend (and quant genius) Diederik Van Liere, we analyzed the impact of add-on memory consumption on browser performance to find out which add-ons use the most memory and thus are most likely slowing down the browser (and frustrating users!). (You can read about our submission here).

But doing the analysis wasn’t enough. We wanted Mozilla engineers to know we thought that users should be shown the results – so they could make more informed choices about which add-ons they download. Our hope was to put pressure on add-on developers to make sure they weren’t ruining Firefox for their users. To do that we visualized the data by making a mock up of their website – with our data inserted.

FF-memory-visualizations2.001

For our efforts, we won an honourable mention. But winning a prize is far, far less cool than actually changing behaviour or encouraging an actual change. So last week, during a trip to Mozilla’s offices in Mountain View, I was thrilled when one of the engineers pointed out that the add-on site now has a page where they list add-ons that most slow down Firefox’s start up time.

Slow-Performing-Add-ons-Add-ons-for-Firefox_1310962746129

(Sidebar: Anyone else find it ironic that “FastestFox: Browse Faster” is #5?)

This is awesome! Better still, in April, Mozilla launched an add-on performance improvement initiative to help reduce the negative impact add-ons can have on Firefox. I have no idea if our submission to the visualization competition helped kick-start this project; I’m sure there were many smart people at Mozilla already thinking about this. Maybe it was already underway? But I like to believe our ideas helped push their thinking – or, at least, validated some of their ideas. And of course, I hope it continues to. I still believe that the above-cited data shouldn’t be hidden on a webpage well off the beaten path, but should be located right next to every add-on. That’s the best way to create the right feedback loops, and is in line with Mozilla’s manifesto – empowering users.

Some lessons (for Mozilla, companies, non-profits and governments)

First lesson. Innovation comes from everywhere. So why aren’t you tapping into it? Diederik and I are all too happy to dedicate some cycles to thinking about ways to make Firefox better. If you run an organization that has a community of interested people larger than your employee base (I’m looking at you, governments), why aren’t you finding targeted ways to engage them, not in endless brainstorming exercises, but in innovation challenges?

Second, get strategic about using data. A lot of people (including myself) talk about open data. Open data is good. But it can’t hurt to be strategic about it as well. I tried to argue for this in the government and healthcare space with this blog post. Data-driven decisions can be made in lots of places; what you need to ask yourself is: What data are you collecting about your product and processes? What, of that data, could you share, to empower your employees, users, suppliers, customers, whoever, to make better decisions? My sense is that the companies (and governments) of the future are going to be those that react both quickly and intelligently to emerging challenges and opportunities. One key to being competitive will be to have better data to inform decisions. (Again, this is the same reason why, over the next two decades, you can expect my country to start making worse and worse decisions about social policy and the economy – they simply won’t know what is going on).

Third, if you are going to share, get a data portal. In fact, Mozilla needs an open data portal (there is a blog post that is coming). Mozilla has always relied on volunteer contributors to help write Firefox and submit patches to bugs. The same is true for analyzing its products and processes. An open data portal would enable more people to help find ways to keep Firefox competitive. Of course, this is also true for governments and non-profits (to help find efficiencies and new services) and for companies.

Finally, reward good behaviour. If contributors submit something you end up using… let them know! Maybe the idea Diederik and I submitted never informed anything the add-on group was doing; maybe it did. But if it did… why not let us know? We are so pumped about the work they are doing, we’d love to hear more about it. Finding out by accident seems like a lost opportunity to engage interested stakeholders. Moreover, back at the time, Diederik was thinking about his next steps – now he works for the Wikimedia Foundation. But it made me realize how an innovation challenge could be a great way to spot talent.

Why not create an Open311 add-on for Ushahidi?

This is not a complicated post. Just a simple idea: Why not create an Open311 add-on for Ushahidi?

So what do I mean by that, and why should we care?

Many readers will be familiar with Ushahidi, non-profit that develops open source mapping software that enables users to collect and visualize data in interactive maps. It’s history is now fairly famous, as the Wikipedia article about it outlines: “Ushahidi.com’ (Swahili for “testimony” or “witness”) is a website created in the aftermath of Kenya’s disputed 2007 presidential election (see 2007–2008 Kenyan crisis) that collected eyewitness reports of violence sent in by email and text-message and placed them on a Google map.[2]“Ushahidi’s mapping software also proved to be an important resource in a number of crises since the Kenyan election, most notably during the Haitian earthquake. Here is a great 2 minute video on How how Ushahidi works.

ushahidi-redBut mapping of this type isn’t only important during emergencies. Indeed it is essential for the day to day operations of many governments, particularly at the local level. While many citizens in developed economies may be are unaware of it, their cities are constantly mapping what is going on around them. Broken infrastructure such as leaky pipes, water mains, clogged gutters, potholes, along with social issues such as crime, homelessness, business and liquor license locations are constantly being updated. More importantly, citizens are often the source of this information – their complaints are the sources of data that end up driving these maps. The gathering of this data generally falls under the rubric of what is termed 311 systems – since in many cities you can call 311 to either tell the city about a problem (e.g. a noise complaint, service request or inform them about broken infrastructure) or to request information about pretty much any of the city’s activities.

This matters because 311 systems have generally been expensive and cumbersome to run. The beautiful thing about Ushahidi is that:

  1. it works: it has a proven track record of enabling citizens in developing countries to share data using even the simplest of devices both with one another and agencies (like humanitarian organizations)
  2. it scales: Haiti and Kenya are pretty big places, and they generated a fair degree of traffic. Ushahidi can handle it.
  3. it is lightweight: Ushahidi technical footprint (yeap making that up right now) is relatively light. The infrastructure required to run it is not overly complicated
  4. it is relatively inexpensive: as a result of (3) it is also relatively cheap to run, being both lightweight and leveraging a lot of open source software
  5. Oh, and did I mention IT WORKS.

This is pretty much the spec you would want to meet if you were setting up a 311 system in a city with very few resources but interested in starting to gather data about both citizen demands and/or trying to monitor newly invested in infrastructure. Of course to transform Ushahidi into a process for mapping 311 type issues you’d need some sort of spec to understand what that would look like. Fortunately Open311 already does just that and is supported by some of the large 311 providers system providers – such as Lagan and Motorola – as well as some of the disruptive – such as SeeClickFix. Indeed there is an Open311 API specification that any developer could use as the basis for the add-on to Ushahidi.

Already I think many cities – even those in developing countries – could probably afford SeeClickFix, so there may already be a solution at the right price point in this space. But maybe not, I don’t know. More importantly, an Open311 module for Ushahidi could get local governments, or better still, local tech developers in developing economies, interested in and contributing to the Ushahidi code base, further strengthening the project. And while the code would be globally accessible, innovation and implementation could continue to happen at the local level, helping drive the local economy and boosting know how. The model here, in my mind, is OpenMRS, which has spawned a number of small tech startups across Africa that manage the implementation and servicing of a number of OpenMRS installations at medical clinics and countries in the region.

I think this is a potentially powerful idea for stakeholders in local governments and startups (especially in developing economies) and our friends at Ushahidi. I can see that my friend Philip Ashlock at Open311 had a similar thought a while ago, so the Open311 people are clearly interested. It could be that the right ingredients are already in place to make some magic happen.

How GitHub Saved OpenSource

For a long time I’ve been thinking about just how much Github has revolutionized open source. Yes, it has made managing the code base significantly easier but its real impact has likely been on the social aspects of managing open source. Github has rebooted how the innovation cycle in open source while simultaneously raising the bar for good community management.

The irony may be that it has done this by making it easy to do the one thing many people thought would kill open source: forking. I remember talking to friends who – before Github launched – felt that forking, while a necessary check on any project, was also its biggest threat and so needed to be managed carefully.

Today, nothing could feel further from the truth. By collapsing the transaction costs around forking Github hasn’t killed open source. It has saved it.

The false fear of forking

The concern with forking – as it was always explained to me – was that it would splinter a community, potentially to the degree that none of the emerging groups would have the necessary critical mass to carry the project forward. Yes, it was necessary that forking be allowed – but only as a last result to manage the worst excesses of bad community leadership. But forking was messy stuff even emotionally painful and exhausting: while sometimes it was mutually agreed upon and cordial, many feared that it would usually was preceded by ugly infighting and nastiness that culminated in an (sometimes) angry rejection and (almost) political act forming a new community.

Forking = Innovation Accelerated

Maybe forking was an almost political act – when it was hard to do. But once anyone could do it, anytime, anywhere, the dynamics changed. I believe open source projects work best when contributors are able to engage in low transaction cost cooperation and high transaction cost collaboration is minimized. The genius of open source is that it does not require a group to debate every issue and work on problems collectively, quite the opposite. It works best when architected so that individuals or functioning sub-groups can grab a part of the whole, take it away, play with it, and bring the solution back and it fit it back into the larger project.

In this world innovation isn’t driven by getting lots of people to work together simultaneously, compromising, negotiating solutions, and waiting on others to complete their part. Such a process can be slow, and worse, can allow promising ideas to be killed by early criticism or be watered down before they reach their potential. What people often need is a private place where their idea can be nursed, an innovation cycle driven by enabling people to work on the same problem in isolation, and then bring working solutions back to the group to be debated. (Yes, this is a simplification, but I think the general idea stands).

And this is why GitHub was such a godsend. Yes it made managing the code base easier, but what it really did was empower contributors. It took something everyone thought would kill open source projects – forking – and made it a powerful tool of experimentation and play. Now, rather than just play with a small part of the code base, you could play with the entire thing. My strong suspension is that this has rebooted the innovation cycle for many open source projects happens. The ability of having lots of people innovating in the safety of their private repository has produced more new ideas then ever before.

Forking = Better Community Management

I also suspect that eliminating the transaction costs around forking has improved open source in another, important way. It has made open source project leads more accountable to the communities they manage.

Why?

Before Github the transaction costs around forking were higher. Setting up a new repository, firing up a bug tracking system and creating all the other necessary infrastructure wasn’t impossible, but neither was it simple. As a result, I suspect it usually only made sense to do if you could motivate a group of contributors to fork with you – there needed to be a deep grievance to justify all this effort. In short, the barriers to forking were high. That meant that project leaders had a lot of leeway in how they engaged in their community before the “threat” of forking became real. The high transaction cost of forking created a cushion for lazy, bad, or incompetent open source leadership and community management.

But collapse the transaction costs to forking and the cost of a parallel project emerging also drops significantly. This is not to claim that the cost of forking is zero – but I suspect that open source community leaders now have to be much more sensitive to the needs, demands, wishes and contributions of their community. More importantly, I suspect this has been good for open source in general.

Developing Community Management Metrics and Tools for Mozilla

Background – how we got here

Over the past few years I’ve spent a great deal of time thinking about how we can improve both the efficiency of open source communities and contributors experience. Indeed, this was the focus, in part, of my talk at the Mozilla Summit last summer. For some years Diederik Van Liere – now with the Wikimedia foundation’s metrics team – and I have played with Bugzilla data a great deal to see if we could extract useful information from it. This led us to engaging closely with some members of the Mozilla Thunderbird team – in particular Dan Mosedale who immediately saw its potential and became a collaborator. Then, in November, we connected with Daniel Einspanjer of Mozilla Metrics and began to imagine ways to share data that could create opportunities to improve the participation experience.

Yesterday, thank’s to some amazing work on the part of the Mozilla Metrics team (listed at bottom of the post), we started sharing some of work at the Mozilla all hands. Specifically, Daniel demoed the first of a group of dashboards that describe what is going on in the Mozilla community, and that we hope, can help enable better community management. While these dashboards deal with the Mozilla community in particular I nonetheless hope they will be of interest to a number of open source communities more generally. (presently the link is only available to Mozilla staffers until the dashboard goes through security review – see more below, along with screen shots – you can see a screencast here).

Why – the contributor experience is a key driver for success of open source projects

My own feeling is that within the Mozilla community the products, like Firefox, evolve quickly, but the process by which people work together tends to evolve more slowly. This is a problem. If Mozilla cannot evolve and adopt new approaches with sufficient speed then potential and current contributors may go where the experience is better and, over time, the innovation and release cycle could itself cease to be competitive.

This task is made all the more complicated since Mozilla’s ability to fulfill its mission and compete against larger, better funded competitors depends on its capacity to tap into a large pool of social capital – a corps of paid and unpaid coders whose creativity can foster new features and ideas. Competing at this level requires Mozilla to provide processes and tools that can effectively harness and coordinate that energy at minimal cost to both contributors and the organization.

As I discussed in my Mozilla Summit talk on Community Management, processes that limit the size or potential of our community limit Mozilla. Conversely, making it easier for people to cooperate, collaborate, experiment and play enhances the community’s capacity. Consequently, open source projects should – in my opinion – constantly be looking to reduce or eliminate transactions costs and barriers to cooperation. A good example of this is how Github showed that forking can be a positive social contribution. Yes it made managing the code base easier, but what it really did was empower people. It took something everyone thought would kill open source projects – forking – and made it a powerful tool of experimentation and play.

How – Using data to enable better contributor experience

Unfortunately, it is often hard to quantitatively asses how effectively an open source community manages itself. Our goal is to change that. The hope is that these dashboards – and the data that underlies them – will provide contributors with an enhanced situational awareness of the community so they could improve not just the code base, but the community and its processes. If we can help instigate a faster pace of innovation of change in the processes of Mozilla, then I think this will both make it easier to improve the contributor experience and increase the pace of innovation and change in the software. That’s the hope.

That said, this first effort is a relatively conservative one. We wanted to create a dashboard that would allow us to identify some broader trends in the Mozilla Community, as well as provide tangible, useful data to Module Owners – particularly around identifying contributors who may be participating less frequently.


This dashboard is primarily designed to serve two purposes. First is to showcase what dashboards could be with the hope of inspiring the Mozilla community members to use it and, more importantly, to inspire them to build their own. The second reason was to provide module owners with a reliable tool with which to more effective manage their part of the community.  So what are some of the ways I hope this dashboard might be helpful? One important feature is the ability to sort contributors by staff or volunteer. An open source communities volunteer contributors should be a treasured resource. One nice things about this dashboard is that you can not only see just volunteers, but you can get a quick sense of those who haven’t submitted a patch in a while.

In the picture below I de-selected all Mozilla employees so that we are only looking at volunteer contributors. Using this view we can see who are volunteers who are starting to participate less – note the red circle marked “everything okay?” A good community manager might send these people an email asking if everything is okay. Maybe they are moving on, or maybe they just had a baby (and so are busy with a totally different type of patch – diapers), but maybe they had a bad experience and are frustrated, or a bunch of code is stuck in review. These are things we would want to know, and know quickly, as losing these contributors would be bad. In addition, we can also see who are the emerging power contributors – they might be people we want to mentor, or connect with mentors in order to solidify their positive association with our community and speed up their development. In my view, this should be core responsibilities of community managers and this dashboard makes it much easier to execute on these opportunities.

main-dasboard-notes
Below you can see how zooming in more closely allows you to see trends for contributors over time. Again, sometimes large changes or shifts are for reasons we know of (they were working on features for a big release and its shipped) but where we don’t know the reasons maybe we should pick up the phone or email this person to check to see if everything is okay.

user-dashboard-notes

Again, if this contributor had a negative experience and was drifting away from the community – wouldn’t we want to know before they silently disappeared and moved on? This is in part the goal.

Some of you may also like the fact that you can dive a little deeper by clicking on a user to see what specific patches that user has worked on (see below).

User-deep-dive1

Again, these are early days. My hope is that other dashboards will provide still more windows into the community and its processes so as to show us where there are bottlenecks and high transaction costs.

Some of the features we’d like to add to this or other dashboards include:

  • a code-review dashboard that would show how long contributors have been waiting for code-review, and how long before their patches get pushed. This could be a powerful way to identify how to streamline processes and make the experience of participating in open source communities better for users.
  • a semantic analysis of bugzilla discussion threads. This could allow us to flag threads that have become unwieldy or where people are behaving inappropriately so that module owners can better moderate or problem solve them
  • a dashboard that, based on your past bugs and some basic attributes (e.g. skillsets) informs newbies and experienced contributors which outstanding bugs could most use their expertise
  • Ultimately I’d like to see at least 3 core dashboards – one for contributors, one for module owners and one for overall projects – emerge and, as well as user generated dashboards developed using Mozilla metrics data.
  • Access to all the data in Bugzilla so the contributors, module owners, researchers and others can build their own dashboards – they know what they need better than we do

What’s Next – How Do I Get To Access it and how can I contribute

Great questions.

At the moment the dashboard is going through security review which it must complete before being accessible. Our hope is that this will be complete by the end of Q2 (June).

More importantly, we’d love to hear from contributors, developers and other interested users. We have a standing weekly call every other Friday at 9am PST where we discuss development issues with this and the forthcoming code-review dashboard, contributors needs and wanted features, as well as use cases. If you are interested in participating on these calls please either let me know, or join the Mozilla Community Metrics Google group.

Again, a huge shout out is deserved by Daniel Einspanjer and the Mozilla Metrics team. Here is a list of contributors both so people know who they are but also in case anyone has question about specific aspects of the dashboard:
Pedro Alves – Team Lead
Paula Clemente – Dashboard implementor
Nuno Moreira – UX designer
Maria Roldan – Data Extraction
Nelson Sousa – Dashboard implementor

Saving Healthcare Billions: Let's fork the VA's Electronic Health Records System

Alternative title for this post: How our Government’s fear of Open Source Software is costing us Billions.

So, I’ve been meaning to blog this for several months now.

Back in November I remember coming across this great, but very short, interview in the Globe and Mail with Ken Kizer. Who, you might ask, is Ken Kizer? He’s a former Naval officer and emergency medicine physician who became the US Veteran’s Affair’s undersecretary for health in 1994.

While the list of changes he made is startling and impressive, what particularly caught my attention is that he accomplished what the Government of Ontario failed to do with $1Billion in spending: implementing an electronic medical record system that works. And, let’s be clear, it not only works, it is saving lives and controlling costs.

And while the VA has spent millions in time and energy developing that code, what is amazing is that it’s all been open sourced, so the cost of leveraging it is relatively low. Indeed, today, Ken Kizer heads up a company that implements the VA’s now open source solution – called VistA – in hospitals in the US. Consdier this extract from his interview:

You have headed a company that promoted “open-source” software for EHR, instead of a pricier proprietary system. Why do you think open source is better?

I believe the solution to health-care information technology lies in the open-source world that basically gives away the code. That is then adapted to local circumstances. With the proprietary model, you are always going back to the vendor for changes, and they decide whether to do them and how much they will cost. In Europe, open source EHR software is zooming. It’s the most widely deployed EHR system in the world, but not here.

Sometimes I wonder, do any Canadian government’s ever look at simply forking VistA and creating a Canadian version?

I wonder all the more after reading a Fortune Magazine article on the changes achieved in the VA during this period. The story is impressive, and VistA played a key role. Indeed, during Kizer’s tenure:

  • The VA saw the number of patents it treat almost doubke from 2.9 million 1996 to 5.4 million patients in 2006.
  • Customer satisfaction ratings within the VA system exceeded those of  private health care providers during many of those years.
  • All this has been achieved as the cost per patient has held steady at roughly $5,000. In contrast the rest of the US medical system saw costs rise 60 percent to $6,300.
  • And perhaps most importantly, in a time of crises the new system proved critical: while Hurricane Katrina destroyed untold numbers of cilivians (paper) healthcare records, VistA’s ensured that health records of veterans in the impacted areas could be called upon in a heartbeat.

This is a story that any Canadian province would be proud to tell its citizens. It would be fascinating to see some of the smaller provinces begin to jointly fund some e-health open source software initiatives, particularly one to create an electronic healthcare record system. Rather than relying on a single vendor with its coterie of expensive consultants, a variety of vendors, all serving the same platform could emerge, helping keep costs down.

It’s the kind of solution that seems custom built for Canada’s healthcare system. Funny how it took a US government agency to show us how to make it a reality.

Honourable Mention! The Mozilla Visualization Challenge Update

Really pleased to share that Diederik and I earned an honourable mention for our submission to the Mozilla Open Data Competition.

For those who missed it – and who find opendata, open source and visualization interesting – you can read a description of and see images from our submission to the competition in this blog post I wrote a month ago.

Visualizing Firefox Plugins Memory Consumption

A few months ago the Mozilla Labs and the Metrics Team, together with the growing Mozilla Research initiative, launched an Open Data Visualization Competition.

Using data collected from Test Pilot users (people who agreed to share anonymous usage data with Mozilla and test pilot new features) Mozilla asked its community to think of creative visual answers to the question: “How do people use Firefox?”

As an open data geek and Mozilla supporter the temptation to try to do something was too great. So I teamed up with my old data partner Diederik Van Liere and we set out to create a visualization. Our goals were simple:

  • have fun
  • focus on something interesting
  • create something that would be useful to Firefox developers and/or users
  • advance the cause for creating a Firefox open data portal

What follows is the result.

It turns out that – in our minds – the most interesting data set revolved around plugin memory consumption. Sure this sounds boring… but plugins (like Adobe reader, Quicktime or Flash) or are an important part of the browser experience – with them we engage in a larger, richer and more diverse set of content.  Plugins, however, also impact memory consumption and, consequently, browser performance. Indeed, some plugins can really slow down Firefox (or any browser). If consumers had a better idea of how much performance would be impacted they might be more selective about which plugins they download, and developers might be more aggressive in trying to make their plugins more efficient.

Presently, if you run Firefox you can go to the Plugin Check page to see if your plugins are up to date. We thought: Wouldn’t it be great if that page ALSO showed you memory consumption rates? Maybe something like this (note the Memory Consumption column, it doesn’t exist on the real webpage, and you can see a larger version of this image here):

Firefox data visualization v2

Please understand (and we are quite proud of this). All of the data in this mockup is real. Memory consumptions are estimates we derived by analyzing the Test Pilot data.

How, you might ask did we (Diederik) do that?

GEEK OUT EXPLANATION: Well, we (Diederik) built a dataset of about 25,000 different testpilot users and parsed the data to see which plugins were installed and how much memory was consumed around the time of initialization. This data was analyzed using ordinary least squares regression where the dependent variable is memory consumption and the different plugins are the explanatory variables. We only included results that are highly significant.

The following table shows our total results (you can download a bigger version here).

Plugin_memory_consumption_chart v2

Clearly, not all plugins are created equal.

Our point here isn’t that we have created the definitive way of assessing plugin impact on the browser, our point is that creating a solid methodology for doing so is likely witihin Mozilla’s grasp. More importantly, doing this could help improve the browsing experience. Indeed, it would probably be even wiser to do something like this for Add-ons, which is where I’m guessing the real lag time around the browsing experience is created.

Also, with such a small data set we were only able to calculate the memory usage for a limited number of plugins and generally those that are more obscure. Our methodology required having several data points from people who are and who aren’t using a given plugin and so with many popular plugins we didn’t have enough data from people who weren’t using it… a problem however, that would likely be easily solved with access to more data.

Finally, I hope this contest and our submission helps make the case for why Mozilla needs an open data portal. Mozilla collects and incredible amount of data of which it does not have the resources to analyze internally. Making it available to the community would do to data what Mozilla has done to code – enable others create value that could affect the product and help advance the open web. I had a great meeting earlier this week with a number of the Mozilla people about this issue, I hope that we can continue to make progress.