Tag Archives: data

Ada Lovelace Day – On Dr. Connie Eaves

For those who don’t know: Today – October 7th – is Ada Lovelace Day. It’s a day where you “share your story about a woman — whether an engineer, a scientist, a technologist or mathematician — who has inspired you to become who you are today.”

It would be remiss for me not to blog about Dr. Connie Eaves. For anyone who thinks I travel a lot, work long hours, or have a passion for evidence and data, I am really just a pale shadow when compared to this inspiring and globally recognized cancer researcher. For those not familiar with her – which is probably anyone outside the field of cancer research and not an avid reader of the journal Blood – you can catch her bio on Wikipedia here.

She is, of course, also my mom.

Obviously, if you are a woman (or a man) interested in getting into science – particularly human biology and stem cell research – I would point you to my mother (and father) as people to get to know, but for me her inspiration is much simpler. At a basic level, there are two invaluable gifts my mother has given me, which I feel are particularly salient to her scientific achievements.

The first, and most important, was the building blocks of critical thinking: To break down an argument and understand it from every angle, as well as dissect the evidence embedded within it. These lessons were hard ones. I learned a lot of it just through observation, and sometimes – more painfully – from trying to engage her in debate. I’ve seen graduate students tremble in fear about engaging my mother in debate. While my victories have been few, I’ve been doing it since probably the age of five or earlier, and it has helped shape my brain in powerful ways in which, I suspect, many masters or doctoral students would happily travel around the world to be exposed to. I am exceedingly lucky.

The second gift my mom bestowed me is her work ethic and drive. I have grown up believing that working for 12 hours, 7 days a week may actually be normal behaviour. There is good and bad in taking on such norms. Neither one of us probably thinks it is healthy when we skip eating all day because we zone out into our work. But that intensity has its upsides, and I’m grateful to have been exposed to it. Indeed, I’d like to think I work hard, but standing next to her, I still often just feel lazy.

I mention these two traits not just because they have had such a great impact on me, but also because I think they’re a reflection of what extraordinary skills were required by my mother to be a successful woman scientist embarking on a career in the 1960s. The simple fact is that in that era, as much as we’d like to think it was not true, I suspect that to be a women scientist – to get on tenure track – you had to be smarter and work harder than almost anyone around you. It is one reason why I think the women scientists of that generation are generally so remarkable. The sad truth is: They had to be.

The happy upside is that for me, purely selfishly, is I got the benefit of being raised by someone who survived and thrived in what I imagine was at times a hostile environment to women. Paradoxically, the benefits I enjoyed are those I would wish on any child in a heartbeat, while the asymmetric expectation are those I would wish on no one.

Happy Ada Lovelace mom.

How Dirty is Your Data? Greenpeace Wants the Cloud to be Greener

My friends over at Greenpeace recently published an interesting report entitled “How dirty is your data?
A Look at the Energy Choices That Power Cloud Computing
.”

For those who think that cloud computing is an environmentally friendly business, let’s just say… it’s not without its problems.

What’s most interesting is the huge opportunity the cloud presents for changing the energy sector – especially in developing economies. Consider the follow factoids from the report:

  • Data centres to house the explosion of virtual information currently consume 1.5-2% of all global electricity; this is growing at a rate of 12% a year.
  • The IT industry points to cloud computing as the new, green model for our IT infrastructure needs, but few companies provide data that would allow us to objectively evaluate these claims.
  • The technologies of the 21st century are still largely powered by the dirty coal power of the past, with over half of the companies rated herein relying on coal for between 50% and 80% of their energy needs.

The 12% growth rate is astounding. It essentially makes it the fastest growing segment in the energy business – so the choices these companies make around how they power their server farms will dictate what the energy industry invests in. If they are content with coal – we’ll burn more coal. If they demand renewables, we’ll end up investing in renewables and that’s what will end up powering not just server farms, but lots of things. It’s a powerful position big data and the cloud hold in the energy marketplace.

And of course, the report notes that many companies say many of the right things:

“Our main goal at Facebook is to help make the world more open and transparent. We believe that if we want to lead the world in this direction, then we must set an example by running our service in this way.”

– Mark Zuckerberg

But then Facebook is patently not transparent about where its energy comes from, so it is not easy to assess how good or bad they are, or how they are trending.

Indeed it is worth looking at Greenpeace’s Clean Cloud report card to see – just how dirty is your data?

Report-card-cloud

I’d love to see a session at the upcoming (or next year) Strata Big Data Conference on say “How to use Big Data to make Big Data more Green.” Maybe even a competition to that effect if there was some data that could be shared? Or maybe just a session where Greenpeace could present their research and engage the community.

Just a thought. Big data has got some big responsibilities on its shoulders when it comes to the environment. It would be great to see them engage on it.

Honourable Mention! The Mozilla Visualization Challenge Update

Really pleased to share that Diederik and I earned an honourable mention for our submission to the Mozilla Open Data Competition.

For those who missed it – and who find opendata, open source and visualization interesting – you can read a description of and see images from our submission to the competition in this blog post I wrote a month ago.

Canada ranks last in freedom of information

For those who missed it over the weekend it turns out Canada ranks last in freedom of information study that looked at the world’s western Parliamentary democracies. What makes it all the more astounding is that a decade ago Canada was considered a leader.

Consider two from the Information Commissioner, Suzanne Legault quotes pulled from the piece:

Only about 16 per cent of the 35,000 requests filed last year resulted in the full disclosure of information, compared with 40 per cent a decade ago, she noted.

And delays in the release of records continue to grow, with just 56 per cent of requests completed in the legislated 30-day period last year, compared with almost 70 per cent at the start of the decade.

These are appalling numbers.

The sad thing is… don’t expect things to get better. Why?

Firstly, the current government seems completely uninterested in access to information, transparency and proactive disclosure, despite these being core planks of its election platform and core values of the reform movement that re-launched Canadian conservatism. Indeed, reforming and improving access to information is the only unfulfilled original campaign promise of the Conservatives – and there appears to be no interest in touching it. Quite the opposite – that political staff now intervene to block and restrict Access to Information Requests – contravening the legislation and policy – is now a well known and documented fact.

Second, this issue is of secondary importance to the public. While everyone will say they care about access to information and open government, then number of people (while growing) still remains small. These types of reports and issues are of secondary importance. This isn’t to say they don’t matter. They do – but generally after something bigger and nastier has come to light and the public begins to smell rot. Then studies like this become the type of thing that hurts a government – it gives legitimacy and language to a sentiment people widely feel.

Third, the public seems confused about who they distrust more – the fact is, however bad the current government is on this issue, the Liberal brand is still badly tarnished on this issue of transparent government due to the scandals from almost a decade ago. Sadly, this means that there will be less burden on this government to act since – every time the issue of transparency and open government arise – rather than act, Government leaders simply point out the other parties failings.

So as the world moves on while Canada remains stuck, its government becoming more opaque, distant and less accountable to the people that elect it.

Interestingly , this also has a real cost to Canada’s influence in the world. It means something when the world turns to you as an expert – as we once were on access to information – minister’s are consulted by other world leaders, your public servants are given access to information loops they might otherwise not know about, there is a general respect, a soft power, that comes from being an acknowledged leader. Today, this is gone.

Indeed, it is worth noting that of the countries survey in the above mentioned study, only Canada and Ireland do not have open data portals which allow for proactive disclosure.

It’s a sign of the times.

Launching Emitter.ca: Open Data, Pollution and Your Community

This week, I’m pleased to announce the beta launch of Emitter.ca – a website for locating, exploring and assessing pollution in your community.

Why Emitter?

A few weeks ago, Nik Garkusha, Microsoft’s Open Source Strategy Lead and an open data advocate asked me: “are there any cool apps you could imagine developing using Canadian federal government open data?”

Having looked over the slim pickings of open federal data sets – most of which I saw while linking to them datadotgc.ca – I remembered one: Environment Canada’s National Pollutant Release Inventory (NPRI) that had real potential.

Emitter-screen-shot

With NPRI I felt we could build an application that allowed people and communities to more clearly see who is polluting, and how much, in their communities could be quite powerful. A 220 chemicals that NPRI tracks isn’t, on its own, a helpful or useful to most Canadians.

We agreed to do something and set for ourselves three goals:

  1. Create a powerful demonstration of how Canadian Federal open data can be used
  2. Develop an application that makes data accessible and engaging to everyday Canadians and provides communities with a tool to better  understand their immediate region or city
  3. Be open

With the help of a crew of volunteers with knew and who joined us along the way – Matthew Dance (Edmonton), Aaron McGowan (London, ON), Barranger Ridler (Toronto) and Mark Arteaga (Oakville) – Emitter began to come together.

Why a Beta?

For a few reasons.

  1. There are still bugs, we’d love to hear about them. Let us know.
  2. We’d like to refine our methodology. It would be great to have a methodology that was more sensitive to chemical types, combinations and other factors… Indeed, I know Matt would love to work with ENGOs or academics who might be able to help provide us with better score cards that can helps Canadians understand what the pollution near them means.
  3. More features – I’d love to be able to include more datasets… like data on where tumours or asthama rates or even employment rates.
  4. I’d LOVE to do mobile, to be able to show pollution data on a mobile app and even in using augmented reality.
  5. Trends… once we get 2009 and/or earlier data we could begin to show trends in pollution rates by facility
  6. plus much, much more…

Build on our work

Finally, we have made everything we’ve done open, our methodology is transparent, and anyone can access the data we used through an API that we share. Also, you can learn more about Emitter and how it came to be reading blog posts by the various developers involved.

Thank yous

Obviously the amazing group of people who made Emitter possible deserve an enormous thank you. I’d also like to thank the Open Lab at Microsoft Canada for contributing the resources that made this possible. We should also thank those who allowed us to build on their work, including Cory Horner’s Howdtheyvote.ca API for Electoral District boundaries we were able to use (why Elections Canada doesn’t offer this is beyond me and, frankly, is an embarrassment). Finally, it is important to acknowledge and thank the good people at Environment Canada who not only collected this data, but have the foresight and wisdom to share make it open. I hope we’ll see more of this.

In Sum

Will Emitter change the world? It’s hard to imagine. But hopefully it is a powerful example of what can happen when governments make their data open. That people will take that data and make it accessible in new and engaging ways.

I hope you’ll give it a spin and I look forward to sharing new features as they come out.

Update!

Since Yesterday Emitter.ca has picked up some media. Here are some of the links so far…

Hanneke Brooymans of the Edmonton Journal wrote this piece which was in turn picked up by the Ottawa Citizen, Calgary Herald, Canada.com, Leader Post, The Province, Times Columnist and Windsor Star.

Nestor Arellano of ITBusiness.ca wrote this piece

Burke Campbell, a freelance writer, wrote this piece on his site.

Kate Dubinski of the London Free Press writes a piece titled It’s Easy to Dig up Dirt Online about emitter.ca

Lots of great reading

So with summer having now sped by I haven’t done a reading update in quite some time… here’s a quickie:

1. The Ascent of Money by Niall Ferguson

For a subject that sounds like it should completely bore you – the history of finance – this book is brilliant and, frankly, fun. It’s also timely. The financial system is so old that we often forget that it actually emerged out of something. Money, bonds, stocks, all that good stuff, it hasn’t always been around. WE of course know this, but it is great to actually be walked through how it all emerged, especially when its so wonderful told. It’s also nice to take a look at an old system, like finance, which we are now as comfortable with as the air we breath (even, when at times, it turns toxic and crashes our economy) as so much of my time is spent looking at relatively newer systems – digital networks and the internet. Lots of lessons could be drawn, especially around trust networks (something here for Shirky while he’s at Berkman?).

One additional point. I initially started watching the PBS series by the same name which is based on the book and also hosted by Niall Ferguson but was not really riveted by it. It was somewhat slow moving and lacked the historical depth and arc the book has… so if you saw the TV documentaries and were turned off, not to worry, the book is definitely working picking up.

2. How Government HR Processes are Broken

Check out this fantastic post by an anonymous public servant in Gatineau. It’s a deadly piece about how broken hiring practices are in government and how it’s unsurprising some would be driven away. It would make you laugh if it weren’t making you cry. Got this from a public servant, then after tweeting it, a bunch more noted to me how painfully true it felt to them. Sigh.

3. David McCandless: The Beauty of Data Visualization

It’s hard(er) to do visualizations without open data. Here’s some beautiful ones from England.

We are going to make a better world, nudging and learning

4. An interview with David Mahfouda and Alex Pasternack, creators of a new app for booking/sharing rides in New York

A fantastic interview about how we can share resources to get around more quickly, cheaply and efficiently by using technology. A riff off of Robin’s Chases’ ZipCar idea but its not about sharing cars, but sharing rides. This is the future of urban transportation. That is, of course, if Ontario’s bus companies don’t try to outlaw it.

The Challenge of Open Data and Metrics

One promise of open data is its ability to inform citizens and consumers about the quality of local services. At the Gov 2.0 Summit yesterday the US Department of Health and Human Resources announced it was releasing data on hospitals, nursing homes and clinics in the hopes that developers will create applications that show citizens and consumers how their local hospitals stacks up against others. In short, how good, or even how safe, is their local hospital?

In Canada we already have some experience with this type of measuring. The Fraser Institute publishes an annual report card of schools performance in Alberta, BC, Ontario and Washington. (For those unfamiliar with the Fraser Institute it is a right-wing think tank based in Vancouver with, shall we say, dubious research credentials but strong ideological and fundraising goals.

Perhaps unsurprisingly, private schools do rather well in the Fraser Institute’s report card. Indeed it would appear (and I may be off by one here) that the t0p 18 schools on the list are all private. This does support a narrative that private schools are inherently better than state run schools that would be consistent with the Fraser Institute’s outlook. But, of course, that would be a difficult conclusion to sustain. Private schools tend to be populated with kids from wealthy families with better educated parents and have been given a blessed head start in life. Also, and not noted in the report card, is that many private schools are comfortable turfing out under-performing or unruly students. This means that the “delayed advancement rate,” one critical metric of a schools performance, is dramatically less impacted than a public school that cannot as easily send students packing.

Indeed, the Fraser Institute’s report card is rife with problems, something that teachers unions and, say,  equally ideological but left-oriented think tanks like the Centre for Policy Alternatives are all too happy to point out.

While I loath the Fraser Institute’s simplistic report card and think it is of dubious value to parents I do like that they are at least trying to give parents some tool by which to measure schools. The notion that schools, teachers and education quality can’t be measured, or are too complicated to measure is untenable. I suspect few parent – especially those in say, jobs where they are evaluated – believe it. Nor does such a position help parents assess the quality of education their child is receiving. While they understand, may be sympathetic to or even agree that this is a complicated issue it seems clear based on the success of Ontario’s school locator that many parents want and like these tools.

Ultimately the problem here isn’t the open data (despite what critics of the Ontario Government’s school comparison website would have you believe). Besides, are we now going to hide or suppress data so that parents can’t assess their kids schools? Nor is the problem school report cards per se. If anything is the problem it is that the Fraser Institute has had the field all to itself to play in. If teachers groups, other think tanks, or any other group believes that the Fraser Institute’s report cards are not too crude, why not design a better one? The data is available (and the government could easily be pressured to make more of it available). Why don’t teacher’s groups share with parents the metrics by which they believe parents should evaluate and compare schools? What this issue could use is some healthy competition and debate – one that generated more options and tools for parents.

The challenge for government is to make data more easily available. By making educational data more accessible, less time, IT skills and energy is needed to organize the data and precious resources can instead be focused on developing and visualizing the scoring methodology. This is certainly seems to be Health and Human Services approach: lower transaction costs, galvanize a variety of assessment applications and foster a healthy debate. It would be nice if ministries of education in Canada took a similar view.

But the second half of that challenge is also important, and groups outside of government need to recognize they can have a role, and the consequence of not participating. The mistake is to ask how to deal with groups like the Fraser Institute that use crude metrics, instead we need to encourage more groups and encourage our own organizations to contribute to the debate, to give it more nuance, and create better tools. Leaving the field to the Fraser Institute is a dangerous strategy, one that will serve few people. This is even more the case since in the future we are likely to have more, not less data about education, health and a myriad of other services and programs.

So, the challenge for readers is – will your organization participate?