Tag Archives: open311

The 311 Open Data Competition is now Live on Kaggle

As I shared the other week, I’ve been working on a data competition with Kaggle and SeeClickFix involving 311 data from four cities: Chicago, New Haven, Oakland and Richmond.

So first things first – the competition is now live. Indeed, there are already 19 teams and 56 submissions that have been made. Fortunately, time is on your side, there are 56 days to go.

As I mentioned in my previous post on the subject, I have real hopes that this competition can help test a hypothesis I have about the possibility of an algorithmic open commons:

There is, however, for me, a potentially bigger goal. To date, as far as I know, predictive algorithms of 311 data have only ever been attempted within a city, not across cities. At a minimum it has not been attempted in a way in which the results are public and become a public asset.

So while the specific problem this contest addresses is relatively humble, I’d see it as a creating a larger opportunity for academics, researchers, data scientists, and curious participants to figure out if can we develop predictive algorithms that work for multiple cities. Because if we can, then these algorithms could be a shared common asset. Each algorithm would become a tool for not just one housing non-profit, or city program but a tool for all sufficiently similar non-profits or city programs.

Of course I’m also discovering there are other benefits that arise out of these competitions.

This last weekend there was a mini-sub competition/hackathon involving a subset of the data. It was amazing to watch from afar. First, I was floored by how much cooperation there was, even between competitors and especially after the competition closed. Take a look at the forums, they are probably make one of the more compelling cases that open data can help foster more people to want to learn how to manipulate and engage with data. Here are contestants sharing their approaches and ideas with one another – just like you’d want them to. I’d known that Kaggle had a interesting community and that learning played an important role in it, but “riding along” in a mini competition has caused me to look again at the competitions through a purely educational lens. It is amazing how much people both wanted to learn and share.

As in the current competition, the team at the hackathon also ran a competition around visualizing the data. And there were some great visualization of the data that came out of it, as well as another example of where people were trying to learn and share. Here are two of my favourites:

I love this visualization by Christoph Molnar because it reveals the different in request locations in each city. In some they are really dense, whereas in others they are much (more) evenly distributed. Super interesting to me.

I also love the simplicity of this image created by miswift. There might have been other things I’d done, like colour coded similar problems to make them easier to compare across cities. But I still love it.

Congratulations to all the winners from this weekends event, and I hope readers will consider participating in the current competition.

Announcing the 311 Data Challenge, soon to be launched on Kaggle

2 Replies

The Kaggle – SeeClickFix – Eaves.ca 311 Data Challenge. Coming Soon.

I’m pleased to share that, in conjunction with SeeClickFix and Kaggle I’ll be sponsoring a predictive data competition using 311 data from four different cities. My hope is that – if we can demonstrate that there are some predictive and socially valuable insights to be gained from this data – we might be able to persuade cities to try to work together to share data insights and help everyone become more efficient, address social inequities and address other city problems 311 data might enable us to explore.

Here’s the backstory and some details in anticipation of the formal launch:

The Story

Several months back Anthony Goldbloom, the founder and CEO of Kaggle – a predictive data competition firm – approached me asking if I could think of something interesting that could be done in the municipal space around open data. Anthony generously offered to waive all of Kaggle’s normal fees if I could come up with a compelling contest.

After playing around with some ideas I reached out to Ben Berkowitz, co-founder of SeeClickFix (one of the world’s largest implementers of the Open311 standard) and asked him if we could persuade some of the cities they work for to share their data for a competition.

Thanks to the hard work of Will Cukierski at Kaggle as well as the team at SeeClickFix we were ultimately able to generate a consistent data set with 300,000 lines of data involving 311 issues spanning 4 cities across the United States.

In addition, while we hoped many of who might choose to participate in a municipal open data challenge would do so out curiosity or desire to better understand how cities work, both myself and SeeClickFix agreed to collectively put up $5000 in prize money to help raise awareness about the competition and hopefully stoke some media (as well as broader participant) interest.

The Goal

The goal of the competition will be to predict the number of votes, comments and views an issue is likely to generate. To be clear, this is not a prediction that is going to radically alter how cities work, but it could be a genuinely useful to communications departments, helping them predict problems that are particularly thorny or worthy proactively communicating to residents about. In addition – and this remains unclear – my own hope is that it could help us understand discrepancies in how different socio-economic or other groups use online 311 and so enable city officials to more effectively respond to complaints from marginalized communities.

In addition there will be a smaller competition around visualization the data.

The Bigger Goal

There is, however, for me, a potentially bigger goal. To date, as far as I know, predictive algorithms of 311 data have only ever been attempted within a city, not across cities. At a minimum it has not been attempted in a way in which the results are public and become a public asset.

So while the specific problem this contest addresses is relatively humble, I’d see it as a creating a larger opportunity for academics, researchers, data scientists, and curious participants to figure out if can we develop predictive algorithms that work for multiple cities. Because if we can, then these algorithms could be a shared common asset. Each algorithm would become a tool for not just one housing non-profit, or city program but a tool for all sufficiently similar non-profits or city programs. This could be exceptionally promising – as well as potentially reveal new behavioral or incentive risks that would need to be thought about.

Of course, discovering that every city is unique and that work is not easily transferable, or that predictive models cluster by city size, or by weather, or by some other variable is also valuable, as this would help us understand what types of investments can be made in civic analytics and what the limits of a potential commons might be.

So be sure to keep an eye on the Kaggle page (I’ll link to it) as this contest will be launching soon.

Why not create an Open311 add-on for Ushahidi?

7 Replies

This is not a complicated post. Just a simple idea: Why not create an Open311 add-on for Ushahidi?

So what do I mean by that, and why should we care?

Many readers will be familiar with Ushahidi, non-profit that develops open source mapping software that enables users to collect and visualize data in interactive maps. It’s history is now fairly famous, as the Wikipedia article about it outlines: “Ushahidi.com’ (Swahili for “testimony” or “witness”) is a website created in the aftermath of Kenya’s disputed 2007 presidential election (see 2007–2008 Kenyan crisis) that collected eyewitness reports of violence sent in by email and text-message and placed them on a Google map.^[2]“Ushahidi’s mapping software also proved to be an important resource in a number of crises since the Kenyan election, most notably during the Haitian earthquake. Here is a great 2 minute video on How how Ushahidi works.

But mapping of this type isn’t only important during emergencies. Indeed it is essential for the day to day operations of many governments, particularly at the local level. While many citizens in developed economies may be are unaware of it, their cities are constantly mapping what is going on around them. Broken infrastructure such as leaky pipes, water mains, clogged gutters, potholes, along with social issues such as crime, homelessness, business and liquor license locations are constantly being updated. More importantly, citizens are often the source of this information – their complaints are the sources of data that end up driving these maps. The gathering of this data generally falls under the rubric of what is termed 311 systems – since in many cities you can call 311 to either tell the city about a problem (e.g. a noise complaint, service request or inform them about broken infrastructure) or to request information about pretty much any of the city’s activities.

This matters because 311 systems have generally been expensive and cumbersome to run. The beautiful thing about Ushahidi is that:

it works: it has a proven track record of enabling citizens in developing countries to share data using even the simplest of devices both with one another and agencies (like humanitarian organizations)
it scales: Haiti and Kenya are pretty big places, and they generated a fair degree of traffic. Ushahidi can handle it.
it is lightweight: Ushahidi technical footprint (yeap making that up right now) is relatively light. The infrastructure required to run it is not overly complicated
it is relatively inexpensive: as a result of (3) it is also relatively cheap to run, being both lightweight and leveraging a lot of open source software
Oh, and did I mention IT WORKS.

This is pretty much the spec you would want to meet if you were setting up a 311 system in a city with very few resources but interested in starting to gather data about both citizen demands and/or trying to monitor newly invested in infrastructure. Of course to transform Ushahidi into a process for mapping 311 type issues you’d need some sort of spec to understand what that would look like. Fortunately Open311 already does just that and is supported by some of the large 311 providers system providers – such as Lagan and Motorola – as well as some of the disruptive – such as SeeClickFix. Indeed there is an Open311 API specification that any developer could use as the basis for the add-on to Ushahidi.

Already I think many cities – even those in developing countries – could probably afford SeeClickFix, so there may already be a solution at the right price point in this space. But maybe not, I don’t know. More importantly, an Open311 module for Ushahidi could get local governments, or better still, local tech developers in developing economies, interested in and contributing to the Ushahidi code base, further strengthening the project. And while the code would be globally accessible, innovation and implementation could continue to happen at the local level, helping drive the local economy and boosting know how. The model here, in my mind, is OpenMRS, which has spawned a number of small tech startups across Africa that manage the implementation and servicing of a number of OpenMRS installations at medical clinics and countries in the region.

I think this is a potentially powerful idea for stakeholders in local governments and startups (especially in developing economies) and our friends at Ushahidi. I can see that my friend Philip Ashlock at Open311 had a similar thought a while ago, so the Open311 people are clearly interested. It could be that the right ingredients are already in place to make some magic happen.

Interview on Open Source, Open Gov & Open Data withe CSEDEV

2 Replies

The other week – in the midst of boarding a plane(!) – I did an interview with the CSEDEV on some thoughts around open data, open government and open source.

The kind people at CSEDEV have written up the interview in a kind of paraphrased way and published it as three short blog posts here, part 2 here and part 3 here.

Part of what makes this interesting to me is how a broader set of people are becoming interested in open government. Take CSEDEV for example. Here is an Ottawa based software firm focused on enterprise solutions. It’s part of an increasing number of software companies and IT consulting firms are taking note of the open government and open data meme. Indeed, another concrete example of this is Lagan, a large supplier of 311 systems, announced the other week that they would support the open311 standard. This dramatically alters the benefits of a 311 system and the capacity for it to serve as a platform and innovation driver for a city.

But, even more exciting, the meme is starting to spread beyond IT and software. I was recently asked to write an article on what open data and open government means for business more generally, here in BC. (Will link to it, when published)

These moments represent an important shift in the open data and open government debate. With vendors and consultants taking notice governments can more easily push for, and expect, off the shelf solutions that support open government initiatives. Not only could this reduce cost to government and improve access for public servants and citizens, it could also be a huge boost for open standards which prove to be transformative to the management of information in the public sector.

Exciting times. Watch the open government space – now that it’s linked to IT, it’s beginning to gain speed.

eaves.ca

if writing is a muscle, this is my gym