Yearly Archives: 2011

Why not create an Open311 add-on for Ushahidi?

This is not a complicated post. Just a simple idea: Why not create an Open311 add-on for Ushahidi?

So what do I mean by that, and why should we care?

Many readers will be familiar with Ushahidi, non-profit that develops open source mapping software that enables users to collect and visualize data in interactive maps. It’s history is now fairly famous, as the Wikipedia article about it outlines: “Ushahidi.com’ (Swahili for “testimony” or “witness”) is a website created in the aftermath of Kenya’s disputed 2007 presidential election (see 2007–2008 Kenyan crisis) that collected eyewitness reports of violence sent in by email and text-message and placed them on a Google map.^[2]“Ushahidi’s mapping software also proved to be an important resource in a number of crises since the Kenyan election, most notably during the Haitian earthquake. Here is a great 2 minute video on How how Ushahidi works.

But mapping of this type isn’t only important during emergencies. Indeed it is essential for the day to day operations of many governments, particularly at the local level. While many citizens in developed economies may be are unaware of it, their cities are constantly mapping what is going on around them. Broken infrastructure such as leaky pipes, water mains, clogged gutters, potholes, along with social issues such as crime, homelessness, business and liquor license locations are constantly being updated. More importantly, citizens are often the source of this information – their complaints are the sources of data that end up driving these maps. The gathering of this data generally falls under the rubric of what is termed 311 systems – since in many cities you can call 311 to either tell the city about a problem (e.g. a noise complaint, service request or inform them about broken infrastructure) or to request information about pretty much any of the city’s activities.

This matters because 311 systems have generally been expensive and cumbersome to run. The beautiful thing about Ushahidi is that:

it works: it has a proven track record of enabling citizens in developing countries to share data using even the simplest of devices both with one another and agencies (like humanitarian organizations)
it scales: Haiti and Kenya are pretty big places, and they generated a fair degree of traffic. Ushahidi can handle it.
it is lightweight: Ushahidi technical footprint (yeap making that up right now) is relatively light. The infrastructure required to run it is not overly complicated
it is relatively inexpensive: as a result of (3) it is also relatively cheap to run, being both lightweight and leveraging a lot of open source software
Oh, and did I mention IT WORKS.

This is pretty much the spec you would want to meet if you were setting up a 311 system in a city with very few resources but interested in starting to gather data about both citizen demands and/or trying to monitor newly invested in infrastructure. Of course to transform Ushahidi into a process for mapping 311 type issues you’d need some sort of spec to understand what that would look like. Fortunately Open311 already does just that and is supported by some of the large 311 providers system providers – such as Lagan and Motorola – as well as some of the disruptive – such as SeeClickFix. Indeed there is an Open311 API specification that any developer could use as the basis for the add-on to Ushahidi.

Already I think many cities – even those in developing countries – could probably afford SeeClickFix, so there may already be a solution at the right price point in this space. But maybe not, I don’t know. More importantly, an Open311 module for Ushahidi could get local governments, or better still, local tech developers in developing economies, interested in and contributing to the Ushahidi code base, further strengthening the project. And while the code would be globally accessible, innovation and implementation could continue to happen at the local level, helping drive the local economy and boosting know how. The model here, in my mind, is OpenMRS, which has spawned a number of small tech startups across Africa that manage the implementation and servicing of a number of OpenMRS installations at medical clinics and countries in the region.

I think this is a potentially powerful idea for stakeholders in local governments and startups (especially in developing economies) and our friends at Ushahidi. I can see that my friend Philip Ashlock at Open311 had a similar thought a while ago, so the Open311 people are clearly interested. It could be that the right ingredients are already in place to make some magic happen.

Code for America – Showing how to get it done on Independence Day

Mind. Prepare to be blown away. Big Data, Wikipedia and Government.

3 Replies

Okay, super psyched about this. Back at the Strata Conference in Feb (in San Diego) I introduced my long time uber-quant friend and now Wikimedia Foundation data scientist Diederik Van Liere to fellow Gov2.0 thinker Nicholas Gruen (Chairman) and Anthony Goldbloom (Founder and CEO) of an awesome new company called Kaggle.

As usually happens when awesome people get together… awesomeness ensued. Mind. Be prepared to be blown.

So first, what is Kaggle? They’re a company that helps companies and organizations post their data and run competitions with the goal of having it scrutinized by the world’s best data scientists towards some specific goal. Perhaps the most powerful example of a Kaggle competition to date was their HIV prediction competition, in which they asked contestants to use a data set to find markers in the HIV sequence which predict a change in the severity of the infection (as measured by viral load and CD4 counts).

Until Kaggle showed up the best science to date had a prediction rate of 70% – a feat that had taken years to achieve. In 90 days contributors to the contest were able to achieve a prediction rate of 77%. A 10% improvement. I’m told that achieving an similar increment had previously taken something close to a decade. (Data geeks can read how the winner did it here and here.)

Diederik and Anthony have created a similar competition, but this time using Wikipedia participation data. As the competition page outlines:

This competition challenges data-mining experts to build a predictive model that predicts the number of edits an editor will make in the five months after the end date of the training dataset. The dataset is randomly sampled from the English Wikipedia dataset from the period January 2001 – August 2010.

The objective of this competition is to quantitively understand what factors determine editing behavior. We hope to be able to answer questions, using these predictive models, why people stop editing or increase their pace of editing.

This is of course, a subject matter that is dear to me as I’m hoping that we can do similar analysis in open source communities – something Diederik and I have tried to theorize with Wikipedia and actually do Bugzilla data.

There is a grand prize of $5000 (along with a few others) and, amazingly, already 15 participants and 7 submissions.

Finally, I hope public policy geeks, government officials and politicians are paying attention. There is power in data and an opportunity to use it to find efficiencies and opportunities. Most governments probably don’t even know how to approach an organization like Kaggle or to run a competition like this, despite (or because?) it is so fast, efficient and effective.

It shouldn’t be this way.

If you are in government (or any org), check out Kaggle. Watch. Learn. There is huge opportunity here.

12:10pm PST – UPDATE: More Michael Bay sized awesomeness. Within 36 hours of the wikipedia challenge being launched the leading submission has improved on internal Wikimedia Foundation models by 32.4%

CIDA announces Open Data portal: What it means to Canadians

7 Replies

For those who missed it, the Canadian International Development Agency (CIDA) has announced it is launching an open data portal.

This is exciting news. On Monday I was interviewed about the initiative by Embassy Magazine which published the resulting article (behind their paywall) here.

As (I hope) the interview conveys, I’m cautiously optimistic about the Minister’s announcement. I’m conservative in my reaction only because we don’t actually know what the Minister has announced. At the moment the CIDA open data page is, quite literally, a blank slate. I feel positive because pretty much anything that gets more information about Canada’s aid budget available online is a step in the right direction. I’m cautious however, because the text from the Minister’s speech leads me to believe that she is using the term “open data” to describe something that may, in fact, not be open data.

Donors and partner countries must be accountable to their citizens, absolutely, but both must also be accountable to each other.

Transparency underpins these accountabilities.

With this in mind, today I am pleased to announce the Open Data Portal on the CIDA website that will make our searchable database of roughly 3,000 projects quick and simple to access.

The Open Data portal will put our country strategies, evaluations, audits and annual statistical and results reports within easy reach.

One of the core elements of the definition of “open data” is that it be machine readable. I need to actually get the “data” (e.g an excel spreadsheet, or database I can download and/or access) so that I can play with it, mash it up, analyze it, etc… It isn’t clear that this is on offer. The minister’s announcements talks about a database that allows you to search, and quickly download, reports on the 3000 projects that CIDA funds or operates. A report however, is not data. It may cite data, it may (and hopefully does) even contain data in charts or tables, but if what we are getting is access to reports then this is not an open data portal.

What I hope is happening – and what I advocated for in an oped in the Toronto Star – is that the Minister is launching a true open data portal which will share actual data – not analysis – with Canadians. More importantly, I hope this means Canada will be joining the efforts of Publish What you Fund, as it pushes donor organizations to share their aid data in a single common structure, so that budgets, contributions, projects, timelines, geography and other information about aid can be compared across countries, agencies, and organizations.

Open data, and especially in a internationally recognized standardized format, matters because no one is going to read all 10,000 reports about all 3000 projects CIDA funds. However, if we had access to the data, in a structured manner, there are those at non-profits, in universities and colleges and in the media (among other places) that could map the projects, compare budgets and results more clearly, compare our efforts against those of other countries, and do their own analysis to say, find duplication and overlap. I don’t, for a second, believe that 99.9% of Canadians will use CIDA’s open data portal, but the .1% who do will be able to create products that can inform the rest of us, and allow us to better understand Canada’s role in the world. In other words, Open Data portal could be empowering and educating to a broad number of people. Access to 10,000 reports, while a good step, simply won’t be able to create a similar outcome on any scale. The difference is, quite frankly, dramatic.

So let’s wait and see. I’m excited that the Minister of International Cooperation is using the language of Open Data – it means that she and her staff understand it has currency. What I also hope is that they understand its meaning – so far we have no data on whether they do or do not, and I remain cautiously optimistic, they should, after all, realize the significance of the language they are using. Either way, they have set high expectations among those of us who think about, talk about and work in, this area. As a Canadian, I’m hoping those expectations get fulfilled.

Links on Social Media & Politics: Notes from "We Want Your Thoughts #4"

1 Reply

Last night I had a great time taking the stage with Alexandra Samuel in Vancouver for “We Want Your Thoughts” at the Khafka coffee house on Main St. The night’s discussion was focused on Social Media – from chit chat to election winner – what next?” (with a little on the social media driven response to the riots thrown in for good measure).

Both Alex and I promised to post some links from our blogs for attendees so what follows is a list of some thoughts on the subject I hope everyone can find engaging.

On Social Media generally, probably the most popular post on this blog is this piece: Twitter is my Newspaper: explaining twitter to newbies. More broadly thinking about the internet and media, this essay I wrote with Taylor Owen is now a chapter in this university textbook on journalism. Along with this post as a sidebar note (different textbook), which has been one of my most read.

On the riots, I encourage you to read Alexandra Samuel’s post on the subject (After a Loss in Vancouver, Troubling Signals of Citizen Surveillance) and my counter thoughts (Social Media and Rioters) – a blogging debate! You can also hear me talk about the issue on an interview on CBC’s Cross Country Checkup on the issue (around hour 1).

On social media and politics, maybe some of the most notable pieces include a back forth between myself and Michael Valpy who felt that social media was ending our social cohesion and destroying democracy (obviously, this was pre-Middle East Riots and the proroguing Parliament debate). I responded with a post on why his arguments were flawed and that actually the reverse was true. He responded to that post in The Mark. And I posted response to that as well. It all makes for a good read.

Rob Cottingham’s Visual Notes of the first 15 minutes

Then there were some pieces on Social Media and the Proroguing of Parliament. I had this piece in the Globe and then this post talking a little more about the media’s confused relationship with social media and politics.

Finally, one of the points I referred to several times yesterday was the problem of assuming social values won’t change when talking about technology adoption and its impact, probably the most explicit post I’ve written on the subject is this one: Why the Internet Will Shape Social Values (and not the other way around)

Finally, some books/articles I mentioned or on topic:

Everything Bad is Good for You by Steven Johnson

What Technology Wants by Kevin Kelly

Here Comes Everybody by Clay Shirky

The Net Delusion: How Not to Liberate the World by Evgeny Morozov

The Inside Story of How Facebook Responded to Tunisian Hacks an article in the Atlantic by Alexis Madrigal

I hope this is interesting.

The next Open Data battle: Advancing Policy & Innovation through Standards

21 Replies

With the possible exception of weather data, the most successful open data set out there at the moment is transit data. It remains the data with which developers have experimented and innovated the most. Why is this? Because it’s been standardized. Ever since Google and the City of Portland creating the General Transit Feed Specification (GTFS) any developer that creates an application using GTFS transit data can port their application to over 100+ cities around the world with 10s and even 100s of millions of potential users. Now that’s scale!

All in all the benefits of a standard data structure are clear. A public good is more effectively used, citizens receive enjoy better service and companies (both Google and the numerous smaller companies that sell transit related applications) generate revenue, pay salaries, etc…

This is why, with a number of jurisdictions now committed to open data, I believe it is time for advocates to start focusing on the next big issue. How do we get different jurisdictions to align around standard structures so as to increase the number of people to whom an application or analysis will be relevant? Having cities publish open data sets is a great start and has led to real innovation, next generation open data and the next leaps in innovation will require some more standards.

The key, I think, is to find areas that meet three criteria:

Government Data: Is there relevant government data about the service or issue that is available?
Demand: Is this a service for which there is regular demand? (this is why transit is so good, millions of people touch the service on a daily basis)
Business Model: Is there a business that believes it can use this data to generate revenue (either directly, or indirectly)

Two comments on this.

First, I think we should look at this model because we want to find places where the incentives are right for all the key stakeholders. The wrong way to create a data structure is to get a bunch of governments together to talk about it. That process will take 5 years… if we are lucky. Remember the GTFS emerged because Google and Portland got together, after that, everybody else bandwagoned because the value proposition was so high. This remains, in my mind, not the perfect, but the fastest and more efficient model to get more common data structures. I also respect it won’t work for everything, but it can give us more successes to point to.

Which leads me to point two. Yes, at the moment, I think that target in the middle of this model is relatively small. But I think we can make it bigger. The GTFS shows cities, citizens and companies that there is value in open data. What we need are more examples so that a) more business models emerge and b) more government data is shared in a structured way across multiple jurisdictions. The bottom and and right hand circles in this diagram can, and if we are successful will, move. In short, I think we can create this dynamic:

So, what does this look like in practice?

I’ve been trying to think of services that fall in various parts of the diagram. A while back I wrote a post about using open restaurant inspection data to drive down health costs. Specifically around finding a government to work with a Yelp!, Bing or Google Maps, Urban Spoon or other company to integrate the inspection data into the application. That for me is an example of something that I think fits in the middle. Government’s have the data, its a service citizens could touch on a regular base if the data appeared in their workflow (e.g. Yelp! or Bing Maps) and for those businesses it either helps drive search revenue or gives their product a competitive advantage. The Open311 standard (sadly missing from my diagram), and the emergence of SeeClickFix strike me as another excellent example that is right on the inside edge of the sweet spot).

Here’s a list of what else I’ve come up with at the moment:

You can also now see why I’ve been working on Recollect.net – our garbage pick up reminder service – and helping develop a standard around garbage scheduling data – the Trash & Recycling Object Notation. I think it is a service around which we can help explain the value of common standards to cities.

You’ll notice that I’ve put “democracy data” (e.g. agendas, minutes, legislation, hansards, budgets, etc…) in the area where I don’t think there is a business plan. I’m not fully convinced of this – I could see a business model in the media space for this – but I’m trying to be conservative in my estimate. In either case, that is the type of data the good people at the Sunlight Foundation are trying to get liberated, so there is at least, non-profit efforts concentrated there in America.

I also put real estate in a category where I don’t think there is real consumer demand. What I mean by this isn’t that people don’t want it, they do, but they are only really interested in it maybe 2-4 times in their life. It doesn’t have the high touch point of transit or garbage schedules, or of traffic and parking. I understand that there are businesses to be built around this data, I love Viewpoint.ca – a site that takes mashes opendata up with real estate data to create a compelling real estate website – but I don’t think it is a service people will get attached to because they will only use it infrequently.

Ultimately I’d love to hear from people on ideas they on why might fit in this sweet spot. (if you are comfortable sharing the idea, of course). Part of this is because I’d love to test the model more. The other reason is because I’m engaged with some governments interested in getting more strategic about their open data use and so these types of opportunities could become reality.

Finally, I just hope you find this model compelling and helpful.

Open Data Job Posting at MaRS in Toronto

1 Reply

The following job posting can be found on the MaRS website here.

So here’s a job for an open data advocate living in, or willing to move to, Toronto. For the right person with the right vision, this could be about getting a group of organizations to open up their data to drive innovation and broaden the adoption or development of public goods. I’ve lots of thoughts on this and think it could be an interesting opportunity, thought I’d share.

You’ve got until the end of the month to submit the necessary documents…

Program Director – Regional Strategic Resource Centre Program (ReSRC)

Posted June 14, 2011

Job Title: Program Director – Regional Strategic Resource Centre Program (ReSRC)

Company Name: MaRS

Position Type: Program Director – Regional Strategic Resource Centre Program (ReSRC)

Location: ON – Metro Toronto

Application Deadline: 2011-06-30

Category: Project Management

Position Overview:

The development of Regional Strategic Resource Centres (ReSRCs) is an exciting new initiative in Ontario’s innovation system, supported by the Ministry of Research and Innovation (MRI) and coordinated by MaRS in partnership with a range of stakeholders.

The ReSRCs will advance Ontario’s Innovation Agenda by creating the information infrastructure required to support 21^st century knowledge economy decision-making, and by engaging the community to use this “open information” to strengthen innovation in the region.

The premise of this initiative is simple: By sharing and integrating disparate sets of data – often collected in institutional silos – from government, academia as well as the private and non-profit sectors, we will better understand the unique strengths, opportunities and needs of our communities and can more effectively work together to build vibrant, productive regional innovation economies.

Successful communities around the world increasingly rely on the information and insights garnered from of a wide range of sources, including civic organizations, businesses, academic institutions, non-profits and governments, among others. Significant amounts of data are collected by these organizations in the course of their work. The aggregation of this data will create a rich virtual information hub that can be accessed by various community stakeholders to make more timely and better decisions on topics ranging from urban planning to drivers of economic development. Different models of these open data platforms are in development elsewhere; there is a unique opportunity for Ontario to become a leader in this arena, given the strong innovation network that is being developed across the Province.

As an example, the ReSRC may specifically focus on the role of high growth firms in the region, given their critical contribution to new job creation. In this case, the hub will integrate relevant data sources and engage entrepreneurs and stakeholders working in/with these firms, in an effort to shed light on the following questions: Where are the high growth firms located? What sectors are they in? How old are these firms? Where do the employees who work in these firms live? How do they get to work? What is their education? How do the high growth firms collaborate with academic institutions in the region? What other firms in the region do the high growth firms rely on or support? Which public policy or program instruments are particularly effective in supporting the growth of these firms, or hinder their progress? How can we do more of what works, to create more high growth firms in the region? What barriers need to be removed?

We believe the time is right for Ontario to take a strong leadership position in the provisioning of open data to foster community leadership and collaboration. This initiative will involve the creation of the inaugural ReSRC, which includes the identification and aggregation of disparate data sources, sourcing a data management infrastructure, and developing an online community engagement portal. The small core ReSRC team will work closely with a range of partners – the success of the initiative will depend on the productive collaboration with stakeholders in different related sectors. An Advisory Board, with representation of key partners, will meet regularly to provide strategic guidance, extend networks, and share expertise.

The qualified candidate for this exciting position will have strong skills and demonstrated experience in project management, open data / information management, data warehousing and integration, community engagement, IP policy / negotiations, and contract management. Outstanding communication skills and ability to lead in a collaborative environment are critical attributes.

Key Responsibilities

Provide overall leadership and management through all elements of project planning and execution including:

Building and managing the project team and working with the Ontario Ministry of Research and Innovation to refine the project scope of work, activities, resources, and timelines

Developing the governance structure of ReSRC including the development and engagement of an Advisory Board to guide and Working Groups to drive the project

Leading and managing all Request for Proposal (RFP) processes and providing oversight and management of technology implementations (including relevant collaboration portal, data warehouses, and integration with legacy systems and data provider systems)

Building and managing negotiations and trusted relationships with key data providers, with respect to IP, privacy, data sharing and level of access, fee-for-service offerings, and support / value add services

Providing consistent project updates to MaRS and the extended project team, select community stakeholders, and the Ministry of Research and Innovation

Managing the overall decision-making processes and proactively identifying, analyzing and resolving issues

Managing the allocation of funds, project budget and ensuring adherence / compliance to Ontario procurement legislation

Managing the development of a brand and marketing strategy

Developing a best in class on-line portal for ReSRC; launch and maintain strong community engagement with the assets of the entity

Managing rollout strategy and maintaining documentation / lessons learned on project implementation to support future ReSRC launches in other Ontario regions

Demonstrate industry knowledge and leadership regarding innovation trends, emerging market shifts, economic development models, current events, major corporate and government initiatives, public policy, regulatory issues etc.

Effectively maintain key contacts related to open data, community engagement, and data integration

Leading the development of original thought leadership related to the project scope including open data, community engagement, data integration, GIS

Educational/Experience Requirements

Minimum Bachelor’s degree and 8+ years of relevant project management and/or business experience in information sciences, market research, management consulting, data management, or related sectors.

Demonstrated project management experience delivering complex technology or other projects; experienced in project management methodologies and the ability to apply them in a flexible manner; PMP certification considered an asset

Demonstrated effective approach to problem solving, understands the context and impact of problems and demonstrates an extensive knowledge of available resources and content

Strong understanding of open data, Geographic Information Systems (GIS), collaboration portals and wiki’s, data integration, IP / privacy, and market research

Familiar with the “language” and terminology of industry, finance, government and the tech/data community; deep knowledge business development process, including an extensive network of business contacts

Highly developed analytical and interpretive skills, conceptual thinker, able to solve complex problems

Demonstrated passion for leading successful, creative, and engaged teams

Strong capabilities in: MS Word, PowerPoint and Excel

Personal Requirements

Self-starter, creative thinker, and strong team player

Strong communications and organizational skills – oral, written, and presentation skills

Superior interpersonal skills; ability to influence others without formal authority

Ability to impact and influence key project participants and stakeholders (including strong negotiation skills)

Strong partnership development capabilities with an array of stakeholders including government, public sector, non-profit, academe, and the private sector

Ability to multi-task, comfortable working in a fast-paced, high energy environment

Consummate professional, able to represent the organization in all circumstances

Personal accountability and commitment to achieving and exceeding goals and objectives

How To Apply: Interested candidates should forward their resume to Elizabeth Pojedyniec at epojedyniec@marsdd.com by Thursday, June 30^th, 2011

Visualizing how everything in Beijing is built at a Las Vegas scale

1 Reply

One of the things that struck me most about Beijing was the sheer size of everything. Beijing, it often seems, is built at a Las Vegas scale – the buildings, the roads, the airport – it’s all huge.

The size can be deceptive when looking at a map, it is not unusual for a city block in Beijing to be 500 meters long, so you might look at a map and say: “hey look that’s only 3 blocks from here! let’s walk over there.” A kilometer and a half isn’t a long walk – but it is if you were expecting to walk 3 or 4 hundred meters.

Of course, there are lots of small things in Beijing, lots of weaving alleys (indeed, it sometimes felt like there were only two types of streets in Beijing 6-10 lane super roads, or tiny alleyways and side streets) but the grandiose parts of it dominate the experience. More interestingly, this is not a recent thing. The Forbidden Palace – world famous and which I first remember being introduced to while watching the visually stunning The Last Emperor in elementary school – is perhaps the best example of how big has always been a part of Beijing. The complex – which was home to China’s emperor since its completion in 1420 – spans an area of 178 acres. If, like me, you find that number hard to assign meaning to, I thought I’d visualize it against an American and Canadian landmark that would make it easier to understand.

All of these images are taken from Google Map at the same scale. The top left hand image shows the Forbidden Palace in Beijing, the lower left shows the same area super imposed over Stanley Park in Vancouver and on the right super imposed over Central Park in New York.

What’s amazing is you could literally pour all of Stanley Park into the Forbidden Palace and have room left over, while the complex consumers around 40% of Central Park. Mind boggling.

I also hope this blog post demonstrates how a simple visualization can be much more powerful than numbers, even using common tools like Google maps and keynote (powerpoint for Mac).

Social Media and Rioters

10 Replies

My friend Alexandra Samuel penned a piece titled “After a Loss in Vancouver, Troubling Signals of Citizen Surveillance” over at the Harvard Business Review. The piece highlights her concern with the number of people willing to engage in citizen surveillance.

As she states:

It’s one thing to take pictures as part of the process of telling your story, or as part of your (paid or unpaid) work as a citizen journalist. It’s another thing entirely to take and post pictures and videos with the explicit intention of identifying illegal (or potentially illegal) activity. At that moment you are no longer engaging in citizen journalism; you’re engaging in citizen surveillance.

And I don’t think we want to live in a society that turns social media into a form of crowdsourced surveillance. When social media users embrace Twitter, Facebook, YouTube and blogs as channels for curating, identifying and pursuing criminals, that is exactly what they are moving toward.

I encourage you to read the piece, and, I’m not sure I agree with much of it on two levels.

First, I want to steer away from good versus bad and right versus wrong. Social Media isn’t going to create only good outcomes, or only bad outcomes, it is going to create both (something I know Alex acknowledges). This technology will, like previous technologies, reset what normal means. In the new world we are becoming more powerful “sensors” in our society. We can enable others to know what, good and bad, is going on around us. To believe that we won’t share, and that others won’t use our shared information to inform their decisions, is simply not logical. As dBarefoot points out in the comments there are lots of social good that can come for surveillance. In the end you can’t post videos of human right injustices without also being able to post videos of people at abortion clinics, you can’t post videos of officials taking bribes without also being able to post videos of people smoking drugs at a party. The alternative, a society where people are not permitted to share, strikes me as even more dangerous than a society where we can share but where one element of that sharing ends up being used as surveillance. My suspicion is that we may end up regulating some use – there will be some things people cannot share online (visiting abortion clinics may end up being one of those) but I’m not confident of even this.

But I suspect that in a few decades my children will be stunned that I grew up in a world of no mutual surveillance. That we tolerated the risks of a world where mutual surveillance didn’t exist – they may wonder at a basic level, how we felt safe at night or in certain circumstances (I really recommend David Brin’s Science Fiction writing, especially Earth in which he explores this idea). I can also imagine they will find the idea of total anonymity and having an untraceable past to both eerie, frightening and intriguing. In their world, having grown up with social media will be different, some of the things we feel are bad, they will like, and vice versa.

Another issues missing from Alex’s piece is the role of the state. It is one thing for people to post pictures of each other, it is another about how, and if, the state does the same. As many tweeters stated – this isn’t 1994 (the last time there were riots in Vancouver). Social media is going to do is make the enforcement of law a much and the role of the state a much trickier subject. Ultimately, they cannot ignore photos of rioters engaged in illegal acts. So the question isn’t so much on what we are going to share, it is about what we should allow the state to do, and not to do, with the information we create. The state’s monopoly on violence gives it a unique role, one that will need to be managed carefully. This monopoly, combined with a world of perfect (or at least, a lot more) information will I imagine necessitate a state and justice system that that looks very, very different than the one we have right now if we are to protect of civil liberties as we presently understand them. (I suspect I’ll be writing some more about this)

But I think the place where I disagree the most with Alex is in the last paragraph:

What social media is for — or what it can be for, if we use it to its fullest potential — is to create community. And there is nothing that will erode community faster, both online and off, than creating a society of mutual surveillance.

Here, Alex confuses the society she’d like to live in with what social media enables. I see nothing to suggest that mutual surveillance will erode community, indeed, I think it already has demonstrated that it does the opposite. Mutual surveillance fosters lots of communities – from communities that track human rights abuses, to communities that track abortion providers to communities that track disabled parking violators. Surveillance builds communities, it may be that, in many cases, those communities pursue the marginalization of another community or termination of a specific behaviour, but that does not make them any less a part of our society’s fabric. It may not create communities everyone likes, but it can create community. What matters here is not if we can monitor one another, but what ends up happening with the information we generate, and why I think we’ll want to think hard about what we allow the state to do and to permit others to do, more and more carefully.

How GitHub Saved OpenSource

16 Replies

For a long time I’ve been thinking about just how much Github has revolutionized open source. Yes, it has made managing the code base significantly easier but its real impact has likely been on the social aspects of managing open source. Github has rebooted how the innovation cycle in open source while simultaneously raising the bar for good community management.

The irony may be that it has done this by making it easy to do the one thing many people thought would kill open source: forking. I remember talking to friends who – before Github launched – felt that forking, while a necessary check on any project, was also its biggest threat and so needed to be managed carefully.

Today, nothing could feel further from the truth. By collapsing the transaction costs around forking Github hasn’t killed open source. It has saved it.

The false fear of forking

The concern with forking – as it was always explained to me – was that it would splinter a community, potentially to the degree that none of the emerging groups would have the necessary critical mass to carry the project forward. Yes, it was necessary that forking be allowed – but only as a last result to manage the worst excesses of bad community leadership. But forking was messy stuff even emotionally painful and exhausting: while sometimes it was mutually agreed upon and cordial, many feared that it would usually was preceded by ugly infighting and nastiness that culminated in an (sometimes) angry rejection and (almost) political act forming a new community.

Forking = Innovation Accelerated

Maybe forking was an almost political act – when it was hard to do. But once anyone could do it, anytime, anywhere, the dynamics changed. I believe open source projects work best when contributors are able to engage in low transaction cost cooperation and high transaction cost collaboration is minimized. The genius of open source is that it does not require a group to debate every issue and work on problems collectively, quite the opposite. It works best when architected so that individuals or functioning sub-groups can grab a part of the whole, take it away, play with it, and bring the solution back and it fit it back into the larger project.

In this world innovation isn’t driven by getting lots of people to work together simultaneously, compromising, negotiating solutions, and waiting on others to complete their part. Such a process can be slow, and worse, can allow promising ideas to be killed by early criticism or be watered down before they reach their potential. What people often need is a private place where their idea can be nursed, an innovation cycle driven by enabling people to work on the same problem in isolation, and then bring working solutions back to the group to be debated. (Yes, this is a simplification, but I think the general idea stands).

And this is why GitHub was such a godsend. Yes it made managing the code base easier, but what it really did was empower contributors. It took something everyone thought would kill open source projects – forking – and made it a powerful tool of experimentation and play. Now, rather than just play with a small part of the code base, you could play with the entire thing. My strong suspension is that this has rebooted the innovation cycle for many open source projects happens. The ability of having lots of people innovating in the safety of their private repository has produced more new ideas then ever before.

Forking = Better Community Management

I also suspect that eliminating the transaction costs around forking has improved open source in another, important way. It has made open source project leads more accountable to the communities they manage.

Why?

Before Github the transaction costs around forking were higher. Setting up a new repository, firing up a bug tracking system and creating all the other necessary infrastructure wasn’t impossible, but neither was it simple. As a result, I suspect it usually only made sense to do if you could motivate a group of contributors to fork with you – there needed to be a deep grievance to justify all this effort. In short, the barriers to forking were high. That meant that project leaders had a lot of leeway in how they engaged in their community before the “threat” of forking became real. The high transaction cost of forking created a cushion for lazy, bad, or incompetent open source leadership and community management.

But collapse the transaction costs to forking and the cost of a parallel project emerging also drops significantly. This is not to claim that the cost of forking is zero – but I suspect that open source community leaders now have to be much more sensitive to the needs, demands, wishes and contributions of their community. More importantly, I suspect this has been good for open source in general.

eaves.ca

if writing is a muscle, this is my gym

Yearly Archives: 2011

Why not create an Open311 add-on for Ushahidi?

Code for America – Showing how to get it done on Independence Day

Mind. Prepare to be blown away. Big Data, Wikipedia and Government.

CIDA announces Open Data portal: What it means to Canadians

Links on Social Media & Politics: Notes from "We Want Your Thoughts #4"

The next Open Data battle: Advancing Policy & Innovation through Standards

Open Data Job Posting at MaRS in Toronto

Program Director – Regional Strategic Resource Centre Program (ReSRC)

Visualizing how everything in Beijing is built at a Las Vegas scale

How GitHub Saved OpenSource