Tag Archives: public policy

The Surveillance State – No Warrant Required

Yesterday a number of police organizations came out in support of bill C-30 – the online surveillance bill proposed by Minister Vic Toews. You can read the Vancouver Police Department’s full press release here – I’m referencing theirs not because it is particularly good or bad, but simply because it is my home town.

For those short on time, the very last statement, at the bottom of the post, is by far the worst and is something every Canadian should know. The authors of these press releases would have been wise to read Michael Geist’s blog posts from yesterday before publishing. Geist’s analysis shows that, at best, the police are misinformed, at worst, they are misleading the public.

So let’s look at some of the details of the press release that are misleading:

Today I speak to you as the Deputy Chief of the VPD’s Investigation Division, but also as a member of the Canadian Association of Chiefs of Police, and I’m pleased to be joined by Tom Stamatakis, President of both the Vancouver Police Union and Canadian Police Association.
The Canadian Association of Chiefs of Police (CACP) is asking Canadians to consider the views of law enforcement as they debate what we refer to as “lawful access,” or Bill C-30 – “An Act to enact the Investigating and Preventing Criminal Electronic Communications Act and to amend the Criminal Code and other Acts.”
This Bill was introduced by government last week and it has generated much controversy. There is no doubt that the Bill is complex and the technology it refers to can be complex as well.
I would, however, like to try to provide some understanding of the Bill from a police perspective. We believe new legislation will:
  • assist police with the necessary tools to investigate crimes while balancing, if not strengthening, the privacy rights for Canadians through the addition of oversight not currently in place

So first bullet point, first problem. While it is true the bill brings in some new process, to say it strengthens privacy rights is misleading. It has become easier, not harder, to gain access to people’s personal data. Before, when the police requested personal information from internet service providers (ISPs) the ISPs could say no. Now, we don’t even have that. Worse, the bill apparently puts a gag on order on these warrantless demands, so you can’t even find out if a government agency has requested information about you.

  • help law enforcement investigate and apprehend those who are involved in criminal activity while using new technologies to avoid apprehension due to outdated laws and technology
  • allow for timely and consistent access to basic information to assist in investigations of criminal activity and other police duties in serving the public (i.e. suicide prevention, notifying next of kin, etc.)

This, sadly, is a misleading statement. As Michael Geist notes in his blog post today “The mandatory disclosure of subscriber information without a warrant has been the hot button issue in Bill C-30, yet it too is subject to unknown regulations. These regulations include the time or deadline for providing the subscriber information (Bill C-30 does not set a time limit)…”

In other words, for the police to say the bill will get timely access to basic information – particularly timely enough to prevent a suicide, which would have to be virtually real time access – is flat out wrong. The bill makes no such promise.

Moreover, this underlying justification is itself fairly ridiculous while the opportunities for abuse are not trivial. It is interesting that none of the examples have anything to do with preventing crime. Suicides are tragic, but do not pose a risk to society. And speedily notifying next of kin is hardly such an urgent issue that it justifies warrantless access to Canadians private information. These examples speak volumes about the strength of their case.

Finally, it is worth noting that while the Police (and the Minister) refer to this as “basic” information, the judiciary disagrees. Earlier this month the Saskatchewan Court of Appeal concluded in R v Trapp, 2011 SKCA 143 that an individual has a reasonable expectation of privacy in the IP address assigned to him or her by an internet service provider, a point which appeared not to have been considered previously by an appellate court in Canada

The global internet, cellular phones and social media have all been widely adopted and enjoyed by Canadians, young and old. Many of us have been affected by computer viruses, spam and increasingly, bank or credit card fraud.

This is just ridiculous and is designed to do nothing more than play on Canadians fears. I mean spam? Really? Google Mail has virtually eliminated spam for its users. No government surveillance infrastructure was required. Moreover, it is very, very hard to see how the surveillance bill will help with any of the problems cited about – viruses, spam or bank fraud.

Okay skipping ahead (again you can read the full press release here)

2. Secondly, the matter of basic subscriber information is particularly sensitive.
The information which companies would be compelled to release would be: name, address, phone number, email address, internet protocol address, and the name of the service provider to police who are in the lawful execution of their duties.
Actually to claim that these are going to police who are in the lawful execution of their duties is also misleading. This data would be made available to police who, at best, believe they are in the lawful execution of their duties. This is precisely why we have warrants so that an independent judiciary can assess whether or not the police are actually engaged in the lawful execution of their duties. Strip away that check and there will be abuses. Indeed, the Sun Newspaper phone hacking scandal in the UK serves as a perfect example of the type of abuse that is possible. In this case police officers were able to access “under extraordinary circumstances” without a warrant or oversight, the names and phone numbers of people whose phones they wanted to, or had already, hacked.
While this information is important to police in all types of investigations, it can be critical in cases where it is urgent that police locate a caller or originator of information that reasonably causes the police to suspect that someone’s safety is at risk.
Without this information the police may not be able to quickly locate and help the person who was in trouble or being victimized.
An example would be a message over the internet indicating someone was contemplating suicide where all we had was an email address.
Again, see above. The bill does not stipulate any timelines around sharing this data. This statement is designed to lead readers to believe readers that the bill will grant necessary and instant access so that a situation could be defused in the moment. The bill does nothing of the sort.
Currently, there is no audited process for law enforcement to gain access to basic subscriber information. In some cases, internet service providers (ISPs) provide the information to police voluntarily — others will not, or often there are lengthy delays. The problem is that there is no consistency in providing this information to police nationally.

This, thankfully is a sensible statement.

3. Lastly, and one of the most important things to remember, this bill does NOT allow the police to monitor emails, phone calls or internet surfing at will without a warrant, as has been implied or explicitly stated.
There is no doubt that those who are against the legislation want you to believe that it does. I have read the Bill and I cannot find that anywhere in it. There are no changes in this area from the current legislation.

This is the worst part of the press release as it is definitely not true. See Michael Geist’s – the Ottawa professor most on top of this story – blog post from yesterday, which was written before this press release went out. According to Geist, there is a provision in the law that “…opens the door to police approaching ISPs and asking them to retain data on specified subscribers or to turn over any subscriber information – including emails or web surfing activities – without a warrant. ISPs can refuse, but this provision is designed to remove any legal concerns the ISP might have in doing so, since it grants full criminal and civil immunity for the disclosures.” In other words the Police can conduct warantless surveillance. It just requires the permission of the ISPs. This flat out contradicts the press release.

 

Algorithmic Regulation Spreading Across Government?

I was very, very excited to learn that the City of Vancouver is exploring implementing a program started in San Francisco in which “smart” parking meters adjust their price to reflect supply and demand (story is here in the Vancouver Sun).

For those unfamiliar with the program, here is a breakdown. In San Francisco, the city has the goal of ensuring at least one free parking spot is available on every block in the downtown core. As I learned during the San Fran’s presentation at the Code for America summit, such a goal has several important consequences. Specifically, it reduces the likelihood of people double parking, reduces smog and greenhouse gas emissions as people don’t troll for parking as long and because trolling time is reduced, people searching for parking don’t slow down other traffic and buses as they drive around slowly looking for a spot. In short, it has a very helpful impact on traffic more broadly.

So how does it work? The city’s smart parking meters are networked together and constantly assess how many spots on a given block are free. If, at the end of the week, it turns out that all the spaces are frequently in use, the cost of parking on that block is increased by 25 cents. Conversely if many of the spots were free, the price is reduced by 25 cents. Generally, each block finds an equilibrium point where the cost meets the demand but is also able to adjust in reaction to changing trends.

Technologist Tim O’Reilly has referred to these types of automated systems in the government context as “algorithmic regulation” – a phrase I think could become more popular over the coming decade. As software is deployed into more and more systems, the algorithms will be creating market places and resource allocation systems – in effect regulating us. A little over a year ago I said that contrary to what many open data advocates believe, open data will make data political – e.g. that open data wasn’t going to depoliticize public policy and make it purely evidenced base, quite the opposite, it will make the choices around what data we collect more contested (Canadians, think long form census). The same is also – and already – true of the algorithms, the code, that will increasingly regulate our lives. Code is political.

Personally I think the smart parking meter plan is exciting and hope the city will consider it seriously, but be prepared, I’m confident that much like smart electrical meters, an army of naysayers will emerge who simply don’t want a public resource (roads and parking spaces) to be efficiently used.

It’s like the Spirit of the West said: Everything is so political.

My Canadian Open Government Consultation Submission

Attached below is my submission to the Open Government Consultation conducted by Treasury Board over the last couple of weeks. There appear to be a remarkable number of submission that were made by citizens, which you can explore on the Treasury Board website. In addition, Tracey Lauriault has tracked some of the submissions on her website.

I actually wish the submissions on the Government website were both searchable and could be downloaded in there entirety. That way we could re-organize them, visualize them, search and parse them as well as play with the submissions so as to make the enormous number of answers easier to navigate and read. I can imagine a lot of creative ways people could re-format all that text and make it much more accessible and fun.

Finally, for reference, in addition to my submission I wrote this blog post a couple months ago suggesting goals the government set for itself as part of its Open Government Partnership commitments. Happily, since writing that post, the government has moved on a number of those recommendations.

So, below is my response to the government’s questions (in bold):

What could be done to make it easier for you to find and use government data provided online?

First, I want to recognize that a tremendous amount of work has been done to get the present website and number of data sets up online.

FINDING DATA:

My advice on making data easier to engage Socrata to create the front end. Socrata has an enormous amount of experience in how to share government data effectively. Consider http://data.oregon.gov here is a site that is clean, easy to navigate and offers a number of ways to access and engage the governments data.

More specifically, what works includes:

1. Effective search: a simple search mechanism returns all results
2. Good filters: Because the data is categorized by type (Internal vs. external, charts, maps, calendars, etc…) it is much easier to filter. One thing not seen on Socrata that would be helpful would be the ability to sort by ministry.
3. Preview: Once I choose a data set I’m given a preview of what it looks like, this enables me to assess whether or not it is useful
4. Social: Here there is a ton on offer
– I’m able to sort data sets by popularity – being able to see what others find interesting is, in of itself interesting.
– Being able to easily share data sets via email, or twitter and facebook means I’m more likely to find something interesting because friends will tell me about it
– Data sets can also be commented upon so I can see what others think of the data, if they think it is useful or not, and what for or not.
– Finally, it would be nice if citizens could add meta data, to make it easier for others to do keyword searches. If the government was worried about the wrong meta data being added, one could always offer a search with crowd sourced meta data included or excluded
5. Tools: Finally, there are a large number of tools that make it easier to quickly play with and make use of the data, regardless of one’s skills as a developer. This makes the data much more accessible to the general public.

USING DATA

Finding data is part of the problem, being able to USE the data is a much bigger issue.

Here the single most useful thing would be to offer API’s into government data. My own personal hope is that one day there will be a large number of systems both within and outside of government that will integrate government data right into their applications. For example, as I blogged about here – https://eaves.ca/2011/02/18/sharing-critical-information-with-public-lessons-for-governments/ – product recall data would be fantastic to have as an API so that major retailers could simply query the API every time they scan inventory in a warehouse or at the point of sale, any product that appears on the list could then be automatically removed. Internally, Borders and Customs could also query the API when scanning exports to ensure that nothing exported is recalled.

Second, if companies and non-profits are going to invest in using open data, they need assurances that both they are legally allowed to use the data and that the data isn’t going to suddenly disappear on them. This means, a robust license that is clear about reuse. The government would be wise to adopt the OGL or even improve on it. Better still helping establish a standardized open data license for Canada and ideally internationally could help reduce some legal uncertainty for more conservative actors.

More importantly, and missing from Socrata’s sites, would be a way of identifying data sets on the security of their longevity. For example, data sets that are required by legislation – such as the NPRI – are the least likely to disappear, whereas data sets the the long form census which have no legal protection could be seen as at higher risk.

 

How would you use or manipulate this data?

I’m already involved in a number of projects that use and share government data. Among those are Emitter.ca – which maps and shares NPRI pollution data and Recollect.net, which shares garbage calendar information.

While I’ve seen dramatically different uses of data, for me personally, I’m interested mostly in using data for thinking and writing about public policy issues. Indeed, much has been made of the use of data in “apps” but I think it is worth noting that the single biggest use of data will be in analysis – government officials, citizens, academics and others using the data to better understand the world around them and lobby for change.

This all said, there are some data sets that are of particular usefulness to people, these include:

1. Data sets on sensitive issues, this includes health, inspection and performance data (Say surgery outcomes for specific hospitals, or restaurant inspection data, crime and procurement data are often in great demand).
2. Dynamic real-time Data: Data that is frequently updated (such a border, passport renewal or emergency room wait times). This data is shared in the right way can often help people adjust schedules and plans or reallocate resources more effectively. Obviously this requires an API.
3.Geodata: Because GIS standards are very mature it is easy to “mashup” geo data to create new maps or offer new services. These common standards means that geo data from different sources will work together or can be easily compared. This is in sharp contrast to say budget data, where there are few common standards around naming and organizing the data, making it harder to share and compare.

What could be done to make it easier for you to find government information online?

It is absolutely essential that all government records be machine readable.

Some of the most deplorable moment in open government occur when the government shares documents with the press, citizens or parliamentary officers in paper form. The first and most important thing to make government information easier to find online is to ensure that it is machine readable and searchable by words. If it does not meet this criteria I increasingly question whether or not it can be declared open.

As part of the Open Government Partnership commitments it would be great for the government to commit to guarantee that every request for information made of it would include a digital version of the document that can be searched.

Second, the government should commit that every document it publishes be available online. For example, I remember in 2009 being told that if I wanted a copy of the Health Canada report “Human Health in a Changing Climate:A Canadian Assessment of Vulnerabilities and Adaptive Capacity” I had to request of CD, which was then mailed to me which had a PDF copy of the report on it. Why was the report not simply available for download? Because the Minister had ordered it not to appear on the website. Instead, I as a taxpayer and to see more of my tax dollars wasted for someone to receive my mail, process it, then mail me a custom printed cd. Enabling ministers to create barriers to access government information, simply because they do not like the contents, is an affront to the use of tax payer dollars and our right to access information.

Finally, Allow Government Scientists to speak directly to the media about their research.

It has become a reoccurring embarrassment. Scientists who work for Canada publish an internationally recognized ground break paper that provides some insight about the environment or geography of Canada and journalists must talk to government scientists from other countries in order to get the details. Why? Because the Canadian government blocks access. Canadians have a right to hear the perspectives of scientists their tax dollars paid for – and enjoy the opportunity to get as well informed as the government on these issues.

Thus, lift the ban that blocks government scientists from speaking with the media.

 

Do you have suggestions on how the Government of Canada could improve how it consults with Canadians?

1. Honour Consultation Processes that have started

The process of public consultation is insulted when the government itself intervenes to bring the process into disrepute. The first thing the government could do to improve how it consults is not sabotage processes that already ongoing. The recent letter from Natural Resources Minister Joe Oliver regarding the public consultation on the Northern Gateway Pipelines has damaged Canadians confidence in the governments willingness to engage in and make effective use of public consultations.

2. Focus on collecting and sharing relevant data

It would be excellent if the government shared relevant data from its data portal on the public consultation webpage. For example, in the United States, the government shares a data set with the number and location of spills generated by Enbridge pipelines, similar data for Canada would be ideal to share on a consultation. Also useful would be economic figures, job figures for the impacted regions, perhaps also data from nearby parks (visitations, acres of land, kml/shape boundary files). Indeed, data about the pipeline route itself that could be downloaded and viewed in Google earth would be interesting. In short, there are all sorts of ways in which open data could help power public consultations.

3. Consultations should be ongoing

It would be great to see a 311 like application for the federal government. Something that when loaded up, would use GPS to identify the services, infrastructure or other resources near the user that is operated by the federal government and allow the user to give feedback right then and there. Such “ongoing” public feedback could then be used as data when a formal public consultation process is kicked off.

 

Are there approaches used by other governments that you believe the Government of Canada could/should model?

1. The UK governments expense disclosure and release of the COINS database more generally is probably the most radical act of government transparency to date. Given the government’s interest in budget cuts this is one area that might be of great interest to pursue.

2. For critical data sets, those that are either required by legislation or essential to the operation of a ministry or the government generally, it would be best to model the city of Chicago or Washington DC and foster the creation of a data warehouse where this data could be easily shared both internally and externally (as privacy and security permits). These cities are leading governments in this space because they have tackled both the technical challenges (getting the data on a platform where it can be shared easily) and around governance (tackling the problem of managing data sets from various departments on a shared piece of infrastructure).

 

Are there any other comments or suggestions you would like to make pertaining to the Government of Canada’s Open Government initiative?

Some additional ideas:

Redefine Public as Digital: Pass an Online Information Act

a) Any document it produces should be available digitally, in a machine readable format. The sham that the government can produce 3000-10,000 printed pages about Afghan detainees or the F-35 and claim it is publicly disclosing information must end.

b) Any data collected for legislative reasons must be made available – in machine readable formats – via a government open data portal.

c) Any information that is ATIPable must be made available in a digital format. And that any excess costs of generating that information can be born by the requester, up until a certain date (say 2015) at which point the excess costs will be born by the ministry responsible. There is no reason why, in a digital world, there should be any cost to extracting information – indeed, I fear a world where the government can’t cheaply locate and copy its own information for an ATIP request as it would suggest it can’t get that information for its own operations.

Use Open Data to drive efficiency in Government Services: Require the provinces to share health data – particularly hospital performance – as part of its next funding agreement within the Canada Health Act.

Comparing hospitals to one another is always a difficult task, and open data is not a panacea. However, more data about hospitals is rarely harmful and there are a number of issues on which it would be downright beneficial. The most obvious of these would be deaths caused by infection. The number of deaths that occur due to infections in Canadian hospitals is a growing problem (sigh, if only open data could help ban the antibacterial wipes that are helping propagate them). Having open data that allows for league tables to show the scope and location of the problem will likely cause many hospitals to rethink processes and, I suspect, save lives.

Open data can supply some of the competitive pressure that is often lacking in a public healthcare system. It could also better educate Canadians about their options within that system, as well as make them more aware of its benefits.

Reduce Fraud: Creating a Death List

In an era where online identity is a problem it is surprising to me that I’m unable to locate a database of expired social insurance numbers. Being able to query a list of social security numbers that belong to dead people might be a simple way to prevent fraud. Interestingly, the United States has just such a list available for free online. (Side fact: Known as the Social Security Death Index this database is also beloved by genealogist who use it to trace ancestry).

Open Budget and Actual Spending Data

For almost a year the UK government has published all spending data, month by month, for each government ministry (down to the £500 in some, £25,000 in others). More over, as an increasing number of local governments are required to share their spending data it has lead to savings, as government begin to learn what other ministries and governments are paying for similar services.

Create a steering group of leading Provincial and Municipal CIOs to create common schema for core data about the country.

While open data is good, open data organized the same way for different departments and provinces is even better. When data is organized the same way it makes it easier to citizens to compare one jurisdiction against another, and for software solutions and online services to emerge that use that data to enhance the lives of Canadians. The Federal Government should use its convening authority to bring together some of the countries leading government CIOs to establish common data schemas for things like crime, healthcare, procurement, and budget data. The list of what could be worked on is virtually endless, but those four areas all represent data sets that are frequently requested, so might make for a good starting point.

Statistics Canada Data to become OpenData – Background, Winners and Next Steps

As some of you learned last night, Embassy Magazine broke the story that all of Statistics Canada’s online data will not only be made free, but released under the Government of Canada’s Open Data License Agreement (updated and reviewed earlier this week) that allows for commercial re-use.

This decision has been in the works for months, and while it does not appear to have been formally announced, Embassy Magazine does appear to have managed to get a Statistics Canada spokesperson to confirm it is true. I have a few thoughts about this story: Some background, who wins from this decision, and most importantly, some hope for what it will, and won’t lead to next.

Background

In the embassy article, the spokesperson claimed this decision had been in the works for years, something that is probably technically true. Such a decision – or something akin to it – has likely been contemplated a number of times. And there have been a number of trials and projects that have allowed for some data to be made accessible albeit under fairly restrictive licenses.

But it is less clear that the culture of open data has arrived at StatsCan, and less clear to me that this decision was internally driven. I’ve met many a Statscan employee who encountered enormous resistance while advocating for data open. I remember pressing the issue during a talk at one of the department’s middle managers conference in November of 2008 and seeing half the room nod vigorously in agreement, while the other half crossed it arms in strong disapproval.

Consequently, with the federal government increasingly interested in open data, coupled with a desire to have a good news story coming out of statscan after last summer census debacle, and with many decisions in Ottawa happening centrally, I suspect this decision occurred outside the department. This does not diminish its positive impact, but it does mean that a number of the next steps, many of which will require StatsCan to adapt its role, may not happen as quickly as some will hope, as the organization may take some time to come to terms with the new reality and the culture shift it will entail.

This may be compounded by the fact that there may be tougher news on the horizon for StatsCan. With every department required to have submitted proposal to cut their budgets by either 5% and 10%, and with StatsCan having already seen a number of its programs cut, there may be fewer resources in the organization to take advantage of the opportunity making its data open creates, or even just adjust to what has happened.

Winners (briefly)

The winners from this decision are of course, consumers of statscan’s data. Indirectly, this includes all of us, since provincial and local governments are big consumers of statscan data and so now – assuming it is structured in such a manner – they will have easier (and cheaper) access to it. This is also true of large companies and non-profits which have used statscan data to locate stores, target services and generally allocate resources more efficiently. The opportunity now opens for smaller players to also benefit.

Indeed, this is the real hope. That a whole new category of winners emerges. That the barrier to use for software developers, entrepreneurs, students, academics, smaller companies and non-profits will be lowered in a manner that will enable a larger community to make use of the data and therefor create economic or social goods.

Such a community, however, will take time to evolve, and will benefit from support.

And finally, I think StatsCan is a winner. This decision brings it more profoundly into the digital age. It opens up new possibilities and, frankly, pushes a culture change that I believe is long over due. I suspect times are tough at StatsCan – although not as a result of this decision – this decision creates room to rethink how the department works and thinks.

Next Steps

The first thing everybody will be waiting for is to see exactly what data gets shared, in what structure and to what detail. Indeed this question arose a number of times on twitter with people posting tweets such as “Cool. This is all sorts of awesome. Are geo boundary files included too, like Census Tracts and postcodes?” We shall see. My hope is yes and I think the odds are good. But I could be wrong, at which point all this could turn into the most over hyped data story of the year. (Which actually matters now that data analysts are one of the fastest growing categories of jobs in North America).

Second, open data creates an opportunity for a new and more relevant role for StatsCan to a broader set of Canadians. Someone from StatsCan should talk to the data group at the World Bank around their transformation after they launched their open data portal (I’d be happy to make the introduction). That data portal now accounts for a significant portion of all the Bank’s web traffic, and the group is going through a dramatic transformation, realizing they are no longer curators of data for bank staff and a small elite group of clients around the world but curators of economic data for the world. I’m told a new, while the change has not been easy, a broader set of users have brought a new sense of purpose and identity. The same could be true of StatsCan. Rather than just an organization that serves the government of Canada and a select groups of clients, StatsCan could become the curators of data for all Canadians. This is a much more ambitious, but I’d argue more democratized and important goal.

And it is here that I hope other next steps will unfold. In the United States, (which has had free census data for as long as anyone I talked to can remember) whenever new data is released the census bureau runs workshops around the country, educating people on how to use and work with its data. StatsCan and a number of other partners already do some of this, but my hope is that there will be much, much more of it. We need a society that is significantly more data literate, and StatsCan along with the universities, colleges and schools could have a powerful role in cultivating this. Tracey Lauriault over at the DataLibre blog has been a fantastic advocate of such an approach.

I also hope that StatsCan will take its role as data curator for the country very seriously and think of new ways that its products can foster economic and social development. Offering APIs into its data sets would be a logical next step, something that would allow developers to embed census data right into their applications and ensure the data was always up to date. No one is expecting this to happen right away, but it was another question that arose on twitter after the story broke, so one can see that new types of users will be interested in new, and more efficient ways, of accessing the data.

But I think most importantly, the next step will need to come from us citizens. This announcement marks a major change in how StatsCan works. We need to be supportive, particularly at a time of budget cuts. While we are grateful for open data, it would be a shame if the institution that makes it all possible was reduced to a shell of its former self. Good quality data – and analysis to inform public policy – is essential to a modern economy, society, and government. Now that we will have free access to what our tax dollars have already paid for, let’s make sure that it stays that way, by both ensure it continues to be available, and that there continues to be a quality institution capable of collecting and analyzing it.

(sorry for typos – it’s 4am, will revise in the morning)

The New Government of Canada Open Data License: The OGL by another name

Last week the Minister Clement issued a press release announcing some of the progress the government has made on its Open Government Initiatives. Three things caught my eye.

First, it appears the government continues to revise its open data license with things continuing to trend in the right direction.

As some of you will remember, when the government first launched data.gc.ca it had a license that was so onerous that it was laughable. While several provisions were problematic, my favourite was the sweeping, “only-make-us-look-good-clause” which, said, word for word: “You shall not use the data made available through the GC Open Data Portal in any way which, in the opinion of Canada, may bring disrepute to or prejudice the reputation of Canada.”

After I pointed out the problems with this clause to then Minister Day, he managed to have it revoked within hours – very much to his credit. But it is a good reminder to the starting point of the government license and to the mindset of government Canada lawyers.

With the new license, almost all the clauses that would obstruct commercial and non-profit reuse have effectively been eliminated. It is no longer problematic to identify individual companies and the attribution clauses have been rendered slightly easier. Indeed, I would argue that the new license has virtually the same constraints as the UK Open Government License (OGL) and even the Creative Commons CC-BY license.

All this begs the question… why not simply use the language and structure of the OGL in much the same manner that British Columbia Government tried to with its own BC OGL? Such a standardized license across jurisdictions might be helpful, it would certainly simply life for think tanks, academics, developers and other users of the data. This is something I’m pushing for and hope that we might see progress on.

Second, the idea that the government is going to post completed access to information (ATIP) requests online is also a move in the right direction. I suspect that the most common ATIP request is one that someone else has already made. Being able to search through previous requests would enable you to find what you are looking for without having to wait weeks or make public servants redo the entire search and clearing process. What I don’t understand is why only post the summaries? In a digital world it would be better for citizens, and cheaper for the government to simply post the entire request whenever privacy policies wouldn’t prevent it.

Third, and perhaps most important were the lines noting that “That number (of data sets) will continue to grow as the project expands and more federal departments and agencies come onboard. During this pilot project, the Government will also continue to monitor and consider national and international best practices, as well as user feedback, in the licensing of federal open data.”

This means that we should expect more data to hit the site. I seems as though more departments are being asked to figure out what data they can share – hopefully this means that real, interesting data sets will be made public. In particular one hopes that data sets which legislation mandates the government collect, will be high on the list of priorities. Also interesting in this statement is the suggestion that the government will consider national and international best practices. I’ve talked to both the Minister and officials about the need to create common standards and structures for open data across jurisdictions. Fostering and pushing these is an area where the government could take a leadership role and it looks like there may be interesting in this.

 

Gov 2.0: Network Analysis for Income Inequality?

I’ve been thinking a lot about these two types of graphs at the moment.  This first is a single chart that shows income growth for various segments of the US population broken down by wealth.

This second is a group of graphs that talk about pageviews and visits to various websites on the internet.

bits-tue71-custom2Top-10-Social-Networking-Sites-by-Market-Share-of-Visits-June-2011July-Search-Engine-Market-Share

What is fascinating about the internet stats is that they are broadly talking about distribution among the top websites – forget about everyone else where the pageviews become infinitesimally small. So even among top websites have a power law distribution, which must be even stronger once one starts talking about all websites.

And this is what I’m frequently told. That the distribution of pageviews, visits and links on the internet looks a lot like the first graph, although possibly even more radically skewed.

In other words while the after-tax income chart isn’t a clean curve, the trends of the two are likely very similar – except that the top 1% of websites do even better than the top 1% of after tax income earners. So both charts look like power law distributions.

Does this matter? I’m not sure, but I’m playing with some thoughts. While I’m confident that the income chart as power law distribution has replicated itself several times in history (such as during the lead up to the great depression, what is less clear to me is if the exponential growth has ever happened so fast? (would be fascinating to know if others have written on this). The rich have often gotten richer – but have they gotten richer this quickly before?

And is this what happens in a faster, more networked economy? Maybe the traits of the online network and its power law distribution are beginning to impact the socioeconomic network of our society at large?

Could this also mean that we need some new ways to ensure social and economic mobility in our economy and society. Network effects are obviously powerful online, but have also, historically, been important offline. In society, your location on that curve creates advantages, it likely gives you access to peers and capital which position you to maintain your status in the network. Perhaps the internet, rather than making the network that is our society more fluid, is actually doing the opposite. It is increasingly the power law distribution, meaning the network effects are getting stronger, further reinforcing advantages and disadvantages. This might have important implications for social and economic mobility.

Either way, applying some network analysis to income inequality and social mobility as well as the social programs we put in place to ensure equality of opportunity, might be a good frame on these problems. I’d love to read anything anyone has written on this – very much open to suggestions.

Using Open Data to drive good policy outcomes – Vancouver’s Rental Database

One of the best signs for open data is when governments are starting to grasp its potential to achieve policy objectives. Rather than just being about compliance, it is seen as a tool that can support the growth and management of a jurisdiction.

This why I was excited to see Vision Vancouver (in which I’m involved in generally, but was not involved in the development of this policy proposal) announced the other day that, if elected, it intends to create a comprehensive online registry that will track work orders and property violations in Vancouver apartments, highlighting negligent landlords and giving a new tool to empower renters.

As the press release goes on to state, the database is “Modeled after a successful on-line watchlist created by New York City’s Public Advocate, the database will allow Vancouver residents to search out landlords and identify any building or safety violations issued by the City of Vancouver to specific rental buildings.”

Much like the pieces I’ve written around restaurant inspection and product recall data, this is a great example of a data set, that when shared the right way, can empower citizens to make better choices and foster better behaviour from landlords.

My main hope is that in the implementation of this proposal, the city does the right thing and doesn’t create a searchable database on its own website, but actually creates an API that software developers and others can tap into. If they do this, someone may develop a mobile app for renters that would show you the repair record of the building you are standing in front of, or in. This could be very helpful for renters, one could even imagine an app where you SMS the postal code of a rental building and it sends you back some basic information. Also exciting to me is the possibility that a university student might look for trends in the data over time, maybe there is an analysis that my yield and insight that could help landlords mitigate against problems, and reduce the number of repairs they have to make (and so help reduce their costs).

But if Vancouver and New York actually structured the data in the same way, it might create an incentive for other cities to do the same. That might entice some of the better known services to use the data to augment their offerings as well. Imagine if PadMapper, in addition to allowing a prospective renter to search for apartments based on rent costs and number of rooms, could also search based on number of infractions?

pad-mapper-rental

That might have a salutary effect on some (but sadly not all) landlords. All an all an exciting step forward from my friends at Vision who brought open data to Canada.

Neo-Progressive Watch: Rahm Emanuel vs. Teachers Union

Anyone who read Obama’s book, The Audacity of Hope will have been struck with the amount of time the then aspiring presidential candidate spent writing about public education policy. More notably, he seemed to acknowledge that any effort at education reform was, at some point, going to butt heads with teachers unions and that new approaches were either going to have to be negotiated or imposed. It was a point of tension that wasn’t much talked about in the reviews I read. But it always struck me as interesting that here was Obama, a next generation progressive, railing against the conservatism of what is possible the original pillar of the progressive movement: public education.

All of this has, of course, been decidedly forgotten given both the bigger problems the president has faced and by the fact that he’s been basically disinterested in monkeying around in public education policy since taking office. That’s why it is still more fascinating to see what his disciples are doing as they get involved in levels of government that are in more direct contact with this policy area. Here, none is more interesting to watch than Rahm Emanuel.

This Saturday my friend Amy L. pointed me to a New York Times article outlining the most recent battle between Rahm Emanuel and the teacher’s union. My own take is that the specifics of the article are irrelevant, what matters is the broad theme. In short, Rahm Emanuel is on a short timeline. He needs to produce results immediately since local elections both happen more frequently and one is much, much closer to the citizen. That said, he doesn’t have to deliver uniform results, progress, in of itself may be sufficient. Indeed, a little experimentation is profoundly good given it can tease out faster and cheaper ways to deliver said results.

In contrast, the teacher’s union faces few of the pressures experienced by Rahm. It can afford to move at a slower pace and, more importantly, wants a uniform level of treatment across the entire system. Indeed, its entire structure is built around the guarantee of uniform treatment for its members. This uniformity is a value that evolved parallel to but not of progressive thinking. It is an artifact of industrial production that gets confused with progressive thought because of the common temporal lineage.

This skirmish offers a window into the major battle that is going to dominate the our politics in about a decade. I increasingly suspect we are moving into a world where the possibilities for education, thanks to the web and social networks, is going to be completely altered. What we deem is possible, what parents demand, and the skills that are seen as essential, are all going to shift. Our educational system, its schools, the school boards and, of course, the unions, are still bound in a world of mass production – shifting students from room to room to prepare them for the labour and production jobs of the 20th century. No matter how gifted the teachers (and there are many who are exceedingly gifted) they remain bound by the structure of the system the education system, the school boards, and the unions, have built and enforce.

Of course, what is going to be in demand are students that can thrive in the world of mass collaboration and peer production in the 21st century -behaviours that are generally viewed as “cheating” in the current model. And parents who are successful in 21st century jobs are going to be the first to ensure their children get the “right” kind of education. Which is going to put them at odds with the current education system.

This is all this is to say that the real question crisis is: how quickly will educational systems be able to adapt? Here both the school boards and the unions play an enormous role, but it is the unions that, it would appear, may be a constraining factor. If they find that having Rahm engage schools directly feels like a threat, I suspect they are going to find the next 20 years a rough, rough ride. Something akin to how the newspapers have felt regarding the arrival of the internet and craigslist.

What terrifies me most, is that unless we can devise a system where teachers are measured and so good results can be both rewarded and shared… and where parents and students have more choices around education, then families (that can afford to) are going to vote with their feet. In fact, you already see it in my home town.

The myth in Vancouver is that high property values are driving families – and thus children – out of the city. But this is patently not true. The fantastic guys over at Bing Thom Architects wrote a report on student populations in Vancouver. According to their research, in the last 10 years the estimated number of elementary and secondary aged children in Vancouver has risen by 3% (around 2,513 new students). And yet, the number of students enrolled in public education facilities has declined by 5.46%. (around 3,092 students). In fact, the Vancouver School Boards numbers seem to indicate the decline may be more pronounced.

In the meantime the number of private/independent schools has exploded by 43% going from 39 to 68 with enrollment increases of 13.8%. (Yes that does leave a surplus of students unaccounted for, I suspect they are also in private/independent schools, but outside of the City of Vancouver’s boundaries). As a public school graduate myself, one who had truly fantastic teachers but who also benefited from enormous choice (IB, French Immersion) the numbers of the past decade are very interesting to immerse oneself in.

Correct or incorrect, it would seem parents are opting for schools that offer a range of choices around education. Of course, it is only the parents who can afford to do this that are doing it. But that makes the outcome worse, not better. With or without the unions, education is going to get radically rethought. It would be nice if it was the public sector that lead that revolution, or at least was on the vanguard of it. But if our public sector managers and teachers are caught arguing over how to adjust the status quo by increments, it is hard to see how our education policy is going to make a quantum leap into the 21st century.

Interview with Charles Leadbeater – Monday September 19th

I’m excited to share that I’ll be interviewing British public policy and open innovation expert Charles Leadbeater on September 19th as part of a SIG’s webinar series. For readers not familiar with Charles Leadbeater, he is the author of We-Think and numerous other chapters, pamphlets and articles, ranging in focus from social innovation, to entrepreneurship to public sector reform. He served as an adviser to Tony Blair and has a long standing relationship with the British think tank Demos.

Our conversation will initially focus on open innovation, but I’m sure will range all over, touching on the impact of open source methodologies on the private, non-profit and public sector, the future of government services and, of course, the challenges and opportunities around open data.

If you are interested in participating in the webinar you can register here. There is a small fee I’m told is being charged to recover some of the costs for running the event.

If you are participating and have a question you’d like to see asked, or a theme or topic you’d like to see covered, please feel free to comment below or, if you prefer more discretion, send me an email.

The Economics of Open Data – Mini-Case, Transit Data & TransLink

TransLink, the company that runs public transit in the region where I live (Vancouver/Lower Mainland) has launched a real time bus tracking app that uses GPS data to figure out how far away the next the bus you are waiting for really is. This is great news for everyone.

Of course for those interested in government innovation and public policy it also leads to another question. Will this GPS data be open data?

Presently TransLink does make its transit schedule “open” under a non-commercial license (you can download it here). I can imagine a number of senior TransLink officials (and the board) scratching their head asking: “Why, when we are short of money, would we make our data freely available?”

The answer is that TransLink should make its current data, as well as its upcoming GPS data, open and available under a license that allows for both non-commercial and commercial re-use, not just because it is the right thing to do, but because the economics of it make WAY MORE SENSE FOR TRANSLINK.

Let me explain.

First, there are not a lot of obvious ways TransLink could generate wealth directly from its data. But let’s take two possible opportunities: the first involves selling a transit app to the public (or advertising in such an app), the second is through selling a “next bus” service to companies (say coffee shops or organizations) that believe showing this information might be a convenience to their employees or customers.

TransLink has already abandoned doing paid apps – instead it maintains a mobile website at m.translink.ca – but even if it created an app and charged $1 per download, the revenue would be pitiful. Assuming a very generous customer base of 100,000 users, TransLink would generate maybe $85,000 dollars (once Apple takes its cut from the iPhone downloads, assuming zero cut for Androids). But remember, this is not a yearly revenue stream, it is one time. Maybe, 10-20,000 people upgrade their phone, arrive in Vancouver and decide to download every year. So your year on year revenue is maybe $15K? So over a 5 year period, TransLink ends up with an extra, say $145,000 dollars. Nothing to sneeze at, but not notable.

In contrast a free application encourages use. So there is also a cost to not giving it away. It could be that, having transit data more readily available might cause some people to choose taking transit over say, walking, or taking a taxi or driving. Last year TransLink handled 211.3 million trips. Let’s assume that more accessible data from wider access to the data meant there was a .1% increase in the number of trips. An infinitesimally small increase – but it means 211,300 more trips. Assuming each rider pays a one zone $2.50 fare that would still translate in an additional revenue of $528,250. Over the same five year period cited above… that’s revenue of $2.641M, much better than $145,000. And this is just calculating money. Let’s say nothing of less congested roads, less smog and a lower carbon footprint for the region…

When the this analysis is applied to licensing data it produces the same result. Will UBC pay to have TransLink’s real time data on terminals in the Student Union building? I doubt it. Would some strategically placed coffee shops… possibly. Obviously organizations would have to pay for the signs, but adding on annual “data license fee” to display’s cost would cause some to opt out. And once you take into account managing the signs, legal fees, dealing with the contract and going through the sales process, it is almost inconceivable that TransLink would make more money from these agreements than it would from simply having more signs everywhere created by other people that generated more customers for its actual core business: moving people from A to B for a fee. Just to show you the numbers, if shops that weren’t willing to pay for the data put up “next bus” screens that generated a mere 1000 new regular bus users who did only 40 one way trips a year (or 40,000 new trips), this would equal revenue of $100,000 every year at no cost to translink. Someone else could install and maintain the signs, no contracts or licenses would need to be managed.

From a cost recovery perspective it is almost impossible to imagine a scenario where TransLink is better off not allowing commercial re-use of its data.

My point is that TransLink should not be focused on creating a few bucks from licensing its data (which it doesn’t do right now anyway). It should be focused on shifting the competitive value in the marketplace from access to accessibility.

Being the monopoly holder of transit data does not benefit TransLink. All it means is that fewer people see and engage with its data. When it makes the data open and available “access” no longer becomes the defining advantage. When anybody (e.g. TransLink, Google, independent developers) can access the data, the market place shifts to competing on access to competing on accessibility. Consumers don’t turn to who has the data, they turn to who makes the data easiest to use.

For example, Translink has noted that in 2011 it will have a record number of trips. Part of me wonders to what degree the increase in trips over the past few years is a result of making transit data accessible in Google Maps. (Has anyone done a study on this in any jurisdiction?) The simple fact is that Google maps is radically easier to use for planning transit journeys than Translink’s own website AND THAT IS A GOOD THING FOR TRANSLINK. Now imagine if lots of companies were sharing translink’s data? The local Starbucks and Blenz Coffee, to colleges and universities and busy buildings downtown. Indeed, the real crime right now is that Translink has handed Google a defacto monopoly. It is allowed to use the data for commercial re-use. Local tax-paying developers…? Not so according to the license they have to click through.

Translink, you want a world where everyone is competing (including against you) on accessibility. In the end… you win with greater use and revenue.

But let me go further. There are other benefits to having Translink share its data for commercial re-use.

Procurement

Some riders will note that there are already bus stops in Vancouver which display “next bus” data (e.g. how many minutes away the next bus is). If TransLink made its next bus data freely available via an API it could conceivably alter the procurement process for buying and maintaining these signs. Any vendor could see how the data is structured and so take over the management of the signs, and/or experiment with creating more innovative or cheaper ways of manufacturing them.

The same is true of creating the RFP for TransLink’s website. With the data publicly available, TransLink could simple ask developers to mock up what they think is the most effective way of displaying the data. More development houses might be enticed to respond to the RFP increasing the likelihood of innovations and putting downward pressure of fees.

Analysis

Of course, making GPS data free could have an additional benefit. Local news companies might be able to use the bus’s GPS data to calculate traffic flow rates and so predict traffic jams. Might they be willing to pay TransLink for the data? Maybe, but again probably not enough to justify the legal and sales overhead. Moreover, TransLink would benefit from this analysis – as it could use the reports to adjust its schedule and notify its drivers of problems beforehand. Of course everyone would benefit as well as better informed commuters might change their behaviour (including taking transit!) reducing congestion, smog, carbon footprint, etc…

Indeed, the analysis opportunities using GPS data are potentially endless – much of which might be done by bloggers and university students. One could imagine correlating actual bus/subway times with any other number of data sets (crime, commute times, weather) that could yield interesting information that could help TransLink with its planning. There is no world where TransLink has the resources to do all this analysis, so enabling others to do it, can only benefit it.

Conclusion

So if you are at TransLink/Coast Mountain Bus Company (or any transit authority in the world), this post is for you. Here’s what I suggest as next steps:

1) Add GPS bus tracking API to your open data portal.

2) Change your license. Drop the non-commercial part. It hurts your business more than you realize and is anti competitive (why does can Google use the data for a commercial application while residents of the lower mainland cannot?). My suggestion, adopt the BC Government Open Government License or the PDDL.

3) Add an RSS feed to your GTFS data. Like Google, we’d all like to know when you update your data. Given we live here and are users, it be nice to extend the same service to us as you do them.

4) Maybe hold a Transit Data Camp where you could invite local developers and entrepreneurs to meet your staff and encourage people to find ways to get transit data into the hands of more Lower Mainlanders and drive up ridership!