Public Policy: The Big Opportunity For Health Record Data

A few weeks ago Colin Hansen – a politician in the governing party in British Columbia (BC) – penned an op-ed in the Vancouver Sun entitled Unlocking our data to save lives. It’s a paper both the current government and opposition should read, as it is filled with some very promising ideas.

In it, he notes that BC has one of the best collections of health data anywhere in the world and that, data mining these records could yield patterns – like longitudinal adverse affects when drugs are combined or the correlations between diseases – that could save billions as well as improve health care outcomes.

He recommends that the province find ways to share this data with researchers and academics in ways that ensure the privacy of individuals are preserved. While I agree with the idea, one thing we’ve learned in the last 5 years is that, as good as academics are, the wider public is often much better in identifying patterns in large data sets. So I think we should think bolder. Much, much bolder.

Two years ago California based Heritage Provider Network, a company that runs hospitals, launched a $3 Million predictive health contest that will reward the team who, in three years, creates the algorithm that best predicts how many days a patient will spend in a hospital in the next year. Heritage believes that armed with such an algorithm, they can create strategies to reach patients before emergencies occur and thus reduce the number of hospital stays. As they put it: “This will result in increasing the health of patients while decreasing the cost of care.”

Of course, the algorithm that Heritage acquires through this contest will be proprietary. They will own it and I can choose who to share it with. But a similar contest run by BC (or say, the VA in the United States) could create a public asset. Why would we care if others made their healthcare system more efficient, as long as we got to as well. We could create a public good, as opposed to Heritage’s private asset. More importantly, we need not offer a prize of $3 million dollars. Several contests with prizes of $10,000 would likely yield a number of exciting results. Thus for very little money with might help revolutionize BC, and possibly Canada’s and even the world’s healthcare systems. It is an exciting opportunity.

Of course, the big concern in all of this is privacy. The Globe and Mail featured an article in response to Hansen’s oped (shockingly but unsurprisingly, it failed to link back to – why do newspaper behave that way?) that focused heavily on the privacy concerns but was pretty vague about the details. At no point was a specific concern by the privacy commissioner raised or cited. For example, the article could have talked about the real concern in this space, what is called de-anonymization. This is when an analyst can take records – like health records – that have been anonymized to protect individual’s identity and use alternative sources to figure out who’s records belong to who. In the cases where this occurs it is usually only only a handful of people whose records are identified, but even such limited de-anonymization is unacceptable. You can read more on this here.

As far as I can tell, no one has de-anonymized the Heritage Health Prize data. But we can take even more precautions. I recently connected with Rob James – a local epidemiologist who is excited about how opening up anonymized health care records could save lives and money. He shared with me an approach taking by the US census bureau which is even more radical than de-anonymization. As outlined in this (highly technical) research paper by Jennifer C. Huckett and Michael D. Larsen, the approach involves creating a parallel data set that has none of the features of the original but maintains all the relationships between the data points. Since it is the relationships, not the data, that is often important a great deal of research can take place with much lower risks. As Rob points out, there is a reasonably mature academic literature on these types of privacy protecting strategies.

The simple fact is, healthcare spending in Canada is on the rise. In many provinces it will eclipse 50% of all spending in the next few years. This path is unsustainable. Spending in the US is even worse. We need to get smarter and more efficient. Data mining is perhaps the most straightforward and accessible strategy at our disposal.

So the question is this: does BC want to be a leader in healthcare research and outcomes in an area the whole world is going to be interested in? The foundation – creating a high value data set – is already in place. The unknown is if can we foster a policy infrastructure and public mandate that allows us to think and act in big ways. It would be great if government officials, the privacy commissioner and some civil liberties representatives started to dialogue to find some common ground.  The benefits to British Columbians – and potentially to a much wider population – could be enormous, both in money and, more importantly, lives, saved.

3 thoughts on “Public Policy: The Big Opportunity For Health Record Data

  1. Pipson52

    David — Love it.. the sort of things I was talking about with you five and more years ago, the result of conversations with epidemiologists at WHO in the late 90s.. 

  2. Nicholas Gruen

    I’ve suggested to the Kaggle guys running the Heritage comp that it would be great to get an avalanche of similar comps going – the data from each might help others’ model their own data better, and even if not, they would turn up all sorts of interesting differences. 

    And no, no-one has hacked open the anonymization of the Heritage data.

  3. Anonymous

    David, the UK government is already planning this, mainly to attract big pharmaceutical companies to UK. However, as the data is networked, they don’t really need to be based here. There are plans as well for anonymised welfare and court sentencing data to be released.

    I think you are glossing over the problems with anonymisation. The Heritage competition people clearly feel there is some danger when they say:
    DO respect the privacy, confidentiality and security of the Competition’s Data SetsDO observe customary and prudent medical data security and privacy practices (including but not limited to data confidentiality, storage and encryption practices) with respect to all data that you use in connection with the Competition


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s