’60 Minutes’ Cancer Doctor’s Revolutionary Idea: Taking Big Data Seriously
December 8, 2014
Cancer pioneer and entrepreneur Patrick Soon-Shiong’s appearance on 60 Minutes last night – see this crisp summary by my colleague Matthew Herper – has ignited an impassioned discussion, focused on two key questions: is Soon-Shiong’s approach really revolutionary, and what’s the evidence he’s on the right track?
Soon-Shiong’s characterization of cancer as fundamentally a problem with cell death rather than cell growth isn’t new or radical. Neither is his idea that sequencing tumors is important – most analysts point to oncology as the largest market for next-generation sequencing. The concept of “liquid biopsies” – using emerging technologies to capture and characterize circulating tumor cells – is also not original, and there are a number of researchers and companies focused on this today.
What is truly distinctive about Soon-Shiong’s approach is the combination of scale (as 60 Minutes correspondent Sanjay Gupta points out) plus the rich, centralized data collection.
(Disclosure/reminder: I work at DNAnexus, a cloud-based genomic data company that operates in this general space.)
The big idea motivating Soon-Shiong is this: if you could collect all the cancer data in the world, you might be able to drive significant insight. In fact, if you could collect a fraction of the cancer data in the world, you could drive significant insight. But the truth is today, we don’t have rich sequencing information on most cancers, and our phenotypic information is often astonishingly limited – and scattered – as well. This is tragic.
Increasingly, medical centers are beginning to capture these data better, as are some companies, such as Foundation Medicine on the genotype side, and Flatiron, on the phenotype side (disclosure: both, like DNAnexus, backed by Google Ventures).
As disappointing as the data collection around cancer is, the sharing of data is generally even worse; it’s hard to get large centers to play nicely with each other, and even when they do agree to share data, it is often relatively limited. In one example, for instance, leading cancer centers are sharing only “cancer-specific” mutations – mutations found in a patient’s tumor not found in the patient’s other cells. Yet such an approach might miss many other mutations that predispose a patient to develop cancer; critical context is almost certain to be lost here.
The word of the day in the research community is “federation” – the idea, essentially, that data ownership is likely to remain siloed, but through appropriate mechanisms can be shared in a useful way. Organizations involved in these efforts, as I’ve recently discussed, include the Global Alliance For Genomic Health, Sage Bionetworks, HL7, and others.
Soon-Shiong’s thesis, it seems to me, is that federation may not be enough, and rather than trying to pry limited amounts of data from organizations that may already not collect enough of it, it may be better to start from the ground up, and build your own rich data repositories. This relates to the concept of the “data-inhaling clinic of the future” I’ve discussed here.
On the one hand, you could argue that he’s just building yet another silo, and thus may be contributing to the data-sharing problem more than solving it. On the other hand, if he’s able to collect denser data than most others, and amass more of it faster, he may be able to turn his data scientists loose on an incredibly rich dataset, at a time when the rest of the world is still negotiating data collection and interoperability standards.
As Herper and others have accurately noted, Soon-Shiong has offered a compelling vision, but not yet proof.
The bottom line is that Soon-Shiong is placing a serious bet on the power of big data to transform cancer. His success would not only help oncology patients, but also emphasize the centrality of dense data collection and sophisticated analysis to the practice and future of medicine.
Soon-Shiong’s ambitious gambit should also remind today’s leading medical centers that if they don’t figure out how to get a lot better at collecting, sharing, and leveraging rich data, they risk losing their place of preeminence to disruptive innovators and their empowered data scientists.