Genomics, big data can thrive through CDS, analytics tools
February 18, 2014
While advanced data analytics are still filtering their way into the healthcare industry, many researchers and data scientists are looking beyond the basics of patient data to the cutting edge: integrating genomic data into clinical decision making on a large scale. Researchers envision a significant boost in good outcomes for patients whose genes predict their response to certain drugs, while hospitals concerned about the bottom line will be pleased to save money by getting it right the first time, preventing lengthier stays and more expensive treatments.
To help further the goal of bringing genomics to the point of care, the National Consortium for Data Science (NCDS) was founded in 2013 to bring best practices and support to an industry struggling to come to grips with the enormous volume of data it produces every day.
“Society has been greatly impacted by the massive amounts of data that are now available for use. Immense streams of data are routinely collected across science, medicine, education, government, industry, social media, and entertainment,” says a new white paper released by NCDS this month. “Yet our ability to efficiently and effectively use these “big data” sets is hampered by our fundamental lack of understanding of the science of data. The interdisciplinary field of data science can provide solutions to the big data impasse by furthering our understanding of the production, management, and use of big data and by developing principles and theories to guide new discoveries and innovations.”
At a summit last April, data science and genomics experts identified six key areas that impact the development of genomic data, including information collection and management, ethics and privacy concerns, data sharing obstacles, and the difficulty of harnessing enough computer processing power to bring filter through petabytes of raw data and end up with a reliable algorithm that can be presented to clinicians during the decision-making process.
While some healthcare organizations and research teams are making progress with this goal on a relatively small scale, the state of data science in the healthcare industry is still too disorganized to push forward at the necessary rate. “Frequently, the details of the data collection process, which are typically required to reuse the data, are not always transmitted with the data,” the white paper explains. “Data sets derived from multiple sources are often collected without agreements in place to ensure provenance for the primary data sources or broad informed consent to allow data reuse. These issues are exacerbated by industry, as competition among external vendors encourages the differentiation and incompatibility of systems.”
Researchers still haven’t entirely agreed on how to delineate phenotypes, even for something as simple as measuring the height of a patient, which can vary due to individual provider guidelines about having the patient relax or extend their spine as much as possible. For more complex gene expressions, like psychiatric disorders or cardiovascular disease, coming to a true consensus among experts is exceedingly difficult – even if they agreed on the descriptive coding standards used to define the myriad characteristics of a single tumor, for example.
Clinicians, researchers, data scientists, and software developers must work harder to agree upon standards if the field is to advance, NCDS says. Data sharing should be encouraged, built upon a trust framework and nudged along by incentives that promote information exchange and coordinated approaches to computing problems. Open discussions about bioethics, security, and fair use of data will foster an atmosphere of collaboration as experts come together to break down barriers towards truly personalized medicine.
Related White Papers: