We are entering an era when algorithms will be used to analyze data in the LIS and EHR databases and predict the risk of a patient developing various disease(s) in the future. This field is often referred to as predictive healthcare analytics. A recent article discussed this process in terms of predicting the onset of Parkinson's disease (see: Algorithm scans medical records for higher Parkinson’s risk), Below is an excerpt from it:
Researchers have developed an algorithm that could check patients’ medical histories to find signs of increased risk for developing Parkinson’s disease and alert doctors to evaluate patients at greater risk. Before symptoms become pronounced, there is no reliable way to identify who is on track to develop Parkinson’s disease....The algorithm relies on information in patients’ medical records, such as tests and diagnoses of various medical conditions....One of the most interesting findings is that people who are going to develop Parkinson’s have medical histories that are notably different from those who don’t develop the disease.....[The authors of a paper on this topic] analyzed de-identified medical claims data for Medicare beneficiaries nationwide, ages 66 to 90. They found 89,790 people who had been diagnosed with Parkinson’s in 2009, and matched them with 118,095 people in the same age range who had not been diagnosed with Parkinson’s in 2009 or prior years. Then, the researchers sifted through each person’s claims history to draw up a list of all diagnoses received and medical procedures undergone from 2004 to 2009...[They] developed an algorithm using medical history—combined with age, sex, race or ethnicity, and history of tobacco smoking—that correctly identified 73 percent of the people who would be diagnosed with the disease in 2009, and 83 percent of the people who would not. Specifically, many of the claims codes that helped predict the disease referred to problems already known to be associated with Parkinson’s such as tremors, posture abnormalities, psychiatric or cognitive dysfunction, gastrointestinal problems, sleep disturbances, fatigue and trauma, including falls. Other factors associated with the disease included weight loss and multiple forms of chronic kidney disease.
Part of the challenge of predictive healthcare analytics is that much of the relevant clinical data is locked in the EHR in the sense the it's recorded in natural language and not easily searchable. I blogged about this problem previously (see: Assessing Drugs Using "Real World Evidence" in Addition to Clinical Trials). Along these same lines, I encountered a company on web called Linguamatics Health that describes itself on its home page as "the world’s leading NLP text mining platform for health science." On its home page is the following claim:
Quickly develop new predictive models using unstructured data such as clinical notes: Identify high risk patients based on living conditions or life style choices like alcohol or drug use. Using NLP reduces the manual chart review needed to build data sets for machine learning models.
So the brave new world that we are facing in healthcare is that a patient will be admitted for the treatment of X disease. Algorithms will scan his medical record and then predict that he will probably develop Y and Z diseases in the near future. I have posted previous notes about some of the challenges relating to pre-diseases (see: The Challenge of Diagnosing Predisease in Our Healthcare Delivery System; Predisposition to Disease and Pre-Disease on the Health Continuum). We are also facing the challenge of patients not necessarily wanting to learn what diseases they are facing in the future (see: CASSANDRA’S REGRET: THE PSYCHOLOGY OF NOT WANTING TO KNOW).