Personalized Medicine

In my previous 2 posts, I have begun to flesh out my concerns with population-based medicine. In this post, I want to (1) elaborate further on the problem of population-based averaging (which forms the basis for epidemiology & randomized-controlled trials), and (2) address the nouveau idea of “personalized medicine.”

Population-based medicine originated with studies of infectious diseases in the late 1800s, and John Snow’s analysis of cholera in Great Britain is a shining example of how the study of a disease process can allow us to hone in on the causative agent (in this case, a dirty water pump). In his studies – and in those that followed him – disease incidence was averaged  across general parameters, like geography (e.g. streets, cities, zip codes). At first, his statistics were met with resistance, as physicians (like me) argued that there was too much individual variation to arrive at any useful conclusion. However, he and others showed that when the etiology could be traced back to a single cause – e.g. a bacteria, a water pump, etc. – population-based averaging worked extremely well since the causative factor often affected/infected anyone, regardless of age, gender, race, or creed (or any other personal characteristic you can imagine).
Fast forward a century, when our leading biological killers – heart disease & cancer – are not single agent mechanisms: there is no single gene, exercise regimen, or food article that uniformly causes these diseases (Ref. 1, 2). In fact, it is precisely our individual-to-individual variation that puts us at risk for these diseases. The lack of a single causative agent is manifested in the size of randomized controlled trials evaluating the efficacy of cholesterol-lowering drugs (Ref. 3): unexplained and pronounced variance in the population requires thousands of individuals (i.e. samples) to tease out the effect of any single factor, like a drug. In population-based averaging, you cannot know what you are “averaging out.” We average disease incidence or intervention outcomes over a group of people amalgamated by age, gender, and race, but imagine the number of personal characteristics that are unaccounted for, which may be as bizarre as eye color, skin pigmentation, or dental crowding (see footnote); these variables are the ones being smoothed over. There are innumerable hidden (i.e. latent) variables beyond our grasp, factors that can subtly or dramatically affect your disease risk. An example I can imagine: people who eat a Big Mac >1/week are at an increased risk for coronary artery disease, but people who eat a Big Mac and run 3 miles/day have a decreased risk for coronary artery disease. (This is a real example. I had a 50 year old patient who, in spite of eating McDonald’s every day, had perfect cholesterol and a heart that beat every other second. He has also run 4-5 miles every morning for the past 20 years)
So this leads you to think, “Well, why don’t we improve our stratification of people? Let’s just lump people together by X and Y, instead of X alone.” Though it sounds simple, we have no grasp over the number of features sufficient to define a group. On one hand, we have been averaging people for the past century by race, gender, and age (easily identifiable features). On the other hand, genotyping technology is now allowing us to measured tens of thousands of molecules at once, and the goal of projects like the $1000 genome or of companies like 23andme is precisely that: take thousands of measurements and calculate risk factors. The problem, however, has not been ameliorated, as we are leagues away from understanding how the genome translates into the phenome, and, consequently, the gross, unpredictable characteristics like BMI or lifestyle still need to factored into the risk prediction scheme.
Apart from the gargantuan issues underlying clinical genomics (Ref. 4), we are still plagued by the issue of combinatorial risk, as I alluded to earlier in thinking about Big Macs and running: risk factors are analyzed independent of each other, but it is hard to predict how these variables operate in tandem. When 23andme.com calculates your risk for a disease, it is based on the presence of a particular allele of a gene, often identified through genome-wide association studies (GWAS) that look for DNA polymorphisms correlated with disease incidence. So, if a single feature (a polymorphism, i.e. an allele) is correlated with disease in the population, they say that it increases (or decreases) your risk for the disease. What we cannot predict, however, is how various alleles blend together in an individual’s genetic soup. If you have a particular allele in MHC and/or IL12B (major histocompatibility complex and interleukin 12b, respectively; two genes with important roles in immunity), you may have an increased risk for psoriasis, a disease associated with an overactive immune system (Ref. 5). What happens, however, if you also possess the allele that predisposes you to tuberculosis (Ref. 6), an infectious disease that can slip below the radar of your immune system? Genetic screens would report your independent risks for these 2 diseases, yet these two diseases would be dependent entities inside of you by their very coexistence in your body. Would your revved up immune system (from your predisposition to psoriasis) abrogate your susceptibility to tuberculosis, or are they operating through independent pathways? Perhaps, unpredictably, they synergize with each other to produce a tertiary disease in a separate part of the body.
Breast cancer genetics is another story of unpredictable combinatorics, where the predisposition to cancer imparted by mutations in BRCA1 & BRCA2 is modulated by lfestyle factors and other genes. Bearing children, for instance, is associated with a decreased risk for breast cancer (Ref. 7), while variations in the AIB1gene seem to increase risk (Ref. 8). Yet, we were unaware of these risk factors when  breast cancer was first studied, and the picture of disease risk resembled the blue curve in the illustration below. Having teased out a few of the hidden variables, like parity or AIB1 status, we can better deconstruct risk by identifying subgroups (the orange and red curves) within the larger population.
Whether the hidden variable is the number of Big Macs you eat or the type of mutation in your genotype, the number of possible risk factors is astronomical. Currently, our picture of most complex illnesses resembles the graph above: we started with the blue curve, and, now, we have deconvolved it into a few smaller curves (orange & pink), but how many subpopulations exist within these? What variables are we inadvertently averaging out when we assemble these population-based risk assessments?
This is not personalized medicine. This is stratified medicine (Ref. 9), and the substratification of people is a interminable task due to its fractal-like complexity.
Footnote:
Although these features may seem like bizarre clinical variables in allopathic medicine, such phenotypic characteristics form the foundation of diagnosis and treatment in Ayurvedic medicine.
References:
  1. Bogardus, C. Missing Heritability and GWAS Utility. Obesity. 17 (2), 2009.
  2. Kraft P, Hunter DJ. Genetic risk prediction—Are we there yet? N Engl J Med 360:1701–1703 (2009)
  3. Heart Protection Study Collaborative Group. MRC/BHF Heart Protection Study of Cholesterol Lowering with Simvastatin in 20,536 high-risk individuals: a randomised placebo-controlled trial. Lancet. 2002 Jul 6;360(9326):7-22.
  4. Ng, P.C., et al. An Agenda for Personalized Medicine. Nature (461), 2009
  5. Zhang, X., et al. Psoriasis GWAS Identifies Suceptibility Variants Within LCE Gene Cluster at 1q21. Nature Genetics, 41 (2009).
  6. Thye, T., et al. GWAS Identifies a Susceptibility Locus for Tuberculosis on Chromosome 18q11.2. Nature Genetics, 42 (2010).
  7. MacMahon, B., et al. Age at first birth and breast cancer risk. Bulletin of the World Health Organization, 43 (1970).
  8. Rebbeck, T.R., et al. Modification of BRCA1 and BRCA2 associated Breast Cancer RIsk by AIB1 Genotype and Reproductive History. Cancer Research, 61 (2001).
  9. Trusheim, M.R., et al. Stratified Medicine: strategic and economic implications of combining drugs and clinical biomarkers. Nature Reviews Drug Discovery, 6 (2007).
2 comments
  1. \S said:

    Very interesting, keep it up! Here are some of my thoughts on what you’re saying, some of which I guess are pretty standard arguments, but I think need to be at least considered:

    You are making the argument for individualized medicine, and I don’t think any one is *really* going to argue with you on that front. Population statistics only apply to populations, and not necessarily to the individuals in the population. Except for one thing: personalization of medicine is really difficult. Unfortunately, population correlation studies are sort the best we can do at the moment, as you sort of mentioned.

    So, since we are relegated to population-based medicine, at least for the time being, we must ask ourselves: what is the meaning of “risk”? “Increased risk”, as you refer to it means that on a population level, having a correlative factor increases the *probability* of getting the correlated disease. However, risk is somewhat additive, so, as in the example of the Big Mac eater, a risk factor and a mitigating factor add together to yield a net negative risk (this combination may or may not be linear, which you can certainly argue, but we can often assume that it is linear enough to still get some useful information from it). You are trying to predict the future disease status of a patient, and by using this tool called “risk” you are able to use a sort of fuzzy logic to assign probabilities. This is not meant to say “you are going to get the disease” or “you are not going to get the disease”, but “you are increasing/decreasing your probability of getting a disease by doing X and/or Y”.

    If you start looking at risk in this sense, rather than a homogenizing term, I think the reasoning for this “population averaging” that you are talking about starts to make more sense; it is precisely the variables that you are talking about that you want to average out, right? For example, if you want to know what the risk attributable to eating a Big Mac every day is, you randomly choose a huge population, so that things like exercise routine (and other things that you might not know about) are averaged out. Once you have this data, I think it is at least somewhat possible to use this information, right? A Big Mac eater who runs 4-5 miles per day probably still has a higher risk of heart disease than a non-Big Mac eater who runs the same amount, assuming everything else is constant. And, even if everything else is not constant, the Big Mac eating probably isn’t helping anything, right?

    In the end, I think these studies of population risk mean what they mean; there is useful information for what different risk factors mean. For example, the link between smoking and lung cancer was found this way (interestingly, it was very difficult to perform these studies, because it was difficult to find people who didn’t smoke – likely similar to what it would take to do a population study on coffee nowadays). It is only when you are not careful about what it means for an individual that you get into trouble. This might be more what you are talking about? The idea in medicine is that standard of care models are designed around these population studies, which force you to do certain things in certain situations, even if you have a suspicion that a particular patient might have some other factor for which the standard of care model has not accounted. The problem that you are going to find if you try this is that you are going to see that there too many factors for a person to handle, which means that you are going to have a hard time eliciting all of the relevant factors from a person. Furthermore, even if you could, you probably could not keep all of the personalization in you head. So, medicine would in this situation become very dependent on computers to process all of this data, right? There’s nothing wrong with this in principle, I think, but it is going to take a lot of research and time to develop reliable tools like this – and what do you do in the meantime? You have to use the data from the population studies, right?

  2. So it seems that it is our job to find a way to assess a patient more holistically.. you suggest comparing a patient to his own “normal” rather than the population normal. This seems relevant when a patient is telling you they are experiencing a stomach pain that wasn’t there before, or if their lab values go up, but how do you assess a patient’s normal when normal shifts with age and stress.. It seems, also, that people (physicians) are blinded by their own experiences. It is innate for us to recognize patterns.. we see a red rash that ends up being cellulitis in a young woman and that is the first thing on the differential for the next young woman with a red rash. If we don’t do large trials to figure out risk factors, we end up making the same type of assessments based on limited data ex. a few weeks ago me: if you plan to keep smoking then maybe we should think about changing your birth control method so that you are not at higher risk for blood clots. pt: not really interested. me: ok. could you tell me a little about your decision? pt: I have 5 friends who smoke and take ocp’s and they all seem fine.
    a few thoughts:
    1. I think that the role of identifying risk factors is prevention of things that we would only be able to assess in the patient when it was too late. This way we can give the patient choices: ex we find if you take folic acid during pregnancy it can benefit your baby.. the patient takes very little risk upon themselves and potentially benefits the baby.
    2. Given that doctors haven’t done a great job with care when medicine is exclusively personalized and that randomized control trials miss assessing an individual’s characteristics.. what if we made a giant bank of questions.. patients/physicians/public health officers/teenagers/homeless people could suggest them and patients in the hospital could respond to them.. ex: Is there a relationship between eating beets on a Tuesday if you are a Leo and UTI’s? As this data base grew it would constantly run logistical regressions on all combinations of factors.. it could show potential risk factors to ask about. The beauty of a strong causation relationship is that you don’t need many subjects to prove it.. Then you could use it in reverse.. lets say your patient has abdominal pain and thinks it might be linked to using green oil paint in a recent painting. You can look if this has happened to people before and if they were in other ways similar to your patient. I am not sure how such a data base could be managed.. but maybe it would lead us to some interesting care paradigms.
    At the end of the day the problem that I see with our current system is that it stifles creativity.. and we cannot make great progress without stepping outside of our current norms of research.
    looking forward to your next post

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.