In my previous 2 posts, I have begun to flesh out my concerns with population-based medicine. In this post, I want to (1) elaborate further on the problem of population-based averaging (which forms the basis for epidemiology & randomized-controlled trials), and (2) address the nouveau idea of “personalized medicine.”
Population-based medicine originated with studies of infectious diseases in the late 1800s, and John Snow’s analysis of cholera in Great Britain is a shining example of how the study of a disease process can allow us to hone in on the causative agent (in this case, a dirty water pump). In his studies – and in those that followed him – disease incidence was averaged across general parameters, like geography (e.g. streets, cities, zip codes). At first, his statistics were met with resistance, as physicians (like me) argued that there was too much individual variation to arrive at any useful conclusion. However, he and others showed that when the etiology could be traced back to a single cause – e.g. a bacteria, a water pump, etc. – population-based averaging worked extremely well since the causative factor often affected/infected anyone, regardless of age, gender, race, or creed (or any other personal characteristic you can imagine).
Fast forward a century, when our leading biological killers – heart disease & cancer – are not single agent mechanisms: there is no single gene, exercise regimen, or food article that uniformly causes these diseases (Ref. 1, 2). In fact, it is precisely our individual-to-individual variation that puts us at risk for these diseases. The lack of a single causative agent is manifested in the size of
randomized controlled trials evaluating the efficacy of cholesterol-lowering drugs (Ref. 3): unexplained and pronounced variance in the population requires thousands of individuals (i.e. samples) to tease out the effect of any single factor, like a drug. In population-based averaging, you cannot know what you are “averaging out.” We average disease incidence or intervention outcomes over a group of people amalgamated by age, gender, and race, but imagine the number of personal characteristics that are unaccounted for, which may be as bizarre as eye color, skin pigmentation, or dental crowding (see footnote); these variables are the ones being smoothed over. There are innumerable hidden (i.e. latent) variables beyond our grasp, factors that can subtly or dramatically affect your disease risk. An example I can imagine: people who eat a Big Mac >1/week are at an increased risk for coronary artery disease, but people who eat a Big Mac
and run 3 miles/day have a decreased risk for coronary artery disease. (This is a real example. I had a 50 year old patient who, in spite of eating McDonald’s every day, had perfect cholesterol and a heart that beat every other second. He has also run 4-5 miles every morning for the past 20 years)
So this leads you to think, “Well, why don’t we improve our stratification of people? Let’s just lump people together by X and Y, instead of X alone.” Though it sounds simple, we have no grasp over the number of features sufficient to define a group. On one hand, we have been averaging people for the past century by race, gender, and age (easily identifiable features). On the other hand, genotyping technology is now allowing us to measured tens of thousands of molecules at once, and the goal of projects like the
$1000 genome or of companies like
23andme is precisely that: take thousands of measurements and calculate risk factors. The problem, however, has not been ameliorated, as we are leagues away from understanding how the genome translates into the phenome, and, consequently, the gross, unpredictable characteristics like BMI or lifestyle
still need to factored into the risk prediction scheme.
Apart from the gargantuan
issues underlying clinical genomics (Ref. 4), we are still plagued by the issue of
combinatorial risk, as I alluded to earlier in thinking about Big Macs and running: risk factors are analyzed independent of each other, but it is hard to predict how these variables operate in tandem. When 23andme.com calculates your risk for a disease, it is based on the presence of a particular allele of a gene, often identified through genome-wide association studies (GWAS) that look for DNA polymorphisms correlated with disease incidence. So, if a single feature (a polymorphism, i.e. an allele) is correlated with disease in the population, they say that it increases (or decreases) your risk for the disease. What we cannot predict, however, is how various alleles blend
together in an individual’s genetic soup. If you have a particular allele in MHC and/or IL12B (major histocompatibility complex and interleukin 12b, respectively; two genes with important roles in immunity), you may have an increased risk for psoriasis, a disease associated with an overactive immune system (Ref. 5). What happens, however, if you also possess the allele that predisposes you to tuberculosis (Ref. 6), an infectious disease that can slip below the radar of your immune system? Genetic screens would report your
independent risks for these 2 diseases, yet these two diseases would be
dependent entities inside of you by their very coexistence in your body. Would your revved up immune system (from your predisposition to psoriasis) abrogate your susceptibility to tuberculosis, or are they operating through independent pathways? Perhaps, unpredictably, they synergize with each other to produce a tertiary disease in a separate part of the body.
Breast cancer genetics is another story of unpredictable combinatorics, where the predisposition to cancer imparted by mutations in BRCA1 & BRCA2 is modulated by lfestyle factors and other genes. Bearing children, for instance, is associated with a decreased risk for breast cancer (Ref. 7), while variations in the AIB1gene seem to increase risk (Ref. 8). Yet, we were unaware of these risk factors when breast cancer was first studied, and the picture of disease risk resembled the blue curve in the illustration below. Having teased out a few of the hidden variables, like parity or AIB1 status, we can better deconstruct risk by identifying subgroups (the orange and red curves) within the larger population.

Whether the hidden variable is the number of Big Macs you eat or the type of mutation in your genotype, the number of possible risk factors is astronomical. Currently, our picture of most complex illnesses resembles the graph above: we started with the blue curve, and, now, we have deconvolved it into a few smaller curves (orange & pink), but how many subpopulations exist within these? What variables are we inadvertently averaging out when we assemble these population-based risk assessments?
This is not personalized medicine. This is stratified medicine (Ref. 9), and the substratification of people is a interminable task due to its fractal-like complexity.
Footnote:
Although these features may seem like bizarre clinical variables in allopathic medicine, such phenotypic characteristics form the foundation of diagnosis and treatment in
Ayurvedic medicine.
References:
- Bogardus, C. Missing Heritability and GWAS Utility. Obesity. 17 (2), 2009.
- Kraft P, Hunter DJ. Genetic risk prediction—Are we there yet? N Engl J Med 360:1701–1703 (2009)
- Heart Protection Study Collaborative Group. MRC/BHF Heart Protection Study of Cholesterol Lowering with Simvastatin in 20,536 high-risk individuals: a randomised placebo-controlled trial. Lancet. 2002 Jul 6;360(9326):7-22.
- Ng, P.C., et al. An Agenda for Personalized Medicine. Nature (461), 2009
- Zhang, X., et al. Psoriasis GWAS Identifies Suceptibility Variants Within LCE Gene Cluster at 1q21. Nature Genetics, 41 (2009).
- Thye, T., et al. GWAS Identifies a Susceptibility Locus for Tuberculosis on Chromosome 18q11.2. Nature Genetics, 42 (2010).
- MacMahon, B., et al. Age at first birth and breast cancer risk. Bulletin of the World Health Organization, 43 (1970).
- Rebbeck, T.R., et al. Modification of BRCA1 and BRCA2 associated Breast Cancer RIsk by AIB1 Genotype and Reproductive History. Cancer Research, 61 (2001).
- Trusheim, M.R., et al. Stratified Medicine: strategic and economic implications of combining drugs and clinical biomarkers. Nature Reviews Drug Discovery, 6 (2007).