van Duijn, PhDDepartment of Epidemiology Erasmus University Medical Center Rotterdam, The Netherlands Ross Duncan, PhD, MA Department of Dermatology Leiden University Medical Center Leid
Trang 2Human Genome Epidemiology
Trang 4HUMAN GENOME EPIDEMIOLOGY
Second Edition
Building the Evidence for
Using Genetic Information to Improve Health and Prevent Disease
Trang 5Oxford University Press, Inc., publishes works that further
Oxford University’s objective of excellence
in research, scholarship, and education.
Oxford New York
Auckland Cape Town Dar es Salaam Hong Kong Karachi
Kuala Lumpur Madrid Melbourne Mexico City Nairobi
New Delhi Shanghai Taipei Toronto
With offi ces in
Argentina Austria Brazil Chile Czech Republic France Greece
Guatemala Hungary Italy Japan Poland Portugal Singapore
South Korea Switzerland Thailand Turkey Ukraine Vietnam
Copyright © 2010 by Oxford University Press, Inc.
Published by Oxford University Press, Inc.
198 Madison Avenue, New York, New York 10016
www.oup.com
Oxford is a registered trademark of Oxford University Press.
All rights reserved No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any means,
electronic, mechanical, photocopying, recording, or otherwise,
without the prior permission of Oxford University Press.
Materials appearing in this book prepared by individuals as part of their offi cial duties as
United States government employees are not covered by the above-mentioned copyright,
and any views expressed therein do not necessarily represent the views of the United
States government Such individuals’ participation in the Work is not meant to serve as
an offi cial endorsement of any statement to the extent that such statement may confl ict
with any offi cial position of the United States government.
Library of Congress Cataloging-in-Publication Data
Human genome epidemiology : building the evidence for using genetic information to improve health and prevent disease / edited by Muin J Khoury [et al.] — 2nd ed.
Trang 6In the fi rst edition of Human Genome Epidemiology published in 2004, we discussed
how the epidemiologic approach provides an important scientifi c foundation for studying the continuum from gene discovery to the development, applications, and evaluation of human genome information in improving health and preventing dis-ease Since 2004, advances in human genomics have continued to occur at a breath-taking pace Although the concept of personalized healthcare and disease prevention often promised by enthusiastic scientists and the media is yet to be fulfi lled, we are now seeing progress and rapid accumulation of data in many “omics” related research fi elds New methods to measure genome variation on an unprecedented large scale have propelled a new generation of genome-wide association studies Evaluation of rare variants and full sequencing at large-scale are rapidly becoming
a reality Also, we have seen the emergence of population-based biobanks in many countries with the objectives of quantifying longitudinally the joint infl uences of genetic and environmental factors on the occurrence of common diseases
With all these ongoing developments, we have invited many authors who are
lead-ers in the fi eld to produce the second edition of Human Genome Epidemiology Our
aim is to inform readers of new developments in the genomics fi eld and how miologic methods are being used to make sense of this information We do realize that the material presented in this book will be outdated even before it is published However, the methodologic challenges and possible solutions to them will remain with us for quite some time There is very little material remaining from the fi rst
epide-edition of Human Genome Epidemiology.
This new edition is divided into fi ve parts In Part I, we revisit the fundamentals
of human genome epidemiology We fi rst give an overview of the development and progress in applications of genomic technologies with a focus on genomic sequence variation (Chapter 2) We then give an overview of the multidisciplinary fi eld of public health genomics that includes a fundamental role of epidemiologic methods and approaches (Chapter 3) We also present a brief overview of evolving methods for tracking and compiling information on genetic factors in disease (Chapter 4)
In Part II, we discuss methodologic developments in collection, analysis, and thesis of data from human genome epidemiologic studies We discuss the emergence
syn-of biobanks around the world (Chapter 5), the evolution syn-of case-control studies and cohort studies in the era of GWAS (Chapter 6), and the emerging role of con-sortia and networks (Chapter 7) Next, we discuss methodologic analytic issues in GWAS (Chapter 8) and the analytic challenges of gene-gene and gene-environment interaction (Chapter 9) We then address issues of reporting of genetic associations
Trang 7(Chapter 10), evolving methods for integrating the evidence (Chapter 11) as well as assessment of cumulative evidence and fi eld synopses (Chapter 12).
In Part III, we provide several case studies that attempt to present an evolving knowledge base of the cumulative evidence on genetic variation in a variety of human diseases As the information undoubtedly will change (even before the publi-cation of the book), we stress here the importance of strong methodologic foundation for analysis and synthesis of information from various studies The diseases shown
in this section include three cancers: colorectal cancer (Chapter 13), childhood kemia (Chapter 14), and bladder cancer (Chapter 15) We also present data from type 2 diabetes (Chapter 16), osteoporosis (Chapter 17), preterm birth (Chapter 18), coronary heart disease (Chapter 19), and schizophrenia (Chapter 20) Collectively, these chapters cover an impressive array of common complex human diseases and provide an epidemiologic approach to rapidly emerging data on gene-disease and gene-environment interactions
leu-In Part IV, we discuss methodologic issues surrounding specifi c applications of human genomic information for medicine and public health We start in Chapter 21 with a review of the concept of Mendelian Randomization, an approach that allows
us to assess the role of environmental factors and other biomarkers in the rence of human diseases using data on the association of genetic variation and dis-ease endpoints In Chapter 22, we discuss how clinical epidemiologic concepts and methods can be used to assess whether or not one or more genetic variants (e.g., genome profi les) can be used to predict risk for human diseases Chapter 23 presents
occur-a moccur-ajor milestone for public heoccur-alth genomics, noccur-amely the publicoccur-ation of methods
of systematic review and assessment of the clinical validity and utility of genomic applications in clinical practice This chapter is a reprint of the published paper from the independent multidisciplinary panel, the EGAPP working group, supported by CDC and many partners Chapter 24 briefl y summarizes how reviews of the evi-dence on validity and utility of genomic information can be done systematically and rapidly, even in the face of incomplete information Chapter 25 focuses on the crucial role of the behavioral and social sciences in assessing the impact and value
of epidemiologic information on gene-disease associations Chapter 26 addresses issues in evaluating developments in newborn screening Chapter 27 provides an epidemiologic framework for the evaluation of pharmacogenomic applications in clinical and public health practice Chapter 28 presents an overview of the relevance and impact of epigenomics in clinical practice and disease prevention Finally, chap-ter 29 presents an epidemiologic framework for evaluating family health history as a tool for disease prevention and health promotion Even in this genomics era, family history remains a strong foundation, not only for identifying single gene disorders, but also for stratifying individuals and populations by different levels of disease risk and implementing personalized interventions
Finally, in Part V of the book, we present a few case studies of the application
of epidemiologic methods of assessment of clinical validity and utility for several disease examples These include two pharmacogenomic testing examples—initial
Trang 8treatment of depression with SSRIs (Chapter 30) and warfarin therapy (Chapter 31)
We also present information on population screening for hereditary tosis (Chapter 32), a genetic disorder with incomplete penetrance that has attracted some attention over the past decade as a possible example of population screening
hemochroma-in the genomics era
The second edition of Human Genome Epidemiology is primarily targeted to
basic, clinical, and population scientists involved in studying genetic factors in mon diseases In addition, the book focuses on practical applications of human genome variation in clinical practice and disease prevention We hope that students, clinicians, public health professionals, and policy makers will fi nd the book useful
com-in learncom-ing about evolvcom-ing epidemiologic methods for approachcom-ing the discovery and the use of genetic information in medicine and public health in the twenty-fi rst century
Cambridge JH
Ottawa JL 2009
Trang 12Contributors xv
PART I Fundamentals of Human Genome
Epidemiology Revisited
Muin J Khoury, Sara R Bedrosian, Marta Gwinn,
Julian Little, Julian P T Higgins, and John P A Ioannidis
Jesus Gonzalez-Bosquet and Stephen J Chanock
Philippa Brice and Ron Zimmern
4 Navigating the Evolving Knowledge of Human Genetic
Marta Gwinn and Wei Yu
PART II Methods and Approaches for Data
Collection, Analysis, and Integration
5 The Global Emergence of Epidemiological Biobanks: Opportunities
Paul R Burton, Isabel Fortier, and Bartha M Knoppers
6 Case-Control and Cohort Studies in the Age of Genome-wide
Associations 100
Teri Manolio
7 The Emergence of Networks in Human Genome
Daniela Seminara, Muin J Khoury, Thomas R O’Brien,
Teri Manolio, Marta Gwinn, Julian Little, Julian P T Higgins,
Jonine L Bernstein, Paolo Boffetta, Melissa L Bondy,
Molly S Bray, Paul E Brenchley, Patricia A Buffler,
Juan Pablo Casas, Anand P Chokkalingam, John Danesh,
Trang 13George Davey Smith, Siobhan M Dolan, Ross Duncan,
Nelleke A Gruis, Mia Hashibe, David J Hunter, Marjo-Riitta Jarvelin, Beatrice Malmer, Demetrius M Maraganore, Julia A Newton-Bishop,
Elio Riboli, Georgia Salanti, Emanuela Taioli, Nic Timpson,
Andr é G Uitterlinden, Paolo Vineis, Nick Wareham, Deborah M Winn, Ron Zimmern, and John P A Ioannidis
8 Design and Analysis Issues in Genome-wide Association
Studies 136
Duncan C Thomas
9 The Challenge of Assessing Complex Gene–Environment
Peter Kraft and David J Hunter
10 STrengthening the REporting of Genetic Association Studies
Julian Little, Julian P T Higgins, John P A Ioannidis,
David Moher, France Gagnon, Erik von Elm, Muin J Khoury,
Barbara Cohen, George Davey Smith, Jeremy Grimshaw,
Paul Scheet, Marta Gwinn, Robin E Williamson, Guang Yong Zou,
Kimberley Hutchings, Candice Y Johnson, Valerie Tait,
Miriam Wiens, Jean Golding, Cornelia M van Duijn,
John McLaughlin, Andrew Paterson, George Wells, Isabel Fortier,
Matthew Freedman, Maja Zecevic, Richard A King,
Claire Infante-Rivard, Alexandre Stewart, and Nick Birkett
11 Integration of the Evidence on Gene-Disease Associations:
Julian P T Higgins and Julian Little
12 Genome-wide Association Studies, Field Synopses, and the
Development of the Knowledge Base on Genetic Variation
Muin J Khoury, Lars Bertram, Paolo Boffetta, Adam S Butterworth,
Stephen J Chanock, Siobhan M Dolan, Isabel Fortier,
Montserrat Garcia-Closas, Marta Gwinn, Julian P T Higgins,
A Cecile J W Janssens, James M Ostell, Ryan P Owen,
Roberta A Pagon, Timothy R Rebbeck, Nathaniel Rothman,
Jonine L Bernstein , Paul R Burton, Harry Campbell,
Anand P Chokkalingam, Helena Furberg, Julian Little,
Thomas R O’Brien, Daniela Seminara, Paolo Vineis,
Deborah M Winn, Wei Yu, and John P A Ioannidis
Trang 14PART III Case Studies: Cumulative Assessment of the Role of Human Genome Variation in Specifi c Diseases
Harry Campbell, Steven Hawken, Evropi Theodoratou, Alex Demarsh, Kimberley Hutchings, Candice Y Johnson, Lindsey Masson, Linda Sharp, Valerie Tait, and Julian Little
Anand P Chokkalingam and Patricia A Buffl er
Jonine D Figueroa, Montserrat Garcia-Closas,
and Nathaniel Rothman
Eleftheria Zeggini and Mark I McCarthy
André G Uitterlinden, Joyce B J van Meurs,
and Fernando Rivadeneira
Siobhan M Dolan
Adam S Butterworth, Julian P T Higgins,
Nadeem Sarwar, and John Danesh
21 Mendelian Randomization: The Contribution of Genetic Epidemiology
George Davey Smith and Shah Ebrahim
22 Evaluation of Predictive Genetic Tests for Common Diseases:
A Cecile J W Janssens, Marta Gwinn, and Muin J Khoury
Trang 1523 The Evaluation of Genomic Applications in
Practice and Prevention (EGAPP) Initiative:
Steven M Teutsch, Linda A Bradley, Glenn E Palomaki,
James E Haddow, Margaret Piper, Ned Calonge, W David Dotson,
Michael P Douglas, and Alfred O Berg
James M Gudgeon, Glenn E Palomaki, and Marc S Williams
25 Role of Social and Behavioral Research in Assessing the Utility of
Saskia C Sanderson, Christopher H Wade, and Colleen M McBride
26 Assessing the Evidence for Clinical Utility in Newborn Screening 517
Rodolfo Valdez, Muin J Khoury, and Paula W Yoon
PART V Case Studies: Assessing the Use of Genetic Information in Practice for Specifi c Diseases
Iris Grossman, Mugdha Thakur, and David B Matchar
31 A Rapid-ACCE Review of CYP2C9 and VKORC1 Allele Testing to
Inform Warfarin Dosing in Adults at Elevated Risk for Thrombotic
Monica R McClain, Glenn E Palomaki, Margaret Piper, and
James E Haddow
32 Hereditary Hemochromatosis: Population Screening for Gene Mutations 639
Diana B Petitti
Index 657
Trang 16Sara R Bedrosian, BA, BFA
McKing Consulting Corporation
Offi ce of Public Health Genomics
Centers for Disease Control and
Providence, RI
Molly S Bray, PhD
Center for Human Genetics Institute of Molecular Medicine and School of Public Health
University of Texas Houston, TX
Paul E Brenchley, PhD
Renal Research Laboratories Manchester Institute of Nephrology and Transplantation
Royal Infirmary Manchester, United Kingdom
Philippa Brice, PhD
Foundation for Genomics and Population Health (PHG Foundation) Cambridge, United Kingdom
Patricia A Buffler, PhD, MPH
Division of Epidemiology University of California Berkeley School of Public Health Berkeley, CA
Paul R Burton, MD
Department of Health Sciences University of Leicester Leicester, United Kingdom
Trang 17UK HuGENet Coordinating Centre
Cambridge, United Kingdom
Public Health Sciences
College of Medicine and Vet Medicine
University of Edinburgh
Edinburgh, United Kingdom
Juan Pablo Casas, MD
Department of Epidemiology and
Laboratory of Translational Genomics
Division of Cancer Epidemiology and
School of Public Health
University of California at Berkeley
Berkeley, CA
Barbara Cohen, PhD
Former Senior Editor Public Library of Science San Francisco, CA
John Danesh, MD, MBChB, MSc, DPhil, FRCP
Department of Public Health and Primary Care
University of Cambridge Cambridge, United Kingdom
George Davey Smith, MD, DSc, FRCP, F Med Sci
MRC Centre for Causal Analyses in Translational Epidemiology Department of Social Medicine University of Bristol
Bristol, United Kingdom
Alex Demarsh, MSc
Department of Epidemiology and Community Medicine
University of Ottawa Ottawa, ON, Canada
Siobhan M Dolan, MD, MPH
Albert Einstein College of Medicine Montefi ore Medical Center Bronx, NY
W David Dotson, PhD
Offi ce of Public Health Genomics Centers for Disease Control and Prevention
Atlanta, GA
Michael P Douglas, MS
McKing Consulting Corporation Offi ce of Public Health Genomics Centers for Disease Control and Prevention
Atlanta, GA
Trang 18Cornelia M van Duijn, PhD
Department of Epidemiology
Erasmus University Medical Center
Rotterdam, The Netherlands
Ross Duncan, PhD, MA
Department of Dermatology
Leiden University Medical Center
Leiden, The Netherlands
Shah Ebrahim, MSc, DM,
FRCP, FFPHM
London School of Hygiene and Tropical
Medicine
London, United Kingdom
Erik von Elm, MD, MSc
Institute of Social and Preventive
Medicine
University of Bern
Bern, Switzerland
and
German Cochrane Centre
Department of Medical Biometry and
National Cancer Institute
Department of Health and Human
Paediatric and Perinatal Epidemiology Bristol, United Kingdom
Jesus Gonzalez-Bosquet,
MD, PhD
Laboratory of Translational Genomics Division of Cancer Epidemiology and Genetics
National Cancer Institute, National Institutes of Health
Bethesda, MD
Jeremy Grimshaw, MBChB, PhD, FRCGP
Canada Research Chair in Health Knowledge Transfer and Uptake Clinical Epidemiology Program Ottawa Health Research Institute Department of Medicine University of Ottawa Ottawa, ON, Canada
Trang 19Clinical Genetics Institute
Salt Lake City, UT
Marta Gwinn, MD, MPH
Offi ce of Public Health Genomics
Centers for Disease Control and
Gene–Environment Epidemiology Group
International Agency for Research on
Claire Infante-Rivard,
MD, PhD
Department of Epidemiology, Biostatistics, and Occupational Health Faculty of Medicine
McGill University Montreal, QC, Canada
Center for Genetic Epidemiology and Modeling
Department of Medicine Tufts University School of Medicine Boston, MA
Trang 20A Cecile J W Janssens, PhD
Department of Epidemiology
Erasmus University Medical Center
Rotterdam, The Netherlands
Offi ce of Public Health Genomics
Centers for Disease Control and
Centre of Genomics and Policy Department of Human Genetics McGill University
Bethesda, MD
Demetrius M Maraganore, MD
Department of Neurology Mayo Clinic
Rochester, MN
Trang 21Duke-NUS Graduate Medical School
Program in Health Services Research
Singapore
Colleen M McBride, PhD
Social and Behavioral Research Branch
National Human Genome Research
Institute
Washington, DC
Mark I McCarthy, MD,
FRCP, FMedSci
Oxford Centre for Diabetes,
Endocrinology and Metabolism
Division of Medical Screening
Women & Infants Hospital
Prosserman Centre for Health Research
at the Samuel Lunenfeld Research Institute
Toronto, ON, Canada
Joyce B J van Meurs, PhD
Department of Internal Medicine Erasmus MC
Rotterdam, The Netherlands
David Moher, PhD
Department of Epidemiology and Community Medicine
University of Ottawa Ottawa, ON, Canada
Julia A Newton-Bishop, PhD
Genetic Epidemiology Division CR-UK Clinical Centre Leeds, United Kingdom
National Library of Medicine, NIH Bethesda, MD
Ryan P Owen, PhD
PharmGKB Genetics Department Stanford University Stanford, CA
Trang 22Genetics of Complex Diseases
Hospital for Sick Children (SickKids)
Toronto, ON, Canada
Blue Cross Blue Shield Association
Technology Evaluation Center
Departments of Internal Medicine and Epidemiology
Erasmus MC Rotterdam, The Netherlands
Nathaniel Rothman, MD, MPH, MHS
Division of Cancer Epidemiology and Genetics
National Cancer Institute Bethesda, MD
Georgia Salanti, PhD
School of Medicine and Biomedical Research Institute
University of Ioannina Ioannina, Greece
Nadeem Sarwar, Mphil, PhD
Department of Public Health and Primary Care
University of Cambridge Cambridge, United Kingdom
Saskia C Sanderson, PhD
Genetics and Genomic Sciences Mount Sinai School of Medicine New York, NY
Paul Scheet, PhD
MD Anderson Cancer Center Department of Epidemiology University of Texas
Houston, TX
Daniela Seminara, PhD, MPH
Epidemiology and Genetics Research Program
Division of Cancer Control and Population Sciences
National Cancer Institute, NIH Bethesda, MD
Trang 23Linda Sharp, PhD
National Cancer Registry (NCR)
Cork, Ireland, United Kingdom
Alexandre Stewart, PhD,
BScH, MSc
University of Ottawa Heart Institute
Ottawa, ON, Canada
Emanuela Taioli, MD, PhD
University of Pittsburgh Cancer Institute
University of Pittsburgh Medical Center
Verna Richter Chair in Cancer Research
Department of Preventive Medicine
University of Southern California
Rodolfo Valdez, PhD, MSc
Offi ce of Public Health Genomics Centers for Disease Control and Prevention
Atlanta, GA
David L Veenstra, PhD, PharmD
Pharmaceutical Outcomes Research and Policy Program, and Institute for Public Health Genetics
University of Washington Seattle, WA
Mukesh Verma, PhD
Methods and Technologies Branch Epidemiology and Genetics Research Program
Division of Cancer Control and Population Sciences
National Cancer Institute (NCI) National Institutes of Health (NIH) Bethesda, MD
Paolo Vineis, MD, MPH
Environmental Epidemiology Imperial College
London, United Kingdom
Trang 24Christopher H Wade, PhD,
MPH
Social and Behavioral Research Branch
& Genome Technology Branch
National Human Genome Research
Elsie Widdowson Laboratories
Cambridge, United Kingdom
George Wells, MSc, PhD
Cardiovascular Research Methods
Centre
University of Ottawa Heart Institute
Ottawa, ON, Canada
Clinical Genetics Institute
Salt Lake City, UT
Atlanta, GA
Maja Zecevic, PhD, MPH
Lancet New York, NY
Ron Zimmern, MA, FRCP, FFPHM
Foundation for Genomics and Population Health (PHG Foundation) Cambridge, United Kingdom
Guang Yong Zou, PhD
Department of Epidemiology and Biostatistics
University of Western Ontario London, ON, Canada and
Robarts Clinical Trials Robarts Research Institute London, ON, Canada
Eleftheria Zeggini, PhD
Wellcome Trust Centre for Human Genetics
University of Oxford Oxford, United Kingdom and
Wellcome Trust Sanger Institute Wellcome Trust Genome Campus Cambridge, United Kingdom
Trang 26FUNDAMENTALS OF HUMAN GENOME
EPIDEMIOLOGY
REVISITED
Trang 28In 2004, we published the book entitled Human Genome Epidemiology: A Scientifi c
Foundation for Using Genetic Information to Improve Health and Prevent Disease
(1) In it, we discussed how the epidemiologic approach provides an important tifi c foundation for studying the continuum from gene discovery to the development, applications, and evaluation of human genome information in improving health and preventing disease We called this continuum human genome epidemiology (or HuGE) to denote an evolving fi eld of inquiry that uses epidemiologic applications
scien-to assess the population impact of human genetic variation on health and disease, and how the resulting information can be used to improve population health We discussed and gave examples that illustrated that after the discovery of genetic vari-ants associated with diseases, additional well-conducted epidemiologic studies are needed to characterize the population impact of gene variants on the risk for adverse health outcomes and to identify and measure the impact of modifi able risk factors that interact with gene variants Epidemiologic studies are also required for evaluat-ing clinical validity and utility of new genetic tests, to monitor population use of genetic tests and to determine the impact of genetic information on the health and well-being of different populations The results of such studies will help medical and public health professionals integrate human genomics into practice
The Rationale for a Second Edition of
Human Genome Epidemiology
Since 2004, advances in human genomics have continued to occur at a breathtaking pace Although the concept of personalized healthcare and disease prevention often promised by enthusiastic scientists and the media is yet to be fulfi lled, we are now seeing rapid progress and accumulation of data in many “omics” related research
fi elds such as transcriptomics, proteomics, and metabolomics (2) Results of the International HapMap project were published in 2005 (3), paving the way to more effi cient methods to discover human genetic variations associated with a variety of common diseases of public health signifi cance New methods to measure genome
1
Human genome epidemiology:
the road map revisited
Muin J Khoury, Sara R Bedrosian, Marta Gwinn,
Julian Little, Julian P T Higgins, and John P A Ioannidis
Trang 29variation on an unprecedented large scale (hundreds of thousands of genetic ants) have propelled a new generation of genome association studies (4) Evaluation
vari-of rare variants and full sequencing at large-scale are rapidly becoming a reality Also, we have seen the emergence of population-based biobanks in many countries with the objectives of quantifying longitudinally the joint infl uences of genetic and environmental factors on the occurrence of common diseases (5)
Perhaps the single most important development in human genome logy has been the emergence of genome-wide association studies (GWAS; 6) The continuous improvements in genome-wide analysis technologies, coupled with drastic reductions in price, have led to widespread applications of these technolo-gies in large collaborative case-control, cross-sectional, and cohort studies These
epidemio-studies have interrogated agnostically, without a priori hypotheses, variation in the
whole genome, looking for differences in the distribution of genetic polymorphisms between individuals with and without disease As of August 2009, more than 400 gene variants have been discovered and replicated as risk markers (but not necessar-ily true culprits) for a variety of common diseases of public health signifi cance (7)
As a result, we are seeing an unprecedented expansion in the number of publications
of GWAS as well as studies of candidate genes with varying methodological quality While the deposition of GWAS data in potentially accessible databases (8,9) could lead to avoidance of selective publication, protection from other biases (e.g., selec-tion, confounding, misclassifi cation) is still a real concern even with large GWA studies that are based on selected or noncomparable samples of cases and controls
In addition, new technology such as full genomic sequencing is likely to replace the current genome-wide SNP analysis platforms Furthermore, we are seeing the emergence of the novel approaches of system biology, as well as the development of biomarkers based on gene expression profi les, epigenetic patterns, proteomic pro-
fi les, and so on Each new development taxes our ability to make sense of the increasing amount of data We must continue to develop, apply, and sharpen our epidemiological approaches to study designs, analysis, interpretation, and knowl-edge synthesis
ever-From Gene Discovery to Clinical
and Public Health Applications
The ongoing success of GWAS in uncovering genetic risk markers for many mon diseases has renewed expectations of a new era of health care and public health practice (6,10,11) Already, we have a few examples of applications in clini-cal medicine and population health (see Table 1.1 for emerging examples) By and large, emerging applications are relatively rare in spite of the rapid advances in gene discovery, and for many of them, their benefi ts and cost-effectiveness are not well known Therefore, there is an urgent need to understand the benefi ts and harms and to ensure high-quality implementation of new technologies (12) This includes improving the evidence base of outcomes of these technologies; the
Trang 30com-development of evidence-based guidelines for the use of genomic applications (13); the use of policy and legislation to prevent discrimination on the basis of genetic information (14); and the effective engagement of providers, researchers, and the general public More recently, “direct to consumer” (DTC) offerings of genome-wide profi les have been developed and marketed by several companies, with the implicit, if not explicit, goal of providing information for improving individual health and preventing common diseases (15) The ready availability and complex-ity of these new DTC tests could strain the ability of consumers and the health care delivery system to determine the true value of applying extensive quantities
of genomic data to health management Proponents of DTC genome-wide
pro-fi les feel strongly that this approach can empower and educate individuals about disease prevention and health promotion Others are concerned that the use of genome-wide profi les is based on an incomplete knowledge about the relation-ship between genetic variations and human diseases, and the lack of a full under-standing of the optimal specifi c medical or lifestyle interventions that should be offered based on these test results (16) Questions also remain regarding the scope
of individual genetic tests that should be included in genomic profi les, whether the underlying technologies are robust, and where the balance lies between potential benefi ts and harms (clinical utility) of these tests to individuals and populations (16,17) A 2007 report found several limitations in the existing US-based research and healthcare delivery infrastructure to create an evidence base of utilization and outcomes of gene-based applications (18) In addition, providers and the public have little understanding of genomics and genomics services (10) Overcoming these limitations would require coordinating efforts that span multiple disciplines
of laboratory sciences, medicine and public health, including health services research, and outcomes research The epidemiologic approach is at the intersec-tion of all these disciplines
The Emergence of Public Health Genomics
In the face of evolving technologies, we have witnessed in the past few years the emergence of “public health genomics,” a multidisciplinary fi eld concerned with
Table 1.1 Examples of emerging applications of human genome discoveries for clinical practice and disease prevention
Type of Application Examples of Proposed Applications
Trang 31the effective and responsible translation of genome-based knowledge and ogies to improve population health This fi eld is thriving in many countries and uses epidemiologic methods as a foundation for knowledge integration of genetic information in medicine and public health (19–21) Public health genomics uses population-based data on genetic variation and gene-environment interactions to develop, implement, and evaluate evidence-based tools for improving health and preventing disease Public health genomics also applies systematic, evidence-based assessments of genomic applications in health practice and works to ensure the delivery of validated, useful genomic tools in practice.
technol-Even with impressive advances in the basic sciences of gene discovery and acterization, reservations have been voiced about the potential benefi ts of medical applications of genomics; these reservations are based in part on the complex rela-tionship between genetic variation and the environment with disease occurrence, as refl ected in the modest associations between individual gene variants and disease outcomes, and the limited clinical validity and utility of using genetic information
char-in the prediction of disease Moreover, prematurely optimistic claims by researchers, the media, test developers, and commercial genomic enterprises may lead to unre-alistic expectations among consumers and inappropriate use of genetic information Also, an overemphasis on the genetics of human disease may divert attention from the importance of environmental exposures, social structure, and lifestyle factors (22) In public health practice, skepticism about genomics runs high among some practitioners whose traditional domains are the control of infectious diseases, envi-ronmental exposures, and health promotion for chronic disease prevention To some, genomics research is perceived as a low-yield investment, as well as an opportunity cost, undercutting social efforts to address environmental causes of ill health To others, public health applications of genomics are viewed only in terms of popula-tion screening, remaining limited to newborn screening programs (23) Still others reject genomics research as an unwarranted extension of the individual risk para-digm (24), citing the distinction between prevention in populations and in high-risk persons set out by Geoffrey Rose in 1985 (25) However, Rose was careful to present these approaches as complementary rather than mutually exclusive (25)
It can be argued that the integration of genomics into healthcare and disease vention requires a strong medicine–public health partnership (26) Public health and health care often operate in different spheres, although medicine is part of the “pub-lic health system” (27) This “schism” can be overcome in genomics using a popula-tion approach to a joint translational agenda that includes (a) a focus on prevention,
pre-a trpre-aditionpre-al public hepre-alth concern thpre-at is now pre-a promise of genomics in the repre-alm
of personalized medicine; (b) a population perspective that requires a large amount
of population level data to validate gene discoveries for clinical and population-level applications, especially given the modest associations between genetic factors and disease burden; (c) commitments to evidence-based knowledge synthesis and guideline development, especially with thousands of potential genomic applica-tions emerging into practice; and (d) emphasis on health services research and the
Trang 32surveillance of population health to evaluate health outcomes, costs, and benefi ts in the “real world” (27).
Epidemiology and the Phases of Genomics Translation
As shown in Table 1.2, there are four phases of translation research in genomics, from gene discovery to population health impact (28) In addition to traditional genetic epidemiology, which has focused by and large on gene discovery, epidemiologic methods and approaches play a role in all four phases (see Table 1.2) Phase 1 (T1) research seeks to move a basic genome-based discovery into a candidate health application (e.g., genetic test/intervention) Phase 2 (T2) research assesses the value
of a genomic application for health practice leading to the development of based guidelines Phase 3 (T3) research attempts to move evidence-based guidelines into health practice, through delivery, dissemination, and diffusion research Phase 4 (T4) research seeks to evaluate the “real world” health outcomes of a genomic appli-cation in practice Because the development of evidence-based guidelines is a mov-ing target, the types of translation research can overlap and provide feedback loops
evidence-to allow integration of new knowledge Although it is diffi cult evidence-to quantify how much
of human genomics research is T1, we have estimated that no more than 3% of lished research focuses on T2 and beyond (28) Indeed, evidence-based guidelines
pub-Table 1.2 Human genome epidemiology and the phases of genomics translation: examples and application
candidate health
application.
Phases 1 and 2 clinical trials;
observational studies.
What is the magnitude of the association between genetic variants and disease risks?
Is there gene-environment interaction?
Phase 4 clinical trials.
What proportion of individuals who meet guidelines criteria receive recommended care and what are the barriers to implementing practice guidelines?
popula-tion health impact.
Outcomes research (includes many disciplines);
population monitoring of morbidity, mortality, benefi ts and risks.
Does implementation of tice guidelines reduce disease incidence/improve outcomes?
prac-Source: Adapted from Reference 28.
Trang 33and T3 and T4 research currently are rare (except in newborn screening, and selected testing for genetic disorders such as hereditary breast and ovarian cancer).
The Continued Need for Methodological Standards
in Human Genome Epidemiology
Thus, the need for making sense of the avalanche of genetic and genomic data is more urgent than ever This urgency is behind the continued growth of the Human Genome Epidemiology Network (HuGENet), a global collaboration of individuals and organizations who are interested in accelerating the development of the knowl-edge base on human genetic variation and population health and the use of this information in improving health and preventing disease (29) HuGENet has focused
on developing methods and guidance to integrate and disseminate a global edge base on assessing the prevalence of genetic variants in different populations, genotype-disease associations, and gene-gene and gene-environment interactions, and evaluating genetic tests for screening and prevention During the past three years, HuGENet has made many methodological and substantive contributions to the fi eld HuGENet has developed a Web-based searchable knowledge base (the HuGE Navigator) that captures ongoing publications in human genome epidemiol-ogy (30) The HuGE Navigator is searchable by disease, gene, and disease risk fac-tors Furthermore, in collaboration with several journals, HuGENet has sponsored the systematic reviews of the evidence on genotype-disease associations, using spe-cifi c published guidelines and recommendations—the HuGENet handbook (31)—for carrying out this work, as well as for applying quantitative methods of synthesis Since 2000, HuGENet collaborators have carried out more than 80 reviews on vari-ous diseases ranging from single gene conditions to common complex diseases In
knowl-2005, HuGENet formed a network of investigator networks (32), which currently has
35 consortia, mostly disease-specifi c networks that are represented by hundreds of collaborators interested in sharing knowledge, experience, and resources in the con-duct, analysis, and dissemination of results of human genome epidemiology investi-gations In 2006, HuGENet conducted a workshop in collaboration with the global movement STROBE (STrengthening the Reporting of OBservational Epidemiology)
to extend the now well-studied “STROBE reporting checklist” to include genetic associations, under the rubric of STREGA (STrengthening the REporting of Genetic Associations; 33) In addition, the HuGENet “network of networks” pub-lished a “road map” for using consortia-driven pooled meta- analyses to accelerate the knowledge base on gene-disease associations (34) With the publication of the
HuGENet roadmap, the editors of Nature Genetics called for the development and
online publication of peer reviewed, curated expert knowledge bases called “fi eld synopses” that are regularly updated and freely accessible (35) HuGENet imple-mented the fi eld synopsis concept in a meeting held in 2006 in Venice (36) The workshop participants generated interim guidelines for grading the cumulative evi-dence in genetic associations based on three criteria: (1) the amount of evidence;
Trang 34(2) the extent of replication; and (3) protection from bias.The proposed scheme allows for three categories of descending credibility for each of these criteria and also for a composite assessment of “strong,” “moderate,” or “weak” credibility (36)
In 2008, HuGENet collaborators conducted a workshop to discuss insights and experiences from several fi eld synopses that represented the fi rst efforts by multiple authors at grading the credibility of these associations on a massive scale HuGENet participants emerged with a vision for collaboration that builds a reliable cumulative evidence for genetic associations and a transparent, distributed, and authoritative knowledge base on genetic variation and human health (37)
The HuGE Roadmap Revisited
With all these ongoing developments, we have invited many authors who are
lead-ers in the fi eld to produce the second edition of Human Genome Epidemiology Our
aim is to inform readers of new developments in the genomics fi eld and how miologic methods are being used to make sense of this information We do realize that the material presented in this book will be outdated even before it is published However, the methodological challenges and possible solutions to them will remain with us for quite some time There is very little material remaining from the fi rst
epide-edition of Human Genome Epidemiology.
This new edition is divided into fi ve parts In Part I, we give an overview of the development and progress in applications of genomic technologies, with a focus on genomic sequence variation (Chapter 2) We then give an overview of the multidis-ciplinary fi eld of public health genomics that includes a fundamental role of epi-demiologic methods and approaches (Chapter 3) We also present a brief overview
of evolving methods for tracking and compiling information on genetic factors in disease (Chapter 4)
In Part II, we discuss methodological developments in collection, analysis, and synthesis of data from human genome epidemiologic studies We discuss the emer-gence of biobanks around the world (Chapter 5), the evolution of case-control studies and cohort studies in the era of GWAS (Chapter 6), and the emerging role of consortia and networks (Chapter 7) Next, we discuss methodological analytic issues in GWAS (Chapter 8) and the analytic challenges of gene-gene and gene- environment interac-tion (Chapter 9) We then address issues of reporting of genetic associations (Chapter 10), evolving methods for integrating the evidence (Chapter 11), and assessment of cumulative evidence and fi eld synopses (Chapter 12)
In Part III, we provide several case studies related to various diseases that attempt
to present an evolving knowledge base of the cumulative evidence on genetic iation in a variety of human diseases As the information undoubtedly will change (even before the publication of the book), we stress here the importance of strong methodological foundation for analysis and synthesis of information from various studies The diseases shown in this section include three cancers: colorectal cancer (Chapter 13), childhood leukemia (Chapter 14), and bladder cancer (Chapter 15)
Trang 35var-We also present data from type 2 diabetes (Chapter 16), osteoporosis (Chapter 17), preterm birth (Chapter 18), coronary heart disease (Chapter 19), and schizophre-nia (Chapter 20) Collectively, these chapters cover an impressive array of common complex human diseases and provide an epidemiologic approach to rapidly emerg-ing data on gene-disease and gene-environment interactions.
In Part IV, we discuss methodological issues surrounding specifi c applications of human genomic information for medicine and public health We start in Chapter 21 with a review of the concept of Mendelian Randomization, an approach that allows
us to assess the role of environmental factors and other biomarkers in the occurrence
of human diseases using data on the association of genetic variation and disease points In Chapter 22, we discuss how clinical epidemiologic concepts and methods can be used to assess whether one or more genetic variants (e.g., genome profi les) can be used to predict risk for human diseases Chapter 23 presents a major mile-stone for public health genomics, namely the publication of methods of systematic review and assessment of the clinical validity and utility of genomic applications in clinical practice This chapter is a reprint of the published paper from the independent multidisciplinary panel, the EGAPP working group, sponsored by CDC and many partners Chapter 24 briefl y summarizes how reviews of the evidence on validity and utility of genomic information can be done systematically and rapidly, even in the face
end-of incomplete information Chapter 25 focuses on the crucial role end-of the behavioral and social sciences in assessing the impact and value of epidemiologic information on gene-disease associations Chapter 26 addresses issues in evaluating developments in newborn screening Chapter 27 provides an epidemiologic framework for the evalua-tion of pharmacogenomic applications in clinical and public health practice Chapter
28 presents an overview of the relevance and impact of epigenomics in clinical practice and disease prevention Finally, Chapter 29 presents an epidemiologic framework for evaluating family health history as a tool for disease prevention and health promotion Even in this genomics era, family history remains a strong foundation, not only for identifying single gene disorders, but also for stratifying individuals and populations
by different levels of disease risk and implementing personalized interventions.Finally, in Part V of the book, we present a few case studies of the application
of epidemiologic methods of assessment of clinical validity and utility for several disease examples These include two pharmacogenomic testing examples—initial treatment of depression with SSRIs (Chapter 30) and warfarin therapy (Chapter 31)
We also present information on population screening for hereditary tosis (Chapter 32), a genetic disorder with incomplete penetrance that has attracted some attention over the past decade as a possible example of population screening
hemochroma-in the genomics era
The second edition of Human Genome Epidemiology is primarily targeted at
basic, clinical, and population scientists involved in studying genetic factors in
com-mon diseases In addition, the book focuses on practical applications of human
genome variation in clinical practice and disease prevention We hope that students, clinicians, public health professionals, and policy makers will fi nd the book useful
Trang 36in learning about evolving methods for approaching the discovery and the use of genetic information in medicine and public health in the twenty-fi rst century.
References
Khoury MJ, Little J, Burke W, eds
Foundation for Using Genetic Information to Improve Health and Prevent Disease
New York: Oxford University Press; 2004.
Nature Omics Gateway Available at
2 http://www.nature.com/omics/ Accessed May 28, 2009.
International HapMap Consortium A haplotype map of the human genome
Knoppers BM Biobanking: international norms
Manolio TA, Brooks LD, Collins FS A Hapmap harvest of insights into the genetics of
6
common disease J Clin Invest 2008;118:1590–1605.
National Human Genome Research Institute-Offi ce of Population Genomics A
at http://cgems.cancer.gov/ Accessed May 28, 2009.
Feero WG, Guttmacher AE, Collis FS The genome gets personal-almost
2008;299:1351–1352.
Department of Health and Human Services: personalized healthcare initiative Available
11
at http://www.hhs.gov/myhealthcare/ Accessed May 28, 2009.
Secretary’s Advisory Committee on Genetics, Health and Society US system of
Information Nondiscrimination Act of 2008 N Engl J Med 2008;358:2661–2663.
Hogarth S, Javitt G, Melzer D The Current Landscape for Direct-to-Consumer
wish N Engl J Med 2008;358:105–107.
Agency for Healthcare Research and Quality Infrastructure to monitor utilization and
18
outcomes of gene-based applications: an assessment Available at care.ahrq.gov/healthInfo.cfm?infotype=nr&ProcessID=63 Accessed May 28, 2009 Burke W, Khoury MJ, Stewart A, et al The path from genome-based research to popula-
http://effectivehealth-19
tion health: development of an international public health genomics network Genet Med
2006;8:451–458.
Trang 37Khoury MJ, Bowen S, Bradley LK, et al A decade of public health genomics in the
in a name Public Health Genomics 2009;12:1–3.
Buchanan AV, Weiss KM, Fullerton SM Dissecting complex disease: the quest for the
22
philosopher’s stone? Int J Epidemiol 2006;35:562–571.
Rockhill B Theorizing about causes at the individual level while estimating effects at
23
the population level: implications for prevention Epidemiology 2005;16:124–129.
Holtzman NA What role for public health in genetics and vice versa?
medicine: how can we accelerate the appropriate integration of human genome
discover-ies into healthcare and disease prevention Genet Med 2007;9:665–674.
Centers for Disease Control and Prevention The Human Genome Epidemiology Network
Ioannidis JPA, Bernstein J, Boffetta P, et al A network of investigator networks in human
32
genome epidemiology Am J Epidemiol 2005;162:302–304.
STrengthening the REporting of Genetic Associations (STREGA)
interim guidelines Int J Epidemiol 2008;37:120–132.
HuGENet workshop 2008: Networks, genomewide association studies and the knowledge
37
base on genetic variation and human health Available at http://www.cdc.gov/genomics/ hugenet/hugewkshp_jan08.htm Accessed May 28, 2009.
Trang 38to complex human diseases and traits Moreover, the age of the genomics tion has spawned an opportunity to examine the interplay between environmental/ lifestyle factors and genetic variation as well as the genetics of individual responses
revolu-to medical intervention (e.g., pharmacogenomics) (4) A seminal step has been the characterization of common haplotypes in three continental populations, known as the International HapMap Project (http://www.hapmap.org); it has already reaped over 200 novel loci in the genome associated with human diseases/traits, primar-ily discovered by genome-wide association studies (GWAS) (5–8) Though these advances have focused on a component of genomic architecture, namely common genetic variants, parallel programs in comprehensive resequence analysis should yield a catalog of uncommon variants (1,000 genome project-HapMap3, http://www.hapmap.org/cgi-perl/gbrowse/hapmap3_B36) that will enable analysis of less common variants In concert with the assessment of germline genetic varia-tion, genomic characterization is underway using different platforms to integrate with gene expression data; these programs include the ENCODE (the ENCyclopedia
Of DNA Elements) Project, which seeks to defi ne functional elements (http://www.genome.gov/10005107) (9), and the Cancer Genome Atlas (TCGA), which is inter-rogating somatic and germline alterations in select cancers (10) Together, these new developments promise to accelerate the discovery and characterization of novel genomic mechanisms in human diseases and traits
The age of genomics has ushered in a more ambitious approach toward tifi c discovery, “team” science, in which resources and study populations are pooled
scien-to identify novel genetic markers (Figure 2.1) In this regard, GWAS survey sands of the most common genetic variants across the genome, single nucleotide polymorphisms (SNPs) in an “agnostic manner” (in other words, unfettered by prior
thou-2
Principles of analysis of germline genetics
Jesus Gonzalez-Bosquet and Stephen J Chanock
Trang 39hypotheses) and require adequately powered follow-up studies for replication (11)
It is this latter point that is central to the search for moderate- to high-frequency low- penetrance variants associated with human diseases and traits (12) Efforts to replicate are necessary to guard against the large number of apparent false positives, which can be due to chance or methodological biases in study design and execution (11,13,14) The emergence of high fi delity, highly parallel genotyping technologies make possible what was unimaginable a few years before The generation of dense data sets with millions of genotypes creates new statistical challenges that are as daunting as are the issues of archiving and storage Careful delineation of responsi-bilities among a team of scientists is necessary to ensure quality control and stable analytical results
Until recently, the primary engine for gene discovery was the candidate approach, but it had yielded only a modicum of success (15) Usually, due to technical or
Figure 2.1 Workfl ow of a genotyping study: The panel depicts critical steps in the execution
of a successful, high-quality, high-throughput genotyping study, starting from the design
of the study with either a candidate approach or a genome-wide association study (GWAS) approach, and followed with an effi cient Laboratory Information Management System (LIMS), required to track samples and processes, as well as with quality control capabilities Powerful, specially designed and highly scalable software (PLINK, GLU) is needed for the increasingly complex data output analysis These processes may include principal compo- nent analysis (PCA), association analysis, and haplotype reconstruction and association.
Trang 40budget constraints, a handful of genetic markers, either SNPs, microsatellites, or other markers, were chosen within one or several known genes (16) These genes were the “best bet,” usually based on prior knowledge drawn from either laboratory
or published association studies The markers were chosen because they satisfi ed one
or more conditions: (a) known or putative function either altering the coding region
or the regulation of the gene or genomic region, (b) prior functional evidence nating from the laboratory or a prior association study, or (c) exploration of regions
ema-fl anking a locus based on patterns of linkage disequilibrium Other approaches included studies of families with high penetrance of certain complex diseases.Previously, family linkage studies have been utilized to identify rare genetic vari-ants with high-penetrance susceptibility genes (17,18), but failed to be informative
on more common genetic variants with low to moderate effect (16) The majority of linkage analysis studies also used genetic markers other than SNPs for mapping In their seminal paper, Risch and Merikangas argued, “that the method that has been used successfully (linkage analysis) to fi nd major genes has limited power to detect genes of modest effect, but that a different approach (association studies) has far greater power, even if one needs to test every gene in the genome Thus, the future
of the genetics of complex diseases is likely to require large-scale testing by ation analysis” (19)
associ-Genetic Variation
Single Nucleotide Polymorphisms (SNPs)
The spectrum of human genetic variation is defi ned by both the frequency of morphisms, which can vary substantially between populations and the size of the variants Interestingly, the difference between any two single human genomes is less than 0.5% The most common sequence variation in the genome is the stable substitution of a single base, known as a single nucleotide polymorphism (SNP), which, by defi nition, is observed in at least 1% of a population The minor allele frequency (MAF) is the lowest allele frequency observed at a locus in one particu-lar population; and current estimates are that there are at least 8–10 million SNPs with a MAF greater than 1% (20–22), and 5 million SNPs with a MAF greater than 10% (2,20) As Figure 2.2 shows, there are a greater number of SNPs with lower MAFs Interestingly, the majority of SNPs with a MAF greater than 15–20% are common to all human populations (7,23); for instance, nearly 85% of the more than 1.5 million SNPs are common to European American, Han Chinese, and African American populations A small subset of high-frequency SNPs, less than 10%, appears to be private to a single population, again suggesting the common ancestry
poly-of all (23)
Human genetic variation is greatly infl uenced by geography, with genetic differentiation between populations increasing with geographic distance, and genetic diversity decreasing with distance from Africa; populations of African ancestry have the greatest diversity, resulting in shorter segments of linkage disequilibrium