NIH Campus, Natcher Building , Room D
Bethesda , Maryland
William Sharrock, PhD, NIAMS
Stephen I. Katz, MD, PhD, NIAMS
Clifford Rosen, MD, Maine Center for Osteoporosis Research and Education and The Jackson Laboratory
In the past two years, the genome-wide association (GWA) approach to identifying genetic loci related to disease risk has matured from an intriguing concept to a widely-used scientific tool. In a number of cases, novel insights have emerged from initial studies and, critically, have been confirmed by replication in additional cohorts. A comprehensive NIH policy has been developed, setting unprecedented expectations for data sharing. The NIH Database of Genotype and Phenotype (dbGaP) has been established, and already contains a large amount of data from a number of disease areas. For example, investigators supported by the NIAMS to study the genetics of psoriasis have contributed to dbGaP through participation in the Genetic Association Information Network (GAIN).
Initiatives such as GAIN and the on-going Genes, Environment and Health Initiative (GEI) have explored the potential of GWA studies (GWAS) through broad, centrally-managed competitions. The GWAS area is now in transition to a mode in which many GWAS efforts are likely to be proposed in unsolicited applications, and will be considered for direct support by individual NIH Institutes and Centers (ICs). There are many areas of the NIAMS mission in which GWAS could furnish critical new insights. However, the nature and scale of GWAS-based investigations may require the NIAMS to take specific steps or develop new policies to ensure that the potential of this approach is realized in NIAMS mission areas.
Successful GWAS examples
Recent advances in GWAS have borne out the need for large population samples, to provide sufficient statistical power for this approach. The design of a study must not only consider the demands of the initial genome-wide scan, but, because initial scans yield many false positives, also provide for the replication of initial results in additional population samples. As a result, successful studies have frequently depended upon broad consortia, sometimes crossing international boundaries. There are clearly multiple paths to successful consortia. In some cases, long-standing professional relationships between investigators have provided the basis for the formation of these groups. Often, seed funding from patient advocacy organizations has proven crucial in developing the patient registries and sample repositories necessary to launch GWAS efforts. In other situations, the NIH has provided resources, and even mandated collaboration, in order to achieve the necessary scale of operations.
Scientific challenges and opportunities
GWAS projects in the NIAMS mission areas are in various stages of development. Recent findings in rheumatic diseases, such as systemic lupus erythematosus ( SLE ) and rheumatoid arthritis (RA), have originated in studies of simple case-control design. However, some diseases in the NIAMS portfolio may require more complex approaches to phenotype. Some disorders, such as osteoarthritis (OA), are characterized by multiple subtypes. Specific categorization (e.g., bilateral hand OA vs. knee OA) is required for a well-designed GWAS, but this specificity reduces the number of applicable cases significantly. Both initial scans and replication studies will require rigorous subtype descriptions. Bone mineral density, an important factor in osteoporotic fracture risk, could be treated as a continuously variable trait or, alternatively, used to define cases and controls at the extremes of the distribution. In such areas, considerable discussion and planning are likely to be necessary in order to arrive at optimal study designs.
Research teams are navigating a variety of genotyping platforms. Reliability, efficiency, cost, and availability are important considerations in selecting technologies. Some replication studies deliberately use different genotyping platforms from the initial scan, to reduce bias or artifacts from a single approach. Lack of consistent standards for genotyping and other data gathering at multiple study sites raises concerns about pooling results or replication studies. A single, designated reviewer of information (such as subtype categorization data) from all of a consortium's sites can help ensure uniformity.
Population stratification, or genetic differences from ancestry rather than genes associated with a disease, is a concern throughout GWAS, due to the risk of generating false positives. Although analytical strategies, such as the use of ancestry-informative markers, can minimize these potential errors, it becomes more challenging when replication studies utilize very different racial and ethnic populations from those of the initial scan. Similarly, many initial scans have had predominantly Caucasian patient populations, but some control populations-which may be from the general population-have had more racial and ethnic diversity. These might be hidden through erroneous self-reporting of ancestry, and may generate apparent associations with variants that are unrelated to the disease risk. Some GWAS designs deliberately select control groups from the same geographic region as the patient sample in order to reduce the potential for error.
Control groups constructed for specific disease GWAS are screened carefully to eliminate individuals with the condition under study. The same control group may sometimes be used for multiple studies, and some investigators have successfully used shared control groups to increase statistical power. However, shared control groups may not be suitable for use in all studies, because of differing prevalence of disease subtypes in populations of different ancestry.
Because GWAS generally reveal marker polymorphisms rather than the actual genetic variant associated with disease, attention must ultimately shift to identifying causal variants and understanding the biological mechanisms of their effects. "Deep sequencing," or medical sequencing, reveals detailed information of an entire genomic region. It provides important confirmation of gene variations identified by GWAS, and may uncover variants and causal targets that elude broader mapping efforts. However, at present, it is labor-intensive and expensive, particularly for large populations. In time, high-throughput sequencing techniques may become standard, affordable, commercial services that will be used to identify therapeutic targets. Other approaches to associating biological mechanisms with GWAS results will require multidisciplinary teams that can test specific genetic variants in animal models and structure-function studies.
It will be important to look beyond the initial genome-wide studies, as shared data resources grow and the number of reported disease associations increases. Independent analyses of existing data, testing of reported associations in new populations, and the combination of existing datasets into larger studies of greater power may require targeted support.
Administrative challenges and opportunities
Sample ownership and individual institutions' policy differences (e.g., institutional review boards, data sharing requirements) may create problems in these research collaborations. Many projects involving clinical samples grapple with historic informed consent forms which restrict sample use to the specific, initial research project; this issue eliminates their availability as control samples for another disease's study or for replication studies. International consortia also encounter differences between countries in intellectual property policies and database security. Authorship can also become controversial; some collaborative groups agree that the named consortium should be the author, but individual recognition remains an important criterion of professional advancement in many scientific communities.
Distribution of credit in GWAS is important, if not essential for investigators in critical stages of their careers, to encourage participation in collaborative efforts. Junior investigators can be positioned to lead important follow-up studies, particularly for functional studies linked to previously-identified causal variants. However, analytical scientists-many who are pioneers in this new field and early in their careers-may have a more difficult time in obtaining individual recognition. Clinicians, who play important roles in developing and characterizing patient cohorts, face increased clinical duties that leave little time for participation in research. Seasoned investigators are also challenged to juggle the logistics and politics of assembling multidisciplinary teams, in addition to other research duties.
Researchers conducting NIH-supported GWAS are expected to deposit data into dpGaP in a timely fashion. It is an important resource for sharing information with the broader scientific community, which may be utilized for further study, after requests are reviewed and approved by NIH staff. The investigators providing the data retain exclusive rights to publish for no more than 12 months after the data are made available, but many authors have found it difficult to produce significant, high-quality research articles within this brief period. Researchers also encounter disparate views on GWAS standards among funding and journal reviewers, which affect the timely launch or reporting of GWAS.
Lessons learned from GAIN and other efforts
GAIN and other early GWAS efforts have yielded useful lessons for current and future projects. Steering committees are important entities, for gaining consensus on a wide range of topics, from defining disease subtypes to distribution of resources. Because of the difficulty in changing databases and repositories after they have been implemented, it is valuable to think ahead when they are under development. Although many funding agencies need to impose policies for such projects, particularly to orchestrate large, complex initiatives, overly restrictive rules may hamper novel or multiple strategies in the future. Many efforts have seen success by maintaining flexibility, to take advantage of new approaches in this rapidly developing field. A resource pool (including funding) within a consortium can be beneficial for launching individual studies that make important contributions to the collective effort. Large, international collaborations have been essential to important, recent discoveries.
Potential NIAMS/NIH approaches
The NIAMS could provide additional central coordination of research efforts in particular diseases, such as the development or funding of central repositories, or even organize the formation of consortia. It could also hold a workshop to explore different consortium models, or provide planning grants to help develop collaborative groups.
As noted, many NIH-supported GWAS have been centrally organized. However, NIH is receiving a growing number of individual, investigator-initiated GWAS proposals. It is recognized that individual investigators remain the key driving force for NIAMS-supported research. Still, collaborative approaches to GWAS design seem most likely to produce studies that have adequate statistical power and make efficient use of additional populations for replication . Many investigators feel strongly that the current peer review system is successful in identifying the most scientific meritorious proposals, based on the criteria that are essential for good GWAS design. Not withstanding, one way the NIAMS could manage the GWAS portfolio and encourage collaboration would be to require prior acceptance of applications proposing GWAS. Many GWAS applications are already subject to such a requirement, imposed on applications requesting more than $500,000 per year. Advance consideration of proposed GWAS applications could help to ensure efficient use of existing resources, such as data, samples, control populations, and bioinformatics expertise.
NIH Data Sharing Policy for GWAS
Genetic Association Information Network (GAIN)
Genes, Environment and Health Initiative (GEI)
NIH Database of Genotype and Phenotype (dbGaP)
BOWCOCK, Anne PhD
Professor of Genetics
Co-Director, Division of Human Genetics
Washington University School of Medicine
CHRISTIANO, Angela, PhD
Associate Professor of Dermatology, and Genetics and Development
GLASS, David, MD
Professor of Pediatrics
Associate Director, Cincinnati Children's Research Foundation
Cincinnati Children's Hospital Medical Center
GREGERSEN, Peter K., MD
Director, Robert S. Boas Center for Genomics and Human Genetics
Feinstein Institute for Medical Research
Professor of Medicine and Pathology
New York University School of Medicine
HARLEY, John B., MD, PhD
Member and Program Chair, Arthritis and Immunology Research Program
Oklahoma Medical Research Foundation
Professor of Medicine
University of Oklahoma Health Sciences Center
HOCHBERG, Marc C., MD, MPH
Professor of Medicine
Head, Division of Rheumatology and Clinical Immunology
University of Maryland School of Medicine
JACKSON, Rebecca, MD
Associate Professor of Internal Medicine and Physical Medicine
Ohio State University Medical Center
KIEL, Douglas P., MD, MPH
Associate Professor of Medicine, Harvard Medical School
Director, Medical Research, Institute for Aging Research
Hebrew Senior Life
LANE, Nancy, MD
Professor of Medicine and Rheumatology
Director and Endowed Chair, Center for Healthy Aging
University of California at Davis Medical Center
LOUGHLIN, John, PhD
Professor, Musculoskeletal Research Group, Institute of Cellular Medicine
REVEILLE, John , MD
Professor of Internal Medicine
Director, Division of Rheumatology and Clinical Immunogenetics
University of Texas Health Science Center
ROSEN, Clifford, MD
Executive Director, Maine Center for Osteoporosis Research and Education
St. Joseph Hospital
Senior Staff Scientist, The Jackson Laboratory
SPRITZ, Richard A., MD
Professor and Director, Human Medical Genetics Program
University of Colorado , Denver