MOLECULAR & CELLULAR NEUROBIOLOGY - NEUROGENOMICS 
Master Course Cognitive Neuroscience - Radboud University, Nijmegen

INDEX

INTRODUCTION CELLS AND WITHIN CELLS IN A NUTSHELL GENOMICS MOLECULAR BIOLOGY AND GENETICS MOLECULAR BIOLOGICAL RESEARCH METHODS NEURODEVELOPMENT

Chapter 5: Molecular biology and genetics

Molecular biology and genetics

The human genome & HapMap projects Animal disease models
The evolution of molecular biology Genetic transmission Polygenic diseases
Understanding DNA Genetic mapping  

The Human Genome Project: purpose and goal  

The Human Genome Project, the first large international effort in the history of biological research, was initiated on October 1, 1990, to be completed in the year 2005. However, with improvements in technology and competition from the private sector, the timetable was accelerated. A rough draft of 90% was completed in 2000, and the complete sequence became available in 2003. The Human Genome Project sequenced the DNA blueprint for the development of a single fertilized egg into a complex organism. This blueprint is written in the coded message given by the sequence of nucleotide bases—the A’s, C’s, G’s, and T’s—that are strung together to make the DNA molecules in the human genome. However, while the overall objective was to sequence the human genome, other goals were completed along the way that markedly accelerated the efforts of all investigators involved in biological or medical research. The first goal was to develop a genetic map. This meant developing markers (unique DNA sequences) along each chromosome that would have a readily identifiable chromosomal position to provide highly informative signposts for the identification of nearby genes. This goal provided thousands of markers spaced 5 to 10 million base pairs apart, spanning the entire human genome, leading to the creation of a genetic “road map” for each chromosome. As will become evident in a future section of this text, it is the use of this genetic map, with DNA sequences (markers) of known positions (loci) along each chromosome, that enables the mapping of a gene’s chromosomal location by genetic linkage analysis. The tool of genetic linkage analysis led to the acceleration of mapping the position of numerous genes responsible for diseases. Currently over 1500 disease-causing genes are known, due to the more rapid identification of genes facilitated by the Human Genome Project.

 

The policy of the Human Genome Project is that the entire human DNA sequence, including all identified genes, will be available to the public. Each gene, as it is sequenced, is entered into a publicly accessible database and available at no cost. In the United States, GenBank (at http://www.ncvi.nlm.nih.gov) is run by the National Center for Biotechnology Information (NCBI) and serves as the public repository of DNA sequence information. The results of the efforts of the publicly funded Human Genome Project consist of not only DNA sequences of the various genes but also the intervening sequences.

 

Another goal was to develop a physical map of regions of the DNA that are expressed as genes. These markers are referred to as expressed sequence tags (ESTs) and contain short sequences of 200 to 300 bp. These sequences are unique and represent a fragment of a yet to be fully characterized specific gene. ESTs are generated by extraction of all of the mRNAs in a cell type, which represents all of the genes expressed at that time in that cell. The mRNA can be converted to cDNA with the enzyme reverse transcriptase and the sequences amplified by the polymerase chain reaction (PCR), from which unique sequences are selected and entered into GenBank as ESTs. The sequences of these ESTs are then matched to the plethora of sequences available in the DNA sequence repository. Thus, ESTs mapped to their chromosomal locations can be used as markers to identify novel genes responsible for disease. The development of this physical map has tremendously accelerated the efforts of investigators to identify novel genes, relevant to normal physiology or disease. These ESTs serve as candidate genes if a locus harboring a disease gene is mapped to a region; the ESTs in the region are potential candidate genes and greatly facilitate the identification of the gene of interest.

 

The HapMap Project   

While the Human Genome Sequencing project was completed in 2003, other large-scale human genome projects continue. The sequence of the Human Genome differs by only 0.1% among human beings. This one-tenth of 1%, however, translates into 3 million bases. These 3 million bases are now considered to be responsible for essentially all of the human variation including predisposition or resistance to diseases. Thus, it became evident that identifying the sequence responsible for human variation would represent a major quest for the next decade.

A great deal of human variation appears to be due to single-nucleotide polymorphisms referred to as single-nucleotide polymorphisms (SNPs), which are distributed throughout the human genome occurring at a frequency on average of about one SNP per 1000 base pairs. While identifying the SNPs responsible for human variation and the mechanism whereby this sequence induces the change is of crucial importance, it is perhaps of even more immediate importance to identify those SNPs that predispose to disease. Their potential to facilitate diagnosis, prevention, and treatment could be enormous. The difficulty lies in how to identify those SNPs that predispose to disease. In searching for SNPs that predispose to disease, it is quite a different task than identifying mutations responsible for single-gene disorders. A particular SNP is neither necessary nor required for a particular disease and thus contributes only a small percentage of the predisposition to the disease. Inheriting several of these SNPs may give you an accumulative effect as expressed in the phenotype of a polygenetic disease. The diseases that ultimately must be understood are those diseases due to multiple genes that interact significantly with the environment such as cardiac diseases, cancer, and mental illness. In an effort to facilitate future studies identifying SNPs and their related phenotype in polygenetic diseases, a consortium was formed consisting of Canada, Japan, United Kingdom, China, Nigeria, and United States to sequence and identify SNPs. The overriding question was to determine whether SNPs were coinherited in blocks and, hence, the term haplotype and the HapMap Project. The results were published and do indeed indicate that several of the SNPs are coinherited as blocks and exert a combined effect and thus one could select SNPs that are tagged to other SNPs, making it practical to scan the genome utilizing 300,000 to 500,000 SNPs as opposed to several million. While each human being has only 3 million SNPs, in the general population it is estimated there are about 17 million. It would now appear that 500,000 SNP chips can be used for genome-wide scans, which significantly decreases the cost compared to having to utilize 2 or 3 million SNPs. One of the difficulties that continues to remain a challenge is the low frequency of occurrence of these SNPs. It would appear that many of the SNPs occur at a frequency of less than 5%, which makes detection by current technology very difficult. Common SNPs that occur with frequency of 5 or 10% can, however, be detected utilizing genome-wide scans with 500,000 SNPs as markers. It appears that probably only 50,000 to 100,000 SNPs are responsible for providing significant change in humans since most SNPs do not affect coding regions, although the percentage of SNPs present in noncoding promoter regions that may markedly influence transcription remains to be determined. See also under "Genetic variations: SNPs and CNVs".

 


Next page: Genetic transmission  Go back to: Understanding DNA