Transcription

MOLECULAR & CELLULAR NEUROBIOLOGY

Master Course Cognitive Neuroscience - Radboud University, Nijmegen

INDEX

INTRODUCTION

CELLS AND WITHIN CELLS

IN A NUTSHELL

GENOMICS

MOLECULAR BIOLOGICAL RESEARCH METHODOLOGY

NEURODEVELOPMENT

Chapter 2: Cells and within cells

Cells	DNA and genes	Translation	Receptor Mechanisms
Neurons	More on DNA	Proteins, Protein Structure and Protein Analysis	Ion channel receptors
Glia	Epigenetics	Protein folding in the cell	Tyrosine kinase receptors
Within cells	Transcription	Post-translational modifications of proteins	G-protein-coupled receptors
Amino ac, Carbohydr, Lipids and Nucleic ac	Noncoding RNAs	Protein degradation in the cell - Autophagy	G-proteins
Membranes and Membrane Proteins	miRNAs and the brain	Protein secretion / Secretory pathway	Transcription and signalling
The Exctracellular Matrix			Transcription factor receptors

Transcription: DNA → RNA

The central dogma of molecular biology is now well established, namely, DNA produces RNA, which in turn produces a polypeptide that makes up the protein that provides the cell structure and performs the functions of the cell (Figures below). The genetic information inherited by each individual is encoded by the sequences of the bases of the DNA in the genome (the genotype), which is translated into proteins and provides the recognizable characteristics of the individual (the phenotype), such as height and weight. For DNA to produce proteins, it must first go through the intermediary step of RNA. DNA, the double-stranded molecule, unwinds to give a single-stranded RNA molecule that serves as the template for protein. This process, since it goes from one nucleic acid to another nucleic acid, is referred to as transcription. This is a key regulatory step in the old process of replicating and maintaining life.

There are several aspects in the regulation of transcription, as indicated in the table below. The process of transcription is initiated by attachment of the enzyme RNA polymerase to specific recognition sites where the DNA is double-stranded but, upon activation by the enzyme, the strands selectively unwind and separate. The RNA polymerase copies the DNA sequence to a similar molecule called messenger RNA (mRNA). mRNA travels out into the cytoplasm.The binding site for the RNA polymerase II is always on the 5′ end of the gene and travels on a single-stranded DNA towards the 3′ end. Messenger RNA in addition to being single-stranded also differs from DNA in that the deoxyribose sugar found in DNA is replaced by ribose. Furthermore, uracil (U) replaces T and, like T, U pairs exclusively with A. The mRNA transcribed from DNA is usually referred to as the primary transcript and is a complementary copy of the DNA.The DNA stays inside the nucleus, while the mRNA exits the nucleus but, prior to transport, undergoes extensive posttranscriptional processing primarily through the three following main events: (1) the addition of the methylated guanosine to the 5′ end, referred to as a CAP, which is important for the initiation of translation; (2) addition of a long tail of repeated adenine nucleotides called the polyadenine tail to the 3′ region of the mRNA, which is essential for stability as it passes out into the cytoplasm to serve as a template for protein synthesis; (3) the primary transcript, which contains introns and exons, undergoes a specific splicing process whereby the introns are removed and exons are properly respliced together prior to exit from the nucleus. It is then referred to as the mature mRNA. The exons of the 3′ end do not code for proteins but for signals that terminate translation and direct the addition of the polyadenine tail. The mature mRNA exits the nucleus through nuclear pores and, upon entering the cytoplasm, attaches to ribosomal RNA.

The majority of genes are expressed as the proteins they encode. The process thus occurs in two steps:

Transcription = DNA → RNA (transcription of the information encoded in DNA into a molecule of RNA; described here) and
Translation = RNA → protein (translation of the information encoded in the nucleotides of mRNA into a defined sequence of amino acids in a protein; described under "Translation").

Taken together, they make up the "central dogma" of biology: DNA → RNA → protein. Here is an overview:

DNA controls cell function; RNA is synthesized from a gene on the DNA template in the nucleus. The protein is then synthesized from RNA to carry out cell function.

Click here for a movie on "Transcription".

Regulation of gene transcription

Transcription start site

This is where a molecule of RNA polymerase II (pol II) binds. Pol II is a complex of 12 different proteins (shown in the figure in yellow with small colored circles superimposed on it). The start site is where transcription of the gene into RNA begins.

The basal promoter

The basal promoter contains a sequence of 7 bases (TATAAAA) called the TATA box. It is bound by a large complex of some 50 different proteins, including transcription factors. The basal or core promoter is found in all protein-coding genes. This is in sharp contrast to the upstream promoter whose structure and associated binding factors differ from gene to gene. Although the figure is drawn as a straight line, the binding of transcription factors to each other probably draws the DNA of the promoter into a loop. Many different genes and many different types of cells share the same transcription factors — not only those that bind at the basal promoter but even some of those that bind upstream. What turns on a particular gene in a particular cell is probably the unique combination of promoter sites and the transcription factors that are chosen. Transcription factors represent only a small fraction of the proteins in a cell. Hormones exert many of their effects by forming transcription factors.

The complexes of hormones with their receptor represent one class of transcription factor. Hormone "response elements", to which the complex binds, are promoter sites. Embryonic development requires the coordinated production and distribution of transcription factors.

Click Transcription factors binding to DNA for a movie.

For a movie on regulated transcription, see "Regulated transcription".

For a movie on mRNA processing, see "mRNA processing"; for a movie on mRNA splicing, see "mRNA splicing".

Transcriptional enhancers, silencers, insulators

Important functional properties are embedded in the non-coding portion of the human genome, but identifying and defining these features remains a major challenge. An initial estimate of the magnitude of functional non-coding DNA was derived from comparative analysis of the first available mammalian genomes (human and mouse), which indicated that fewer than half of the evolutionary constrained sequences in the human genome encode proteins, a prospect that gained further support when additional vertebrate genomes became available for comparative genomic analyses. These approaches are based on the assumption that the sequences of gene regulatory elements, like those of protein-coding genes, are under negative evolutionary selection, because most changes in functional sequences have deleterious consequences. Thus, statistical measures of evolutionary sequence constraint would provide a way to identify potential enhancer sequences within the vast amount of non-coding sequence in the human genome. A large proportion of these non-coding sequences give robust positive signals in various assays and may represent tissue-specific in vivo enhancers active during development.

The overall impact of these presumably functional non-coding sequences on human biology was initially unclear. A considerable urgency to define their locations and functions came from a growing number of known associations of non-coding sequence variants with common human diseases. Specifically, genome-wide association studies (GWAS; see also under “GWAS”) have revealed a large number of disease susceptibility regions that do not overlap protein-coding genes but rather map to non-coding intervals. One possibility that could explain some of these GWAS hits is that the non-coding intervals contain enhancers, a category of gene regulatory sequence that can act over long distances; other categories of functional elements in the non-coding portion of the genome include insulators, negative regulators, promoters and non-coding RNAs (Table 1).

Table 1. Major categories of non-coding functional elements.

Enhancers

Some transcription factors ("Enhancer-binding protein") bind to regions of DNA that are thousands of base pairs away from the gene they control. Binding increases the rate of transcription of the gene. Enhancers can be located upstream, downstream, or even within the gene they control. How does the binding of a protein to an enhancer regulate the transcription of a gene thousands of base pairs away? One possibility is that enhancer-binding proteins — in addition to their DNA-binding site, have sites that bind to transcription factors ("TF") assembled at the promoter of the gene. This would draw the DNA into a loop (as shown in the figure).

Promoters and enhancers

A simplified view of the current understanding of the role of enhancers in regulating genes is summarized in Figure 1. The docking of RNA polymerase II to proximal promoter sequences and transcription initiation are fairly well characterized (see above); by contrast, the mechanisms by which insulator and silencer elements buffer or repress gene regulation, respectively, are less well understood. Transcriptional enhancers are regulatory sequences that can be located upstream of, downstream of or within their target gene and can modulate expression independently of their orientation. In vertebrates, enhancer sequences are thought to comprise densely clustered aggregations of transcription-factor-binding sites. When appropriate occupancy of transcription-factor-binding sites is achieved, recruitment of transcriptional coactivators and chromatin-remodelling proteins occurs. The resultant protein aggregates are thought to facilitate DNA looping and ultimately promoter-mediated gene activation. A number of studies on individual loci suggest that variation in distant-acting enhancer sequences and the resultant changes in their activities can contribute to human disorders (e.g. thalassaemias resulting from deletions or rearrangements of -globin gene (HBB) enhancers, preaxial polydactyly resulting from sonic hedgehog (SHH) limb-enhancer point mutations, and susceptibility to Hirschsprung's disease associated with a RET proto-oncogene enhancer variant). In addition to the pathological consequences of the removal or the repositioning of distant-acting enhancers, there are also examples of single-nucleotide changes within enhancer elements as a cause of human disorders (e.g. the disease-causing non-coding mutation in the limb-specific long-distance enhancer ZRS of SHH). It remains unclear whether these are rare exceptions or whether variation in enhancers contributes to disease on a pervasive scale.

Figure 1. Gene regulation by distant-acting enhancers. a, For many genes, the regulatory information embedded in the promoter is insufficient to drive the complex expression pattern observed at the messenger RNA level. For example, a gene could be expressed both in the brain and in the limbs during embryonic development (red), even if the promoter by itself is not active in either of these structures, suggesting that appropriate expression depends on additional sequences that are distant-acting and cis-regulatory. However, defining the genomic locations of such regulatory elements (question marks) and their activities in time and space (arrows) is a major challenge. b, c, Tissue-specific enhancers are thought to contain combinations of binding sites for different transcription factors. Only when all required transcription factors are present in a tissue does the enhancer become active: it binds to transcriptional coactivators, relocates into physical proximity with the gene promoter (through a looping mechanism) and activates transcription by RNA polymerase II. In any given tissue, only a subset of enhancers is active, as schematically shown in b and c for the example gene pictured in a, whose expression is controlled by two separate enhancers with brain-specific and limb-specific activities. Insulator elements prevent enhancer–promoter interactions and can thus restrict the activity of enhancers to defined chromatin domains. In addition to activation by enhancers, negative regulatory elements (including repressors and silencers) can contribute to transcriptional regulation.

Silencers

Silencers are control regions of DNA that, like enhancers, may be located thousands of base pairs away from the gene they control. However, when transcription factors bind to them, expression of the gene they control is repressed.

Insulators

Enhancers can turn on promoters of genes located thousands of base pairs away. What is to prevent an enhancer from inappropriately binding to and activating the promoter of some other gene in the same region of the chromosome? One answer: an insulator. Insulators are:

stretches of DNA (as few as 42 base pairs may do the trick)
located between the enhancer(s) and promoter or between the silencer(s) and promoter of adjacent genes or clusters of adjacent genes.

Their function is to prevent a gene from being influenced by the activation (or repression) of its neighbors.

Next page: Noncoding RNAs

Go back to: Epigenetics