Arabidopsis magic lines

Transcript abundance and regulatory RNA species will be analysed by RNA-seq and protein translation and abundance quantified by Ribo-seq and proteomics, respectively. We will measure epigenetic marks of cytosine DNA methylation and chromatin accessibility by ATAC-seq. Long-read sequencing of 18 founders' genomes will be performed for comparison of structural variation relative to the 19th founder, the Col-0 reference. Firstly, we will determine all variation in the 19 MAGIC founders, and interactions between different 'omic layers, via a comprehensive set of assays. We propose an integrated programme of quantitative trait loci (QTL) analysis of an Arabidopsis multiparental advanced generation intercross (MAGIC) population. This project aims to understand how protein abundance is controlled in plants and to determine the phenotypic consequences of proteomic variation, together with genotypic, structural, epigenotypic and transcriptomic variation. The approaches developed in this project will provide valuable fundamental insights that will be applicable to other organisms and which will also pave the way to future crop improvement. As well as depositing our data in public repositories, our findings will be made available to the academic community via a user-friendly knowledge discovery and gene mining resource. This will be the first study of this kind on this scale. Because collecting genome-scale data from many samples is expensive and time-consuming, we will use novel statistical methods to get more information without significantly increasing sample size, including combining different layers of information. The power of this project derives from innovative computational analysis that will enable us to apportion the relative contributions of genotype, transcription, protein synthesis and protein degradation and identify networks controlling protein expression. Therefore, we are in an exciting position to provide enormous insight into protein regulation. Although it is relatively straightforward to measure genomic structural variation and epigenetic marks such as DNA methylation, their impact on protein expression is unclear. Less research has been done on measuring translation, protein amount and protein breakdown but advances in technology now let us do so. However, evidence suggests that transcription is a poor predictor of protein abundance, because the control of translation and protein degradation are important, particularly in plants. Much effort has been spent studying gene transcription, because it is relatively easy to measure on a genome-wide scale. It is important to take an holistic approach, because the amount of any given protein in an individual is determined by the balance of these processes. Chemical modifications to DNA that do not involve a change in DNA sequence, known as epigenetic marks, which often indicate environmental perturbation. Chromatin accessibility, a measure of the availability of a given region of DNA for transcription. Structural variation within the genome (including small-scale variation and large-scale structural rearrangements) 2. We will characterise and compare the following different processes that potentially influence protein expression in the MAGIC lines: 1. This is a powerful genetic resource for mapping sections of DNA that correlate with variation in a trait (known as quantitative trait loci, QTL), to identify causal variants and dissect the regulation of genome expression. To address these questions, we have designed an integrated programme of experiments and sophisticated mathematical analysis around a genetically variable population of Arabidopsis (known as the MAGIC population). We also aim to determine to what extent the protein content of a given cell, tissue or organ predicts observable traits (the phenotype) of the plant. This project seeks to use the model plant, Arabidopsis thaliana, to answer fundamental questions about the control of protein expression, including which mechanisms are important and how they interact in a complex multi-cellular organism. We expect both inherited and environmental differences between individuals to play important roles in the control of proteins. There are many levels at which this process is regulated and there are still many gaps in our knowledge.

Crick's Central Dogma states that coding sequences of DNA are transcribed into mRNAs, which in turn are translated into proteins. They also need to be removed when no longer needed. For cells to work efficiently, proteins need to be produced in the right place, at the right time and in the right amount. Proteins are the workhorses of the cell: they facilitate chemical reactions, act as gene switches and have structural roles.