Escherichia coli, originally known as Bacterium coli commune, was identified in 1885 by the German pediatrician, Theodor Escherich. They are Gram-negative, straight and rod-shaped bacteria from the Enterobacteriaceae family. Due to its rapid growth rate, simple nutritional requirements and established genetic manipulation techniques, E. coli has become a model organism in the field of basic biomolecular sciences for understanding various biological phenomena. As a result, well established information about E. coli's genetics and several completed genome sequences are available. Among them, the genome sequences of two closely related K-12 non-pathogenic strains, MG1655 and W3110 have been accurately determined. Resequencing of PCR products of selected regions indicates that there is only eight true insertion/deletion or base differences between the two strains in addition to the 13 sites where differences are due to insertion sequences, defective prophages and two sites due to the W3110 inversion between the ribosomal RNA genes rrnD and rrnE. The rate of nucleotide changes between both the strains is estimated to be relatively low with almost identical genome structures. Hence the following important questions arise:
- Is this high degree of similarity at the nucleotide level reflected in the metabolic phenotype?
- Do sub-strains with almost identical genome structures exhibit similar behaviour in cellular metabolism?
- How do global aspects of cell metabolism, protein synthesis and gene expression differ among closely related sub-strains of the same species, revealing possible complexities of cellular metabolism?
To address these queries, we analyzed the growth behaviour in strictly controlled conditions and analysed the global proteome and transcriptome pools of these closely related E. coli sub-strains W3110 and MG1655. We applied the conventional 2-dimensional polyacrylamide gel electrophoresis for global proteomic profiling which is still the major method for global proteome analysis. Global changes in the gene expression levels were analysed using microarrays, thus providing quantitative information about the gene expression levels. Being the most extensively studied model organism, E. coli is frequently used in molecular evolutionary studies. A few potential reasons for this are: its capacity to propagate and reproduce quickly, facilitating the evolution experiments for many generations in a short time span, and capability to store the evolved and ancestor strains, allowing for direct comparison between them. As a result, several studies have used gene expression and proteome profiling methods to study molecular evolution, but these studies were confined to a single type of evolution process and were focused on a single molecular aspect that characterizes a cell (transcript or protein abundance). Metabolome profiling has been frequently applied for obtaining quantitative information on metabolites for studies on mutational or environmental effects, but not in an evolutionary context. In our study, we depicted a complete picture of molecular evolution processes in the laboratory among the two strains MG1655 and DH10B under three different evolutionary conditions in all three functional levels of the cell (transcriptome, proteome and metabolome). These data sets obtained from the three functional levels would be of vital importance for viewing a global picture of the experimental sample in question. To eliminate the possibility of the strain-dependent phenomenon of evolution and to examine the parallelism of the laboratory evolution processes, we examined all the evolutionary processes in two strains. The major questions that arose during our study were:
- What are the transcriptome, proteome and metabolome changes occurring during the excess-nutrient adaptive evolution process?
- Which genes, proteins and metabolites are vitally involved in the prolonged stationary phase evolution process?
- What are the transcript, protein and metabolite changes occurring due to the pleiotropic effects due to environmental shift?
- To what extent are the changes occurring during these evolutionary processes seen in both strains?
- Among both the strains, is the path of evolution similar in these evolutionary processes (parallelism)?
By global protein profiling technologies and integrating the multidimensional datasets generated, we aimed to find vital genes, proteins and metabolites involved in the evolutionary processes in three conditions in two E. coli K-12 strains. These generated datasets from all the three functional levels would be an initial resource for the systems biology of microbial evolution.