The ability of a bacterial cell to respond to a complex environment lies with the network of genetic interactions that process information regarding the state of the cell and its local environment. The object of this study has been to better understand bacterial genetic regulatory networks, which conceptually consist of nodes (the genes) and edges (the interactions between the genes). Two problems are addressed, one focusing on the nodes in the network, the other focusing on the edges.
The problem concerning the nodes in the network is that a large fraction of genes from bacteria have an unknown function. Even the most studied microbe, E. coli, still has around 30 percent of its genes uncharacterised. Using phenotype data and genomic data, a method for gene function prediction is improved and implemented in a tool called GeneTrawler. The gene function prediction tool is validated against a number of well studied phenotypes, and successfully predicts the majority of genes known to be involved in each phenotype when the phenotype is well defined. In addition, it can suggest genes that are dependent on a phenotype, but not directly involved.
The second problem addressed in this thesis is reconstructing gene networks from DNA microarray data. DNA microarrays can measure the expression levels of thousands of genes simultaneously. Elucidating the topology of biochemical or genetic regulatory networks is a common problem in molecular biology, but piecing together gene regulatory networks with standard molecular methods not only requires significant time and effort, but may also miss the global or modular structure of the network. In addition, different stimuli can substantially change the "wiring" of a network, meaning that understanding how a network functions in time requires a global analysis of its dynamics combined with a detailed fine-grained approach.
There has been much interest in the possibility of reconstructing genetic regulatory interactions from microarray data. We investigate the limitations of such reconstruction procedures, in terms of what kinds of networks can (and cannot) be reconstructed. A method of reconstruction is investigated that uses evolutionary computation for "evolving" a population of networks to resemble a target network. The results demonstrate that the topology of gene regulatory network structures can be reconstructed from time series data if the network displays dynamics of a particular category where negative feedback loops determine the dynamical behaviour of the network.