Undergraduate Honors Thesis Presentation: Gene Ontology Analysis for Evaluation of Gene Regulatory Networks

Speaker: Thomas Westbrook, Washington University in Saint Louis

Abstract: Gene regulation is the process that determines which genes in a cell’s genome are active and producing proteins and which are kept inactive. One common tool to describe this critical system is the inference of a gene regulatory network (GRN), which can be represented as a graph with nodes that are genes and edges that represent regulation. There are many methods of inferring GRNs from biological data, but no ground truth by which to assess the quality of the produced networks. This prompts the need for systematic metrics to compare GRNs from multiple inference methods. In this project, an R pipeline was written to produce comparable metrics that evaluate the biological information contained in a GRN. This is done by first performing over-representation analysis for annotation to Gene Ontology (GO) terms on the set of regulated genes for each regulator in a network. Annotation of a gene to a GO term indicates that the gene is involved in a particular biological process. Multiple possible metrics that aggregate these results to compare GRNs were explored. The percentage of regulators annotated to their over-represented GO terms and the penalized sum of the negative log of the smallest over-representation p-value for each regulator were chosen as two key metrics for comparison.