Development of integrative statistical algorithms for the analysis of gene expression data.

Postgraduate Thesis uoadl:2878644 314 Read counter

Unit:
Specialty Molecular Biomedicine Mechanisms of Disease, Molecular and Cellular Therapies, and Bioinnovation
Library of the School of Health Sciences
Deposit date:
2019-07-15
Year:
2019
Author:
Fanidis Dionysios
Supervisors info:
Παντελής Χατζής, Ερευνητής Β΄, Ερευνητικό Κέντρο Βιοϊατρικών Επιστημών "Αλέξανδρος Φλέμινγκ"
Δέσποινα Σανούδου, Αναπληρώτρια Καθηγήτρια, Ιατρική Σχολή, ΕΚΠΑ
Αριστοτέλης Χατζηιωάννου, Ερευνητής Β΄, Εθνικό Ίδρυμα Ερευνών
Original Title:
Development of integrative statistical algorithms for the analysis of gene expression data.
Languages:
English
Translated title:
Development of integrative statistical algorithms for the analysis of gene expression data.
Summary:
In the past few years, RNA-seq has become the technology of choice for monitoring gene expression at massive scales. Although its benefits outmatch its potential pitfalls, RNA-seq exhibits certain technical and systematic biases like every high-throughput technique. Such biases become more evident in real-life experimental settings such as searching for a signature that differentiates healthy and disease tissues or finding a set of genes whose expression significantly varies across a time-course or a drug dosage study. In an attempt to confront RNA-seq data inherent biases, many different statistical analysis approaches have been proposed, each one with its own advantages and drawbacks. Taking into consideration the limited research dedicated in developing meta-analysis pipelines capable of ameliorating the results yielded by individual methods, we hereby present metaseqR2, an upgraded version of the previously released metaseqR Bioconductor package. Including some of the best performing and most popular differential expression analysis statistical tools, as well as a new supported organism, metaseqR2 is an all-in-one, powerful tool for RNA-seq data analysis. Moreover, we demonstrate that PANDORA, the main p-value combination method behind the metaseqR2 package, not only continues to greatly perform under metaseqR2 statistical environment, but it is also characterized by a very robust behavior under different analysis pipelines. Finally, in the presence of RNA-seq biases such as the gene length bias and the recently discovered bias in the detection of differentially expressed lncRNAs, PANDORA is probably the most reliable solution to work with.
Main subject category:
Health Sciences
Keywords:
RNA-sequencing, Differential gene expression, p-value combination
Index:
Yes
Number of index pages:
2
Contains images:
Yes
Number of references:
58
Number of pages:
75
DionysiosFanidis_MSc_Thesis_FinalText.pdf (13 MB) Open in new window