Gene clustering of transcriptomic data using Deep Learning algorithms

Postgraduate Thesis uoadl:3395607 13 Read counter

Unit:
Κατεύθυνση Βιοπληροφορική-Υπολογιστική Βιολογία
Library of the School of Science
Deposit date:
2024-04-05
Year:
2024
Author:
Tsotra Ioanna
Supervisors info:
Βασιλική Οικονομίδου Αναπληρώτρια Καθηγήτρια Τμήμα Βιολογίας ΕΚΠΑ, (Επιβλέπουσα),
Ιωάννης Τρουγκάκος, Καθηγητής Τμήμα Βιολογίας ΕΚΠΑ,
Ιωάννης Μιχαλόπουλος, Ειδικός Λειτουργικός Επιστήμονας Β',Ίδρυμα Ιατροβιολογικών Ερευνών της Ακαδημίας Αθηνών
Original Title:
ΟΜΑΔΟΠΟΙΗΣΗ ΓΟΝΙΔΙΩΝ ΑΠΟ ΔΕΔΟΜΕΝΑ ΜΕΤΑΓΡΑΦΩΜΙΚΗΣ ΜΕ ΧΡΗΣΗ ΑΛΓΟΡΙΘΜΩΝ ΒΑΘΙΑΣ ΜΑΘΗΣΗΣ
Languages:
Greek
Translated title:
Gene clustering of transcriptomic data using Deep Learning algorithms
Summary:
The study and understanding of the transcriptome are important for investigating the molecular components of cells or tissues, interpreting the functional elements of the genome or even understanding the development of a disease. In the analysis of transcriptomic data (either RNA-seq or microarrays) we have two kinds of approaches, the one concerning the analysis of differential gene expression and the one of gene co-expression. In the present work, gene co-expression was studied, which has as its center the combination of many different transcriptomic data of the same organism, which are obtained from different tissues or developmental stages. Genes with similar expression patterns tend to participate in common biological processes. The purpose of the work was the grouping of human genes according to their co-expression levels in correlated gene groups (clusters), with Machine Learning and more specifically Deep Learning techniques. The data used were Next Generation Sequencing RNA-seq, bulk and single cell, and were drawn from the public GTEx (Genotype-Tissue Expression) Database. This database studies the expression and regulation of human genes, as well as genetic polymorphisms. The bulk RNA-seq data contained the enumeration of the reads of each gene in each human sample, while from the single cell RNA-seq data, after changes, the part of the table with the gene enumerations was detached. With techniques and algorithms mainly of Neural Networks, we proceeded with grouping tests, initially of samples, and then of genes. The most effective types of neural networks seem to be Autoencoders, Variational Autoencoders and Graph Neural Networks. In addition to the goal of generating gene clusters, we also investigated Deep Learning algorithms to analyze and generate better input data (gene expression arrays) for the clustering algorithms.
The grouping of genes with similar expression patterns can be a useful approach, as it enables the identification of groups of genes, which are functional partners, whose relationship has not yet been fully determined, the elucidation of molecular mechanisms or even and finding potential metabolic pathways.
Main subject category:
Science
Keywords:
Gene co-expression, Deep Learning
Index:
No
Number of index pages:
0
Contains images:
No
Number of references:
122
Number of pages:
118
File:
File access is restricted until 2025-04-08.

tsotra.pdf
5 MB
File access is restricted until 2025-04-08.