Analysis of small RNAs from deep sequencing data using spipeRNA

Postgraduate Thesis uoadl:1326236 714 Read counter

Unit:
Κατεύθυνση Βιοπληροφορική
Πληροφορική
Deposit date:
2016-12-19
Year:
2016
Author:
Handzlik Joanna-Elzbieta
Supervisors info:
Χατζηγεωργίου Άρτεμις, Καθηγήτρια Βιοπληροφορικής, Τμήμα Μηχανικών Υ/Η, Τηλεπικοινωνιών και Δικτύων, Πανεπιστήμιο Θεσσαλίας
Γεώργιος Μ. Σπύρου, Επικεφαλής της ομάδας Βιοπληροφορικής, Κυπριακό Ινστιτούτου Νευρολογίας και
Γενετικής
Ιωάννης Βλάχος, Ερευνητής, Ιατρική Σχολή, Πανεπιστήμιο Χάρβαρντ
Original Title:
Ανάλυση των μικρών μορίων RNA από δεδομένα αλληλούχησης επόμενης γενιάς με τη χρήση του spipeRNA
Languages:
Greek
Translated title:
Analysis of small RNAs from deep sequencing data using spipeRNA
Summary:
Some of the most important technological developments in biotechnology in recent
years, are summarized under the term “Next Generation Sequencing (NGS)”. While the
sequencing of the first human genome (3 gigabases per haploid genome) took about 15
years and roughly 100 million of US dollars in material costs only, today the raw
sequencing data for a complete human genome (100 gigabases at 30x coverage) can
be produced by a single machine within a few days and for just 1.000 US dollars. This
technological quantum leap has paved the way for numerous exciting applications such
as de novo sequencing, transcriptome (RNA-Seq) and methylome (methyl-Seq)
analysis, the determination of transcription factor binding sites (ChIP-Seq), the detection
of disease-causing mutations, and many others.
The purpose of this study was the design and implementation of the computational tool,
dedicated to the analysis of small RNA-Seq data, which form a part of the overall
analysis of trascriptome (RNA-Seq). This analysis aims to quantify the expressed small
RNA molecules and to detect new non-annotated expression regions in various
biological samples.
The implemented algorithm was called "spipeRNA» and tries to overcome many open
challenges: it quantifies all types of small RNAs, not only the miRNAs, solves the
problem of multi-mapped reads and appropriately handles the reads without existing
annotation.
This study presents the results obtained by applying this tool to analyze simulated data
and 8 small RNA-Seq datasets, which include tumor/healthy lung and pancreas
samples. The comparison between the spipeRNA and very popular tool for miRNAs
analysis, showed, that in some cases, the spipeRNA may produce more precise and
accurate output.
The spipeRNA is an integrated data analysis pipeline, based on a reliable, flexible and
fully automated workflow, useful for fast and efficient analysis of small RNA-Seq data
produced by next-generation sequencers.
Main subject category:
Science
Keywords:
small non-coding RNAs, Next Generation Sequencing, NGS, reads alignment, annotation of genomic regions, microRNA, snoRNA, snRNA, tRNA, rRNA, siRNA
Index:
Yes
Number of index pages:
2
Contains images:
Yes
Number of references:
14
Number of pages:
79
master_thesis.pdf (2 MB) Open in new window

 


spipeRNA.zip
9 MB
File access is restricted.