Development and validation of wide-scope retention time prediction models to support suspect and non-target screening of emerging contaminants in environmental samples

Postgraduate Thesis uoadl:1320987 329 Read counter

Unit:
Κατεύθυνση Αναλυτική Χημεία
Library of the School of Science
Deposit date:
2015-07-21
Year:
2015
Author:
Aalizadeh Reza
Supervisors info:
Θωμαΐδης Νικόλαος - Αναπλ. Καθηγητής ΕΚΠΑ, Κουππάρης Μιχαήλ - Καθηγητής ΕΚΠΑ, Ευσταθίου Κωνσταντίνος Καθηγητής ΕΚΠΑ
Original Title:
Development and validation of wide-scope retention time prediction models to support suspect and non-target screening of emerging contaminants in environmental samples
Languages:
English
Translated title:
Ανάπτυξη και επικύρωση μοντέλων πρόβλεψης χρόνου ανάσχεσης για την ταυτοποίηση αναδυόμενων ρύπων σε περιβαλλοντικά δείγματαμε μη στοχευμένη σάρωση και τεχνικές φασματομετρίας μαζών υψηλής διακριτικής ικανότητας
Summary:
Over the last decade, the application of liquid chromatography - high
resolution mass spectroscopy (LC-HRMS) has been growing extensively due its
ability to identify a wide range of suspect and unknown compounds in
environmental samples. However, certain information such as mass accuracy and
isotopic pattern of the precursor ion, MS/MS spectra evaluation and retention
time plausibility are needed to confirm its identity. In this context, a
comprehensive workflow based on computational tools was developed to understand
the retention time behavior of a large number of compounds belonging to
emerging contaminants. An extensive dataset was provided, containing
information for the retention time of 528 and 303 compounds for positive and
negative electrospray ionization mode, respectively, to expand the
applicability domain of the developed models. Then, the dataset was split into
training and test employing k-nearest neighborhood clustering, so as to build
and validate the models’ internal and external prediction ability. The best
subset of molecular descriptors was selected using genetic algorithms which is
based on the evolutionary computations, and could result in representative
selection of descriptors. Multiple Linear Regression, Artificial Neural
Networks and Support Vector Machines were used to correlate the selected
descriptors with the experimental retention times. Several validation
techniques were used, including Golbraikh-Tropsha acceptable model criteria's,
Euclidean based applicability domain, r2m, concordance correlation coefficient
values to measure the accuracy and precision of the models. The best linear and
non-linear models for each dataset were derived and used to predict the
retention time of suspect compounds in a wide-scope survey as the evaluation
data set. Overall, the proposed workflow was fast, reliable, cost-effective and
can be employed as an effective filtering tool for decreasing false positives
of wide-scope HRMS screening of environmental samples.
Keywords:
Retention Time, Suspect Screening, Non-target Screening, High Resolution Mass Spectrometry, Support Vector Machines
Index:
Yes
Number of index pages:
1-6
Contains images:
Yes
Number of references:
63
Number of pages:
132
document.pdf (3 MB) Open in new window