TY - JOUR
TI - A comparison of batch effect removal methods for enhancement of
prediction performance using MAQC-II microarray gene expression data
AU - Luo, J.
AU - Schumacher, M.
AU - Scherer, A.
AU - Sanoudou, D. and
AU - Megherbi, D.
AU - Davison, T.
AU - Shi, T.
AU - Tong, W.
AU - Shi, L. and
AU - Hong, H.
AU - Zhao, C.
AU - Elloumi, F.
AU - Shi, W.
AU - Thomas, R. and
AU - Lin, S.
AU - Tillinghast, G.
AU - Liu, G.
AU - Zhou, Y.
AU - Herman, D. and
AU - Li, Y.
AU - Deng, Y.
AU - Fang, H.
AU - Bushel, P.
AU - Woods, M.
AU - Zhang,
AU - J.
JO - The Pharmacogenomics Journal
PY - 2010
VL - 10
TODO - 4
SP - 278-291
PB - Nature Publishing Group
SN - 1470-269X, 1473-1150
TODO - 10.1038/tpj.2010.57
TODO - batch effect; batch effect removal; cross-batch prediction; microarray;
gene expression; MAQC-II
TODO - Batch effects are the systematic non-biological differences between
batches (groups) of samples in microarray experiments due to various
causes such as differences in sample preparation and hybridization
protocols. Previous work focused mainly on the development of methods
for effective batch effects removal. However, their impact on
cross-batch prediction performance, which is one of the most important
goals in microarray-based applications, has not been addressed. This
paper uses a broad selection of data sets from the Microarray Quality
Control Phase II (MAQC-II) effort, generated on three microarray
platforms with different causes of batch effects to assess the efficacy
of their removal. Two data sets from cross-tissue and cross-platform
experiments are also included. Of the 120 cases studied using Support
vector machines (SVM) and K nearest neighbors (KNN) as classifiers and
Matthews correlation coefficient (MCC) as performance metric, we find
that Ratio-G, Ratio-A, EJLR, mean-centering and standardization methods
perform better or equivalent to no batch effect removal in 89, 85, 83,
79 and 75% of the cases, respectively, suggesting that the application
of these methods is generally advisable and ratio-based methods are
preferred. The Pharmacogenomics Journal (2010) 10, 278-291; doi:
10.1038/tpj.2010.57
ER -